Saat pipeline CAPTCHA Anda memproses ribuan task, grep tidak berskala. ELK Stack (Elasticsearch, Logstash, Kibana) memungkinkan Anda mencari, mengagregasi, dan memvisualisasikan log solve — menemukan pola error, melacak tren latensi, dan mendiagnosis masalah dalam hitungan detik.
Arsitektur
[CAPTCHA Workers] → JSON logs → [Filebeat] → [Logstash] → [Elasticsearch]
↓
[Kibana]
Structured Logging
Python — Output Log JSON
import os
import json
import time
import logging
import sys
import requests
API_KEY = os.environ["CAPTCHAAI_API_KEY"]
class JSONFormatter(logging.Formatter):
def format(self, record):
log_entry = {
"timestamp": self.formatTime(record),
"level": record.levelname,
"logger": record.name,
"message": record.getMessage(),
}
# Add extra fields
if hasattr(record, "captcha_id"):
log_entry["captcha_id"] = record.captcha_id
if hasattr(record, "captcha_type"):
log_entry["captcha_type"] = record.captcha_type
if hasattr(record, "solve_time"):
log_entry["solve_time"] = record.solve_time
if hasattr(record, "error_code"):
log_entry["error_code"] = record.error_code
if hasattr(record, "target_url"):
log_entry["target_url"] = record.target_url
if hasattr(record, "poll_count"):
log_entry["poll_count"] = record.poll_count
return json.dumps(log_entry)
# Configure logger
logger = logging.getLogger("captchaai")
logger.setLevel(logging.INFO)
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(JSONFormatter())
logger.addHandler(handler)
session = requests.Session()
def solve_captcha(sitekey, pageurl, captcha_type="recaptcha_v2"):
extra = {"captcha_type": captcha_type, "target_url": pageurl}
# Submit
resp = session.post("https://ocr.captchaai.com/in.php", data={
"key": API_KEY,
"method": "userrecaptcha",
"googlekey": sitekey,
"pageurl": pageurl,
"json": 1
})
data = resp.json()
if data.get("status") != 1:
logger.error("Submit failed", extra={
**extra, "error_code": data.get("request")
})
return {"error": data.get("request")}
captcha_id = data["request"]
extra["captcha_id"] = captcha_id
logger.info("Task submitted", extra=extra)
# Poll
start = time.time()
poll_count = 0
for _ in range(60):
time.sleep(5)
poll_count += 1
result = session.get("https://ocr.captchaai.com/res.php", params={
"key": API_KEY, "action": "get", "id": captcha_id, "json": 1
}).json()
if result.get("status") == 1:
elapsed = round(time.time() - start, 2)
logger.info("Solve success", extra={
**extra,
"solve_time": elapsed,
"poll_count": poll_count
})
return {"solution": result["request"]}
if result.get("request") != "CAPCHA_NOT_READY":
logger.error("Solve failed", extra={
**extra,
"error_code": result.get("request"),
"poll_count": poll_count
})
return {"error": result.get("request")}
logger.error("Solve timeout", extra={
**extra,
"error_code": "TIMEOUT",
"poll_count": poll_count
})
return {"error": "TIMEOUT"}
JavaScript — Structured Logging
const axios = require("axios");
const API_KEY = process.env.CAPTCHAAI_API_KEY;
function log(level, message, fields = {}) {
const entry = {
timestamp: new Date().toISOString(),
level,
message,
service: "captcha-worker",
...fields,
};
console.log(JSON.stringify(entry));
}
async function solveCaptcha(sitekey, pageurl, captchaType = "recaptcha_v2") {
const fields = { captchaType, targetUrl: pageurl };
const submitResp = await axios.post("https://ocr.captchaai.com/in.php", null, {
params: {
key: API_KEY, method: "userrecaptcha",
googlekey: sitekey, pageurl, json: 1,
},
});
if (submitResp.data.status !== 1) {
log("error", "Submit failed", { ...fields, errorCode: submitResp.data.request });
return { error: submitResp.data.request };
}
const captchaId = submitResp.data.request;
fields.captchaId = captchaId;
log("info", "Task submitted", fields);
const startTime = Date.now();
let pollCount = 0;
for (let i = 0; i < 60; i++) {
await new Promise((r) => setTimeout(r, 5000));
pollCount++;
const pollResp = await axios.get("https://ocr.captchaai.com/res.php", {
params: { key: API_KEY, action: "get", id: captchaId, json: 1 },
});
if (pollResp.data.status === 1) {
const solveTime = ((Date.now() - startTime) / 1000).toFixed(2);
log("info", "Solve success", { ...fields, solveTime: parseFloat(solveTime), pollCount });
return { solution: pollResp.data.request };
}
if (pollResp.data.request !== "CAPCHA_NOT_READY") {
log("error", "Solve failed", { ...fields, errorCode: pollResp.data.request, pollCount });
return { error: pollResp.data.request };
}
}
log("error", "Solve timeout", { ...fields, errorCode: "TIMEOUT", pollCount });
return { error: "TIMEOUT" };
}
module.exports = { solveCaptcha };
Konfigurasi Filebeat
# filebeat.yml
filebeat.inputs:
- type: log
paths:
- /var/log/captcha-worker/*.log
json:
keys_under_root: true
add_error_key: true
message_key: message
output.logstash:
hosts: ["logstash:5044"]
Pipeline Logstash
# logstash-captcha.conf
input {
beats {
port => 5044
}
}
filter {
# Parse JSON logs
json {
source => "message"
target => "captcha"
}
# Add computed fields
if [captcha][solve_time] {
mutate {
add_field => {
"solve_time_bucket" => "fast"
}
}
if [captcha][solve_time] > 30 {
mutate { update => { "solve_time_bucket" => "medium" } }
}
if [captcha][solve_time] > 90 {
mutate { update => { "solve_time_bucket" => "slow" } }
}
}
# Extract date
date {
match => ["[captcha][timestamp]", "ISO8601"]
target => "@timestamp"
}
}
output {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "captcha-logs-%{+YYYY.MM.dd}"
}
}
Index Template Elasticsearch
{
"index_patterns": ["captcha-logs-*"],
"template": {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"properties": {
"captcha_type": { "type": "keyword" },
"captcha_id": { "type": "keyword" },
"error_code": { "type": "keyword" },
"solve_time": { "type": "float" },
"poll_count": { "type": "integer" },
"target_url": { "type": "keyword" },
"level": { "type": "keyword" },
"message": { "type": "text" }
}
}
}
}
Panel Dashboard Kibana
| Panel | Visualisasi | Query |
|---|---|---|
| Solve success rate | Metric | level:info AND message:"Solve success" / total |
| Error breakdown | Pie chart | level:error dikelompokkan berdasarkan error_code |
| Latensi seiring waktu | Line chart | Rata-rata solve_time dari waktu ke waktu |
| Error seiring waktu | Bar chart | Count level:error per bucket 5 menit |
| Solve paling lambat | Data table | Top 10 berdasarkan solve_time menurun |
| Aktivitas queue | Area chart | Count berdasarkan message ("Task submitted" vs "Solve success") |
Query yang Berguna
# Semua error dalam 1 jam terakhir
level:error AND @timestamp:[now-1h TO now]
# Error timeout untuk reCAPTCHA
error_code:TIMEOUT AND captcha_type:recaptcha_v2
# Solve lambat (> 60 detik)
solve_time:>60
# Error untuk target URL tertentu
level:error AND target_url:"example.com"
# Investigasi CAPTCHA ID tertentu
captcha_id:"73519847"
Pemecahan Masalah
| Masalah | Penyebab | Perbaikan |
|---|---|---|
| Log tidak muncul di Kibana | Filebeat tidak mengirim log | Periksa log Filebeat; verifikasi kecocokan pola path |
| Error parse JSON | Baris non-JSON dalam file log | Tambahkan json.keys_under_root ke Filebeat; perbaiki output logger |
| Terlalu banyak index | Index harian tanpa ILM | Setup Index Lifecycle Management dengan retensi 30 hari |
| Query lambat | Mapping keyword tidak ada | Gunakan tipe keyword untuk field yang bisa difilter, bukan text |
Pertanyaan Umum
Berapa lama saya harus menyimpan log CAPTCHA?
30 hari untuk log operasional. 90 hari jika Anda membutuhkan analisis tren. Gunakan Elasticsearch ILM untuk menghapus index lama secara otomatis.
Bisakah saya menggunakan OpenSearch daripada Elasticsearch?
Ya. OpenSearch kompatibel API dengan Elasticsearch. Plugin output Logstash, Filebeat, dan OpenSearch Dashboards (pengganti Kibana) bekerja dengan cara yang sama.
Haruskah saya mencatat teks solusi CAPTCHA?
Tidak. Solusi adalah token sekali pakai tanpa nilai diagnostik. Mencatatnya menambah biaya storage dan dapat menimbulkan masalah keamanan. Catat hanya metadata (ID, tipe, latensi, status).
Langkah Selanjutnya
Cari dan analisis log CAPTCHA Anda — dapatkan API key CaptchaAI Anda dan setup ELK.
Panduan terkait:
- Structured Logging
- Monitoring Datadog
- OpenTelemetry Tracing