Tutorial

Audit Log Solve CAPTCHA: Pelacakan Request untuk Compliance

Saat organisasi Anda memecahkan ribuan CAPTCHA setiap hari, Anda memerlukan catatan setiap request. Audit log menjawab pertanyaan seperti: Siapa yang memicu solve ini? Untuk situs apa? Berapa biayanya? Kapan terjadinya? Panduan ini menunjukkan cara menerapkan audit logging yang komprehensif untuk operasi CaptchaAI.

Apa yang Harus Dicatat

Setiap solve CAPTCHA harus mencatat:

Bidang Tujuan Contoh
timestamp Saat permintaan itu dibuat 2026-04-04T14:30:00Z
request_id Identifier unik untuk solve ini uuid4()
captcha_type Metode CAPTCHA yang digunakan userrecaptcha
target_site URL halaman yang sedang di-solve https://staging.example.com/qa-login
task_id ID task CaptchaAI 73829451
status Hasil solved, failed, timeout
solve_time_ms Waktu dari submit hingga hasil 18432
error_code Error jika gagal ERROR_CAPTCHA_UNSOLVABLE
initiator Siapa atau apa yang memicu solve scraper-job-42
cost Estimasi biaya 0.003

Jangan catat: API key, token CAPTCHA (bersifat sementara), atau informasi identitas pribadi dari situs target.

Implementasi Python

# audit_solver.py
import os
import uuid
import time
import json
import logging
from datetime import datetime, timezone
import requests

API_KEY = os.environ.get("CAPTCHAAI_KEY", "YOUR_API_KEY")

# Configure audit logger — separate from application logs
audit_logger = logging.getLogger("captcha_audit")
audit_logger.setLevel(logging.INFO)

# File handler with rotation
from logging.handlers import RotatingFileHandler
handler = RotatingFileHandler(
    "captcha_audit.jsonl",
    maxBytes=50_000_000,  # 50 MB per file
    backupCount=10,
)
handler.setFormatter(logging.Formatter("%(message)s"))
audit_logger.addHandler(handler)

def log_audit(record):
    """Write a structured audit record."""
    audit_logger.info(json.dumps(record, default=str))

def solve_with_audit(sitekey, pageurl, captcha_type="userrecaptcha",
                      initiator="unknown"):
    """Solve a CAPTCHA with full audit logging."""
    request_id = str(uuid.uuid4())
    start = time.time()

    audit_record = {
        "request_id": request_id,
        "timestamp": datetime.now(timezone.utc).isoformat(),
        "captcha_type": captcha_type,
        "target_site": pageurl,
        "initiator": initiator,
        "status": "submitted",
    }

    session = requests.Session()

    try:
        # Submit
        resp = session.get("https://ocr.captchaai.com/in.php", params={
            "key": API_KEY,
            "method": captcha_type,
            "googlekey": sitekey,
            "pageurl": pageurl,
            "json": "1",
        })
        result = resp.json()

        if result.get("status") != 1:
            audit_record.update({
                "status": "submit_failed",
                "error_code": result.get("request"),
                "solve_time_ms": int((time.time() - start) * 1000),
            })
            log_audit(audit_record)
            return None

        task_id = result["request"]
        audit_record["task_id"] = task_id

        # Poll
        time.sleep(15)
        for _ in range(25):
            poll = session.get("https://ocr.captchaai.com/res.php", params={
                "key": API_KEY, "action": "get",
                "id": task_id, "json": "1",
            })
            poll_result = poll.json()

            if poll_result.get("status") == 1:
                solve_time = int((time.time() - start) * 1000)
                audit_record.update({
                    "status": "solved",
                    "solve_time_ms": solve_time,
                    "cost_estimate": 0.003,  # Adjust per your rate
                })
                log_audit(audit_record)
                return poll_result["request"]

            if poll_result.get("request") != "CAPCHA_NOT_READY":
                audit_record.update({
                    "status": "failed",
                    "error_code": poll_result.get("request"),
                    "solve_time_ms": int((time.time() - start) * 1000),
                })
                log_audit(audit_record)
                return None

            time.sleep(5)

        audit_record.update({
            "status": "timeout",
            "solve_time_ms": int((time.time() - start) * 1000),
        })
        log_audit(audit_record)
        return None

    except Exception as e:
        audit_record.update({
            "status": "error",
            "error_code": str(e)[:200],
            "solve_time_ms": int((time.time() - start) * 1000),
        })
        log_audit(audit_record)
        raise

# Usage
token = solve_with_audit(
    sitekey="6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-",
    pageurl="https://www.google.com/recaptcha/api2/demo",
    initiator="price-scraper-v2",
)

Keluaran Log Audit (format JSONL)

{"request_id":"a1b2c3d4-...","timestamp":"2026-04-04T14:30:00+00:00","captcha_type":"userrecaptcha","target_site":"https://www.google.com/recaptcha/api2/demo","initiator":"price-scraper-v2","status":"solved","task_id":"73829451","solve_time_ms":18432,"cost_estimate":0.003}

Implementasi JavaScript

// audit_solver.js
const fs = require('fs');
const { v4: uuidv4 } = require('uuid');
const axios = require('axios');

const API_KEY = process.env.CAPTCHAAI_KEY || 'YOUR_API_KEY';
const AUDIT_FILE = 'captcha_audit.jsonl';

function logAudit(record) {
  fs.appendFileSync(AUDIT_FILE, JSON.stringify(record) + '\n');
}

async function solveWithAudit(sitekey, pageurl, initiator = 'unknown') {
  const requestId = uuidv4();
  const start = Date.now();
  const record = {
    request_id: requestId,
    timestamp: new Date().toISOString(),
    captcha_type: 'userrecaptcha',
    target_site: pageurl,
    initiator,
    status: 'submitted',
  };

  try {
    const submit = await axios.get('https://ocr.captchaai.com/in.php', {
      params: {
        key: API_KEY, method: 'userrecaptcha',
        googlekey: sitekey, pageurl, json: '1',
      },
    });

    if (submit.data.status !== 1) {
      record.status = 'submit_failed';
      record.error_code = submit.data.request;
      record.solve_time_ms = Date.now() - start;
      logAudit(record);
      return null;
    }

    record.task_id = submit.data.request;
    await new Promise(r => setTimeout(r, 15000));

    for (let i = 0; i < 25; i++) {
      const poll = await axios.get('https://ocr.captchaai.com/res.php', {
        params: { key: API_KEY, action: 'get', id: submit.data.request, json: '1' },
      });

      if (poll.data.status === 1) {
        record.status = 'solved';
        record.solve_time_ms = Date.now() - start;
        record.cost_estimate = 0.003;
        logAudit(record);
        return poll.data.request;
      }
      if (poll.data.request !== 'CAPCHA_NOT_READY') {
        record.status = 'failed';
        record.error_code = poll.data.request;
        record.solve_time_ms = Date.now() - start;
        logAudit(record);
        return null;
      }
      await new Promise(r => setTimeout(r, 5000));
    }

    record.status = 'timeout';
    record.solve_time_ms = Date.now() - start;
    logAudit(record);
    return null;
  } catch (e) {
    record.status = 'error';
    record.error_code = e.message.slice(0, 200);
    record.solve_time_ms = Date.now() - start;
    logAudit(record);
    throw e;
  }
}

Meminta Log Audit

Ringkasan Harian

import json
from collections import Counter
from datetime import date

def daily_summary(log_file, target_date=None):
    """Generate a daily summary from audit logs."""
    target = target_date or date.today().isoformat()
    statuses = Counter()
    total_cost = 0
    solve_times = []

    with open(log_file) as f:
        for line in f:
            record = json.loads(line)
            if record["timestamp"].startswith(target):
                statuses[record["status"]] += 1
                total_cost += record.get("cost_estimate", 0)
                if record.get("solve_time_ms"):
                    solve_times.append(record["solve_time_ms"])

    print(f"Date: {target}")
    print(f"Total requests: {sum(statuses.values())}")
    print(f"Statuses: {dict(statuses)}")
    print(f"Estimated cost: ${total_cost:.2f}")
    if solve_times:
        print(f"Median solve time: {sorted(solve_times)[len(solve_times)//2]}ms")

daily_summary("captcha_audit.jsonl")

Retensi dan Penyimpanan

Volume Ukuran log harian Storage bulanan Rekomendasi
100 solve/hari ~30 KB ~1 MB File lokal
1.000 solve/hari ~300 KB ~10 MB File lokal + rotasi
10.000 solve/hari ~3 MB ~100 MB Kirim ke log aggregator
100.000 solve/hari ~30 MB ~1 GB Centralized logging (ELK, Datadog)

Pemecahan Masalah

Masalah Penyebab Solusi
File log terlalu besar Rotasi tidak dikonfigurasi Gunakan RotatingFileHandler atau logrotate
Record audit hilang Exception sebelum logging Log di blok finally
Penulisan lambat pada volume tinggi Synchronous file I/O Gunakan async file write atau buffer
Timestamp tidak konsisten System clock drift Gunakan NTP; log dalam UTC

Pertanyaan Umum

Haruskah saya mencatat token CAPTCHA di jejak audit?

Tidak. Token bersifat sementara (kedaluarsa dalam 60–300 detik) dan tidak memiliki nilai audit. Mencatatnya hanya membesar storage tanpa manfaat.

Bisakah saya menggunakan audit log untuk rekonsiliasi billing?

Ya. Bandingkan total audit log Anda dengan dashboard penggunaan CaptchaAI untuk memverifikasi akurasi billing.

Periode retensi apa yang harus ditetapkan?

90 hari adalah standar untuk log audit operasional. Untuk logging berbasis compliance, periksa persyaratan industri Anda (SOC 2, GDPR, HIPAA).

Artikel Terkait

  • Pelacakan Solve CAPTCHA Serverless DynamoDB

Langkah Selanjutnya

Tambahkan akuntabilitas pada setiap solve CAPTCHA — dapatkan kunci API CaptchaAI Anda.

Panduan terkait:

Komentar dinonaktifkan untuk artikel ini.