Memproses 10.000 CAPTCHA per jam berarti ~2,8 solve per detik secara berkelanjutan. Ini dapat dicapai dengan arsitektur yang tepat. Panduan ini membahas matematika, kode, dan tuning yang diperlukan menggunakan CaptchaAI.
Matematika
Jika satu penyelesaian reCAPTCHA v2 memerlukan waktu 15 detik (median):
- Berurutan: 3.600 detik / 15 detik = 240 solve/jam
- Untuk mencapai 10.000/jam: Anda memerlukan ~42 solve concurrent yang sedang berjalan setiap saat
Insight utama: Anda tidak menunggu CaptchaAI menjadi lebih cepat — cukup overlap request sehingga 42 solve selesai dalam jangka waktu 15 detik yang sama.
Arsitektur
┌──────────┐ ┌────────────┐ ┌─────────────┐ ┌──────────┐
│ Task │────▶│ Submit │────▶│ CaptchaAI │────▶│ Result │
│ Queue │ │ Workers │ │ API │ │ Store │
│ (Redis) │ │ (async) │ │ │ │ (DB) │
└──────────┘ └────────────┘ └─────────────┘ └──────────┘
│ ▲
│ ┌──────────┐ │
└───▶│ Poll │────┘
│ Workers │
└──────────┘
Komponen:
- Task queue – Menyimpan task CAPTCHA pending dengan sitekey dan URL
- Submit workers – Submit task ke CaptchaAI API secara concurrent
- Poll workers – Periksa hasil pada interval optimal
- Result store – Simpan token saat tiba
Python: Pipa Asinkron
# high_throughput_solver.py
import os
import asyncio
import time
import aiohttp
API_KEY = os.environ.get("CAPTCHAAI_KEY", "YOUR_API_KEY")
BASE_URL = "https://ocr.captchaai.com"
MAX_CONCURRENT = 50 # Max simultaneous solves
POLL_INTERVAL = 5 # Seconds between polls
INITIAL_WAIT = 12 # Seconds before first poll
semaphore = asyncio.Semaphore(MAX_CONCURRENT)
stats = {"submitted": 0, "solved": 0, "failed": 0, "start": 0}
async def solve_one(session, sitekey, pageurl, task_num):
"""Submit and poll a single CAPTCHA."""
async with semaphore:
try:
# Submit
async with session.get(f"{BASE_URL}/in.php", params={
"key": API_KEY, "method": "userrecaptcha",
"googlekey": sitekey, "pageurl": pageurl, "json": "1",
}) as resp:
result = await resp.json(content_type=None)
if result.get("status") != 1:
stats["failed"] += 1
return None
stats["submitted"] += 1
task_id = result["request"]
# Wait before first poll
await asyncio.sleep(INITIAL_WAIT)
# Poll
for _ in range(25):
async with session.get(f"{BASE_URL}/res.php", params={
"key": API_KEY, "action": "get",
"id": task_id, "json": "1",
}) as resp:
poll_result = await resp.json(content_type=None)
if poll_result.get("status") == 1:
stats["solved"] += 1
return poll_result["request"]
if poll_result.get("request") != "CAPCHA_NOT_READY":
stats["failed"] += 1
return None
await asyncio.sleep(POLL_INTERVAL)
stats["failed"] += 1
return None
except Exception as e:
stats["failed"] += 1
return None
async def run_batch(tasks):
"""Process a batch of CAPTCHA tasks concurrently."""
connector = aiohttp.TCPConnector(
limit=MAX_CONCURRENT,
keepalive_timeout=60,
)
async with aiohttp.ClientSession(connector=connector) as session:
coros = [
solve_one(session, task["sitekey"], task["pageurl"], i)
for i, task in enumerate(tasks)
]
results = await asyncio.gather(*coros)
return results
async def main():
# Generate test tasks (replace with your task source)
tasks = [
{
"sitekey": "6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-",
"pageurl": "https://www.google.com/recaptcha/api2/demo",
}
for _ in range(100) # Start with 100 tasks
]
stats["start"] = time.time()
print(f"Processing {len(tasks)} tasks with {MAX_CONCURRENT} concurrent workers")
results = await run_batch(tasks)
elapsed = time.time() - stats["start"]
print(f"\nCompleted in {elapsed:.0f}s")
print(f"Submitted: {stats['submitted']}")
print(f"Solved: {stats['solved']}")
print(f"Failed: {stats['failed']}")
print(f"Throughput: {stats['solved'] / (elapsed / 3600):.0f} solves/hour")
asyncio.run(main())
JavaScript: Alur Concurrent
// high_throughput_solver.js
const axios = require('axios');
const https = require('https');
const API_KEY = process.env.CAPTCHAAI_KEY || 'YOUR_API_KEY';
const BASE = 'https://ocr.captchaai.com';
const MAX_CONCURRENT = 50;
const agent = new https.Agent({ keepAlive: true, maxSockets: MAX_CONCURRENT });
const api = axios.create({ baseURL: BASE, httpsAgent: agent, timeout: 30000 });
const stats = { submitted: 0, solved: 0, failed: 0 };
async function solveOne(sitekey, pageurl) {
try {
const submit = await api.get('/in.php', {
params: { key: API_KEY, method: 'userrecaptcha', googlekey: sitekey, pageurl, json: '1' },
});
if (submit.data.status !== 1) { stats.failed++; return null; }
stats.submitted++;
await new Promise(r => setTimeout(r, 12000));
for (let i = 0; i < 25; i++) {
const poll = await api.get('/res.php', {
params: { key: API_KEY, action: 'get', id: submit.data.request, json: '1' },
});
if (poll.data.status === 1) { stats.solved++; return poll.data.request; }
if (poll.data.request !== 'CAPCHA_NOT_READY') { stats.failed++; return null; }
await new Promise(r => setTimeout(r, 5000));
}
stats.failed++;
return null;
} catch { stats.failed++; return null; }
}
async function runWithConcurrency(tasks, limit) {
const results = [];
const executing = new Set();
for (const task of tasks) {
const p = solveOne(task.sitekey, task.pageurl).then(r => {
executing.delete(p);
return r;
});
executing.add(p);
results.push(p);
if (executing.size >= limit) {
await Promise.race(executing);
}
}
return Promise.all(results);
}
(async () => {
const tasks = Array.from({ length: 100 }, () => ({
sitekey: '6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-',
pageurl: 'https://www.google.com/recaptcha/api2/demo',
}));
const start = Date.now();
console.log(`Processing ${tasks.length} tasks, ${MAX_CONCURRENT} concurrent`);
await runWithConcurrency(tasks, MAX_CONCURRENT);
const elapsed = (Date.now() - start) / 1000;
console.log(`\nDone in ${elapsed.toFixed(0)}s`);
console.log(`Solved: ${stats.solved}, Failed: ${stats.failed}`);
console.log(`Throughput: ${(stats.solved / (elapsed / 3600)).toFixed(0)} solves/hour`);
agent.destroy();
})();
Parameter Penyetelan
| Parameter | Konservatif | Seimbang | Agresif |
|---|---|---|---|
| MAX_CONCURRENT | 20 | 50 | 100 |
| INITIAL_WAIT | 15 dtk | 12 dtk | 10 dtk |
| POLL_INTERVAL | 7 dtk | 5 dtk | 3 dtk |
| MAX_POLL_ATTEMPTS | 30 | 25 | 20 |
| Throughput diharapkan | ~4.800/jam | ~10.000/jam | ~18.000/jam |
Mulailah secara konservatif dan tingkatkan MAX_CONCURRENT hingga Anda melihat hasil yang semakin berkurang atau tingkat kesalahan yang meningkat.
Pemantauan Throughput
Lacak metrik ini secara real-time:
- Solve per menit — Harus tetap di ~167 untuk target 10K/jam
- Error rate – Tetap di bawah 5%. Jika melonjak, kurangi concurrency
- Kedalaman queue – Jika bertambah, tambah worker. Jika kosong, provisioning berlebihan
- Waktu solve P90 – Jika meningkat, CaptchaAI mungkin sedang rate-limiting
Pemecahan Masalah
| Masalah | Penyebab | Solusi |
|---|---|---|
| Throughput plateau di ~5K/jam | Concurrency tidak cukup | Naikkan MAX_CONCURRENT ke 80–100 |
| Error rate > 10% | API kelebihan beban atau proxy buruk | Kurangi concurrency; periksa kesehatan proxy |
| Memory meningkat | Akumulasi task tak terbatas | Proses hasil saat datang, jangan buffer |
ERROR_NO_SLOT_AVAILABLE |
Queue CaptchaAI penuh | Backoff dan retry setelah 5 detik |
Pertanyaan Umum
Berapa batas concurrency CaptchaAI?
Tidak ada batasan ketat pada request concurrent, namun concurrency sangat tinggi (500+) dapat memicu rate limiting. Mulai dari 50 dan naikkan bertahap.
Bisakah saya menjalankan ini di banyak mesin?
Ya. Gunakan shared queue (Redis, RabbitMQ) dan jalankan worker di beberapa server. Setiap worker memproses task secara independen.
Bagaimana pemantauan saldo pada volume ini?
Pada 10.000 solve/jam, pantau saldo Anda dengan cermat. Gunakan endpoint cek saldo (res.php?action=getbalance) dan atur peringatan.
Langkah Selanjutnya
Bangun pipeline CAPTCHA throughput tinggi Anda — dapatkan kunci API CaptchaAI Anda.
Panduan terkait:
- Koneksi Keep-Alive HTTP/2 untuk CAPTCHA API
- Benchmarking Waktu Solve CAPTCHA