Ketika beberapa worker memecahkan CAPTCHA untuk situs yang sama, mereka menghadapi masalah yang sama: setiap worker memiliki sesinya sendiri. Situs target melihat cookie berbeda, IP berbeda, dan browser sinyal browser berbeda. Manajemen status sesi menyinkronkan konteks antar worker sehingga penyelesaiannya konsisten dan situs target melihat sesi yang koheren.
Masalah Status Sesi
Worker 1 → Login → Solve CAPTCHA → Get cookie A → Submit form ✅
Worker 2 → New session → Solve CAPTCHA → Get cookie B → Submit form ✅
Worker 3 → Reuse cookie A? → Cookie expired → Solve CAPTCHA → Fail ❌
Tanpa state bersama, worker membuang solve pada sesi yang sudah kedaluarsa dan menghasilkan perilaku tidak konsisten yang dapat terdeteksi oleh situs target.
Apa yang Termasuk dalam Status Sesi
| Komponen Negara | Seumur hidup | Strategi Berbagi |
|---|---|---|
| Cookie otentikasi | Menit ke jam | Redis dengan TTL |
| Token CAPTCHA | 90–300 detik | Daftar Redis (TTL pendek) |
Cookie qa_validation_cookie |
~30 menit | Redis hash |
| Token CSRF | Per page load | Jangan dibagi — setiap worker ambil sendiri |
| Browser sinyal browser | Permanen | Konfigurasi, bukan runtime state |
| Penugasan proxy | Per sesi | Pool proxy berbasis Redis |
Arsitektur
┌──────────────────────────────────────┐
│ Session State Store │
│ (Redis) │
│ │
│ cookies:{domain} → Hash │
│ tokens:{sitekey} → List │
│ proxies:pool → Set │
│ locks:{domain}:{worker} → String │
└─────┬──────────┬──────────┬──────────┘
│ │ │
┌───▼───┐ ┌──▼────┐ ┌──▼────┐
│Worker1│ │Worker2│ │Worker3│
└───────┘ └───────┘ └───────┘
Implementasi Python
Toko Sesi
import os
import json
import time
import redis
import requests
from datetime import datetime, timezone
r = redis.Redis(
host=os.environ.get("REDIS_HOST", "localhost"),
port=int(os.environ.get("REDIS_PORT", 6379)),
decode_responses=True
)
API_KEY = os.environ["CAPTCHAAI_API_KEY"]
class SessionStore:
"""Shared session state across distributed workers."""
def __init__(self, domain):
self.domain = domain
self.cookie_key = f"session:cookies:{domain}"
self.token_key = f"session:tokens:{domain}"
def save_cookies(self, cookies, ttl=1800):
"""Store cookies from a successful session."""
cookie_data = {name: value for name, value in cookies.items()}
r.hset(self.cookie_key, mapping=cookie_data)
r.expire(self.cookie_key, ttl)
def get_cookies(self):
"""Retrieve shared cookies."""
cookies = r.hgetall(self.cookie_key)
return cookies if cookies else None
def save_token(self, sitekey, token, ttl=80):
"""Store a solved CAPTCHA token."""
key = f"{self.token_key}:{sitekey}"
r.rpush(key, token)
r.expire(key, ttl)
def get_token(self, sitekey):
"""Pop a cached CAPTCHA token."""
key = f"{self.token_key}:{sitekey}"
return r.lpop(key)
def acquire_session_lock(self, worker_id, ttl=300):
"""Ensure only one worker manages the session at a time."""
lock_key = f"session:lock:{self.domain}"
return r.set(lock_key, worker_id, nx=True, ex=ttl)
def release_session_lock(self, worker_id):
"""Release session lock if this worker holds it."""
lock_key = f"session:lock:{self.domain}"
current = r.get(lock_key)
if current == worker_id:
r.delete(lock_key)
Pekerja dengan Negara Bersama
class CaptchaWorker:
def __init__(self, worker_id, domain):
self.worker_id = worker_id
self.store = SessionStore(domain)
self.session = requests.Session()
def setup_session(self):
"""Load shared cookies into this worker's session."""
cookies = self.store.get_cookies()
if cookies:
for name, value in cookies.items():
self.session.cookies.set(name, value)
return True
return False
def solve_captcha(self, sitekey, pageurl):
"""Solve with token cache and session sharing."""
# Check for cached token
cached = self.store.get_token(sitekey)
if cached:
return {"solution": cached, "source": "cache"}
# Solve via CaptchaAI
resp = requests.post("https://ocr.captchaai.com/in.php", data={
"key": API_KEY,
"method": "userrecaptcha",
"googlekey": sitekey,
"pageurl": pageurl,
"json": 1
})
data = resp.json()
if data.get("status") != 1:
return {"error": data.get("request")}
captcha_id = data["request"]
for _ in range(60):
time.sleep(5)
result = requests.get("https://ocr.captchaai.com/res.php", params={
"key": API_KEY, "action": "get",
"id": captcha_id, "json": 1
}).json()
if result.get("status") == 1:
token = result["request"]
self.store.save_token(sitekey, token)
return {"solution": token, "source": "api"}
if result.get("request") != "CAPCHA_NOT_READY":
return {"error": result.get("request")}
return {"error": "TIMEOUT"}
def process_page(self, url, sitekey):
"""Full workflow: setup session → solve CAPTCHA → submit."""
# Load shared session
self.setup_session()
# Solve CAPTCHA
result = self.solve_captcha(sitekey, url)
if "error" in result:
return result
# Submit form with token
response = self.session.post(url, data={
"g-recaptcha-response": result["solution"]
})
# Share resulting cookies
self.store.save_cookies(dict(self.session.cookies))
return {"status": response.status_code, "source": result["source"]}
Manajemen Pool Proxy
class ProxyPool:
"""Distribute proxies across workers to avoid IP conflicts."""
def __init__(self, proxies):
self.pool_key = "session:proxy_pool"
self.assigned_key = "session:proxy_assigned"
# Initialize pool
for proxy in proxies:
r.sadd(self.pool_key, proxy)
def acquire_proxy(self, worker_id, ttl=600):
"""Assign an unused proxy to a worker."""
# Check if worker already has one
existing = r.hget(self.assigned_key, worker_id)
if existing:
return existing
# Pop from available pool
proxy = r.spop(self.pool_key)
if proxy:
r.hset(self.assigned_key, worker_id, proxy)
r.expire(self.assigned_key, ttl)
return proxy
return None
def release_proxy(self, worker_id):
"""Return proxy to the pool."""
proxy = r.hget(self.assigned_key, worker_id)
if proxy:
r.sadd(self.pool_key, proxy)
r.hdel(self.assigned_key, worker_id)
Implementasi JavaScript
const Redis = require("ioredis");
const axios = require("axios");
const redis = new Redis(process.env.REDIS_URL || "redis://localhost:6379");
const API_KEY = process.env.CAPTCHAAI_API_KEY;
class SessionStore {
constructor(domain) {
this.domain = domain;
this.cookieKey = `session:cookies:${domain}`;
this.tokenKey = `session:tokens:${domain}`;
}
async saveCookies(cookies, ttl = 1800) {
const entries = Object.entries(cookies).flat();
if (entries.length > 0) {
await redis.hset(this.cookieKey, ...entries);
await redis.expire(this.cookieKey, ttl);
}
}
async getCookies() {
return await redis.hgetall(this.cookieKey);
}
async saveToken(sitekey, token, ttl = 80) {
const key = `${this.tokenKey}:${sitekey}`;
await redis.rpush(key, token);
await redis.expire(key, ttl);
}
async getToken(sitekey) {
return await redis.lpop(`${this.tokenKey}:${sitekey}`);
}
async acquireLock(workerId, ttl = 300) {
const result = await redis.set(`session:lock:${this.domain}`, workerId, "NX", "EX", ttl);
return result === "OK";
}
async releaseLock(workerId) {
const current = await redis.get(`session:lock:${this.domain}`);
if (current === workerId) await redis.del(`session:lock:${this.domain}`);
}
}
async function workerSolve(store, sitekey, pageurl) {
const cached = await store.getToken(sitekey);
if (cached) return { solution: cached, source: "cache" };
const submit = await axios.post("https://ocr.captchaai.com/in.php", null, {
params: { key: API_KEY, method: "userrecaptcha", googlekey: sitekey, pageurl, json: 1 },
});
if (submit.data.status !== 1) return { error: submit.data.request };
const captchaId = submit.data.request;
for (let i = 0; i < 60; i++) {
await new Promise((r) => setTimeout(r, 5000));
const poll = await axios.get("https://ocr.captchaai.com/res.php", {
params: { key: API_KEY, action: "get", id: captchaId, json: 1 },
});
if (poll.data.status === 1) {
await store.saveToken(sitekey, poll.data.request);
return { solution: poll.data.request, source: "api" };
}
if (poll.data.request !== "CAPCHA_NOT_READY") return { error: poll.data.request };
}
return { error: "TIMEOUT" };
}
Pola Manajemen State
| Pola | Kapan Digunakan |
|---|---|
| Session lock | Satu worker mengelola login, worker lain menggunakan cookie |
| Token pool | Throughput tinggi: solve lebih awal dan distribusikan token |
| Cookie sharing | Worker memerlukan sesi terautentikasi |
| Proxy affinity | Situs target melacak binding sesi IP |
Pemecahan Masalah
| Masalah | Penyebab | Solusi |
|---|---|---|
| Worker mendapat sesi berbeda | Cookie tidak dibagi via Redis | Verifikasi save_cookies dipanggil setelah request berhasil |
| Token kedaluarsa sebelum worker lain menggunakannya | TTL terlalu panjang atau network delay | Kurangi margin safety TTL; gunakan token dalam 10 detik setelah diambil |
| Session lock tidak pernah dilepas | Worker crash | TTL pada lock key melepasnya otomatis (default 300 detik) |
| Situs target memblokir worker | Semua worker menggunakan proxy yang sama | Gunakan pool proxy dengan affinity per worker |
Pertanyaan Umum
Haruskah setiap worker berbagi cookie?
Hanya untuk situs yang memerlukan sesi terautentikasi. Untuk solve CAPTCHA stateless (kirim sitekey → dapatkan token), worker tidak perlu cookie bersama — cukup berbagi token.
Bagaimana cara menangani berakhirnya sesi?
Atur Redis TTL sedikit lebih pendek dari masa hidup sesi. Ketika cookie kedaluarsa, satu worker mengambil session lock, melakukan autentikasi ulang, dan menyimpan cookie baru untuk worker lainnya.
Bagaimana dengan sesi berbasis browser (Puppeteer/Playwright)?
Serialkan cookie browser dengan page.cookies() dan simpan di Redis. Worker lain memuatnya dengan page.setCookie(). Ini berfungsi di seluruh mesin dan browser instance yang berbeda.
Langkah Selanjutnya
Koordinasikan worker CAPTCHA terdistribusi Anda secara efisien — dapatkan kunci API CaptchaAI Anda.
Panduan terkait:
- Manajemen TTL Token Redis
- Kegigihan Sesi Browser