Setiap sesi browser baru dimulai dari nol — tanpa cookie, tanpa riwayat, tanpa kepercayaan. Sistem CAPTCHA melihat sesi baru sebagai sesi yang berisiko dan lebih sering memicu tantangan. Sesi yang bertahan di antara proses akan membangun kepercayaan, mengurangi frekuensi CAPTCHA, dan menghindari penyelesaian tantangan yang sama berulang kali.
Mengapa Persistensi Sesi Mengurangi CAPTCHA
| Status Sesi | Frekuensi CAPTCHA | Alasan |
|---|---|---|
| Sesi baru (tanpa cookie) | Tinggi | Tidak ada riwayat kepercayaan, pengguna tidak dikenal |
| Sesi dengan cookie Google | Sedang | reCAPTCHA mengenali login Google |
| Sesi hangat (riwayat browsing) | Rendah | Sinyal perilaku organik |
| Profil persisten (berhari-hari) | Sangat rendah | Skor kepercayaan yang sudah terbentuk |
Kegigihan Cookie (Selenium/Python)
Simpan dan Pulihkan Cookie
import json
import os
import time
from selenium import webdriver
class PersistentSession:
def __init__(self, profile_name="default", cookie_dir="./sessions"):
self.profile_name = profile_name
self.cookie_dir = cookie_dir
self.cookie_file = os.path.join(cookie_dir, f"{profile_name}_cookies.json")
os.makedirs(cookie_dir, exist_ok=True)
def create_driver(self):
options = webdriver.ChromeOptions()
options.add_argument("--window-size=1920,1080")
return webdriver.Chrome(options=options)
def save_cookies(self, driver):
"""Save all cookies to disk."""
cookies = driver.get_cookies()
with open(self.cookie_file, "w") as f:
json.dump(cookies, f, indent=2)
print(f"Saved {len(cookies)} cookies to {self.cookie_file}")
def load_cookies(self, driver, domain=None):
"""Restore cookies from disk."""
if not os.path.exists(self.cookie_file):
print("No saved cookies found")
return False
with open(self.cookie_file) as f:
cookies = json.load(f)
loaded = 0
for cookie in cookies:
# Filter by domain if specified
if domain and domain not in cookie.get("domain", ""):
continue
# Remove problematic fields
cookie.pop("sameSite", None)
cookie.pop("storeId", None)
try:
driver.add_cookie(cookie)
loaded += 1
except Exception as e:
print(f"Skip cookie {cookie.get('name')}: {e}")
print(f"Loaded {loaded}/{len(cookies)} cookies")
return loaded > 0
def run_with_session(self, url, callback):
"""Run a task with persistent session."""
driver = self.create_driver()
try:
# Navigate to domain first (required for cookie loading)
driver.get(url)
time.sleep(1)
# Load saved cookies
self.load_cookies(driver)
# Refresh to apply cookies
driver.get(url)
time.sleep(2)
# Execute task
result = callback(driver)
# Save updated cookies
self.save_cookies(driver)
return result
finally:
driver.quit()
# Usage
session = PersistentSession("target-site")
def my_task(driver):
# Check if already logged in
if "dashboard" in driver.current_url:
print("Session restored — no login needed")
return driver.page_source
else:
print("Need to login + solve CAPTCHA")
# Solve CAPTCHA with CaptchaAI...
return None
result = session.run_with_session("https://example.com", my_task)
Direktori Data Pengguna Chrome (Persistensi Profil Lengkap)
Persistensi terlengkap — menyimpan cookie, Penyimpanan lokal, cache, riwayat, dan status browser:
import os
from selenium import webdriver
PROFILE_DIR = os.path.abspath("./chrome-profiles/profile-1")
def create_persistent_driver():
options = webdriver.ChromeOptions()
options.add_argument(f"--user-data-dir={PROFILE_DIR}")
options.add_argument("--profile-directory=Default")
options.add_argument("--no-sandbox")
return webdriver.Chrome(options=options)
# First run: builds fresh profile
driver = create_persistent_driver()
driver.get("https://example.com")
# ... solve CAPTCHA, login, etc.
driver.quit()
# Second run: same profile, cookies and state preserved
driver = create_persistent_driver()
driver.get("https://example.com")
# Often skips CAPTCHA because session is recognized
driver.quit()
Manfaat Direktori Data Pengguna
| Apa yang Disimpan | Dampak pada CAPTCHA |
|---|---|
| Cookie | Token sesi, cookie NID Google |
| LocalStorage | Token kepercayaan khusus situs |
| IndexedDB | Status internal reCAPTCHA |
| Cache | Pemuatan halaman lebih cepat |
| Riwayat | Sinyal pola browsing |
| Service Worker | Pemeriksaan CAPTCHA latar belakang |
Konteks Persisten Puppeteer
const puppeteer = require("puppeteer-extra");
const StealthPlugin = require("puppeteer-extra (mode standar)");
const path = require("path");
puppeteer.use(StealthPlugin());
const USER_DATA_DIR = path.resolve("./chrome-profiles/profile-1");
async function runWithPersistentProfile() {
const browser = await puppeteer.launch({
headless: false,
userDataDir: USER_DATA_DIR,
args: [
"--no-sandbox",
"--window-size=1920,1080",
],
});
const page = await browser.newPage();
await page.goto("https://example.com", { waitUntil: "networkidle0" });
// Check if session is active
const isLoggedIn = await page.evaluate(() =>
document.querySelector(".user-menu") !== null
);
if (isLoggedIn) {
console.log("Session active — no CAPTCHA needed");
} else {
console.log("Session expired — solving CAPTCHA");
// Solve with CaptchaAI...
}
await browser.close();
}
Penyimpanan lokal dan Penyimpanan sesi
def save_storage(driver, filepath):
"""Save localStorage and sessionStorage."""
storage = driver.execute_script("""
return {
localStorage: Object.fromEntries(
Object.entries(localStorage)
),
sessionStorage: Object.fromEntries(
Object.entries(sessionStorage)
),
};
""")
with open(filepath, "w") as f:
json.dump(storage, f, indent=2)
def restore_storage(driver, filepath):
"""Restore localStorage and sessionStorage."""
if not os.path.exists(filepath):
return
with open(filepath) as f:
storage = json.load(f)
for key, value in storage.get("localStorage", {}).items():
driver.execute_script(
f"localStorage.setItem('{key}', '{value}')"
)
for key, value in storage.get("sessionStorage", {}).items():
driver.execute_script(
f"sessionStorage.setItem('{key}', '{value}')"
)
Strategi Pemanasan Sesi
Sesi baru memicu lebih banyak CAPTCHA. "Menghangatkan" sesi dengan perilaku organik membangun kepercayaan:
import random
import time
def warm_session(driver, warm_urls=None):
"""Simulate organic browsing to build session trust."""
default_urls = [
"https://www.google.com",
"https://www.google.com/search?q=weather",
"https://www.wikipedia.org",
]
urls = warm_urls or default_urls
for url in urls:
driver.get(url)
time.sleep(random.uniform(2, 5))
# Simulate scroll
driver.execute_script(
f"window.scrollTo(0, {random.randint(200, 800)})"
)
time.sleep(random.uniform(1, 3))
print(f"Session warmed with {len(urls)} pages")
# Usage
driver = create_persistent_driver()
warm_session(driver)
# Now navigate to target — lower CAPTCHA chance
driver.get("https://staging.example.com/form")
Manajer Sesi Multi-Profil
import os
import json
import time
from datetime import datetime
class SessionManager:
"""Manage multiple persistent browser profiles."""
def __init__(self, base_dir="./sessions"):
self.base_dir = base_dir
self.meta_file = os.path.join(base_dir, "profiles.json")
os.makedirs(base_dir, exist_ok=True)
if os.path.exists(self.meta_file):
with open(self.meta_file) as f:
self.profiles = json.load(f)
else:
self.profiles = {}
def _save_meta(self):
with open(self.meta_file, "w") as f:
json.dump(self.profiles, f, indent=2)
def get_profile_dir(self, name):
return os.path.join(self.base_dir, f"profile-{name}")
def create_profile(self, name, proxy=None):
"""Create a new browser profile."""
profile_dir = self.get_profile_dir(name)
os.makedirs(profile_dir, exist_ok=True)
self.profiles[name] = {
"created": datetime.now().isoformat(),
"last_used": None,
"use_count": 0,
"proxy": proxy,
"captcha_solves": 0,
}
self._save_meta()
return profile_dir
def get_least_used_profile(self):
"""Get the profile used least recently."""
if not self.profiles:
return None
return min(
self.profiles.items(),
key=lambda x: x[1].get("last_used") or ""
)[0]
def record_use(self, name, solved_captcha=False):
"""Record profile usage."""
if name in self.profiles:
self.profiles[name]["last_used"] = datetime.now().isoformat()
self.profiles[name]["use_count"] += 1
if solved_captcha:
self.profiles[name]["captcha_solves"] += 1
self._save_meta()
def get_stats(self):
"""Print profile statistics."""
for name, meta in self.profiles.items():
print(f"Profile: {name}")
print(f" Uses: {meta['use_count']}")
print(f" CAPTCHAs: {meta['captcha_solves']}")
print(f" Last used: {meta.get('last_used', 'never')}")
print()
# Usage
manager = SessionManager()
# Create 5 rotating profiles
for i in range(5):
manager.create_profile(f"worker-{i}")
# Get next profile to use
profile_name = manager.get_least_used_profile()
profile_dir = manager.get_profile_dir(profile_name)
# Use with Selenium
options = webdriver.ChromeOptions()
options.add_argument(f"--user-data-dir={os.path.abspath(profile_dir)}")
driver = webdriver.Chrome(options=options)
# After task
manager.record_use(profile_name, solved_captcha=True)
manager.get_stats()
Rotasi dan Kedaluwarsa Cookie
from datetime import datetime, timezone
def clean_expired_cookies(cookie_file):
"""Remove expired cookies from saved file."""
if not os.path.exists(cookie_file):
return
with open(cookie_file) as f:
cookies = json.load(f)
now = datetime.now(timezone.utc).timestamp()
valid = [c for c in cookies if c.get("expiry", float("inf")) > now]
removed = len(cookies) - len(valid)
if removed > 0:
with open(cookie_file, "w") as f:
json.dump(valid, f, indent=2)
print(f"Removed {removed} expired cookies")
def merge_cookies(existing_file, new_cookies):
"""Merge new cookies with existing, preferring newer values."""
existing = []
if os.path.exists(existing_file):
with open(existing_file) as f:
existing = json.load(f)
# Index by (name, domain)
cookie_map = {}
for c in existing:
key = (c["name"], c.get("domain", ""))
cookie_map[key] = c
for c in new_cookies:
key = (c["name"], c.get("domain", ""))
cookie_map[key] = c # Newer overwrites
merged = list(cookie_map.values())
with open(existing_file, "w") as f:
json.dump(merged, f, indent=2)
return len(merged)
Pemecahan Masalah
| Masalah | Penyebab | Solusi |
|---|---|---|
| Cookie tidak dimuat | Belum navigasi ke domain terlebih dahulu | Panggil driver.get(url) sebelum add_cookie |
| Error profil terkunci | Chrome sebelumnya tidak ditutup | Matikan proses Chrome, hapus SingletonLock |
| Sesi masih kedaluwarsa | Cookie sameSite tidak cocok |
Hapus sameSite sebelum memuat |
| Storage diblokir | Konteks CORS/keamanan | Muat storage setelah navigasi ke origin yang benar |
| CAPTCHA rate naik seiring waktu | IP ditandai | Rotasi proxy per profil |
Pertanyaan Umum
Berapa lama sesi browser mengurangi frekuensi CAPTCHA?
Cookie NID Google bertahan selama 6 bulan. Cf_clearance Cloudflare biasanya berlangsung 15 menit hingga 1 jam. Pertahankan dan segarkan secara berkala.
Bisakah saya berbagi sesi antar mesin?
Ya — ekspor file cookie dan folder direktori data pengguna. Cocokkan zona waktu dan proxy dengan sesi asli untuk hasil terbaik.
Apakah persistensi sesi berfungsi dengan Chrome headless?
Ya. Direktori data pengguna dan file cookie bekerja identik dalam mode headless. Cookie yang tersimpan membawa sinyal kepercayaan yang sama.
Berapa banyak profil yang harus saya pertahankan?
Untuk penggunaan bergilir, pertahankan 5–10 profil per situs target. Rotasi penggunaan untuk menghindari rate limiting pada satu profil.
Apakah CaptchaAI mendapat manfaat dari persistensi sesi?
Secara tidak langsung — persistensi sesi mengurangi frekuensi CAPTCHA, menurunkan jumlah panggilan CaptchaAI yang diperlukan (menghemat biaya). Ketika CAPTCHA muncul, CaptchaAI menyelesaikannya seperti biasa.
Panduan Terkait
- Isolasi Profil Browser + Integrasi CaptchaAI
- Puppeteer + CaptchaAI untuk QA
Bangun sesi browser persisten yang mengurangi tantangan CAPTCHA — dapatkan kunci CaptchaAI Anda ketika tantangan masih muncul.