Kasus Penggunaan

Penanganan CAPTCHA untuk Otomatisasi Pencarian Domain WHOIS

Portal pencarian WHOIS melindungi data pendaftaran domain dengan reCAPTCHA v2, CAPTCHA gambar, dan pembatasan tarif. Baik Anda memeriksa ketersediaan domain, memverifikasi kepemilikan, atau memantau tanggal kedaluwarsa, CAPTCHA muncul hanya setelah beberapa kueri. Berikut cara menanganinya.

Pola CAPTCHA di Portal WHOIS

Jenis portal CAPTCHA Threshold pemicu
ICANN WHOIS reCAPTCHA v2 3–5 query per sesi
Halaman pencarian registrar reCAPTCHA v2/v3 5–10 query per menit
NIR Regional (APNIC, RIPE) Image CAPTCHA 10–20 query
WHOIS lelang domain Cloudflare Turnstile Query cepat
Alat WHOIS massal CAPTCHA kustom Setelah batas rate gratis

Pencarian WHOIS dengan Pemecahan CAPTCHA

import requests
import time
import re

class WhoisLookup:
    def __init__(self, api_key):
        self.api_key = api_key
        self.session = requests.Session()
        self.session.headers.update({
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
        })

    def lookup(self, domain, whois_url):
        """Look up WHOIS data for a domain, solving CAPTCHAs as needed."""
        response = self.session.get(whois_url, params={"domain": domain})

        if self._has_recaptcha(response.text):
            site_key = self._extract_site_key(response.text)
            token = self._solve_recaptcha(site_key, whois_url)
            response = self.session.post(whois_url, data={
                "domain": domain,
                "g-recaptcha-response": token
            })

        return self._parse_whois(response.text)

    def bulk_lookup(self, domains, whois_url, delay=3):
        """Look up WHOIS for multiple domains."""
        results = {}
        for domain in domains:
            try:
                results[domain] = self.lookup(domain, whois_url)
            except Exception as e:
                results[domain] = {"error": str(e)}
            time.sleep(delay)
        return results

    def check_availability(self, domains, whois_url):
        """Check which domains are available for registration."""
        results = self.bulk_lookup(domains, whois_url)
        available = []
        taken = []

        for domain, data in results.items():
            if data.get("error") or data.get("status") == "available":
                available.append(domain)
            else:
                taken.append(domain)

        return {"available": available, "taken": taken}

    def _has_recaptcha(self, html):
        return "g-recaptcha" in html or "recaptcha" in html.lower()

    def _extract_site_key(self, html):
        match = re.search(r'data-sitekey="([^"]+)"', html)
        if match:
            return match.group(1)
        raise ValueError("reCAPTCHA site key not found")

    def _solve_recaptcha(self, site_key, page_url):
        resp = requests.post("https://ocr.captchaai.com/in.php", data={
            "key": self.api_key,
            "method": "userrecaptcha",
            "googlekey": site_key,
            "pageurl": page_url,
            "json": 1
        })
        task_id = resp.json()["request"]

        for _ in range(60):
            time.sleep(3)
            result = requests.get("https://ocr.captchaai.com/res.php", params={
                "key": self.api_key,
                "action": "get",
                "id": task_id,
                "json": 1
            })
            data = result.json()
            if data["status"] == 1:
                return data["request"]

        raise TimeoutError("reCAPTCHA solve timed out")

    def _parse_whois(self, html):
        from bs4 import BeautifulSoup
        soup = BeautifulSoup(html, "html.parser")

        # Look for WHOIS data in pre-formatted blocks or tables
        raw_whois = soup.select_one("pre, .whois-data, #whois-result")
        if raw_whois:
            text = raw_whois.get_text()
            return self._extract_fields(text)

        return {"raw": soup.get_text()[:2000]}

    def _extract_fields(self, text):
        fields = {}
        patterns = {
            "registrar": r"Registrar:\s*(.+)",
            "created": r"Creat(?:ed|ion) Date:\s*(.+)",
            "expires": r"(?:Expir(?:y|ation)|Registry Expiry) Date:\s*(.+)",
            "updated": r"Updated Date:\s*(.+)",
            "status": r"(?:Domain )?Status:\s*(.+)",
            "nameservers": r"Name Server:\s*(.+)",
            "registrant": r"Registrant (?:Name|Organization):\s*(.+)"
        }

        for field, pattern in patterns.items():
            matches = re.findall(pattern, text, re.IGNORECASE)
            if matches:
                fields[field] = matches if len(matches) > 1 else matches[0].strip()

        return fields


# Usage
whois = WhoisLookup("YOUR_API_KEY")

# Single lookup
result = whois.lookup("example.com", "https://whois.example.com/lookup")
print(f"Registrar: {result.get('registrar')}")
print(f"Expires: {result.get('expires')}")

# Bulk availability check
domains = ["startup-name.com", "my-project.io", "cool-app.dev"]
availability = whois.check_availability(domains, "https://whois.example.com/lookup")
print(f"Available: {availability['available']}")

Pemantauan Domain (JavaScript)

class DomainMonitor {
  constructor(apiKey) {
    this.apiKey = apiKey;
    this.watchList = new Map();
  }

  addDomain(domain, whoisUrl) {
    this.watchList.set(domain, { url: whoisUrl, history: [] });
  }

  async checkExpirations() {
    const expiring = [];

    for (const [domain, config] of this.watchList) {
      try {
        const data = await this.lookup(domain, config.url);
        config.history.push({ ...data, checkedAt: new Date().toISOString() });

        if (data.expires) {
          const daysLeft = Math.ceil(
            (new Date(data.expires) - new Date()) / (1000 * 60 * 60 * 24)
          );
          if (daysLeft <= 30) {
            expiring.push({ domain, daysLeft, expires: data.expires });
          }
        }
      } catch (error) {
        console.error(`Failed to check ${domain}: ${error.message}`);
      }
    }

    return expiring;
  }

  async lookup(domain, whoisUrl) {
    const response = await fetch(`${whoisUrl}?domain=${domain}`);
    const html = await response.text();

    if (html.includes('g-recaptcha')) {
      return this.solveAndLookup(domain, whoisUrl, html);
    }

    return this.parseWhois(html);
  }

  async solveAndLookup(domain, whoisUrl, html) {
    const match = html.match(/data-sitekey="([^"]+)"/);
    if (!match) throw new Error('No reCAPTCHA site key found');

    const submitResp = await fetch('https://ocr.captchaai.com/in.php', {
      method: 'POST',
      body: new URLSearchParams({
        key: this.apiKey,
        method: 'userrecaptcha',
        googlekey: match[1],
        pageurl: whoisUrl,
        json: '1'
      })
    });
    const { request: taskId } = await submitResp.json();

    for (let i = 0; i < 60; i++) {
      await new Promise(r => setTimeout(r, 3000));
      const result = await fetch(
        `https://ocr.captchaai.com/res.php?key=${this.apiKey}&action=get&id=${taskId}&json=1`
      );
      const data = await result.json();
      if (data.status === 1) {
        const response = await fetch(whoisUrl, {
          method: 'POST',
          body: new URLSearchParams({
            domain,
            'g-recaptcha-response': data.request
          })
        });
        return this.parseWhois(await response.text());
      }
    }
    throw new Error('reCAPTCHA solve timed out');
  }

  parseWhois(html) {
    const extract = (pattern) => {
      const match = html.match(pattern);
      return match ? match[1].trim() : null;
    };

    return {
      registrar: extract(/Registrar:\s*([^\n<]+)/i),
      created: extract(/Creat(?:ed|ion) Date:\s*([^\n<]+)/i),
      expires: extract(/(?:Expir(?:y|ation)|Registry Expiry) Date:\s*([^\n<]+)/i),
      status: extract(/(?:Domain )?Status:\s*([^\n<]+)/i)
    };
  }
}

// Usage
const monitor = new DomainMonitor('YOUR_API_KEY');
monitor.addDomain('example.com', 'https://whois.example.com/lookup');
monitor.addDomain('mysite.io', 'https://whois.example.com/lookup');

const expiring = await monitor.checkExpirations();
expiring.forEach(d => console.log(`${d.domain} expires in ${d.daysLeft} days`));

Optimasi Query WHOIS

Strategi Manfaat
Cache hasil secara lokal Hindari lookup berulang untuk domain yang sama
Gunakan delay 3–5 detik Kurangi trigger rate CAPTCHA
Rotasi antar portal WHOIS Distribusikan beban ke seluruh provider
Session persistence Pertahankan status izin CAPTCHA

Pemecahan Masalah

Masalah Penyebab Perbaikan
CAPTCHA setelah 3 query Rate limit portal Naikkan delay, gunakan proxy
WHOIS mengembalikan "No match" Redaksi privasi/RDAP Coba portal WHOIS alternatif
Token reCAPTCHA ditolak Token kedaluwarsa sebelum dikirim Solve dan submit dalam 2 menit
IP diblokir Melebihi batas query harian Rotasi egress jaringan yang diotorisasi

Pertanyaan Umum

Berapa banyak lookup WHOIS yang dapat saya otomatiskan per hari?

Sebagian besar portal WHOIS berbasis web mengizinkan 50–200 query per IP per hari sebelum rate limiting agresif. Dengan rotasi proxy dan CaptchaAI menangani CAPTCHA, Anda dapat menskalakan hingga ribuan query.

Haruskah saya menggunakan protokol WHOIS (port 43) daripada portal web?

Port 43 WHOIS tidak memiliki CAPTCHA tapi memiliki rate limit ketat dan data terbatas karena redaksi GDPR. Portal web sering menampilkan lebih banyak data di balik CAPTCHA.

Bisakah saya memantau tanggal kedaluwarsa domain secara otomatis?

Ya. Jadwalkan lookup WHOIS harian atau mingguan untuk domain yang Anda pantau. CaptchaAI menangani CAPTCHA apa pun yang muncul selama pengecekan.

Artikel Terkait

  • Solve reCAPTCHA v2 dengan Callback via API
  • Penanganan reCAPTCHA v2 dan Turnstile di Situs yang Sama
  • Mekanisme Callback reCAPTCHA v2

Langkah Selanjutnya

Otomatiskan pencarian domain — dapatkan API key CaptchaAI Anda dan tangani CAPTCHA portal WHOIS.

Komentar dinonaktifkan untuk artikel ini.