Gurugram, Haryana, India
Information Technology
Full-Time
Simform
Overview
Job Description
Key Responsibilities :
Required Skills & Qualifications
Key Responsibilities :
- Design and implement scalable web scraping frameworks to collect data from complex and dynamic websites.
- Develop custom spiders/crawlers using Python libraries like Playwright, Puppeteer, Selenium, Scrapy, or BeautifulSoup.
- Apply advanced anti-bot evasion strategies such as CAPTCHA solving, IP rotation, user-agent spoofing, browser fingerprinting, and session/cookie management.
- Automate scraping tasks across distributed systems using tools like Celery, Airflow, cron, and ETL orchestration platforms.
- Optimize scraper performance for speed, accuracy, and resilience to website structure changes.
- Implement network interception, DOM traversal, WebSocket handling, and headless browser control.
- Store and manage scraped data in cloud or local storage using PostgreSQL, MongoDB, or S3.
- Integrate scraping systems with APIs or microservices for data consumption and downstream workflows.
- Monitor scraper reliability and handle retry logic, error logging, and dynamic throttling.
- Write modular, well-documented, and testable Python code with proper unit testing and
- Collaborate with engineers, data scientists, and stakeholders to understand scraping goals
Required Skills & Qualifications
- Bachelors/Masters degree in Computer Science, Engineering, or a related field.
- 3+ years of experience in Python development with specialization in web scraping.
- Deep understanding of modern anti-scraping defenses and bypass techniques (e.g., CAPTCHA, IP bans, dynamic rendering).
- Proficiency with headless browser tools like Playwright, Puppeteer, or Selenium.
- Strong grasp of DOM manipulation, JavaScript execution, network inspection, and asynchronous scraping using asyncio, aiohttp, etc.
- Experience in handling large-scale data extraction and storage using SQL and NoSQL databases.
- Hands-on experience deploying scrapers and automation workflows on AWS, GCP, or Azure.
- Familiarity with containerization using Docker and optional experience with Kubernetes.
- Comfortable with REST API integration, job scheduling, and microservices-based
- Strong debugging, optimization, and testing skills.
- Clear understanding of legal and ethical scraping boundaries.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in