Chennai, Tamil Nadu, India
Information Technology
Other
Futurism

Overview
ID: 849 | 2-5 yrs | India | careers
Job Title: Data Scraping Engineer
Job Location: Baner, Pune
Experience: 2 to 5 Years
Shift: Monday to Friday (10:00 AM to 7:00 PM IST)
Qualification: BTech, MBA
Job Objective:
Futurism Technologies is looking for Data Scraping Engineer with 2+ years of experience scraping data from high-security websites. The ideal candidate will be proficient in traditional and AI-driven scraping techniques, capable of bypassing complex anti-bot systems, and skilled in filtering, modifying, and storing large-scale structured data. Strong command of Python, Excel and Screaming Frog SEO Spider is also essential for data analysis and website auditing.
Key Responsibilities:
Develop robust scraping pipelines for websites with advanced bot protection (CAPTCHA, Cloudflare, rate limiting).
Implement and leverage AI/ML techniques (e.g., visual DOM parsing, content classification, anomaly detection) to enhance scraping capabilities where traditional methods fall short.
Use Screaming Frog SEO Spider for comprehensive crawling, data extraction, and SEO-focused analysis.
Use Python to Scrap the high-security websites.
Work with headless browsers (Playwright, Puppeteer) to render and extract dynamic JavaScript content.
Clean, transform, and structure raw data for business-ready consumption using Excel (advanced formulas, pivot tables, lookups, macros, etc.).
Store and manage scraped data in databases like MongoDB, PostgreSQL, or structured file formats (CSV, JSON).
Create automated, fault-tolerant scraping jobs with retry logic, proxy rotation, and alerting systems.
Stay up to date with scraping trends, legal compliance, and AI tools to optimize workflows.
Required Skills:
Proficiency in Python and scraping libraries (Scrapy, Selenium, Playwright, BeautifulSoup).
Hands-on experience with anti-bot bypass techniques (proxy rotation, CAPTCHA solving, header spoofing).
Strong Excel knowledge – including advanced data manipulation and automation (macros, formulas, VBA optional).
Working knowledge of Screaming Frog SEO Spider for crawling and extracting structured website data.
Exposure to AI-based scraping enhancements, such as:
Visual DOM recognition using ML/computer vision
NLP for parsing unstructured content
Content-type classifiers or dynamic selector generators
Experience with structured data handling in MongoDB, MySQL/PostgreSQL, and flat file formats.
Familiarity with XPath, CSS selectors, regex, and dynamic content handling.
Nice to Have:
Familiarity with Docker, CI/CD pipelines, and cloud environments (AWS, GCP).
Experience integrating with external APIs or handling real-time data feeds.
Bash scripting or task automation (Airflow, Cron jobs).
Understanding of ethical/legal considerations around scraping.
What We’re Looking For:
An engineer who thinks outside the box and solves scraping challenges creatively.
Passion for automation and data accuracy.
Someone who’s hands-on, detail-focused, and eager to work with cutting-edge scraping and AI tech.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in