Free cookie consent management tool by TermsFeed Hiring Data Analyst(Web Scrapping using Python) | Antal Tech Jobs
Back to Jobs
14 Weeks ago

Hiring Data Analyst(Web Scrapping using Python)

decor
214318 - 1051540 INR - Annual
Pune, India
Information Technology
Full-Time
Texila Educare Healthcare and Technology Enterprise Pvt Ltd

Overview

Experience: 2Years

Key Responsibilities:

  • Develop and Maintain Web Scraping Scripts: Build efficient, scalable, and robust web scraping tools using Python and relevant libraries (e.g., BeautifulSoup, Scrapy, Selenium).
  • Data Extraction: Extract structured and unstructured data from websites and APIs, focusing on gathering high-quality and clean datasets.
  • Data Processing and Storage: Process, clean, and store extracted data in databases (SQL/NoSQL) or data warehouses, ensuring it's ready for analysis and reporting.
  • Website Parsing and HTML Manipulation: Parse complex HTML structures and interact with websites that require JavaScript rendering.
  • Error Handling and Logging: Develop error handling and logging mechanisms to ensure scripts run reliably and provide useful diagnostics when failures occur.
  • Automation and Scheduling: Automate scraping jobs to run on a regular basis using task schedulers (e.g., cron jobs) and ensure minimal downtime.
  • Ensure Compliance: Implement scraping systems that comply with website Terms of Service and applicable laws (e.g., GDPR, Copyright Laws, and Robots.txt).
  • Optimize Performance: Optimize scraping performance for speed and reliability. Handle rate limits, CAPTCHAs, and IP blocking mechanisms to ensure smooth operations.
  • Documentation and Reporting: Maintain clear documentation of scraping processes, data flows, and any issues encountered. Provide status updates and reports to stakeholders.
  • Collaboration: Work closely with data analysts, product teams, and engineers to ensure data quality and availability for decision-making processes.

Required Skills and Qualifications:

  • Proficiency in Python: Strong experience with Python, especially in libraries like BeautifulSoup, Scrapy, Requests, Selenium, and Pandas.
  • Web Scraping Frameworks: Experience with scraping tools such as Scrapy, Selenium, or Puppeteer.
  • HTML, CSS, JavaScript: Deep understanding of web technologies, including HTML, CSS, and JavaScript to navigate websites and handle dynamic content.
  • Data Manipulation and Storage: Experience with SQL and NoSQL databases (e.g., MySQL, PostgreSQL, MongoDB) and data processing libraries (e.g., Pandas).
  • APIs: Experience working with RESTful APIs to extract or push data.
  • Data Formats: Knowledge of data formats like JSON, XML, CSV, and how to parse/handle them.
  • Error Handling and Debugging: Strong skills in troubleshooting, debugging, and optimizing web scraping operations.
  • Networking and HTTP Protocols: Familiarity with HTTP requests, headers, cookies, and web scraping proxies (e.g., rotating proxies, IP management, VPNs).
  • Version Control: Experience using version control systems like Git.
  • Problem Solving and Critical Thinking: Ability to handle complex scraping challenges like dynamic content, CAPTCHA, JavaScript rendering, etc.

Preferred Qualifications:

  • Experience with Cloud Technologies: Familiarity with cloud platforms such as AWS, Google Cloud, or Azure for scalable scraping and storage solutions.
  • Distributed Systems: Experience with managing distributed web scraping jobs using tools like Celery, RabbitMQ, or Kubernetes.
  • Data Quality and Validation: Experience in data validation, cleaning, and transforming data for downstream processes.
  • Knowledge of Machine Learning: Familiarity with applying machine learning techniques to parse and extract data from semi-structured or unstructured sources.

Job Type: Full-time

Pay: ?214,318.07 - ?1,051,539.21 per year

Schedule:

  • Day shift

Experience:

  • total work: 2 years (Preferred)

Work Location: In person

Share job
Similar Jobs
View All
1 Day ago
TrueFan - Senior Machine Learning Engineer
Information Technology
  • Thiruvananthapuram, Kerala, India
About UsTrueFan is at the forefront of AI-driven content generation, leveraging cutting-edge generative models to build next-generation products. Our mission is to redefine content generation space through advanced AI technologies, including deep ge...
decor
1 Day ago
Salesforce commerce cloud consultant
Information Technology
  • Thiruvananthapuram, Kerala, India
Salesforce Commerce Cloud consultant  5+ Years of Experience 6 to 12 months Mode - Remote 1.1LPM - 1.2LPM Max Key Responsibilities Translate business requirements into scalable Salesforce Service Cloud solutions, in collaboration with CAE's technic...
decor
1 Day ago
Cloud Infrastructure Engineer
Information Technology
  • Thiruvananthapuram, Kerala, India
DescriptionInvent the future with us. Recognized by Fast Company’s 2023 100 Best Workplaces for Innovators List, Ampere is a semiconductor design company for a new era, leading the future of computing with an innovative approach to CPU design focuse...
decor
1 Day ago
Devops Engineer- Intermetiate
Information Technology
  • Thiruvananthapuram, Kerala, India
BackJD: Dev ops Engineer:As a DevOps Specialist- should be able to take ownership of the entire DevOps process, including Automated CI/CD pipelines and deployment to production.They should also be comfortable with risk analysis and prioritization.Le...
decor
1 Day ago
Sr Data Scientist (London)
Information Technology
  • Thiruvananthapuram, Kerala, India
AryaXAI stands at the forefront of AI innovation, revolutionizing AI for mission-critical, highly regulated industries by building explainable, safe, and aligned systems that scale responsibly. Our mission is to create AI tools that empower research...
decor
1 Day ago
Software Test Engineer
Information Technology
  • Thiruvananthapuram, Kerala, India
By clicking the “Apply” button, I understand that my employment application process with Takeda will commence and that the information I provide in my application will be processed in line with Takeda’s Privacy Notice and Terms of Use. I further att...
decor
1 Day ago
Software Developer 5 (Java Fullstack)
Information Technology
  • Thiruvananthapuram, Kerala, India
Job DescriptionBuilding off our Cloud momentum, Oracle has formed a new organization - Oracle Health Applications & Infrastructure. This team focuses on product development and product strategy for Oracle Health, while building out a complete platfo...
decor
1 Day ago
Java Developer - Spring Frameworks
Information Technology
  • Thiruvananthapuram, Kerala, India
Java DescriptionWe are looking for a passionate and talented Java Developer with 2-3 years of hands-on experience to join our growing development team.The ideal candidate should have a strong foundation in Java technologies and the ability to develo...
decor

Talk to us

Feel free to call, email, or hit us up on our social media accounts.
Social media