Back to Jobs

1 Day ago

eMedEvents - Python Developer - Web Scraping

Apply Now

Information Technology

Full-Time

eMedEvents Global Marketplace for CMECE

Overview

Python Developer Web Scraping & Data Processing

Experience : 3+ Years

Employment Type : Full-time

Job Overview

We are seeking a skilled and detail-oriented Python Developer with 3+ years of hands-on experience in web scraping, document parsing (PDF, HTML, XML), and structured data extraction. You will be a vital part of a core team focused on aggregating biomedical content from diverse sources, including grant repositories, scientific journals, conference abstracts, treatment guidelines, and clinical trial databases. This role demands strong technical proficiency in various parsing and scraping libraries, along with solid data processing and integration skills.

Key Responsibilities

Develop scalable Python scripts to effectively scrape and parse biomedical data from a wide range of web sources, including websites, pre-print servers, citation indexes, scientific journals, and treatment guidelines.
Build robust modules specifically for splitting multi-record documents (such as PDFs, HTML, and other formats) into individual, manageable content units.
Implement NLP-based field extraction pipelines utilizing libraries like spaCy, NLTK, or advanced regex for precise metadata tagging.
Design and automate complex data acquisition workflows using schedulers and orchestrators like cron, Celery, or Apache Airflow for periodic scraping and content updates.
Store parsed and processed data efficiently in both relational (PostgreSQL) and NoSQL (MongoDB) databases, ensuring optimal schema design for performance and scalability.
Ensure robust logging, comprehensive exception handling, and rigorous content quality validation across all data processing and scraping workflows.

Required Skills And Qualifications

3+ years of hands-on experience in Python, particularly focused on data extraction, transformation, and loading (ETL).
Strong command over web scraping libraries, including :
BeautifulSoup
Scrapy
Selenium
Playwright
Proficiency in PDF parsing libraries, such as :
PyMuPDF
pdfminer.six
PDFPlumber
Experience with HTML/XML parsers: lxml, XPath, html5lib.
Familiarity with regular expressions, NLP concepts, and advanced field extraction techniques.
Working knowledge of SQL and/or NoSQL databases (MySQL, PostgreSQL, MongoDB).
Understanding of API integration (RESTful APIs) for interacting with structured data sources.
Experience with task schedulers and workflow orchestrators (cron, Apache Airflow, Celery).
Proficiency in version control using Git/GitHub and comfort working in collaborative development environments.

Good To Have

Exposure to biomedical or healthcare data parsing (scientific abstracts, clinical trials data, drug labels).
Familiarity with cloud environments like AWS (specifically Lambda, S3 for data storage and processing).
Experience with data validation frameworks and building robust QA rules for data quality.
Understanding of ontologies and taxonomies (UMLS, MeSH) for structured content tagging.

(ref:hirist.tech)

Share job

Similar Jobs

View All

1 Day ago

Machine Learning Engineer

Information Technology

2 - 6 Yrs
Maharashtra

What you ll do: Lead ML model lifecycle, from research and experiments to implementation and deployment. Build and deploy deep learning models on GCP and edge devices , ensuring real-time inference. Combine multiple sensor in...

More info

1 Day ago

D-TechWorks - Java Full Stack Developer

Information Technology

Java Full stack - GurugramJob Title : Java Full stackExp : 5 to 7 yearsLocation : Gurgaon officeMandatory Skills : Java 8, Spring boot, Microservices, Reactjs, Database (MYSQL/PostgreSQL), Rest APIGood to have : AWS, Docker and orchestration to...

More info

1 Day ago

IT Manager (H/F) - AIRBUS INDIA PRIVATE LIMITED

Information Technology

Chennai, Tamil Nadu, India

Job Description: Description We are loo king for a highly customer focused, driven and dependable candidate to manage overall IT Operations at the Airbus India Training center (AITC) in Gurgaon. This position will functionally report to the Head of ...

More info

1 Day ago

PHP Developer

Information Technology

Job Description Youll Do Participating in the design and delivery of web-UI product. Developing solutions by designing system specifications and tests before delivering them. Identifying, analysing, and developing interfaces, flows and APIs. Int...

More info

1 Day ago

Technical Lead - .Net Core/AngularJS

Information Technology

Chennai, Tamil Nadu, India

We are seeking an experienced Technical Lead to guide a team of software developers in delivering robust and scalable enterprise solutions using ASP.NET Core, Angular, SQL Server, and Microsoft Azure.The ideal candidate will be responsible for end-t...

More info

1 Day ago

Business Analyst - Wealth Management

Information Technology

Chennai, Tamil Nadu, India

Primary skills:Domain->Capital Markets->Wealth Management A day in the life of an Infoscion As part of the Infosys consulting team, your primary role would be to get to the heart of customer issues, diagnose problem areas, design innovative solutio...

More info

1 Day ago

Sr. Software Engineer - Pentaho Data Integration Job

Information Technology

Chennai, Tamil Nadu, India

We use cookies to offer you the best possible website experience. Your cookie preferences will be stored in your browser’s local storage. This includes cookies necessary for the website's operation. Additionally, you can freely decide and change any...

More info

1 Day ago

Dataviv Technologies - Quality Assurance Test Engineer - Manual & Automation Testing

Information Technology

Chennai, Tamil Nadu, India

Job Title: QA Tester (Manual & GurugramEmployment Type: Full-timeJob SummaryWe are looking for a skilled and detail-oriented QA Tester who is proficient in both manual and automation testing. The ideal candidate will be responsible for ensuring the ...

More info

Talk to us

Feel free to call, email, or hit us up on our social media accounts.

Email info@antaltechjobs.in