Information Technology
Full-Time
Provido Solutions
Overview
Location : Preferred Hyd or Bangalore but can be remote for right candidate.
Exp: 8-10+ years
Skillset
- Design and implement robust, production-grade pipelines using Python, Spark SQL, and Airflow to process high-volume file-based datasets (CSV, Parquet, JSON).
- Own the full lifecycle of core pipelines — from file ingestion to validated, queryable datasets — ensuring high reliability and performance.
- Build resilient, idempotent transformation logic with data quality checks, validation layers, and observability.
- Refactor and scale existing pipelines to meet growing data and business needs.
- Tune Spark jobs and optimize distributed processing performance.
- Implement schema enforcement and versioning aligned with internal data standards.
- Collaborate deeply with Data Analysts, Data Scientists, Product Managers, Engineering, Platform, SMEs, and AMs to ensure pipelines meet evolving business needs.
- Monitor pipeline health, participate in on-call rotations, and proactively debug and resolve production data flow issues.
- Contribute to the evolution of our data platform — driving toward mature patterns in observability, testing, and automation.
- Build and enhance streaming pipelines (Kafka, SQS, or similar) where needed to support near-real-time data needs.
- Help develop and champion internal best practices around pipeline development and data modeling.
Experience
- 8-10 years of experience as a Data Engineer (or equivalent), building production-grade pipelines.
- Strong expertise in Python, Spark SQL, and Airflow.
- Experience processing large-scale file-based datasets (CSV, Parquet, JSON, etc) in production environments.
- Experience mapping and standardizing raw external data into canonical models.
- Familiarity with AWS (or any cloud), including file storage and distributed compute concepts.
- Ability to work across teams, manage priorities, and own complex data workflows with minimal supervision.
- Strong written and verbal communication skills — able to explain technical concepts to non-engineering partners.
- Comfortable designing pipelines from scratch and improving existing pipelines.
- Experience working with large-scale or messy datasets (healthcare, financial, logs, etc.).
- Experience building or willingness to learn streaming pipelines using tools such as Kafka or SQS.
- Bonus: Familiarity with healthcare data (837, 835, EHR, UB04, claims normalization).
Please share your updated resume with below details:
Highest Education:
Total and Relevant Exp:
CCTC:
ECTC:
Any offer in hand or in pipeline:
Notice period:Current location:
Job Type: Full-time
Pay: ₹3,000,000.00 - ₹3,500,000.00 per year
Benefits:
- Health insurance
Schedule:
- Day shift
Supplemental Pay:
- Performance bonus
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in