Overview
Data Engineer
Experience: 3- 5 Years
Location: Bangalore - Hybrid
Type: Full-time
About Digit88:
Digit88 is an AI-native product engineering partner helping startups and enterprises build, scale and operate intelligent software products.
We are a lean, high-impact team of 75+ technologists, backed by leaders with deep experience across startups and global enterprises. We build strong, outcome-driven engineering teams that solve complex, real-world problems.
From GenAI applications and data platforms to enterprise-grade SaaS, we deliver scalable, reliable, production-ready systems. Our expertise spans AI/ML, RAG systems, agentic workflows and large-scale data engineering - enabling businesses to move from idea to production with speed and confidence.
Our teams operate as a true extension of our clients, with full ownership and flexible engagement models focused on measurable business outcomes - not just delivery.
With 80+ AI implementations and proven success in scaling dedicated teams and driving significant cost efficiencies, we partner for long-term impact.
We bring experience across B2B and B2C SaaS, web and mobile platforms, e-commerce and domains such as Conversational AI, HealthTech, IoT, ESG/Energy, and Data Engineering - thriving in fast-paced, high-ownership environments.
The Vision:
To be the most trusted AI-native product engineering partner for innovative software companies worldwide, delivering ownership, speed, and measurable outcomes.
The Opportunity:
As a Data Engineer, you will build and support scalable data platforms and pipelines for global customers. You will work closely with senior engineers and customers, product teams, and stakeholders to deliver reliable, production-grade data systems.
You will play a key role in data engineering best practices and AI-driven data platforms at Digit88, enabling customers to move from raw data to actionable insights and intelligent systems.
Key Responsibilities:
● Assist in modernizing legacy data systems and contribute to scalable data platform design
● Implement Medallion Architecture (Bronze, Silver, Gold) using Delta Lake and Databricks components such as Delta Live Tables (DLT), Delta Sharing and Workflows (LakeFlow, LakeBase)
● Design, build, and maintain scalable, reliable, and production-ready ETL/ELT pipelines using PySpark, SQL, and Databricks notebooks to ingest and transform data from diverse sources.
● Create and manage workflows using Databricks Workflows (Jobs) or orchestration tools to automate pipelines and dependencies
● Optimize Spark jobs for performance, scalability, and cost efficiency (partitioning, caching, query tuning, cluster optimization)
● Implement data quality checks (e.g., DLT Expectations) and enforce governance via Unity Catalog (access control, PII masking, lineage)
● Design and implement event-driven and streaming pipelines (Kafka or equivalent)
● Ensure high data reliability through monitoring, observability, and alerting
Requirements:
● BE/MS in Computer Science or a related field with 3 - 5 years of experience in data engineering
● Strong experience in ETL/ELT pipelines, data modeling, and distributed data systems, with hands-on expertise in Databricks
● Deep proficiency in PySpark, including performance optimization, job orchestration, and large-scale data processing
● Good understanding of event streaming systems such as Kafka or equivalent technologies
● Experience working with Azure data ecosystem (ADLS, Azure services, AHDS or similar data platforms)
● Strong foundation in event-driven architecture and scalable distributed systems
● Experience leveraging AI-assisted development tools (e.g., Claude, Antigravity, etc.) to improve productivity
● Solid experience in Agile delivery, estimation, and program execution
● Experience working with global customers (US/EU) in a client-facing role
● Excellent written and verbal communication skills across engineering, business, and customer stakeholders
● Strong analytical thinking and structured problem-solving ability
● Ownership mindset with ability to deliver independently with little guidance
Good to have skills:
● Experience in Healthcare, EHR/EMR data migration, Clinical Trials, or Life Sciences domains (at least one)
● Exposure to handling large-scale EMR/EHR integrations and reducing technical complexity across multiple data sources
● Experience working with healthcare data standards such as CDA, FHIR, and HL7, including data normalization into modern data models
● Ability to build and manage scalable data pipelines for ingestion, transformation (FHIR), data quality, governance, and near real-time processing
● Understanding of interoperability challenges across diverse healthcare systems and approaches to solve them
● Experience building patient-centric data platforms (e.g., Patient 360, Master Patient Index)
● Familiarity with data privacy, security, and compliance standards such as HIPAA
Benefits/Culture @ Digit88:
● Comprehensive insurance coverage (Life, Health, Accident, parents, and in-laws are optional)
● Flexible work model focused on outcomes
● Accelerated learning with non-linear growth opportunities
● Flat organization with high ownership & accountability
● Opportunity to work on cutting-edge AI and SaaS products with global customers (primarily North America, Australia, EU, and UAE)
● Accomplished Global Peers - Working with some of the best engineers/professionals globally from the likes of Apple, Amazon, IBM Research, Adobe, and other innovative product companies
● Direct exposure to building and scaling real-world systems across Conversational AI, Energy/Utilities, ESG, HealthTech, IoT, and more
● High-impact roles with the ability to influence product, architecture, and business outcomes globally
Learn from a founding team of serial entrepreneurs with multiple exits - high growth, high ownership, and real challenges
This is an exciting time to join Digit88 - build, scale, and grow with us as part of our journey!
Digit88 Technologies Private Limited, www.digit88.com