Bangalore, Karnataka, India
Information Technology
Full-Time
Dolby Laboratories
Overview
Opportunity:We are seeking a highly skilled and experienced Senior Data Engineer to join our growing data team. In this role, you will be instrumental in designing, building, and maintaining robust and scalable data pipelines and infrastructure that power critical data-driven initiatives across our organization. You will work with vast datasets, cutting-edge technologies, and collaborate closely with AI researchers, other data engineers, data scientists, machine learning engineers, and product teams to deliver insights that shape the future of our products and user experiences.
What You'll Do:
Build and Optimize Data Infrastructure:
- Develop, construct, test, and maintain large-scale data ingest architecture consisting of diverse cloud-based services (messaging, storage, Kubernetes, persistent data store, serverless functions, etc).
- Create tooling like SDK, APIs to enable user self-service.
- Contribute to the design and evolution of our core data platform,ensuring its scalability, reliability, and cost-effectiveness.
- Implement robust monitoring, alerting, and logging solutions for data pipelines and infrastructure to proactively identify and resolve issues.
- Design and implement highly reliable and efficient ETL/ELT processes to ingest, transform, and load data from diverse sources (e.g., real-time events, third-party APIs, rich media datasets) into our data lake and data warehouses.
- Utilize distributed data processing frameworks like Spark or similar to handle large scale data volumes with high throughput and low latency.
- Describe and annotate datasets using industry standard schemas and internal specifications
- Cultivate data catalogs and metadata management solutions to improve data discoverability and understanding across the organization.
- Implement data validation, cleansing, and reconciliation processes to ensure the accuracy and integrity of our data assets.
- Work closely with stakeholders (research, engineering, product, and peers more broadly) to translate their data needs into robust data solutions.
- Provide technical leadership and mentorship to junior data engineers, fostering a culture of technical excellence and continuous learning.
- Contribute to the evolution of our data architecture and engineering best practices.
- Extensive Experience: 5+ years of experience in data engineering, with a strong focus on building and maintaining large-scale data pipelines and infrastructure.
- Programming Proficiency: Expert-level proficiency in at least one major programming language such as Python, Scala, or Java.- Distributed Data Processing: Deep experience with distributed data processing frameworks (e.g., Apache Spark, Apache Beam). Strong foundation in event-based approaches and systems including messaging/topics, pub/sub, queues, etc.
- Data Warehousing/Lakes: Hands-on experience with data warehousing solutions (e.g., Databricks, Snowflake, Redshift, BigQuery) and data lake technologies (e.g., S3, HDFS). Deep experience with managing large scale, heterogeneous datasets on Databricks is highly preferred.
- SQL Mastery: Advanced SQL skills for data manipulation, analysis, and optimization.
- Cloud Platforms: Strong experience with one or more major cloud providers (AWS, GCP, Azure) and their data-related services.
- Orchestration and DevOps: Familiarity with containerization and orchestration technologies (e.g., Docker, Kubernetes). Proficient at CI/CD-based deployment.
- Database Knowledge: Solid understanding of relational and NoSQL databases.
- Data Modeling: Expertise in data modeling, schema design, and data architecture principles.
- Problem-Solving: Excellent analytical and problem-solving skills, with a track record of tackling complex data challenges.
- Communication: Strong communication and interpersonal skills, with the ability to collaborate effectively with cross-functional teams.
- Master’s degree in Computer Science, Data Science, or a related quantitative field.
- Strong theoretical understanding of distributed computing concepts such as concurrency, parallelism, queueing, consistency, coordination protocols, etc.- Experience with machine learning pipelines and MLOps principles.
- Contributions to open-source data projects.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in