Overview
Job Title: Data Engineer (CDC / Realtime Data Integration)
Location: L&T Finance, Mahape, Navi Mumbai, India
Experience Level: 4-6 Years
WHO WE ARE:
L&T Finance is one of India’s leading Non-Banking Financial Companies (NBFCs), known for its innovation-driven lending solutions across retail, rural, and infrastructure finance. With a strong commitment to digital transformation and data-led decision making, we offer a dynamic workplace where your contributions shape the financial future of millions. Join us to be a part of an organization that values growth, integrity, and impact.
About the Role:
We are seeking a skilled and experienced Data Engineer with a focus on Change Data Capture (CDC) and real-time data integration to join our dynamic data team. The ideal candidate will have 5-6 years of experience in designing, implementing, and managing real-time data pipelines, particularly leveraging technologies like Debezium and Kafka. You will play a crucial role in enabling instant data availability for analytics, operational reporting, and downstream systems, ensuring data integrity and low latency across our enterprise data assets within a fast-paced NBFC environment.
PROFESSIONAL SUMMARY:
Results-driven Data Engineer with 5-6 years of experience specializing in Change Data Capture (CDC) and real-time data integration. Proven expertise in building robust, low-latency data pipelines using technologies like Debezium and Kafka. Skilled in ensuring immediate data synchronization, supporting real-time analytics, and operational systems. Experience in the BFSI or NBFC domain is highly preferred, with a strong commitment to data accuracy and high availability.
RESPONSIBILITIES:
- Design, develop, and maintain real-time data ingestion pipelines using Change Data Capture (CDC) mechanisms from various source systems (e.g., relational databases, NoSQL databases).
- Implement and manage data streaming solutions primarily using Apache Kafka, Kafka Connect, and Debezium for reliable and low-latency data propagation.
- Configure, monitor, and optimize Debezium connectors for various database sources to ensure efficient and accurate capture of data changes.
- Develop robust data transformation and processing logic for real-time streams using frameworks like Apache Spark Streaming, Flink, or Kafka Streams.
- Ensure data quality, consistency, and integrity within real-time data streams and downstream systems.
- Collaborate closely with source system owners, application teams, data scientists, and analysts to understand real-time data requirements and deliver appropriate solutions.
- Troubleshoot and resolve complex issues related to real-time data pipelines, Kafka clusters, and Debezium connectors.
- Implement monitoring, alerting, and logging for real-time data infrastructure to ensure high availability and performance.
- Optimize streaming applications and Kafka configurations for scalability, throughput, and cost-efficiency.
- Document real-time data architecture, pipeline designs, and operational procedures.
- Adhere to data governance policies, security standards, and regulatory compliance (e.g., RBI guidelines) in all real-time data integration efforts.
TECHNICAL SKILLS:
- Real-time Data Integration: Deep hands-on experience with Change Data Capture (CDC) mechanisms.
- Streaming Technologies: Expertise in Apache Kafka, Kafka Connect, and stream processing concepts.
- CDC Tools: Strong proficiency with Debezium for real-time data ingestion from databases.
- Programming Languages: Strong proficiency in Python or Java (Java 8 & above, Spring Boot, REST APIs). Shell Scripting.
- Big Data Frameworks: Experience with stream processing frameworks like Apache Spark Streaming, Apache Flink, or Kafka Streams is highly desirable.
- Cloud Platforms: Experience working in cloud environments (preferably GCP-based lakes) and utilizing relevant services for real-time data.
- SQL Proficiency: Strong SQL skills for querying source databases and validating data. Proficient in querying and validating data using SQL or similar languages.
- Database Knowledge: Solid understanding of relational and NoSQL databases for CDC source integration.
- Data Warehousing/Lakes: Familiarity with data warehouse environments like BigQuery, or GCP-based lakes.
- DevOps/Deployment: Experience with containerization (Docker, Kubernetes) and CI/CD pipelines for deploying streaming applications.
- Monitoring & Alerting: Experience with monitoring tools for real-time data pipelines and Kafka (e.g., Prometheus, Grafana).
- Other: JSON, Debugging skills, Design.
Collaboration & Communication:
- Proven ability to collaborate with cross-functional teams including application developers, business, and engineering teams.
- Strong documentation skills to clearly articulate real-time data flow architectures and processes.
- Skilled in stakeholder management and explaining real-time data concepts to technical and non-technical audiences.
- Experience driving adoption of real-time data integration practices.
Personality Traits & Leadership:
- Detail-oriented with a strong sense of data integrity, accuracy, and real-time reliability.
- Self-driven, with the ability to take ownership of real-time data initiatives independently.
- Process-oriented thinker with a structured approach to problem-solving in a fast-paced environment.
- Proactive in identifying and resolving potential real-time data flow issues.
- Adaptable to change and comfortable working in a dynamic NBFC environment.
QUALIFICATIONS
- BE/B.Tech and/or M.Tech in any discipline.
- 4-6 years of industry experience in data engineering, with a specific focus on real-time data integration and CDC.
- Strong problem-solving skills.
- Good collaboration skills.
- Good communication skills.
Prior exposure to NBFC processes (e.g., real-time transaction processing, fraud detection, immediate reporting) is a significant plus.