Bangalore, Karnataka, India
Information Technology
Full-Time
Octro Inc.
Overview
DescriptionAs a Data Engineer, the candidate will design and maintain scalable data pipelines and analytics systems. The ideal candidate will have 2- 4 years of experience with Apache Spark,Scala/Python, Trino/Presto, Hadoop,kafka and data lake technologies such as Delta Lake.
Experience with Elasticsearch, streaming data, and modern analytics platforms is preferred.
Mandatory Skills Requirements
- Proficient in Python and/or Scala with strong experience in developing and optimizing data processing applications using Apache Spark.
- Extensive experience with Apache Spark Structured Streaming for near real-time and streaming data processing.
- Strong hands-on experience with Apache Kafka, including integration with Spark for reliable real-time data ingestion and event-driven pipelines.
- Experience working with analytical and distributed data stores such as ClickHouse, Trino/Presto, and data lake technologies (Delta Lake or equivalent).
- Solid understanding of data modeling and metric design for large-scale analytics systems, including fact/dimension modeling and event-based schemas.
- Proven ability to design and implement ETL / ELT pipelines for data ingestion, transformation, aggregation, and performance optimization using Spark.
- Demonstrated experience in writing efficient, scalable, and maintainable code for large-scale data processing workloads.
- Experience operating in on-prem or hybrid data platforms, with a working understanding of cluster resource management, performance tuning, and capacity planning.
- Familiarity with Elasticsearch for search, observability, or analytical use cases is a plus.
- Bachelors degree in Computer Science, Software Engineering, or a related field, or equivalent practical experience.
- Strong familiarity with version control systems, particularly Git, and collaborative development workflows.
- Working knowledge of cloud platforms such as AWS, Azure, or Google Cloud, primarily for data services, storage, or hybrid deployments.
- Understanding of distributed data systems and database administration principles, including performance tuning, reliability, and scaling of analytical or NoSQL databases (e.g., ClickHouse, Elasticsearch, HBase, or similar
(ref:hirist.tech)
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in