Thiruvananthapuram, Kerala, India
Information Technology
Full-Time
Citi
Overview
Responsible for designing, developing, and optimizing data processing solutions using a combination of Big Data technologies. Focus on building scalable and efficient data pipelines for handling large datasets and enabling batch & real-time data streaming and processing.
Responsibilities:
Responsibilities:
- > Develop Spark applications using Scala or Python (Pyspark) for data transformation, aggregation, and analysis.
> Develop and maintain Kafka-based data pipelines: This includes designing Kafka Streams, setting up Kafka Clusters, and ensuring efficient data flow.
> Create and optimize Spark applications using Scala and PySpark: They leverage these languages to process large datasets and implement data transformations and aggregations.
> Integrate Kafka with Spark for real-time processing: They build systems that ingest real-time data from Kafka and process it using Spark Streaming or Structured Streaming.
> Collaborate with data teams: This includes data engineers, data scientists, and DevOps, to design and implement data solutions.
> Tune and optimize Spark and Kafka clusters: Ensuring high performance, scalability, and efficiency of data processing workflows.
> Write clean, functional, and optimized code: Adhering to coding standards and best practices.
> Troubleshoot and resolve issues: Identifying and addressing any problems related to Kafka and Spark applications.
> Maintain documentation: Creating and maintaining documentation for Kafka configurations, Spark jobs, and other processes.
> Stay updated on technology trends: Continuously learning and applying new advancements in functional programming, big data, and related technologies.
Proficiency in:
Hadoop ecosystem big data tech stack(HDFS, YARN, MapReduce, Hive, Impala).
Spark (Scala, Python) for data processing and analysis.
Kafka for real-time data ingestion and processing.
ETL processes and data ingestion tools
Deep hands-on expertise in Pyspark, Scala, Kafka
Programming Languages:
Scala, Python, or Java for developing Spark applications.
SQL for data querying and analysis.
Other Skills:
Data warehousing concepts.
Linux/Unix operating systems.
Problem-solving and analytical skills.
Version control systems
------------------------------------------------------
Job Family Group:
Technology
------------------------------------------------------
Job Family:
Applications Development
------------------------------------------------------
Time Type:
Full time
------------------------------------------------------
Most Relevant Skills
Please see the requirements listed above.
------------------------------------------------------
Other Relevant Skills
For complementary skills, please see above and/or contact the recruiter.
------------------------------------------------------
Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in