
Overview
Participate in the customer’s system design meetings and collect the
functional/technical requirements.
● Build up data pipelines for consumption by the data science team.
● Skillful in ETL process and tools.
● Clear understanding and experience with Python and PySpark or Spark and SCALA, with HIVE, Airflow, Impala, and Hadoop and RDBMS architecture.
● Experience in writing Python programs and SQL queries.
● Experience in SQL Query tuning.
● Experienced in Shell Scripting(Unix/Linux).
● Build and maintain data pipelines in Spark/Pyspark with SQL and Python or SCALA. ● Knowledge of Cloud (Azure/AWS/GCP, etc..) technologies is additional. ● Good to have knowledge of Kubernetes, CI/CD concepts, Apache Kafka ● Suggest and implement best practices in data integration.
● Guide the QA team in defining system integration tests as needed.
● Split the planned deliverables into tasks and assign them to the team. ● Needs to Maintain/Deploy the ETL code and follow the Agile methodology ● Needs to work on optimization wherever applicable.
● Good oral, written and presentation skills.
Preferred Qualifications:
● Degree in Computer Science, IT, or similar field; a Master’s is a plus. ● Hands-on experience with Python and Pyspark
Or
● Hands-on experience with Spark and SCALA.
● Great numerical and analytical skills.
● Working knowledge of cloud platforms such as MS Azure, AWS, etc... ● Technical expertise with data models, data mining, and segmentation techniques.
Job Type: Full-time
Pay: ₹1,500,000.00 - ₹3,000,000.00 per year
Benefits:
- Flexible schedule
- Food provided
- Health insurance
- Leave encashment
- Life insurance
- Paid sick time
- Paid time off
- Provident Fund
- Work from home
Work Location: Remote
Expected Start Date: 31/07/2025