Overview
About the job:
At CDPG, we are committed to democratising data, and our mission is to helpharness its power by creating data exchange platforms and seamlessly integrating
them into the broader context of Data for Public Good. By ensuring that data
exchange is conducted ethically, with a focus on privacy and security, we strive to
make the benefits of data accessible to all, promoting inclusivity in decision-making
processes.
Key Responsibilities:
1. Evaluate APIs and datasets, create data models, develop software ETL modules, perform unit testing, and deploy them in cloud environments.
2. Develop ETL modules in Python to ingest data into the data exchange using REST APIs and streaming protocols such as AMQP and MQTT. This includes containerising the adapters, creating data models, and catalogue entries according to data exchange specifications.
3. Follow best practices for software development and adhere to Agile methodology throughout.
4. Analyse data using statistical models that drive product strategy and make data informed decisions, while designing and maintaining data pipelines.
5. Collect requirements and collaborate with agencies, system integrators, solution providers, and other data sources to integrate relevant datasets into in-house products.
Who can apply:
- have minimum 1 years of experience
- are Computer Science Engineering students
Only those candidates can apply who:
Salary:
₹ 12,00,000 /yearExperience:
1 year(s)Deadline:
2025-12-27 23:59:59Other perks:
5 days a week, Free snacks & beverages, Health InsuranceSkills required:
Python, Linux, Statistical Modeling, Git, JSON, REST API, Data Analysis, Grafana and PrometheusOther Requirements:
1. Python (NumPy, pandas) for data cleaning, transformation, and numerical computing.
2. Exploratory Data Analysis (EDA) with visualisation using Matplotlib and Plotly.
3. Statistical analysis and hypothesis testing for data-driven insights. Solid understanding of descriptive statistics and inferential analysis on data.
4. REST API integration for data ingestion and application development.
5. Solid understanding of available data sources and APIs, including the ability to evaluate data quality, availability frequency, and common issues like data stream repetitions, as well as familiarity with various data fields and their meanings, and data structure formats (JSON, GeoJSON).
6. Strong command of data visualisation best practices for clear, actionable dashboards and plots.
7. Proficient in Linux, with experience in GIT version control and cloud computing platforms.
8. Strong understanding of IoT, GIS, Big Data, and Cloud applications aimed at improving operational efficiency and service delivery, with a commitment to creating a positive societal impact.
9. Experience in containerization using Docker, with familiarity in Kubernetes for orchestration, is a plus.
10. Knowledge of monitoring and logging tools such as Prometheus, Grafana, and Logstash would be a strong plus.
About Company:
IUDX was born out of the need to enable data exchange between various city departments, government agencies, citizens, and the private sector. IUDX helps the cities in using the data intelligently to address complex urban challenges, establish integrated development across various aspects of the urban sector, and catapult them to the next stage of innovation. IUDX is completely open source, based on an underlying framework of open standard APIs, data models, and the security, privacy, and accounting mechanisms that will facilitate its easy adoption across the digital ecosystem.