Overview
About the Role
We are seeking a highly skilled Senior Data Engineer with strong hands-on experience in Neo4j production environments, unstructured data processing, and data pipeline development. The ideal candidate will have a proven track record of designing and operating scalable graph-based systems, building robust data extraction pipelines, and transforming complex unstructured datasets into actionable knowledge graphs.
This role requires a self-driven engineer who can independently own technical initiatives, communicate effectively with stakeholders, and contribute to the architecture and execution of graph-powered data platforms.
Key Responsibilities
**1. Graph Database Architecture & Development
Design, develop, and maintain enterprise-scale graph data models using Neo4j.
Architect and optimize graph storage, indexing, querying, and relationship modeling for high-performance workloads.
Build and maintain knowledge graph solutions that integrate data from multiple structured and unstructured sources.
Ensure scalability, reliability, and performance of Neo4j deployments in production environments.
2. Data Pipeline Engineering
Design and implement end-to-end data extraction, transformation, and loading (ETL/ELT) pipelines.
Build production-grade data ingestion frameworks for processing large volumes of data from diverse sources.
Develop automated workflows for data validation, enrichment, lineage tracking, and monitoring.
Optimize pipeline performance and operational reliability.
3. Unstructured Data Processing
Develop systems for processing and extracting insights from documents, PDFs, reports, emails, web content, and other unstructured datasets.
Transform extracted entities and relationships into graph-ready formats for Neo4j ingestion.
Ensure high-quality data normalization, deduplication, and graph enrichment processes.
4. Workflow Orchestration
Build, schedule, monitor, and maintain data workflows using Dagster / Airflow.
Design reusable, modular, and observable pipeline architectures.
Implement workflow monitoring, error handling, retries, and operational dashboards
5. Collaboration & Ownership
Work closely with data engineers, Software Engineers, AI/ML engineers, product teams, and business stakeholders.
Take ownership of technical initiatives from design through production deployment.
Communicate architecture decisions, trade-offs, and implementation plans effectively.
Qualifications
- 5+ years of experience in Data Engineering or Graph Data Engineering.
- Strong production experience with Neo4j, including deployment, scaling, optimization, and maintenance.
- Proven experience designing and implementing knowledge graphs and graph-based architectures.
- Experience building production-grade data extraction and ingestion pipelines.
- Strong experience working with unstructured data processing systems.
- Hands-on experience with Dagster / Airflow for workflow orchestration and pipeline management.
- Strong proficiency in Python and related data engineering libraries.
- Experience with data modeling, ETL/ELT design, and distributed data processing.
- Strong understanding of data quality, observability, monitoring, and operational best practices.
- Excellent communication skills and ability to work independently with minimal supervision.
Work Location: Ahmedabad/Pune
Contact us to apply
If you would like to apply for this role, send your resume to careers@infocusp.com.