
Overview
Role description
Senior Data Architect – Big Data & Cloud Solutions
Experience: 10+ Years Industry: Information Technology / Data Engineering / Cloud Computing
Job Summary: We are seeking a highly experienced and visionary Data Architect to lead the design and implementation of scalable, high-performance data solutions. The ideal candidate will have deep expertise in Apache Kafka, Apache Spark, AWS Glue, PySpark, and cloud-native architectures, with a strong background in solution architecture and enterprise data strategy. Key Responsibilities: Design and implement end-to-end data architecture solutions on AWS using Glue, S3, Redshift, and other services. Architect and optimize real-time data pipelines using Apache Kafka and Spark Streaming. Lead the development of ETL/ELT workflows using PySpark and AWS Glue. Collaborate with stakeholders to define data strategies, governance, and best practices. Ensure data quality, security, and compliance across all data platforms. Provide technical leadership and mentorship to data engineers and developers. Evaluate and recommend new tools and technologies to improve data infrastructure. Translate business requirements into scalable and maintainable data solutions.
Required Skills & Qualifications: 10+ years of experience in data engineering, architecture, or related roles. Strong hands-on experience with: Apache Kafka (event streaming, topic design, schema registry) Apache Spark (batch and streaming) AWS Glue, S3, Redshift, Lambda, CloudFormation/Terraform PySpark for large-scale data processing Proven experience in solution architecture and designing cloud-native data platforms. Deep understanding of data modeling, data lakes, and data warehousing concepts. Strong programming skills in Python and SQL. Experience with CI/CD pipelines and DevOps practices for data workflows. Excellent communication and stakeholder management skills.
Preferred Qualifications: AWS Certified Solutions Architect or Big Data Specialty certification. Experience with data governance tools and frameworks. Familiarity with containerization (Docker, Kubernetes) and orchestration tools (Airflow, Step Functions). Exposure to machine learning pipelines and MLOps is a plus.
Skills
Apache,Pyspark,Aws Cloud,Kafka