Free cookie consent management tool by TermsFeed Data Architect (GCP, Pyspark) | Antal Tech Jobs
Back to Jobs
3 Days ago

Data Architect (GCP, Pyspark)

decor
Hyderabad, Telangana, India
Information Technology
Full-Time
Assembly Global

Overview

About Us

At Assembly, we help brands find the change to fuel business growth. We are an award-winning global brand performance agency, home to 1,600 talented people across 25 offices globally. We create unique data, technology and media solutions that enable faster and smarter problem solving and an inspired, collaborative workplace culture.

At Assembly we embody three core values: Show Up - actively contribute to a space of personal and collective growth; Make Change - embrace obstacles as opportunities, taking intentional steps to drive positive change; and Win Well - approach success with integrity, responsibility, and a commitment to collaboration, understanding that the journey is as important as the destination. Together, we create an environment that fosters continuous learning, adaptability, and a shared passion for making a meaningful impact.

We stay ahead of what’s next, providing fresh insights to spark new ideas. We’re a trusted partner to our clients, working behind the scenes to bring imagination, depth, and clarity to their biggest challenges—in entertainment, technology, lifestyle, sports, and gaming. Together, we create with confidence

Overview

We are seeking a skilled Data Lead/ Architect to design, govern, and optimize our enterprise-wide data ecosystem. This role spans cloud architecture, ingestion, data modeling , governance, and data consumption patterns. Using modern data architecture principles, the Data Architect will define the end-to-end data supply chain—from data capture to curation to consumption—ensuring scalability, reliability, and business value.

Responsibilities

  • Part 1: Enterprise Architecture
  • a) Cloud & Data Architecture
    • Define cloud-based data architecture (AWS / GCP / Azure) aligned with enterprise strategy.
    • Establish the blueprint for data lakes, raw zones, curated zones, integrated zones, purpose-built marts, and consumption layers.
    • Design real-time, pub-sub, API-led, and batch-based data pipelines.
    • Build unified/industry-standard data models across domains.
  • b) Data & Analytics Vendor Selection
    • Evaluate modern data platforms ( BigQuery , Snowflake, Databricks, Redshift, Kafka, dbt , Airflow, Collibra, Alation).
    • Make recommendations on ingestion, ETL/ELT tooling, cataloging , ML platforms, and reporting stacks.
  • c) Data Team Org Structure
    • Define roles across data engineering, governance, data management, ML engineering, and analytics.
    • Contribute to operating model design—centralized vs. federated vs. hybrid data teams.
Part 2: Capture (Data Ingestion & Modeling )

  • a) Data Ingestion
  • Architect ingestion frameworks for:
  • Batch, streaming, API ingestion, change data capture (CDC)
  • File copy & 3rd-party connectors
  • Device and sensor data flows (IoT)
  • Define ingestion patterns for as-is, source mirror, and standardized landing zones.
  • b) Data Model
  • Build enterprise logical and physical data models.
  • Define schema evolution strategies, metadata standards, and modeling approaches for:
  • structured (SQL), semi-structured (JSON/Parquet), and unstructured data.
  • Implement conforming dimensions, master/reference data standards, and data linking strategies.

Part 3: Curate (Data Lake & Data Services)

  • a) Data Lake
  • Define raw, curated, integrated, and purpose-fit data zones.
  • Architect data integration processes:
  • cleanse, standardize, conform, shape, business rule assertion, lineage capture.
  • Establish unified data model and industry-standard schema mappings.
  • b) Data Services
  • Enable data provisioning through APIs, microservices, and data virtualization.
  • Design sandbox, discovery, and development environments for analysts and data scientists.
  • Oversee data quality frameworks, profiling, master data, glossary, and taxonomy creation.

Part 4: Consume (AI/ML, Reporting, Analytics)

  • a) Support AI/ML Workload
  • Support ML feature pipelines, model training data sets, model versioning, and MLOps integrations.
  • Ensure curated zones support machine learning and ad-hoc analysis with scalable compute layers.
  • b) Reporting & Analytics
  • Define BI consumption patterns (dashboards, semantic layers, visualization).
  • Architect SQL query optimization, semantic models, and data virtualization for analysts.
  • Enable self-serve analytics—data search, data preparation, visual intelligence.

Part 5: Enterprise Essentials (Governance & Security)

  • a) Data Governance
  • Implement metadata management, lineage, cataloging , quality rules, reference data, classification.
  • Ensure compliance with GDPR, HIPAA, SOC2, and internal data governance processes.
  • Define operating model for governance: stewardship, ownership, custodianship.
  • b) Security
  • Implement enterprise data security controls:
  • identity & access management
  • encryption & data protection
  • audit & monitoring
  • DevSecOps integration
  • data privacy frameworks
  • Ensure secure handling of PII, PHI, and sensitive datasets.

Required Skills

  • Technical Expertise
    • Strong understanding of modern data architecture (data supply chain, raw-to-consume architecture).
    • Expertise with cloud platforms: AWS / GCP / Azure.
    • Strong hands-on experience with:
    • Data ingestion tools: Kafka, Pub/Sub, Kinesis, Fivetran , CDC tools
    • Data engineering technologies: Python, SQL, PySpark , dbt
    • Data processing engines: Spark, Databricks, Beam
    • Data storage: BigQuery , Snowflake, Redshift, S3/GCS/ADLS
    • Metadata & governance: Collibra, Alation, Purview
    • Streaming & Messaging: Kafka, Pub/Sub
    • ML & Analytics: Feature stores, ML pipelines, BI tools
    • Experience building data models, taxonomies, lineage, and data catalogs .
    • Experience in building large-scale enterprise data platforms, especially in regulated or data-heavy industries.
Architecture & Design

  • Ability to define conceptual, logical, and physical data models.
  • Strong knowledge of microservices, event-driven architecture, and API-based data services.
  • Proven ability to design large-scale distributed systems.

Governance & Security

  • Experience implementing enterprise data governance, classification, cataloging , and retention rules.
  • Strong grasp of IAM, encryption, DevSecOps , and compliance frameworks.

Soft Skills

  • Excellent communication; able to work across engineering, analytics, product, and business teams.
  • Ability to create architectural documentation and present complex concepts clearly.
  • Strategic thinker who can build long-term data roadmaps.

Preferred Qualifications

  • Certifications:
  • AWS/GCP Professional Data Engineer
  • Databricks Data Architect
  • Snowflake Architect
  • TOGAF or equivalent enterprise architecture frameworks

Benefits

  • Annual Leave in number of 20 allotted to all employees beginning of every calendar year.
  • Sick Leave in number of 12 is allotted effective DOJ and beginning of ever calendar year.
  • Other Leaves-Maternity Leave & Paternity Leaves, Birthday Leave Entitlement
  • Dedicated L&D Budget for all Teams to upskill & get certified
  • All employees are entitled for Group Personal Accident Cover & Life Cover Insurance.
  • Insurance coverage for the entire family (Employee + up to 7 dependents - Self, Spouse, up to 4 children, and Parents)
  • Monthly Cross Team Lunch
  • Rewards and Recognition program-Employee of the month, Star Performer, Tenure Celebration & many more

Equal Opportunities

Assembly is an advocate for equal opportunity in the workplace. We are committed to ensuring equal opportunities regardless of race, colour, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability and gender identity. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you have a disability or special need that requires accommodation, please let us know.

Social and Environmental Responsibility

At Assembly, we have a responsibility to bring impact into our every day. This means we must always look for ways in which to be conscious citizens in our roles to support society and environmental sustainability. We encourage employees to; be a conscious citizen by actively participating in our organisation's sustainability efforts, help us promote environmentally friendly practices within the workplace, collaborate with community organisations and stakeholders to support initiatives aligned with our company's values, participate in volunteer activities that benefit the community. Employees are also encouraged to make suggestions and evaluate our business practices to identify areas for improvement in social and environmental performance. Employees at Assembly demonstrate commitment to sustainability and inclusivity in their actions and behaviors.

Share job
Similar Jobs
View All
1 Day ago
Software Engineer III - Android
Information Technology
  • Hyderabad, Telangana, India
hackajob is collaborating with J.P. Morgan to connect them with exceptional tech professionals for this role.Job DescriptionWe have an exciting and rewarding opportunity for you to take your software engineering career to the next level.As a Software...
decor
1 Day ago
Software Engineer III Data Engineer - Databricks, AWS, Python
Information Technology
  • Hyderabad, Telangana, India
hackajob is collaborating with J.P. Morgan to connect them with exceptional tech professionals for this role.Job DescriptionUnlock the power of data with our expert Databricks Developer, transforming complex datasets into actionable insights with sea...
decor
1 Day ago
Software Engineer II - Fullstack (Dotnet, Angular + AWS)
Information Technology
  • Hyderabad, Telangana, India
hackajob is collaborating with J.P. Morgan to connect them with exceptional tech professionals for this role.Job DescriptionYou’re ready to gain the skills and experience needed to grow within your role and advance your career — and we have the perfe...
decor
1 Day ago
Sr. Software Engineer - Next.js Job
Information Technology
  • Hyderabad, Telangana, India
We use cookies to offer you the best possible website experience. Your cookie preferences will be stored in your browser’s local storage. This includes cookies necessary for the website's operation. Additionally, you can freely decide and change any ...
decor
1 Day ago
Eroute Technologies - iOS Developer - Objective C/SWIFT
Information Technology
  • Hyderabad, Telangana, India
DescriptionWe are looking for a passionate and skilled iOS Developer with 4-6 years of experience to join our team.Key Responsibilities Develop and maintain iOS applications using Swift and Objective-C. Ensure the performance, quality, and responsive...
decor
1 Day ago
QA Engineer (TOSCA) - AVP
Information Technology
  • Hyderabad, Telangana, India
This job is with Deutsche Bank, an inclusive employer and a member of myGwork – the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly. Position OverviewRole DescriptionCandidate will perform a QA ...
decor
1 Day ago
Business Analyst – Sales Operations, Sales Incentive Process
Information Technology
  • Hyderabad, Telangana, India
Rockwell Automation is a global technology leader focused on helping the world’s manufacturers be more productive, sustainable, and agile. With more than 28,000 employees who make the world better every day, we know we have something special. Behind ...
decor
1 Day ago
Staff Software Engineer
Information Technology
  • Hyderabad, Telangana, India
If you are looking for a challenging and exciting career in the world of technology, then look no further. Skyworks is an innovator of high performance analog semiconductors whose solutions are powering the wireless networking revolution. At Skyworks...
decor

Talk to us

Feel free to call, email, or hit us up on our social media accounts.
Social media