Back to Jobs

4 Weeks ago

Mid-level Data Engineer - MIDAS Data Platform

Apply Now

Bangalore, Karnataka, India

Information Technology

Full-Time

Money Forward India

Overview

About the position

As a Mid-level Data Engineer in the MIDAS (Management Integration & Data Analytics System) Data Platform Team, you will build from scratch and maintain the central data hub connecting most systems found inside one of Japan's more innovative digital banks.

You will work with modern cloud-based data technologies to ingest data from various banking systems, apply complex business logic on it, then serve it to downstream systems for enterprise management, regulatory reporting, risk management and many other applications.

Thanks to the high expectations towards the banking domain, you will have the opportunity to work on complex data engineering challenges including data quality, reconciliation across multiple systems, time-critical data processing, and complete traceability.

This is a mid-level position where you will work with increasing independence on data pipeline development, while collaborating closely with senior engineers and the technical lead for guidance on complex problems.

This position involves employment with Money Forward, Inc., and a secondment to the new company (SMBC Money Forward Bank Preparatory Corporation). The evaluation system and employee benefits will follow the policies of Money Forward, Inc.

Who we are?

We are a startup team partnering with Sumitomo Mitsui Financial Group and Sumitomo Mitsui Banking Corporation to establish a new digital bank. Our mission is to build embedded financial products from the ground up, with a strong focus on supporting small and medium-sized businesses (SMBs).

Development Structure
We operate in a small, agile team while collaborating closely with partners from the banking industry. The MIDAS team is growing rapidly, aiming for more than 10 data engineers within this year.

Technology Stack and Tools Used

Cloud Infrastructure

AWS (primary cloud platform in Tokyo region)
S3 for data lake storage with VPC networking for secure connectivity
AWS IAM for security and access management

Data Lakehouse Architecture

Modern lakehouse architecture using Delta Lake or Apache Iceberg for ACID transactions, time-travel, and schema evolution
Columnar storage formats (Parquet) optimized for analytics
Bronze/Silver/Gold medallion architecture for progressive data refinement
Partition strategies and Z-ordering for query performance

Orchestration & Processing

Managed workflow orchestration platforms (Amazon MWAA/Apache Airflow, Databricks Workflows, or similar)
Distributed data processing with Apache Spark
Serverless compute options for cost optimization
Streaming and batch ingestion patterns (AutoLoader, scheduled jobs)

Data Transformation

dbt (data build tool) for SQL-based analytics engineering
Delta Live Tables or AWS Glue for declarative ETL pipelines
SQL and Python for data transformations
Incremental materialization strategies for efficiency

Query & Analytics

Serverless query engines (Amazon Athena, Databricks SQL, or Redshift Serverless)
Auto-scaling compute for variable workloads
Query result caching and optimization
REST APIs for data serving to downstream consumers

Data Quality & Governance

Automated data quality frameworks (AWS Glue Data Quality, Delta Live Tables expectations, Great Expectations)
Cross-system reconciliation and validation logic
Fine-grained access control with column/row-level security (AWS Lake Formation or Unity Catalog)
Automated data lineage tracking for regulatory compliance
Audit logging and 10-year data retention policies

Business Intelligence

Amazon QuickSight and/or Databricks SQL Dashboards
Integration with enterprise BI tools (Tableau, PowerBI, Looker

Development & DevOps

Languages: SQL (primary), Python
Version Control: GitHub
CI/CD: GitHub Actions
Infrastructure as Code: Terraform
Monitoring: CloudWatch, Databricks monitoring, or similar
AI-Assisted Development: Claude Code, GitHub Copilot, ChatGPT

Responsibilities

Develop and maintain data ingestion pipelines from multiple banking source systems
Build data transformations to ensure data quality, consistency, and business logic correctness
Set up and maintain orchestration workflows for scheduled data processing
Implement data quality checks and validation rules based on business requirements
Develop and maintain API interfaces for data serving to downstream systems
Set up BI tool integrations and develop reports and dashboards
Write tests for data pipelines and transformations
Monitor scheduled jobs, troubleshoot failures, and implement fixes
Optimize data pipeline performance and query efficiency
Document data flows, transformation logic, and system configurations
Participate in code reviews and collaborate with team members
Learn banking domain concepts and regulatory requirements
Contribute to team knowledge sharing and best practices

Requirements

2-5 years of experience in data engineering with data focus, or analytics engineering
Strong proficiency in SQL and working knowledge of Python
Hands-on experience building data pipelines using tools like Airflow, dbt, or similar
Experience with cloud platforms (AWS, Azure, or GCP) and object storage (S3, ADLS, GCS)
Understanding of data modeling concepts including dimensional modeling and fact/dimension tables
Experience with data quality validation and testing
Ability to debug and troubleshoot data pipeline issues
Experience with version control (Git) and basic understanding of CI/CD concepts
Understanding of data governance basics: access control and audit logging
Good problem-solving skills and ability to work with moderate independence
Good communication skills and willingness to ask questions when blocked
Bachelor's degree in Computer Science, Engineering, Mathematics, or a related field, or equivalent practical experience
Language ability: Japanese at Business level and/or English at Business level (TOEIC score of 700 or above)

Nice to haves

While not specifically required, tell us if you have any of the following.

Experience in financial services, fintech, or regulated industries
Basic knowledge of banking domain concepts: core banking, payments, or regulatory reporting
Exposure to data platforms in regulated environments (FISC Guidelines, GDPR, APPI)
Hands-on experience with Databricks platform or AWS native data services
Experience with performance tuning: partitioning strategies, file formats, query optimization
Experience building REST APIs with Python (FastAPI, Flask, or similar)
Knowledge of streaming data pipelines (Kafka, Kinesis, or similar)
Basic experience with Terraform
Experience with BI tools (QuickSight, Tableau, Looker, PowerBI)
Experience with data visualization and dashboard design
Interest in obtaining certifications (AWS Certified Data Analytics, Databricks certifications)

What You'll Learn

This role offers exceptional learning opportunities:

Cloud Engineering : Hands-on experience with AWS services (S3, IAM, VPC networking), understanding cloud-native architectures
Lakehouse Technologies: Deep dive into Delta Lake or Apache Iceberg including ACID transactions, time-travel queries, and schema evolution
Data Orchestration: Build production workflows with Apache Airflow or Databricks Workflows including DAG design, dependency management, and error handling
Analytics Engineering: Master dbt (data build tool) for SQL-based transformations, incremental models, and data testing
Data Processing: Work with Apache Spark for distributed data processing and learn optimization techniques
Data Modeling: Learn dimensional modeling, slowly changing dimensions (SCD), fact/dimension tables, and star schema design
Banking Domain: Understand core banking systems, payment flows, regulatory reporting (FSA/BOJ), and financial reconciliation
Data Quality: Implement validation frameworks, cross-system reconciliation, and automated testing for data pipelines
Governance & Compliance: Experience with fine-grained access control, audit logging, data lineage tracking, and regulatory compliance (FISC Guidelines)
Performance Optimization: Learn query optimization, partitioning strategies, Z-ordering, and cost management for cloud data platforms
Professional Development: Mentorship from experienced data engineers and architects, code review practices, and engineering best practices

Growth Path

We are committed to your professional growth:

Clear progression path from Mid-level → Senior Data Engineer → Technical Lead
Regular 1-on-1s with PM and Tech Lead for feedback and career planning
Opportunities to lead features and projects as you gain experience
Support for certifications and training (AWS, Databricks, dbt)
Exposure to architecture decisions and system design discussions
Increasing ownership of data platform components
Potential to specialize in areas of interest (data governance, real-time streaming, ML infrastructure, cost optimization)
Mentorship opportunities as the team grows

Share job

Similar Jobs

View All

22 Hours ago

Data Engineer

Fintech

3 - 5 Yrs
Mumbai

Data Engineer Mumbai | Full-Time Experience: 3–6 Years Budget: Up to ₹27 LPA Industry: General Insurance (Digital-First Organization) We’re rebuilding insurance from the ground up digital-first, transparent, fast, and fair. No legacy te...

More info

1 Day ago

QA Manager

Fintech

10 - 18 Yrs
Pune

Job Description We are seeking an experienced and dynamic QA Manager to lead our quality assurance team in delivering high-quality software products for our organization. The ideal candidate will have a strong background in manual and automation tes...

More info

1 Day ago

Database Administrator (DBA)

Information Technology

Bangalore, Karnataka, India

This role is for one of our clients Company Name: cloudtechner Seniority level: Mid-Senior level Min Experience: 5 years Location: Gurgaon, NCR JobType: full-time We are looking for an experienced and detail-oriented Database Administrator (DBA) to ...

More info

1 Day ago

Salesforce Data Engineer

Information Technology

Bangalore, Karnataka, India

DescriptionRole Summary :We are seeking a highly skilled Salesforce Data Engineer with deep expertise in the Salesforce platform and a strong focus on building and operating Salesforce Data Cloud (D360) solutions. The ideal candidate will design, int...

More info

1 Day ago

Business Analyst I

Information Technology

Bangalore, Karnataka, India

Through our dedicated associates, Conduent delivers mission-critical services and solutions on behalf of Fortune 100 companies and over 500 governments - creating exceptional outcomes for our clients and the millions of people who count on them. You ...

More info

1 Day ago

Associate Software Engineer - Test Automation (Infra)

Information Technology

Bangalore, Karnataka, India

Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster. As one of the fastest-growing SaaS companies in history, we surpassed $2B in revenue in our last fiscal ...

More info

1 Day ago

Interesting Job Opportunity: Data Analyst - SQL/Python

Information Technology

Bangalore, Karnataka, India

DescriptionWe are seeking a skilled Data Analyst with strong expertise in Python, SQL, and Excel, coupled with a solid foundation in statistics and a good understanding of retail demand processes.The ideal candidate will be responsible for transformi...

More info

1 Day ago

EY - GDS Consulting - AI and DATA - GCP Data Engineer - Senior

Information Technology

Bangalore, Karnataka, India

At EY, you’ll have the chance to build a career as unique as you are, with the global scale, support, inclusive culture and technology to become the best version of you. And we’re counting on your unique voice and perspective to help EY become even b...

More info

Talk to us

Feel free to call, email, or hit us up on our social media accounts.

Email info@antaltechjobs.in