Overview
Who is this for
If building scalable data infrastructure excites you, this is the place to be. Fornax is a team of cross-functional individuals who solve critical business challenges using analytics and innovative data solutions.
We are seeking a skilled Data Engineer to work on our cutting-edge data product.
The ideal candidate will possess strong technical expertise in modern data stack technologies and a passion for building robust, scalable data platforms.
The Data Engineer will play a critical role in architecting, developing, and maintaining our data product infrastructure. This role involves working closely with data scientists, analytics engineers, and product stakeholders to build high-performance data pipelines, optimize query performance, and deliver reliable data solutions. The ideal candidate has a strong background in data engineering, distributed systems, and modern data tooling.
Key Responsibilities
Data Infrastructure & Pipeline Development (50%)
- Design, build, and maintain scalable data pipelines using Prefect for orchestration and workflow management
- Implement ELT processes using Dlthub to efficiently load data from various sources into the data platform
- Develop and optimize data transformation workflows using DBT to ensure clean, modeled, and business-ready datasets
- Build and manage Apache Iceberg table formats to enable efficient data lakehouse operations with ACID transactions
- Leverage DuckDB for fast local analytics, development testing, and embedded analytical workloads
- Ensure data pipeline reliability, monitoring, and error handling with comprehensive logging and alerting mechanisms
Query Optimization & Performance Engineering (20%)
- Design and optimize distributed query execution using Trino for high-performance analytics across diverse data sources
- Utilize DuckDB for rapid prototyping, local query testing, and in-process analytical operations
- Implement query optimization strategies including partition pruning, predicate pushdown, and materialized views
- Monitor and tune query performance to ensure sub-second response times for critical business queries
- Develop best practices for efficient data access patterns and resource utilization across different query engines
Data Modeling & Architecture (15%)
- Implement semantic layer and metrics definitions for consistent business logic across applications
- Design dimensional models and data mart architectures to support analytics and reporting use cases
- Collaborate with analytics engineers and stakeholders to translate analytical requirements into optimized data structures
- Establish and maintain data modeling standards and documentation for the data product
Data Quality & Governance (10%)
- Implement data quality frameworks and validation checks within DBT models and Prefect workflows
- Develop automated data testing and monitoring solutions to ensure data accuracy and consistency
- Document data lineage, schema definitions, and transformation logic to maintain data governance standards
- Establish SLAs for data freshness, quality, and pipeline reliability
Collaboration & Product Development (5%)
- Work closely with product managers, data scientists, and business stakeholders to understand data product requirements
- Participate in technical design reviews and contribute to architectural decisions
- Provide technical guidance and mentorship to junior team members
- Stay current with emerging technologies and best practices in the data engineering ecosystem