Overview
About Us
Vieu is building a best-in-class knowledge graph; collecting, analyzing, and indexing billions of entities and relationships to power revolutionary experiences in the Sales space. Engineers at Vieu contribute across the stack from data pipelines to user interfaces to deliver powerful, intuitive experiences which unlock new opportunities for our customers.
About the Role
We’re looking for a Graph Data Engineer to design and build large-scale data pipelines that transform unstructured web data into structured, searchable knowledge systems.
You will be foundational in architecting our data platform from scratch. You will help scale our data systems to handle millions of data points daily.
You’ll work at the intersection of:
- Data engineering
- Distributed systems (Spark/Flink)
- NLP & entity extraction
- Web scraping & automation
- Vector search & graph-based intelligence
This is a hands-on engineering role focused on building scalable pipelines that power intelligent data products.
What You’ll Do
- Build and maintain scalable ETL/data pipelines for large-scale structured & unstructured datasets
- Design distributed processing workflows using Apache Spark or Apache Flink
- Develop web scraping and crawling frameworks using Selenium, Puppeteer, or similar tools
- Implement entity extraction, NLP pipelines, and feature engineering workflows
- Integrate data into Elasticsearch / OpenSearch / Vector Databases
- Design data models for graph-based and search-based applications
- Optimize pipeline performance, reliability, and monitoring (Airflow or equivalent orchestration)
- Collaborate with ML engineers to support model training & evaluation workflows
- Ensure data quality, cleaning, and validation at scale
What We’re Looking For
Core Requirements:
- 3–7+ years in Data Engineering & backend systems.
- Build multiple systems from scratch in a fast-paced environment
- Experience working with large-scale data processing of 100M+ nodes
- Hands-on experience building ETL pipelines
- Experience with Airflow or workflow orchestration systems
Strong Plus
- Experience with Python/Typescript
- Experience with NLP / entity extraction pipelines
- Strong experience with Apache Spark or Apache Flink
- Experience building web scraping / crawling systems
- Familiarity with Elasticsearch / OpenSearch
- Experience with Vector Databases
- Exposure to graph data modeling or knowledge graph systems
- Experience with ML lifecycle: feature engineering, model evaluation, deployment & monitoring of the final system
What Makes This Role Interesting
- Work on real-world unstructured data at scale
- Build graph-powered intelligence systems
- High ownership across data ingestion → extraction → indexing → ML enablement
- Opportunity to shape foundational data infrastructure