Kochi, Kerala, India
Information Technology
Full-Time
Karomi Technology Private Limited
Overview
Role Summary
You will own the full ML stack that turns raw dielines, PDFs, and e-commerce images into a self-learning system that reads, reasons about, and designs packaging artwork.
That Includes
Area Tasks :
Data pipeline v1 that converts > 500 ECMA dielines + 200 PDFs into training-ready JSON. Panel-encoder checkpoint with < 5 % masked-panel error. MVP copy-placement model (LayoutLM-v3 backbone + heads) hitting - 85 % IoU on validation. REST inference service + designer preview UI able to draft lid/side-wrap artwork for one SKU in Nightly active-learning retrain loop.
Reporting & Team
You will own the full ML stack that turns raw dielines, PDFs, and e-commerce images into a self-learning system that reads, reasons about, and designs packaging artwork.
That Includes
- Building data-ingestion & annotation pipelines (SVG/PDF - JSON),
- Designing / modifying model heads on top of LayoutLM-v3, CLIP, GNNs, diffusion LoRAs,
- Training & fine-tuning on GPUs,
- Shipping inference APIs and evaluation dashboards.
- You'll work day-to-day with packaging designers and a product-manager; you are the technical authority on everything deep-learning for this domain.
Area Tasks :
- Data & Pre-processing (- 40 %) - Write robust Python scripts to parse PDF, AI, SVG; extract text, colour separations, images, panel polygons.
- Implement Ghostscript, Tesseract, YOLO, CLIP pipelines.
- Automate synthetic-copy generation for ECMA dielines.
- Maintain vocabulary YAMLs & JSON schemas.
- Model R-&-D (- 40 %) - Modify LayoutLM-v3 heads (panel-ID, bbox-reg, colour, contrastive).
- Build panel-encoder pre-train (mask-panel prediction).
- Add Graph-Transformer & CLIP-retrieval heads; optional diffusion generator.
- Run experiments, hyper-param sweeps, ablations; track KPIs (IoU, panel-F1, colour recall).
- Maintain CI/CD, experiment tracking (Weights&Biases, MLflow).
- Serve REST/GraphQL endpoints that designers and the web front-end call.
- Implement active-learning loop that ingests designer corrections nightly.
- 5 + years Python, 3 + years deep-learning (PyTorch, Hugging Face).
- Hands-on with Transformer-based vision-language models (e.g. LayoutLM, Pix2Struct) and at least one object-detection pipeline (YOLOv5/8, DETR).
- Comfortable hacking PDF/SVG tool-chains: PyMuPDF/pdfplumber, Ghostscript, svgpathtools, OpenCV.
- Experience designing custom heads / loss functions and fine-tuning large pre-trained checkpoints on limited data.
- Solid Linux & GPU know-how; can spin up, monitor, and profile multi-GPU jobs.
- Familiarity with graph neural networks or relational transformers.
- Clear, idiomatic Git & code-review discipline; writes reproducible experiments.
- Knowledge of colour science (Lab, ICC profiles, Pantone tables) or print production.
- Prior work on multimodal retrieval (CLIP, ImageBind) or diffusion fine-tuning (LoRA, ControlNet).
- Packaging / CPG industry exposure (Nutrition Facts, Drug Facts, ECMA codes).
- Experience standing up FAISS or similar vector search, and with AWS/GCP ML tooling.
- Familiarity with Typescript/React front-ends for quick label-preview UIs.
- Domain Primary tools
- DL frameworks PyTorch, Hugging Face Transformers, torch-geometric
- Parsing / CV PyMuPDF, pdfplumber, svgpathtools, OpenCV, Ghostscript
- OCR / Detectors Tesseract, YOLOv8, Grounding DINO (optional)
- Retrieval CLIP / ImageBind + FAISS
- MLOps Docker, GitHub Actions, W&B or MLflow, AWS SageMaker / GCP Vertex
- Languages 95 % Python, occasional Bash / JSON / YAML
- Reports to Head of AI (or CTO).
- Collaborates with 1 front-end engineer, 1 product manager, 2 packaging-design SMEs.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in