Information Technology
Full-Time
Muoro
Overview
Description
Role Overview :
We are seeking a highly skilled Software Engineer specializing in Large Language Models (LLMs) to design, develop, and deploy cutting-edge AI solutions leveraging state-of-the-art transformer architectures.
Core Expertise :
Role Overview :
We are seeking a highly skilled Software Engineer specializing in Large Language Models (LLMs) to design, develop, and deploy cutting-edge AI solutions leveraging state-of-the-art transformer architectures.
- The ideal candidate will have strong expertise in deep learning, NLP, and model optimization, combined with software engineering best practices for building scalable AI systems in production.
- Youll collaborate with data scientists, ML engineers, and product teams to build intelligent applications powered by advanced generative AI models such as GPT, LLaMA, Falcon, Mistral, Claude, or similar open-source and proprietary models.
- Design, train, fine-tune, and evaluate Large Language Models (LLMs) for specific use cases (e.g., summarization, code generation, chatbots, reasoning, and retrieval-augmented generation).
- Experiment with transformer-based architectures (e.g., GPT, T5, BERT, LLaMA, Mistral).
- Develop parameter-efficient fine-tuning (PEFT) strategies such as LoRA, QLoRA, adapters, or prompt-tuning.
- Create and maintain high-quality datasets for pretraining, fine-tuning, and evaluation.
- Optimize model inference using techniques like quantization, distillation, and tensor parallelism for real-time or edge deployment.
- Integrate LLMs into production environments using frameworks like Hugging Face Transformers, PyTorch Lightning, or DeepSpeed.
- Implement scalable model serving solutions using FastAPI, Ray Serve, Triton Inference Server, or similar frameworks.
- Build and maintain APIs or SDKs that expose LLM capabilities to other teams and products.
- Evaluate and experiment with open-source and proprietary foundation models.
- Keep up with the latest trends in Generative AI, NLP, and Transformer models.
- Perform benchmarking, ablation studies, and A/B testing to measure performance, cost, and quality improvements.
- Collaborate with ML Ops and DevOps teams to design CI/CD pipelines for model training and deployment.
- Manage and optimize GPU/TPU clusters for distributed training and inference.
- Implement robust monitoring, logging, and alerting for deployed AI systems.
- Ensure software follows clean code principles, version control, and proper documentation.
- Partner with product managers, data scientists, and UX teams to identify and translate business problems into AI-driven solutions.
- Contribute to internal research initiatives and help shape the companys AI strategy.
- Mentor junior engineers in AI model development, coding standards, and best practices.
Core Expertise :
- Strong proficiency in Python and deep learning frameworks (PyTorch, TensorFlow, JAX).
- Hands-on experience with transformer architectures and LLM fine-tuning.
- Deep understanding of tokenization, attention mechanisms, embeddings, and sequence modeling.
- Experience with Hugging Face Transformers, LangChain, LlamaIndex, or OpenAI API.
- Experience deploying models using Docker, Kubernetes, or cloud ML services (AWS Sagemaker, GCP Vertex AI, Azure ML, OCI Data Science).
- Familiarity with model optimization (quantization, pruning, distillation).
- Knowledge of retrieval-augmented generation (RAG) pipelines, vector databases (FAISS, Pinecone, Weaviate, Chroma).
- Experience with multi-modal models (text + image, text + code).
- Familiarity with MLOps tools like MLflow, Kubeflow, or Weights & Biases (W&B).
- Understanding of Responsible AI practicesbias mitigation, data privacy, and model explainability.
- Experience contributing to open-source AI projects
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in