
Overview
Values Rooted in Purpose
We embrace the power to lead, the courage to innovate, and the determination to grow. At our core, we believe in humanizing our approach, recognizing that our people are our greatest strength. With a shared vision of transformation, we endeavor to shape a brighter future for higher education.
We’re seeking a seasoned Senior Cloud Engineer to help modernize our Ellucian application ecosystem and drive AI-powered automation in service management. You’ll be part of a transformational journey—from manual infrastructure to a fully pipeline-driven, scalable SaaS platform combining Agentic AI, SMART workflows & self healing API's coded in Python. With a focus on delivering intelligent workflows and creating a Highly available CI/CD Platform, this high-impact role combines deep technical expertise with a passion for innovation, shaping Ellucian into a leader in SaaS for higher education.
If you get excited about automation & creation and the impact, you can make on shaping the future of higher education then we should talk!
Where you will make an impact
- Lead the creation of agentic AI–powered service management applications, using LLMs, embeddings, and vector databases to automate incident triage, change requests, and user support workflows.
- Architect and optimize scalable data lakes and vectorization pipelines that transform logs, tickets, and knowledge articles into high-quality embeddings for semantic search and proactive problem detection.
- Design orchestration frameworks that integrate LLM agents with ITSM platforms, internal tools, and third-party services via REST APIs and event-driven architectures to deliver zero-touch operations.
- Champion MLOps best practices—implement CI/CD for models, establish monitoring and alerting on SLAs, and automate retraining—to ensure AI agents maintain peak performance in live service environments.
- Collaborate with service management, product, and engineering teams to translate operational challenges into AI-driven solutions, providing technical leadership and mentorship throughout delivery.
- Stay at the cutting edge of generative AI and vector search research to continuously enhance our service management applications and drive innovation in automated support.
- 7+ years of AI/ML engineering experience with deep expertise in Python, TensorFlow or PyTorch, and cloud platforms (AWS preferred).
- Proven track record building LLM-powered applications and autonomous agent frameworks (e.g., LangChain), with a focus on IT and service management use cases.
- Strong proficiency in embedding strategies and vector databases such as Pinecone, FAISS, or Milvus, including designing indexing and retrieval pipelines for ticketing and knowledge data.
- Solid background in data engineering and data lake architectures, ensuring seamless support for advanced service management workloads.
- Demonstrated ability to integrate AI services via REST APIs, message queues, and event routers, delivering robust, scalable service management solutions.
- Extensive MLOps experience—using tools like MLflow, SageMaker Pipelines, Kubeflow, MCPS, and AGNO—to automate model lifecycle from training to monitoring and retraining in production support environments.
- Expertise in observability tools and practices to instrument, monitor, and troubleshoot AI/ML pipelines, ensuring system reliability and performance.
- Exceptional problem-solving skills, clear communication, and a passion for mentoring peers and driving cross-functional collaboration between ITSM and engineering teams.
- 22 days annual leave plus 11 public holidays
- Competitive gratuity policy
- Group insurance and Annual health checkup plan with a variety of family and wellness benefits.
- Thrive Flex Lifestyle Account (LSA) that allows you to contribute towards your health,
financial or learning interests - 5 charitable days to support the community that supports us
- Wellness
o Headspace (mental health)
o Wellbeats (virtual fitness classes) - RethinkCare – caregiver support
- Diversity and inclusion programs that promote employee resource groups such as: Buzzinga and Lean In Team to name a few.
- Parental leave
- Employee referral bonuses to encourage the addition of great new people to the team
- We Foster a learning culture with:
- Education Assistance Program
- Professional development opportunities
#LI-HS1
#LI-remote