Pune, Maharashtra, India
Information Technology
Full-Time
MulticoreWare Inc
Overview
Key Responsibilities
Debugging and Troubleshooting :
Debugging and Troubleshooting :
- Investigate and resolve complex software issues within OpenStack environments (particularly those running on Ubuntu), including networking, compute, and storage.
- Diagnose and troubleshoot problems related to Kubernetes container orchestration, including pod failures, service outages, and networking issues.
- Debug and analyze issues with Docker containers and their interaction with the underlying system.
- Analyze and resolve issues related to Ceph distributed storage, including data replication, performance tuning, and storage availability.
- Work on Octavia load balancers to troubleshoot L2/L3 networking issues and ensure reliable load balancing for cloud-native applications.
- Lead incident resolution efforts for platform outages or performance degradation, coordinating across different teams to ensure swift recovery.
- Perform root cause analysis (RCA) and provide long-term fixes for recurring or critical issues.
- Document incident postmortems to prevent future occurrences and improve processes.
- Analyze performance bottlenecks across the cloud stack, including OpenStack components, Kubernetes, and Ceph, and implement optimizations to improve reliability and efficiency.
- Optimize networking setups, including Octavia load balancers, to enhance cloud service delivery.
- Monitor and improve containerized application performance and scaling across Docker and Kubernetes clusters.
- Assist in upgrading and maintaining cloud infrastructure, ensuring that all components (Ubuntu, OpenStack, Kubernetes, Ceph, etc.) are kept secure and up to date.
- Participate in the deployment of software updates, security patches, and configuration changes in a controlled manner with minimal downtime.
- Build and maintain automation scripts for monitoring, troubleshooting, and resolving cloud platform issues, focusing on OpenStack, Kubernetes, Ceph, and Docker environments.
- Implement and optimize Infrastructure as Code (IaC) solutions to improve the deployment and configuration of cloud resources.
- Strong debugging skills and familiarity with cloud and software debugging tools.
- Experience with networking, compute, and storage components in OpenStack.
- Hands-on experience with containerization (Docker) and orchestration (Kubernetes).
- Familiarity with Ceph distributed storage solutions and troubleshooting storage issues.
- Experience with monitoring and logging tools, such as Prometheus, Grafana, and Elasticsearch.
- Solid understanding of networking principles, including L2/L3 networking, load balancing (Octavia), and SDN (Software Defined Networking).
- Proficient in scripting languages like Python, Bash, or equivalent for automation.
- Strong communication skills and the ability to work in a collaborative environment.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in