
Overview
We are looking for a Site Reliability Engineering Manager to lead our new Product Reliability Team. Your responsibility will be to help drive innovation, increase platform stability, and create operational excellence from within a development team, across Imperva’s product portfolio. The person taking this role will have input in decisions that will have a significant impact on our products, how they are deployed and the infrastructure and operational decisions behind them. The person selected for this role will be able to build a team from the ground up, make a genuine impact on our operational excellence and be able to tangibly improve our platform and applications with their knowledge and their diligence.
Responsibilities:
Your responsibility will be to help drive innovation, increase platform stability, and create operational excellence from within a development team, across Imperva’s product portfolio.
The person taking this role will have input in decisions that will have a significant impact on our products, how they are deployed and the infrastructure and operational decisions behind them.
The person selected for this role will be able to build a team from the ground up, make a genuine impact on our operational excellence and be able to tangibly improve our platform and applications with their knowledge and their diligence..
Apply SRE core tenets of measurement (SLI/SLO/SLA), eliminate toil, and reliability modeling
Enable and educate development teams on industry best practice design patterns, ways of working and operational knowledge to ensure platform continuity
Develop and architect solutions to infrastructure and operational aspects of new products and feature sets
Work within development teams to troubleshoot and resolve business affecting issues
Contribute to improving Imperva’s global systems performance and stability
Qualifications:
At least 5 years of professional experience, at least 2 years within a leadership role
A strong technical background, with current capabilities and willingness to get hands on when needed
Excellent knowledge of Linux and scripting languages (Bash, Python, Golang) • Excellent knowledge of cloud and on-premise systems administration principles
End user experience with CI/CD Systems (Jenkins, ArgoCD)
Significant Understanding of DevOps principles
Excellent Knowledge and experience with logging and monitoring tooling (Datadog, Coralogix, Grafana, Prometheus)
Good knowledge of web scale networking (Routing protocols, DNS)
A strong team player who is accountable towards business urgency