
Overview
Wood Mackenzie is the global data and analytics business for the renewables, energy, and natural resources industries. Enhanced by technology. Enriched by human intelligence. In an ever-changing world, companies and governments need reliable and actionable insight to lead the transition to a sustainable future. That’s why we cover the entire supply chain with unparalleled breadth and depth, backed by over 50 years’ experience. Our team of over 2,400 experts, operating across 30 global locations, are enabling customers’ decisions through real-time analytics, consultancy, events and thought leadership. Together, we deliver the insight they need to separate risk from opportunity and make confident decisions when it matters most.
WoodMac.com
Wood Mackenzie Brand Video
Wood Mackenzie Values
- Inclusive – we succeed together
- Trusting – we choose to trust each other
- Customer committed – we put customers at the heart of our decisions
- Future Focused – we accelerate change
- Curious – we turn knowledge into action
Site Reliability Engineer II
Job Description
Wood Mackenzie has an exciting opportunity for a Site Reliability Engineer (SRE) II to join a dynamic global business to help drive change and innovation. We are looking for an SRE professional to help us manage and support our products and services within the enterprise.
Role Purpose
The principal responsibility of this role is to provide operational expertise within the SRE team and work with the software engineering teams for releasing and maintaining new and existing applications. This encompasses:
Working in partnership with the business and the technology teams, bringing awareness and insight of the different operational constraints / opportunities for projects targeting cloud-based or on-premises deployment.
Advanced implementation and maintenance of more complex cloud and on-prem resources and environments.
Promotion of mutual feedback in cross-functional groups, following SRE best practices within a devops culture.
Implementation of advanced continuous integration/delivery toolsets or the processes, resources, and platforms that use those tools.
Strong focus on service availability and proactive detection of problems.
Ability to articulate technical and business concepts to different audiences and be able to influence technical decisions with solid metrics collection and proof of concepts
Responsibilities:
Advanced Pipeline Implementation and Optimization: Work closely with cross-functional teams, including developers, QA, and product managers, to develop, implement, and maintain advanced delivery pipelines for efficiency and scalability using tools like Jenkins, TeamCity, Octopus Deploy, and GitHub Actions.
Operational Insights: Provide advanced solutions for operational excellence through identifying operational constraints and opportunities such as auto-scaling, container orchestration, and system resiliency.
Proactive Monitoring: Develop proactive monitoring solutions to predict and mitigate potential issues. Regularly analyze monitoring data to identify trends and areas for improvement.
Tooling Innovation: Drive innovation in team tooling and processes. Continuously evaluating fit and purpose of industry tools used across the team.
Operational Leadership: Mentor Level 1 engineers and lead operational improvements.
Continuous Improvement: Embrace a mindset of continuous improvement, regularly reviewing operational processes to identify inefficiencies. Assist with drafting and proposing actionable plans for process enhancements to increase team efficiency and system reliability.
Incident Management: Actively leading incident response efforts (P3 and P4) and conduct post-incident reviews. Be actively involved in troubleshooting and collaboration of active incidents.
Documentation and Knowledge Sharing: Ensure thorough documentation, including adding or changing where necessary, and fostering a culture of knowledge sharing.
On-Call Rotation: Participate in a 24/7 on-call support rotation, providing advanced support, responding to system alerts, and incidents to ensure continuous system availability and performance.
Qualifications
We understand every organization is different and professionals have their own unique history and experience, so we don’t expect to find a 100% match of candidate competencies in respect of the tech stack we use in Wood Mackenzie. We list our preferred technologies, but if you have transferrable knowledge and you are willing to learn what you do not know, we will consider your application.
Skill Requirements:
Experience: Minimum of 2-4 years in SRE/DevOps roles.
Leadership in Agile: Experience leading agile processes, supporting multiple software engineering teams for product releases, and providing monitoring and insights into issues.
Advanced Cloud Skills: Strong Amazon Web Services (AWS) understanding (Cloud Practitioner or equivalent knowledge), working knowledge of Azure (AZ-900 or equivalent knowledge).
DevOps Mindset: Strong understanding of DevOps principles and operational model, including continuous integration, continuous delivery, and infrastructure as code. Experience with agile methodologies such as Kanban or Scrum and familiarity with JIRA for issue tracking.
Team Collaboration: Demonstrated ability to work independently as well as part of a cross-functional, multi-locational team. Effective communication skills to collaborate with team members and stakeholders.
Resource Maintenance: Develop and oversee routine system maintenance tasks, including patch management, system backups, and performance tuning.
Advanced Linux Skills: Proficient in Linux administration (RHEL/Ubuntu) and automation.
Automation Expertise: Expertise with configuration management tools (e.g., Ansible, SaltStack) for automating system configurations and deployments.
Mentorship: Ability to mentor SRE Is.
Additional Preferred Skills:
Advanced Cloud Proficiency: Extensive AWS and Azure experience.
Containerization Expertise: Proficiency in Docker and Kubernetes.
Advanced Scripting and Automation: Strong scripting skills in Python, Bash, or PowerShell.
Infrastructure as Code: Experience with Terraform, CloudFormation, or Pulumi.
CI/CD Pipelines: Deep understanding of CI/CD principles.
Monitoring and Logging: Proficiency with using tools like Prometheus, Nagios, Grafana, ELK Stack, Splunk, App Insights, and CloudWatch.
Networking Knowledge: Basic understanding of networking concepts.
Security Best Practices: Familiarity with security best practices and compliance frameworks (e.g. SOX, SOC II, NIST).
Database Management: Experience with SQL and NoSQL databases.
Enterprise SaaS applications: Experience working with SaaS applications such as Okta, Jira, and Confluence.
Collaboration Tools: Experience with Git, GitHub, and documentation platforms like Confluence.
Equal Opportunities
We are an equal opportunities employer. This means we are committed to recruiting the best people regardless of their race, colour, religion, age, sex, national origin, disability or protected veteran status. You can find out more about your rights under the law at www.eeoc.gov
If you are applying for a role and have a physical or mental disability, we will support you with your application or through the hiring process.