Manager, Site Reliability Engineering - Cloud Operations
Job title: Manager, Site Reliability Engineering - Cloud Operations in San Diego, CA at ServiceNow
Company: ServiceNow
Job description: Company DescriptionIt all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today — ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500®. Our intelligent cloud-based platform seamlessly connects people, systems, and processes to empower organizations to find smarter, faster, and better ways to work. But this is just the beginning of our journey. Join us as we pursue our purpose to make the world work better for everyone.Job DescriptionWe are currently seeking a Manager, Site Reliability Engineering.Our Site Reliability Engineering (SRE) teams are a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability and performance of the ServiceNow cloud infrastructure. Our SREs are empowered to drive technical resolutions across the technology stack from hardware through to application and all stops in between. They are also tasked with driving forward the operability of the platform to drive down the number of incidents and to reduce MTTR.To accomplish this the team combines software development, networking and systems engineering expertise with a strong desire to be challenged by problems of scale and complexity and to make services better for our customers.Let's get straight to the point. Do you:
- have a technical background in roles like systems engineering or devops?
- know operating systems in various levels of troubleshooting and diagnostics?
- have low tolerance to repetitive tasks and automate your way through work?
- have experience in leading a team of engineers and exposure to people management?
- Team management, career development, project prioritization and performance review.
- Drive a culture of intolerance to manual activities that promotes automation efforts.
- Drive initiatives with partner teams to improve the reliability of the infrastructure.
- Act as crisis management to orchestrate actions towards sustainable solutions.
- Analysis and evaluation of existing processes to drive continuous improvement and efficiencies.
- Provide training and support to partner teams that interface with SRE.
- Onboarding of new hires to enable their success in their roles.
- Onboarding of new technologies, systems and automations into the team.
- Experience in leveraging or critically thinking about how to integrate AI into work processes, decision-making, or problem-solving. This may include using AI-powered tools, automating workflows, analyzing AI-driven insights, or exploring AI's potential impact on the function or industry
- 6+ years of experience managing technical teams
- 4+ years of experience managing an SRE team or similar
- Hands-on technical skills in Linux systems, Kubernetes, and software development.
- Design and implementation of monitoring solutions for large and scalable environments.
- Experience with cloud operations, follow-the-sun and geographic distributed teams.
- Experience working in software, platform and infrastructure delivered as a service.
- Outstanding interpersonal skills and strong communication skills, both written and verbal.
- Uncompromising attention to detail.
- Bachelor's degree in Computer Science or related technical field preferred
Expected salary:
Location: San Diego, CA
Apply for the job now!
[ad_2]
Apply for this job