Senior Site Reliability

hace 1 día


región centro jalisco, México Canonical A tiempo completo

Senior Site Reliability / Gitops Engineer Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Canonical is a pioneer of global distributed collaboration, with over 1,200 colleagues in 75+ countries and very few office‑based roles. Job Summary The IS team at Canonical supports and maintains all of Canonical's IT production services. The team is responsible for running services used by more than 60 million Ubuntu users. As a Senior SRE & Gitops engineer you will drive operations automation to the next level in both private and public clouds, leveraging open source infrastructure‑as‑code tools, CI/CD pipelines, and Canonical products for software operation automation. Responsibilities Drive the development of automation and Gitops as an embedded tech lead. Collaborate closely with the IS architect to align solutions with the IS architecture vision. Design and architect services that the IS can offer to the organization as products. Apply experience of IaC to develop infrastructure as code practices by constantly increasing automation and improving IaC processes. Automate software operations for re‑usability and consistency across private and public clouds, taking into account distributed system complexities. Maintain operational responsibility for all of Canonical's core services, networks, and infrastructure. Develop skills in troubleshooting, capacity planning, and performance investigation; set up, maintain, and use observability tools such as Prometheus, Grafana, and Elasticsearch; design, implement, and maintain monitoring and alerting for various systems and services. Provide assistance and collaborate with globally distributed engineering, operations, and support peers. Receive uninterrupted development time to focus on larger projects and automation of manual tasks. Share experience, know‑how, and best practices with other team members in design sessions, mentorship, and collaborative work. Carry final responsibility for time‑critical escalations. Qualifications A modern view on hosting architecture, driven by infrastructure as code across both private and public clouds. A product mindset thriving to develop products rather than solutions. Python software development experience with large projects. Experience working with Kubernetes or other container orchestration systems. Proven exposure to manage and deploy cloud infrastructure with code. Practical knowledge of Linux networking, routing, and firewalls. Affinity with various forms of Linux storage, from Ceph to databases. Hands‑on experience administering enterprise Linux servers. Extensive knowledge of cloud computing concepts and technologies. Bachelor's degree or higher, preferably in computer science or related engineering field. Effective written and verbal communication skills in English. Motivated to troubleshoot from kernel to web and willingness to ask for help when appropriate. A willingness to be flexible and learn new things quickly. Inspired by the needs of fast‑changing environments. Happy to work within distributed teams. Passionate about open‑source, especially Ubuntu or Debian. Benefits Distributed work environment with twice‑yearly team sprints in person. Personal learning and development budget of USD 2,000 per year. Annual compensation review. Recognition rewards. Annual holiday leave. Maternity and paternity leave. Team Member Assistance Program & Wellness Platform. Opportunity to travel to new locations to meet colleagues. Priority Pass and travel upgrades for long‑haul company events. About Canonical Canonical is a pioneering tech firm at the forefront of the global move to open source. As the company that publishes Ubuntu, one of the most important open‑source projects and the platform for AI, IoT, and the cloud, we are changing the world of software. We recruit on a global basis and set a very high standard for people joining the company. We expect excellence; to succeed, we must be the best at what we do. Most colleagues at Canonical have worked from home since 2004. Working here challenges you to think differently, work smarter, learn new skills, and raise your game. We are proud to foster a workplace free from discrimination. Diversity of experience, perspectives, and background create a better work environment and better products. Whatever your identity, we will give your application fair consideration. Seniority level Mid‑Senior level Employment type Full‑time Job function Engineering and Information Technology Industries Software Development #J-18808-Ljbffr



  • región centro jalisco, México NTT DATA, Inc. A tiempo completo

    Site Reliability Engineer – GDL, Jalisco, Mexico We are currently seeking a Site Reliability Engineer to join our team in GDL, Jalisco (MX-JAL), Mexico (MX). Responsibilities Perform L1.5 activities such as monitoring, deployment, rollback. Monitor the efficiency of the Azure cloud systems to prevent outages and initiate an Incident Management bridge in...


  • región centro jalisco, México NTT DATA A tiempo completo

    SRE – Site Reliability Engineer We are currently seeking a Site Reliability Engineer to join our team in GDL, Jalisco (MX-JAL), Mexico (MX). Responsibilities Perform L1.5 activities such as monitoring, deployment, rollback. Monitor the efficiency of the Azure cloud systems to prevent outages and initiate an Incident Management bridge in case of an outage....


  • región centro jalisco, México Oracle A tiempo completo

    A leading cloud solutions provider in Mexico is seeking a skilled Cloud Region Build Site Reliability Engineer to join its team. This full-time role focuses on ensuring the performance, availability, and scalability of cloud infrastructure services. Responsibilities include building and maintaining OCI infrastructure, responding to incidents, and improving...


  • región centro jalisco, México Canonical A tiempo completo

    Join to apply for the Senior Site Reliability Engineer role at Canonical Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. Our...


  • región centro jalisco, México NTT DATA North America A tiempo completo

    Job Overview SRE – Site Reliability Engineer We are currently seeking a Site Reliability Engineer to join our team in Guadalajara, Jalisco, Mexico. In this role you will perform L1.5 activities including monitoring, deployment, and rollback. You will monitor the efficiency of Azure cloud systems to prevent outages and initiate an Incident Management bridge...


  • región centro jalisco, México Oracle A tiempo completo

    We are looking for a skilled and motivated Cloud Region Build Site Reliability Engineer (SRE) to join our Oracle Cloud Infrastructure Region Build team. In this role, you will be responsible for building, deploying, and maintaining compute cloud infrastructure services across multiple regions to ensure high availability, scalability, and performance. You...


  • región centro jalisco, México GrainChain Inc A tiempo completo

    ¡Estamos en busca de nuevos talentos! GrainChain es una empresa tecnológica dedicada a reducir la brecha digital en la industria agrícola. Nuestras plataformas facilitan las transacciones de manera rápida, seguras y sencillas para nuestros usuarios. Estamos en búsqueda de un Site Reliability Engineer capaz de integrar y automatizar las áreas de...


  • región centro jalisco, México F5 A tiempo completo

    Site Reliability Engineer – Incident Management Join to apply for the Site Reliability Engineer – Incident Management role at F5 . At F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate...


  • región centro jalisco, México Wizeline A tiempo completo

    Senior Site Reliability Engineer (AWS/Airflow) We Are Wizeline, a global AI-native technology solutions provider, develops cutting‑edge AI‑powered digital products and platforms. We partner with clients to leverage data and AI, accelerating market entry and driving business transformation. As a global community of innovators, we foster a culture of...


  • región centro jalisco, México F5 Networks, Inc. A tiempo completo

    At F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity, from protecting consumers from fraud to enabling companies to focus on innovation. The Reliability Engineer will be...