Senior Site Reliability Engineer

hace 4 semanas


Guadalajara, México Tech Holding A tiempo completo

**About us**:
Working at Tech Holding isn't just a job, it's an opportunity to be a part of something bigger. We are a full-service consulting firm that was founded on the premise of delivering predictable outcomes and high-quality solutions to our clients. Our founders and team members have industry experience and have held senior positions in a wide variety of companies - from emerging startups to large Fortune 50 firms - and we have taken our combined experiences and developed a unique approach that is supported by the principles of deep expertise, integrity, transparency, and dependability.

**The Role**:
**Responsibilities**:
**Site Reliability Engineering**:

- Partner with development teams to implement best practices for building reliable and scalable systems.
- Stay up-to-date on the latest SRE trends and technologies.

**Monitoring and Observability**:

- Design, implement, and maintain robust monitoring solutions using tools like Prometheus and Grafana.
- Develop and configure alerts within tools like PagerDuty to ensure timely notification of potential issues.

**Incident Management**:

- Lead incident response, ensuring timely resolution and minimizing downtime.
- Document and communicate incident details effectively to stakeholders.
- Conduct post-incident reviews to identify root causes and implement preventative measures.

**Service Level Agreements (SLAs)**:

- Collaborate with product and engineering teams to define clear and measurable SLAs for our SaaS offerings.
- Establish Service Level Objectives (SLOs) for key metrics based on SLA requirements.
- Define Service Level Indicators (SLIs) to track progress towards achieving SLOs.
- Monitor SLO compliance and proactively identify potential SLA breaches.

**Automation**:

- Identify opportunities for automation to improve efficiency and reliability.
- Develop and implement automation scripts using tools like Python or Bash.
- Automate routine tasks and incident response workflows.

**Cross-Team Collaboration**:

- Act as a liaison between SRE, Product, Security, Application Engineering, and Customer Operations teams.
- Facilitate communication and information sharing across teams to ensure smooth operations.
- Work collaboratively to define and implement solutions that meet the needs of all stakeholders.

**Mentorship and Knowledge Sharing**:

- Mentor and collaborate with junior SRE engineers.
- Share knowledge and best practices within the team.
- Contribute to the development and documentation of internal SRE processes.

**Required Skills**:

- 5-8 years of experience as a Site Reliability Engineer (SRE) or related role.
- Experience with cloud platform GCP
- Proven experience with monitoring tools like Prometheus and Grafana.
- Strong understanding of incident management best practices.
- Experience with alerting tools like PagerDuty.
- Experience with scripting languages like Python or Bash for automation.
- Excellent communication and collaboration skills.
- Ability to work independently and as part of a team.
- Strong problem-solving and analytical skills.
- Passion for building reliable and scalable systems.

**Nice to Have**:

- Experience with container orchestration platforms like Kubernetes.
- Experience with chaos engineering principles.
- Experience with configuration management tools like Ansible or Chef.

**What we offer**:

- Remote Work Opportunities
- Flexible Work Hours



  • Guadalajara, Jalisco, México Arrive Logistics A tiempo completo

    Who We AreWho We WantAs a Senior Site Reliability Engineer for Arrive Logistics, you will be responsible for building a purposeful, proactive, and sustainable approach to reliability based on core SRE principles and practices. Your role covers the entire life-cycle of a product: from helping engineering teams with architecture and delivery to on-call...


  • Guadalajara, México Nextiva Mexico A tiempo completo

    At Nextiva, we create connected communication tools that help businesses stay in touch with their customers and teams. Over 100,000 companies rely on Nextiva for phone service and customer management tools. We're not your parent's phone company. Founded in 2008, Nextiva took on the trillion-dollar telecom industry and succeeded in changing the game by...


  • Guadalajara, México Finastra USA Corporation A tiempo completo

    **Responsibilities**: **What will you contribute?** As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra’s Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture where the primary objective is continuous improvement. You’ll be treating operations as a software...


  • Guadalajara, México C3 AI A tiempo completo

    We are looking for a Senior Site Reliability Engineer to join our team in Guadalajara. **Responsibilities**: - Maximize system uptime and availability, ensuring functional and performance SLAs. - Establish end-to-end monitoring and alerting on all critical aspects. - Solve complex problems for critical services and build automation to prevent problem...


  • Guadalajara, Jalisco, México Finastra Usa Corporation A tiempo completo

    Responsibilities:What will you contribute?As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra's Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture where the primary objective is continuous improvement. You'll be treating operations as a software engineering...

  • Site Reliability Engineer

    hace 3 semanas


    Guadalajara, Jalisco, México Finastra A tiempo completo

    ResponsibilitiesWhat will you contribute?As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra's Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture where the primary objective is continuous improvement. You'll be treating operations as a software engineering problem...


  • Guadalajara, Jalisco, México Finastra A tiempo completo

    ResponsibilitiesWhat will you contribute?As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra's Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture where the primary objective is continuous improvement. You'll be treating operations as a software engineering problem...


  • Guadalajara, Jalisco, México C3 AI A tiempo completo

    We are looking for a Senior Site Reliability Engineer to join our team in Guadalajara.Responsibilities: Maximize system uptime and availability, ensuring functional and performance SLAs. Establish endtoend monitoring and alerting on all critical aspects. Solve complex problems for critical services and build automation to prevent problem recurrence....


  • Guadalajara, México myGwork - LGBTQ+ Business Community A tiempo completo

    This inclusive employer is a member of myGwork – the largest global platform for the LGBTQ+ business community. ResponsibilitiesWhat will you contribute?As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra's Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture...


  • Guadalajara, Jalisco, México myGwork - LGBTQ+ Business Community A tiempo completo

    This inclusive employer is a member of myGwork – the largest global platform for the LGBTQ+ business community. ResponsibilitiesWhat will you contribute?As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra's Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture...

  • Site Reliability Engineer

    hace 3 semanas


    Guadalajara, Jalisco, México myGwork - LGBTQ+ Business Community A tiempo completo

    This inclusive employer is a member of myGwork – the largest global platform for the LGBTQ+ business community. ResponsibilitiesWhat will you contribute?As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra's Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture...

  • Senior Cloud Engineer

    hace 5 días


    Guadalajara, Jalisco, México Laagencia A tiempo completo

    Regular Employee Oracle Hoy Senior DevOps / SRE SaaS Cloud Engineer Regular Employee Zapopan, Jalisco Oracle Hoy Oracle's Energy and Water DevOps team is seeking talented DevOps/ Site Reliability engineers getting opportunity to work on SaaS cloud platform. Do you care about the planet? We are delivering Saa... Senior Cloud Operations Engineer Regular...


  • Guadalajara, Jal., México Finastra A tiempo completo

    Responsibilities What will you contribute? As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra’s Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture where the primary objective is continuous improvement. You’ll be treating operations as a software engineering...


  • Guadalajara, Jalisco, México Tech Holding A tiempo completo

    About us:Working at Tech Holding isn't just a job, it's an opportunity to be a part of something bigger. We are a full-service consulting firm that was founded on the premise of delivering predictable outcomes and high-quality solutions to our clients. Our founders and team members have industry experience and have held senior positions in a wide variety of...

  • Site Reliability Engineer

    hace 2 semanas


    Guadalajara, México Finastra A tiempo completo

    Your deliverables as a Site Reliability Engineer will include, but are not limited to, the following: - Work with containers and container orchestration systems such as Kubernetes - Capacity Planning to determine resource requirements of your service for it to be scalable, efficient, and reliable - Collaborate with other engineers to implement operational...

  • Site Reliability Engineer

    hace 3 semanas


    Guadalajara, México FreshBooks A tiempo completo

    The Opportunity - Site Reliability Engineer The Infrastructure team at FreshBooks is looking for talented and experienced engineers to help us build and support our cloud-native infrastructure. Join our growing organization and you will get a chance to be in the driving seat of innovation and change at FreshBooks. As a Site Reliability Engineer, you’ll...


  • Guadalajara, México f5 A tiempo completo

    Everything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive. Position Summary Software engineering is a core discipline at F5 for many roles. As a software engineer specializing in site reliability,...


  • Guadalajara, México Finastra USA Corporation A tiempo completo

    **Responsibilities**: **What will you contribute?** As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra’s Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture where the primary objective is continuous improvement. You’ll be treating operations as a software...


  • Guadalajara, Jalisco, México BMC Software A tiempo completo

    Basic Information:CountryMexicoStateJaliscoCityGuadalajaraDate Published15-May-2023Job ID37236Travel AmountNoneDescription and Requirements:LI-LH1BMC helps customers run and reinvent their businesses in the digital age by tackling their IT management challenges, championing their innovation, and celebrating their success.Every BMC employee has the potential...


  • Guadalajara, México F5 A tiempo completo

    **The systems reliability **engineer will be responsible to incorporate aspects of software engineeringand applies them to infrastructure and operationsproblems. This position will focus on the engineering and support for single sign on (SSO) and Azure cloud-based infrastructure. The main goal of a systems reliability engineer is to create scalableand highly...