Principal Site Reliability Engineer

hace 1 semana


Zapopan, México Oracle A tiempo completo

**Responsibilities**
- Solve complex problems related to Linux infrastructure and Oracle Cloud Infrastructure
- Act as escalation point for critical issues that may not have a documented procedure and provide Root Cause Analysis (RCA)
- Understand the end-to-end configuration, technical dependencies, characteristics of production infrastructure and services
- Quickly grasp and analyze new technologies that are complex and rapidly changing and integrate those into automation and infrastructure support
- Design and delivery of mission critical automation, with focus on security, resiliency, scale, and performance.
- Identify opportunities and drive the implementation of automation to improve service health, availability and reliability
- Author functional and technical documentation and standard operating producers (SOP)
- Collaborate with development teams in defining and implementing improvements in service architecture.
- Articulate technical characteristics of services and technology areas and guide cross-functional teams to engineer and add capabilities to internal tools.
- Partner with DevOps teams, Oracle Cloud Infrastructure deployment, development teams to identify and resolve issues.

**Knowledge Skills**
- 6- 12 years of experience in Site Reliability Engineering and automation.
- Experience in Linux Administration with good knowledge on Kernel level debugging
- Experience in debugging operating system performance issues and performance tuning
- Experience working with fault tolerant, highly available, high throughput, distributed and scalable systems
- Expertise in developing scripts, utilities and tools to automate routine or manual intensive tasks
- Experience in cloud infrastructure technologies
- Experience in operations and problem management
- Development experience using Python and building Infrastructure using Terraform
- Experience of working with global teams across different time zones.
- Possess and demonstrates strong logical-thinking skill, full of intellectual curiosity and high for self-development.
- Aptitude to be a good team player and the desire to learn and implement new Cloud technologies as needed
- Good understanding of Agile software development principles including using common tools such as JIRA
- Good understanding of cloud security, compliance management including patching
- Excellent organizational, verbal, and written communication skills

**Qualifications required**
- 6 to 12 years of experience working in IT Operations\Infrastructure team
- Bachelor degree in Computer Science, Computer Engineering, Software Engineering, or related areas is preferred



  • Zapopan, México Oracle A tiempo completo

    About The Job:At Oracle, we're seeking a talented and skilled Site Reliability Engineer to work on Oracle Cloud Observability and Management platform.As a Site Reliability Engineer, you will solve interesting technical challenges by designing, deploying, and troubleshooting key Cloud services, platforms, and infrastructure, always thinking about reliability,...


  • Zapopan, México Oracle A tiempo completo

    **Responsibilities**- Solve complex problems related to Linux infrastructure and Oracle Cloud Infrastructure- Act as a partner concern point for critical issues that may not have a detailed procedure and provide Root Cause Analysis (RCA)- Understand the end-to-end configuration, technical dependencies, characteristics of production infrastructure and...


  • Zapopan, México Oracle A tiempo completo

    The role provides a mixture of production platform Operations ownership as well as engineering. You will solve challenging technical problems, identify improvements, and work on implementing your recommendations. You will also work directly with high-level developers on projects and work to blur the lines between traditional system operations and development...


  • Zapopan, México Oracle A tiempo completo

    The role provides a mixture of production platform Operations ownership as well as engineering. You will solve challenging technical problems, identify improvements, and work on implementing your recommendations. You will also work directly with high-level developers on projects and work to blur the lines between traditional system operations and development...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    We are looking for a skilled and motivated Cloud Region Build Site Reliability Engineer (SRE) to join our Oracle Cloud Infrastructure Region Build team. In this role, you will be responsible for building, deploying, and maintaining compute cloud infrastructure services across multiple regions to ensure high availability, scalability, and performance. You...

  • Site Reliability Engineer

    hace 4 semanas


    Zapopan, México GrainChain Inc A tiempo completo

    ¡Te estamos buscando, únete a GrainChain!Estamos en búsqueda de un Site Reliability Engineer capaz de integrar y automatizar las áreas de desarrollo y operaciones, asegurando la calidad y la entrega de soluciones de software.Somos una empresa de tecnología que ayuda a la industria agrícola a cerrar la brecha digital, con diferentes plataformas que...


  • Zapopan, México GrainChain Inc A tiempo completo

    ¡Te estamos buscando, únete a GrainChain! Estamos en búsqueda de un Site Reliability Engineer capaz de integrar y automatizar las áreas de desarrollo y operaciones, asegurando la calidad y la entrega de soluciones de software. Somos una empresa de tecnología que ayuda a la industria agrícola a cerrar la brecha digital, con diferentes plataformas que...


  • Zapopan, México Oracle A tiempo completo

    Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate...


  • Zapopan, México Oracle A tiempo completo

    We are looking for a skilled and motivated Cloud Region Build Site Reliability Engineer (SRE) to join our Oracle Cloud Infrastructure Region Build team. In this role, you will be responsible for building, deploying, and maintaining compute cloud infrastructure services across multiple regions to ensure high availability, scalability, and performance. You...


  • Zapopan, México Oracle A tiempo completo

    We are looking for a skilled and motivated Cloud Region Build Site Reliability Engineer (SRE) to join our Oracle Cloud Infrastructure Region Build team. In this role, you will be responsible for building, deploying, and maintaining compute cloud infrastructure services across multiple regions to ensure high availability, scalability, and performance. You...