Principal Site Reliability Engineer

hace 2 semanas


Zapopan, México Oracle A tiempo completo

**Responsibilities**- Solve complex problems related to Linux infrastructure and Oracle Cloud Infrastructure- Act as escalation point for critical issues that may not have a documented procedure and provide Root Cause Analysis (RCA)- Understand the end-to-end configuration, technical dependencies, characteristics of production infrastructure and services- Quickly grasp and analyze new technologies that are complex and rapidly changing and integrate those into automation and infrastructure support- Design and delivery of mission critical automation, with focus on security, resiliency, scale, and performance.- Identify opportunities and drive the implementation of automation to improve service health, availability and reliability- Author functional and technical documentation and standard operating producers (SOP)- Collaborate with development teams in defining and implementing improvements in service architecture.- Articulate technical characteristics of services and technology areas and guide cross-functional teams to engineer and add capabilities to internal tools.- Partner with DevOps teams, Oracle Cloud Infrastructure deployment, development teams to identify and resolve issues.**Knowledge Skills**- 6- 12 years of experience in Site Reliability Engineering and automation.- Experience in Linux Administration with good knowledge on Kernel level debugging- Experience in debugging operating system performance issues and performance tuning- Experience working with fault tolerant, highly available, high throughput, distributed and scalable systems- Expertise in developing scripts, utilities and tools to automate routine or manual intensive tasks- Experience in cloud infrastructure technologies- Experience in operations and problem management- Development experience using Python and building Infrastructure using Terraform- Experience of working with global teams across different time zones.- Possess and demonstrates strong logical-thinking skill, full of intellectual curiosity and high for self-development.- Aptitude to be a good team player and the desire to learn and implement new Cloud technologies as needed- Good understanding of Agile software development principles including using common tools such as JIRA- Good understanding of cloud security, compliance management including patching- Excellent organizational, verbal, and written communication skills**Qualifications required**- 6 to 12 years of experience working in IT Operations\Infrastructure team- Bachelor degree in Computer Science, Computer Engineering, Software Engineering, or related areas is preferred



  • Zapopan, Jalisco, México Oracle A tiempo completo

    We are looking for a skilled and motivated Cloud Region Build Site Reliability Engineer (SRE) to join our Oracle Cloud Infrastructure Region Build team. In this role, you will be responsible for building, deploying, and maintaining compute cloud infrastructure services across multiple regions to ensure high availability, scalability, and performance. You...

  • Site Reliability Engineer

    hace 3 semanas


    Zapopan, México Grainchain Inc A tiempo completo

    ¡Te estamos buscando, únete a GrainChain!Estamos en búsqueda de un Site Reliability Engineer capaz de integrar y automatizar las áreas de desarrollo y operaciones, asegurando la calidad y la entrega de soluciones de software.Somos una empresa de tecnología que ayuda a la industria agrícola a cerrar la brecha digital, con diferentes plataformas que...


  • Zapopan, México GrainChain Inc A tiempo completo

    ¡Te estamos buscando, únete a GrainChain!Estamos en búsqueda de un Site Reliability Engineer capaz de integrar y automatizar las áreas de desarrollo y operaciones, asegurando la calidad y la entrega de soluciones de software.Somos una empresa de tecnología que ayuda a la industria agrícola a cerrar la brecha digital, con diferentes plataformas que...


  • Zapopan, México Oracle A tiempo completo

    Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate...


  • Zapopan, México Oracle A tiempo completo

    Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    DescriptionAs a senior member of the Site Reliability Engineering (SRE) team, you'll take ownership of highly available systems, influence service design, and work across teams to drive resiliency, automation, and operational excellence. This is a hands-on engineering role where deep infrastructure knowledge meets software engineering expertise, ideal for...

  • Site Reliability Engineer

    hace 2 semanas


    Zapopan, México Oracle A tiempo completo

    Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate...


  • Zapopan, México BairesDev A tiempo completo

    WinDifferent specializes in helping businesses achieve rapid and sustainable growth through our powerful proprietary marketing system. Our data-driven solutions generate positive engagement that leads to ready-to-close opportunities, massively expanding sales pipelines and enabling companies to scale faster than the competition. As one of WinDifferent's...


  • Zapopan, México Oracle A tiempo completo

    Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate...


  • Zapopan, México Oracle A tiempo completo

    Technical SkillsStrong knowledge of Exadata, Real Application Clusters, Oracle database, Storage, and Linux fundamentals.Oracle Exadata Database Machine and Oracle Cloud Infrastructure (OCI) Certifications - PreferredKnowledge of network fundamentals such as VCN, Ethernet, RoCE, TCP/IP, routing, DHCP etc.Experience automating management of Linux based...