Principal Site Reliability Developer

hace 1 semana


Zapopan, México Oracle A tiempo completo

Be comfortable with mission-critical production issues and manage customer anxiety appropriately. We would like to see some combination of the following skills:

- 5+ years of software design or development experience or DevOps role with distributed, highly-scalable, maximum availability (HA, brownout), multi-node environments (partitioning, isolation with vlan, pkeys, qinq, vrf, evpn)
- Oncall
- Knowledge of server virtualization technologies: Xen, KVM Linux containers, docker including vnuma, domain groups, SR-IOV
- Knowledge of Linux kernel internals (memory management, scheduler, builds), TCP/IP Networking stack, Infiniband/ OFED Architecture (RDS, RoCE V2, OCFS2), Filesystems/volumes
- Familiar with x86 systems, network switches from either Cisco, Arista, Juniper, Mellanox, L3 top of switch routing (OSPF, BGP), Mellanox HCAs (CX3, CX5 and newer) programmer's guide
- Experience working with Cloud infrastructure APIs, REST API model, and developing REST APIs
- Demonstrate experience with Java, as well as strong experience with scripting languages such as Python, Bash.
- Strong troubleshooting and performance tuning skills, OPS or system administration
- Knowledge on any of the following areas is a plus:

- Understanding the latest features of Exadata / Engineered systems, Oracle Grid Infrastructure and Database is a plus
- Familiar with OpenStack and/or other Cloud infrastructure products is a plus
- Understanding and experience of Cloud Networking & Security (like Application Firewall, IPSec VPN, NAT, IPv6, websockets, TLS, certificates, and tunneling protocols) architectures
- Strong understanding of I/O characteristics and storage systems
- A background in multi-tenant service offering and concepts on Service Level Availability a strong plus
- PCI, HIPAA audits, UK gov, security vulnerabilities remediation

Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.



  • Zapopan, México Oracle A tiempo completo

    Applicants are required to read, write, and speak the following languages: English **Role**: Site Reliability Engineer **Location**: Guadalajara preferred **Who are we looking for?** **Roles and Responsibilities** - Perform DevOps activities to support customers, engineers, and processes through our release cycles as well as production - Participate in a...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    Be comfortable with mission-critical production issues and manage customer anxiety appropriately.We would like to see some combination of the following skills: 5+ years of software design or development experience or DevOps role with distributed, highlyscalable, maximum availability (HA, brownout), multinode environments (partitioning, isolation with vlan,...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    Responsibilities Solve complex problems related to Linux infrastructure and Oracle Cloud Infrastructure Act as a partner concern point for critical issues that may not have a detailed procedure and provide Root Cause Analysis (RCA) Understand the end-to-end configuration, technical dependencies, characteristics of production infrastructure and...


  • Zapopan, México Oracle A tiempo completo

    **Responsibilities** - Solve complex problems related to Linux infrastructure and Oracle Cloud Infrastructure - Act as a partner concern point for critical issues that may not have a detailed procedure and provide Root Cause Analysis (RCA) - Understand the end-to-end configuration, technical dependencies, characteristics of production infrastructure and...


  • Zapopan, México Oracle A tiempo completo

    **Responsibilities** - Solve complex problems related to Linux infrastructure and Oracle Cloud Infrastructure - Act as escalation point for critical issues that may not have a documented procedure and provide Root Cause Analysis (RCA) - Understand the end-to-end configuration, technical dependencies, characteristics of production infrastructure and...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    Responsibilities Solve complex problems related to Linux infrastructure and Oracle Cloud Infrastructure Act as a partner concern point for critical issues that may not have a detailed procedure and provide Root Cause Analysis (RCA) Understand the endtoend configuration, technical dependencies, characteristics of production infrastructure and services Quickly...


  • Zapopan, México Oracle A tiempo completo

    **Job Description**: Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the critical stack, with focus...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    ResponsibilitiesJob DescriptionSolve complex problems related to Linux infrastructure and Oracle Cloud Infrastructure Act as a partner concern point for critical issues that may not have a detailed procedure and provide Root Cause Analysis (RCA)Understand the end-to-end configuration, technical dependencies, characteristics of production infrastructure and...


  • Zapopan, Jalisco, México myGwork - LGBTQ+ Business Community A tiempo completo

    This inclusive employer is a member of myGwork – the largest global platform for the LGBTQ+ business community. ResponsibilitiesSolve complex problems related to Linux infrastructure and Oracle Cloud Infrastructure Act as a partner concern point for critical issues that may not have a detailed procedure and provide Root Cause Analysis (RCA)Understand the...


  • Zapopan, Jalisco, México myGwork - LGBTQ+ Business Community A tiempo completo

    This inclusive employer is a member of myGwork – the largest global platform for the LGBTQ+ business community. ResponsibilitiesSolve complex problems related to Linux infrastructure and Oracle Cloud Infrastructure Act as a partner concern point for critical issues that may not have a detailed procedure and provide Root Cause Analysis (RCA)Understand the...


  • Zapopan, México Oracle A tiempo completo

    Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security,...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    Job Description:Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the critical stack, with focus on...


  • Zapopan, Jalisco, México Ll Oefentherapie A tiempo completo

    Are you interested in the exciting challenges of building and operating large-scale distributed infrastructure for the cloud? Oracle's Cloud Infrastructure is building its next generation of cloud technologies that operate in a broadly distributed, highly available, highly scalable, multi-tenant environment. Our mission is to provide our customers with an...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    Job DescriptionmdclpJoinOCIMXWe are looking to recruit a Site Reliability Engineer to the established Oracle Cloud Infrastructure (OCI) Enterprise Engineering team. The successful candidate will be located in Mexico and will mainly be responsible for defining and deploying key services with deep focus on architecture, production operations, performance...


  • Zapopan, México Oracle A tiempo completo

    Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate...

  • Site Reliability Engineer

    hace 4 semanas


    Zapopan, México GrainChain Inc A tiempo completo

    ¡Te estamos buscando, únete a GrainChain! Estamos en búsqueda de un Site Reliability Engineer capaz de integrar y automatizar las áreas de desarrollo y operaciones, asegurando la calidad y la entrega de soluciones de software. Somos una empresa de tecnología que ayuda a la industria agrícola a cerrar la brecha digital, con diferentes plataformas que...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    Job DescriptionAre you interested in the exciting challenges of building and operating large-scale distributed infrastructure for the cloud? Oracle's Cloud Infrastructure is building its next generation of cloud technologies that operate in a broadly distributed, highly available, highly scalable, multi-tenant environment. Our mission is to provide our...


  • Zapopan, Jalisco, México myGwork - LGBTQ+ Business Community A tiempo completo

    This inclusive employer is a member of myGwork – the largest global platform for the LGBTQ+ business community. Are you interested in the exciting challenges of building and operating large-scale distributed infrastructure for the cloud? Oracle's Cloud Infrastructure is building its next generation of cloud technologies that operate in a broadly...


  • Zapopan, Jalisco, México myGwork - LGBTQ+ Business Community A tiempo completo

    This inclusive employer is a member of myGwork – the largest global platform for the LGBTQ+ business community. Are you interested in the exciting challenges of building and operating large-scale distributed infrastructure for the cloud? Oracle's Cloud Infrastructure is building its next generation of cloud technologies that operate in a broadly...