Principal Site Reliability Developer

hace 2 días


Zapopan, México Oracle A tiempo completo

Work with an elite team to provide Oracle Database Administration support for customer production systems in the Oracle Cloud, with the opportunity to work on the latest Oracle database releases and features as part of the cloud first strategy. Provide DBA operational support with a high degree of customer service, technical expertise, and timeliness. Deliver accurate and creative solutions to customer problems to ensure customer success.

Must be a motivated, team-player with the ability to work multi-functional with a “can-do” and “do-it-right” attitude, a self-starter, and constantly thinking of ways to improve the work.

Leading contributor individually and as a team member. 7+ years of demonstrated ability supporting relational databases as a DBA, with a focus in Oracle and related tools. Experience with an organization with a key 24 X 7 reliance on its databases is preferred..

Roles & Responsibilities:

- Alert and diagnostic monitoring
- Participate 24x7 on call support on a weekly rotating schedule, including weekends
- Work closely with Product Development teams and provide feedback to improve product quality
- Develop administration automation and monitoring scripts as needed
- Develop and maintain Standard Operating Procedures/documentation
- Provide escalation support for database related issues
- Share knowledge and mentor junior team members as needed
- Attend weekly team meetings and participate team building activities
- Other duties, as assigned

Key Requirements include:

- Bachelor's degree in Computer Science or equivalent
- Must have 7+ years of extensive database management experience with solid sense of ownership, vitality and drive
- Willing to work in a 24x7 production mission critical environment, including weekends
- Experience with Oracle database 12c, 19c, RAC, ASM, Data Guard and Performance tuning
- Experience with Enterprise Manager for monitoring and administering critically important databases
- Experience with database backup and recovery
- Experience with database patching and upgrades
- Experience in any Unix shell script, Perl script and Python
- Experience in Oracle SQL/PLSQL
- Experience administering databases on Exadata is a strong plus
- Experience with Public Cloud
- Understanding of Rest APIs
- Knowledge of standard processes and DEVOPS methodologies in an always-up, always-available service
- Strong troubleshooting skills to investigate errors and performance bottlenecksExcellent written, verbal communication and documentation skills

Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.

Career Level - IC4

Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.



  • Zapopan, Jalisco, México Oracle A tiempo completo

    We are looking for a skilled and motivated Cloud Region Build Site Reliability Engineer (SRE) to join our Oracle Cloud Infrastructure Region Build team. In this role, you will be responsible for building, deploying, and maintaining compute cloud infrastructure services across multiple regions to ensure high availability, scalability, and performance. You...


  • Zapopan, México Oracle A tiempo completo

    Work with the Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, focusing on...


  • Zapopan, México Oracle A tiempo completo

    As part of the Site Reliability Engineering (SRE) team, you’ll contribute to designing, automating, and evolving mission-critical systems. You'll combine deep systems expertise with modern software engineering practices to reduce operational toil and build resilient, self-healing services.This is a high-impact role where your work directly affects the...


  • Zapopan, México Oracle A tiempo completo

    Technical SkillsStrong knowledge of Exadata, Real Application Clusters, Oracle database, Storage, and Linux fundamentals.Oracle Exadata Database Machine and Oracle Cloud Infrastructure (OCI) Certifications - PreferredKnowledge of network fundamentals such as VCN, Ethernet, RoCE, TCP/IP, routing, DHCP etc.Experience automating management of Linux based...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    DescriptionSolve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems....


  • Zapopan, México Oracle A tiempo completo

    Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate...


  • Zapopan, México Oracle A tiempo completo

    Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    DescriptionWork with an elite team to provide Oracle Database Administration support for customer production systems in the Oracle Cloud, with the opportunity to work on the latest Oracle database releases and features as part of the cloud first strategy.   Provide DBA operational support with a high degree of customer service, technical expertise, and...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    DescriptionWe are looking for a skilled and motivated Cloud Region Build Site Reliability Engineer (SRE) to join our Oracle Cloud Infrastructure Region Build team. In this role, you will be responsible for building, deploying, and maintaining compute cloud infrastructure services across multiple regions to ensure high availability, scalability, and...


  • Zapopan, México Oracle A tiempo completo

    **Responsibilities**- Solve complex problems related to Linux infrastructure and Oracle Cloud Infrastructure- Act as escalation point for critical issues that may not have a documented procedure and provide Root Cause Analysis (RCA)- Understand the end-to-end configuration, technical dependencies, characteristics of production infrastructure and services-...