Principal Site Reliability Developer

hace 7 días


Zapopan, Jalisco, México Oracle A tiempo completo
Description

Work with an elite team to provide Oracle Database Administration support for customer production systems in the Oracle Cloud, with the opportunity to work on the latest Oracle database releases and features as part of the cloud first strategy.   Provide DBA operational support with a high degree of customer service, technical expertise, and timeliness.  Deliver accurate and creative solutions to customer problems to ensure customer success.

Must be a motivated, team-player with the ability to work multi-functional with a "can-do" and "do-it-right" attitude, a self-starter, and constantly thinking of ways to improve the work.

Leading contributor individually and as a team member.  7+ years of demonstrated ability supporting relational databases as a DBA, with a focus in Oracle and related tools.  Experience with an organization with a key 24 X 7 reliance on its databases is preferred..

Roles & Responsibilities:

  • Alert and diagnostic monitoring
  • Participate 24x7 on call support on a weekly rotating schedule, including weekends
  • Work closely with Product Development teams and provide feedback to improve product quality
  • Develop administration automation and monitoring scripts as needed
  • Develop and maintain Standard Operating Procedures/documentation
  • Provide escalation support for database related issues
  • Share knowledge and mentor junior team members as needed
  • Attend weekly team meetings and participate team building activities
  • Other duties, as assigned

Key Requirements include: 

  • Bachelor's degree in Computer Science or equivalent
  • Must have 7+ years of extensive database management experience with solid sense of ownership, vitality and drive
  • Willing to work in a 24x7 production mission critical environment, including weekends
  • Experience with Oracle database 12c, 19c, RAC, ASM, Data Guard and Performance tuning
  • Experience with Enterprise Manager for monitoring and administering critically important databases
  • Experience with database backup and recovery
  • Experience with database patching and upgrades
  • Experience in any Unix shell script, Perl script and Python
  • Experience in Oracle SQL/PLSQL
  • Experience administering databases on Exadata is a strong plus
  • Experience with Public Cloud
  • Understanding of Rest APIs
  • Knowledge of standard processes and DEVOPS methodologies in an always-up, always-available service
  • Strong troubleshooting skills to investigate errors and performance bottlenecks
  • Excellent written, verbal communication and documentation skills

Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.

Career Level - IC4

Responsibilities

Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.

Qualifications

Career Level - IC4



  • Zapopan, Jalisco, México Oracle A tiempo completo

    We are looking for a skilled and motivated Cloud Region Build Site Reliability Engineer (SRE) to join our Oracle Cloud Infrastructure Region Build team. In this role, you will be responsible for building, deploying, and maintaining compute cloud infrastructure services across multiple regions to ensure high availability, scalability, and performance. You...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    DescriptionSolve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems....


  • Zapopan, Jalisco, México Oracle A tiempo completo

    DescriptionWe are looking for a skilled and motivated Cloud Region Build Site Reliability Engineer (SRE) to join our Oracle Cloud Infrastructure Region Build team. In this role, you will be responsible for building, deploying, and maintaining compute cloud infrastructure services across multiple regions to ensure high availability, scalability, and...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    Job DescriptionWe are looking for a skilled and motivated Cloud Region Build Site Reliability Engineer (SRE) to join our Oracle Cloud Infrastructure Region Build team. In this role, you will be responsible for building, deploying, and maintaining compute cloud infrastructure services across multiple regions to ensure high availability, scalability, and...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    DescriptionAs a senior member of the Site Reliability Engineering (SRE) team, you'll take ownership of highly available systems, influence service design, and work across teams to drive resiliency, automation, and operational excellence. This is a hands-on engineering role where deep infrastructure knowledge meets software engineering expertise, ideal for...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    DescriptionSolve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems....


  • Zapopan, Jalisco, México Oracle A tiempo completo

    Description"The Oracle Cloud Infrastructure (OCI) team can provide you the opportunity to build and operate a suite of massive scale, integrated cloud services in a broadly distributed, multi-tenant cloud environment.  OCI is committed to providing the best in cloud products that meet the needs of our customers who are tackling some of the world's biggest...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    Job DescriptionAs a senior member of the Site Reliability Engineering (SRE) team, you'll take ownership of highly available systems, influence service design, and work across teams to drive resiliency, automation, and operational excellence. This is a hands-on engineering role where deep infrastructure knowledge meets software engineering expertise, ideal...


  • Zapopan, Jalisco, México Oracle A tiempo completo

    Work with an elite team to provide Oracle Database Administration support for customer production systems in the Oracle Cloud, with the opportunity to work on the latest Oracle database releases and features as part of the cloud first strategy.   Provide DBA operational support with a high degree of customer service, technical expertise, and timeliness. ...

  • Site Reliability

    hace 2 semanas


    Zapopan, Jalisco, México Canonical - Jobs A tiempo completo

    Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers,...