Site Reliability Developer

hace 2 semanas


Guadalajara, México Oracle A tiempo completo

Site Reliability Developer-22000D5T**Applicants are required to read, write, and speak the following languages***: English**Preferred Qualifications**Site Reliability Engineer - IC3**Oracle's Demonstration Services****Site Reliability Engineering Team****As part of the SRE team you...**- Monitor and manage uptime, end-to-end performance and operability of all service processes and dependent infrastructure to meet SLAs- Solve complex problems related to infrastructure cloud services to prevent problem recurrence.- Contribute to making our infrastructure simple, reliable, and easy to operate- Document your system knowledge as you acquire it over time, create runbooks, and ensure critical system information is readily available to those who need it and turn into repeatable actions-and then into automation.- Conduct periodic on call duties, respond to production incidents and provide support for development to address customer incidents- Results driven; thrive in a development environment that is agile, collaborative and in start-up mode, even when faced with ambiguity- Need to possess a contagious sense of ownership and are capable of using all available tools to solve any issues you encounter- Participate in the development of tools and processes that leverage observability best practices to proactively identify and resolve issues before they become incidents- Model and maintain our Autonomous Data Warehouse and data flows. You will also develop, maintain and debug our internal reporting system on Oracle Analytics Cloud**Your Skills...**- You have a bachelor’s degree in Computer Science, Software Engineering, Information Systems or equivalent and 4+ years of relevant work experience.- You have worked in an SRE/DevOps role and managed highly complex production environments at scale- You have practical experience with continuous integration and continuous delivery methodologies, using tools like GitLab, Jenkins, or others- You have hands-on experience with orchestration and configuration management tools such as Ansible, Terraform, Puppet, or others- Experience in monitoring and analyzing infrastructure performance using standard performance monitoring tools - Prometheus, Alertmanager, Grafana- You are knowledgeable with network concepts - DNS, load balancing, VCN, firewall, proxy server, etc.- You are familiar with Linux and its administration life cycle - deployment, upgrading, compiling, and debugging- You are adept in one or more of the following languages: JavaScript, NodeJS, Java, Python, Perl, Go, Shell Scripting- You are able and willing to work in an on-call rotation that will include rotating weekend coverage- Ability to operate independently, make decisions, take action and take responsibility- Effective communication and interpersonal skills, ability to work and coordinate between multiple teams- Have a software-centric mindset**Your Bonus Skills...**- You have a master’s degree in Computer Science or related studies- You have experience in working with major cloud platform(s): Oracle Cloud, Microsoft Azure, Google Cloud Platform, or AWS - any certification(s) a plus- You have experience with Container and Container Management technologies: Docker, Kubernetes- You are adept with SQL, PL/SQL, and query performance tuning- You have experience with data modeling, analytics, and report building- You have a solid foundation in database administration and are comfortable with the complete database Life Cycle, including provisioning, backup and recovery, cloning, performance tuning, maintenance, and troubleshooting**Detailed Description and Job Requirements**Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demo



  • Guadalajara, México Oracle A tiempo completo

    Site Reliability Developer- Z**Applicants are required to read, write, and speak the following languages***: English**Preferred Qualifications**We're looking for a Site Reliability Engineer (SRE) to join our team and develop automated software solutions for the operational aspects of an organization.- Incorporate SRE and DevOps practices, to develop and...


  • Guadalajara, Jalisco, México Oracle A tiempo completo

    DescriptionSolve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems....


  • Guadalajara, México Oracle A tiempo completo

    Site Reliability Developer *******D2D**Applicants are required to read, write, and speak the following languages***: English**Preferred Qualifications**Department DescriptionCompute Substrate of OCI - We are Cloud builders.Compute is the core organization within OCI.We are responsible for providing the Compute hosts that power the Cloud.The Compute Substrate...


  • Guadalajara, México Oracle A tiempo completo

    Site Reliability Developer 3-22000D1W**Applicants are required to read, write, and speak the following languages***: English**Preferred Qualifications**The Oracle Cloud Infrastructure (OCI) team can provide you the opportunity to build and operate a suite of massive scale, integrated cloud services in a broadly distributed, multi-tenant cloud environment....

  • Site Reliability Engineer

    hace 2 semanas


    Guadalajara, México Valce Talent Solutions A tiempo completo

    We are looking for a Lead Site Reliability Engineer who takes the initiative on developing and maintain the system and services for our Cash Management Platform, automating the deployment process, ensuring system scaling, investigating and resolving outdates, identifying and implementing preventive measures proactively, collaborating with key stakeholders,...

  • Site Reliability Engineer

    hace 4 semanas


    Guadalajara, México Valce Talent Solutions A tiempo completo

    We are looking for a Lead Site Reliability Engineer who takes the initiative on developing and maintain the system and services for our Cash Management Platform, automating the deployment process, ensuring system scaling, investigating and resolving outdates, identifying and implementing preventive measures proactively, collaborating with key stakeholders,...

  • Site Reliability Engineer

    hace 4 semanas


    Guadalajara, México Valce Talent Solutions A tiempo completo

    We are looking for a Lead Site Reliability Engineer who takes the initiative on developing and maintain the system and services for our Cash Management Platform, automating the deployment process, ensuring system scaling, investigating and resolving outdates, identifying and implementing preventive measures proactively, collaborating with key stakeholders,...


  • Guadalajara, México Intel A tiempo completo

    Come and join a dynamic and challenging team within the Intel Data Center and Artificial Intelligence Group focused on engineering, developing, and supporting world class platforms and component building blocks aligned to the Data Center roadmap and strategies.We are seeking a well-rounded Site Reliability Engineer to work with a team of architects and other...

  • Site Reliability Engineer

    hace 3 semanas


    Guadalajara, México F5 A tiempo completo

    Everything we do centers around people.That means we obsess over how to make the lives of our customers, and their customers, better.And it means we prioritize a diverse F5 community where each individual can thrive.- Site Reliability Engineer IIIWhy do you want to join our team?- Everything we do centers around people.That means we obsess over how to make...


  • Guadalajara, México f5 A tiempo completo

    Everything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive. - Site Reliability Engineer III Why do you want to join our team? - Everything we do centers around people. That means we obsess over how to...