Principal Site Reliability Engineer
hace 2 semanas
**Responsibilities**- Solve complex problems related to Linux infrastructure and Oracle Cloud Infrastructure- Act as a partner concern point for critical issues that may not have a detailed procedure and provide Root Cause Analysis (RCA)- Understand the end-to-end configuration, technical dependencies, characteristics of production infrastructure and services- Quickly grasp and analyze new technologies that are sophisticated and constantly evolving and integrate those into automation and infrastructure support- Design and delivery of mission-critical automation, with a focus on security, resiliency, scale, and performance.- See opportunities and drive the implementation of automation to improve service health, availability and reliability- Author functional and technical documentation and standard operating producers (SOP)- Collaborate with development teams in defining and implementing improvements in service architecture.- Articulate technical characteristics of services and technology areas and guide multi-functional teams to engineer and add capabilities to internal tools.- Partner with DevOps teams, Oracle Cloud Infrastructure deployment, and development teams to identify and resolve issues.**Knowledge Skills**- Proven experience in Site Reliability Engineering and automation.- Experience in Linux Administration with good knowledge of Kernel-level debugging- Experience in debugging operating system performance issues and performance tuning- Experience working with fault-tolerant, highly available, high-efficiency, distributed and scalable systems- Expertise in developing scripts, utilities, and tools to automate routine or manual intensive tasks- Experience in cloud infrastructure technologies- Experience in operations and problem management- Development experience using Python and building Infrastructure using Terraform- Experience working with global teams across different time zones.- Possesses and demonstrates strong logical-thinking skills, full of intellectual curiosity and high for self-development.- Ability to be a good teammate and the desire to learn and implement new Cloud technologies as needed- Good understanding of Agile software development principles including using common tools such as JIRA- Good understanding of cloud security, and compliance management including patching- Excellent interpersonal, verbal, and written communication skills**Qualifications required**- Proven experience working in IT Operations\Infrastructure team- Bachelor degree in Computer Science, Computer Engineering, Software Engineering, or related areas is helpful
-
Site Reliability Engineer
hace 1 semana
Zapopan, México Oracle A tiempo completoAbout The Job:At Oracle, we're seeking a talented and skilled Site Reliability Engineer to work on Oracle Cloud Observability and Management platform.As a Site Reliability Engineer, you will solve interesting technical challenges by designing, deploying, and troubleshooting key Cloud services, platforms, and infrastructure, always thinking about reliability,...
-
Principal Site Reliability Engineer
hace 1 semana
Zapopan, México Oracle A tiempo completo**Responsibilities** - Solve complex problems related to Linux infrastructure and Oracle Cloud Infrastructure - Act as escalation point for critical issues that may not have a documented procedure and provide Root Cause Analysis (RCA) - Understand the end-to-end configuration, technical dependencies, characteristics of production infrastructure and...
-
Principal Site Reliability Engineer
hace 2 semanas
Zapopan, México Oracle A tiempo completoThe role provides a mixture of production platform Operations ownership as well as engineering. You will solve challenging technical problems, identify improvements, and work on implementing your recommendations. You will also work directly with high-level developers on projects and work to blur the lines between traditional system operations and development...
-
Principal Site Reliability Engineer
hace 1 semana
Zapopan, México Oracle A tiempo completoThe role provides a mixture of production platform Operations ownership as well as engineering. You will solve challenging technical problems, identify improvements, and work on implementing your recommendations. You will also work directly with high-level developers on projects and work to blur the lines between traditional system operations and development...
-
Site Reliability Developer 4
hace 1 día
Zapopan, Jalisco, México Oracle A tiempo completoWe are looking for a skilled and motivated Cloud Region Build Site Reliability Engineer (SRE) to join our Oracle Cloud Infrastructure Region Build team. In this role, you will be responsible for building, deploying, and maintaining compute cloud infrastructure services across multiple regions to ensure high availability, scalability, and performance. You...
-
Site Reliability Engineer
hace 4 semanas
Zapopan, México GrainChain Inc A tiempo completo¡Te estamos buscando, únete a GrainChain!Estamos en búsqueda de un Site Reliability Engineer capaz de integrar y automatizar las áreas de desarrollo y operaciones, asegurando la calidad y la entrega de soluciones de software.Somos una empresa de tecnología que ayuda a la industria agrícola a cerrar la brecha digital, con diferentes plataformas que...
-
Site Reliability Engineer
hace 1 día
Zapopan, México GrainChain Inc A tiempo completo¡Te estamos buscando, únete a GrainChain! Estamos en búsqueda de un Site Reliability Engineer capaz de integrar y automatizar las áreas de desarrollo y operaciones, asegurando la calidad y la entrega de soluciones de software. Somos una empresa de tecnología que ayuda a la industria agrícola a cerrar la brecha digital, con diferentes plataformas que...
-
Sr Principal Site Reliability Developer
hace 3 semanas
Zapopan, México Oracle A tiempo completoSolve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate...
-
Site Reliability Developer
hace 6 días
Zapopan, México Oracle A tiempo completoWe are looking for a skilled and motivated Cloud Region Build Site Reliability Engineer (SRE) to join our Oracle Cloud Infrastructure Region Build team. In this role, you will be responsible for building, deploying, and maintaining compute cloud infrastructure services across multiple regions to ensure high availability, scalability, and performance. You...
-
Site Reliability Developer
hace 6 días
Zapopan, México Oracle A tiempo completoWe are looking for a skilled and motivated Cloud Region Build Site Reliability Engineer (SRE) to join our Oracle Cloud Infrastructure Region Build team. In this role, you will be responsible for building, deploying, and maintaining compute cloud infrastructure services across multiple regions to ensure high availability, scalability, and performance. You...