Senior Reliability Engineer
hace 3 días
Senior Reliability Engineer-22000DX9
**Applicants are required to read, write, and speak the following languages***: English
**Preferred Qualifications**
Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.
Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the effect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.
A BS or MS in Computer Science, or equivalent. Identifies solutions to knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance. Experience running large scale customer facing web services. Identifies solutions to understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies. Work involves defining and documenting technical architecture of complex and highly scalable products. A minimum of 5+ years experience of running large scale customer facing web services.
Be comfortable with mission critical production issues and manage customer anxiety appropriately. We would like to see some combination of the following skills:
- 5+ years of software design or development experience or devops role with distributed, highly-scalable, maximum availability (HA, brownout), multi-node environments (partitioning, isolation with vlan, pkeys, qinq, vrf, evpn)
- Oncall
- Knowledge of server virtualization technologies: Xen, KVM Linux containers, docker including vnuma, domain groups, SR-IOV
- Knowledge of Linux kernel internals (memory management, scheduler, builds), TCP/IP Networking stack, Infiniband/ OFED Architecture (RDS, RoCE V2, OCFS2), Filesystems/volumes
- Familiar with x86 systems, network switches from either Cisco, Arista, Juniper, Mellanox, L3 top of switch routing (OSPF, BGP), Mellanox HCAs (CX3, CX5 and newer) programmer's guide
- Experience working with Cloud infrastructure APIs, REST API model, and developing REST APIs
- Demonstrate experience with Java, as well as strong experience with scripting languages such as Python, Bash.
- Strong troubleshooting and performance tuning skills, OPS or system administration
Knowledge on any of the following areas is a plus:
- Understand latest features of Exadata / Engineered systems, Oracle Grid Infrastructure and Database is a plus
- Familiar with Openstack and/or other Cloud infrastructure products is a plus
- Understanding and experience of Cloud Networking & Security (like Application Firewall, IPSec VPN, NAT, IPv6, websockets, TLS, certificates, tunneling protocols) architectures
- Strong understanding of I/O characteristics and storage systems
- A background in multi-tenant service offering and concepts on Service Level Availability a strong plus
- PCI, HIPAA audits, UK gov, security vulnerabilities remediation
**Detailed Description and Job Requirements**
Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performa
-
Senior Network Reliability Engineer
hace 7 meses
Guadalajara, México AstraZeneca A tiempo completo**Senior Network Reliability Engineer** **About the AstraZeneca** AstraZeneca is a global, innovation-driven biopharmaceutical business that focuses on the discovery, development, and commercialization of prescription medicines for some of the world's most serious diseases. But we're more than one of the world's leading pharmaceutical companies. At...
-
Senior Network Reliability Engineer
hace 7 meses
Guadalajara, México AstraZeneca A tiempo completo**WHY JOIN US** We’re a network of entrepreneurial self-starters who contribute to something far bigger. There’s a diversity of expertise in our Technology group that’s unique to AstraZeneca - it allows us to dive deep into exploring new leading-edge technology. A place to be open and transparent - we speak up, think creatively, and share ideas. Our...
-
Site Reliability Engineer
hace 2 meses
Guadalajara, Jalisco, México Tech Holding A tiempo completoAbout UsTech Holding is a full-service consulting firm that delivers predictable outcomes and high-quality solutions to clients. Our team has industry experience and holds senior positions in various companies, including emerging startups and large Fortune 50 firms.Our unique approach is supported by the principles of deep expertise, integrity, transparency,...
-
Sr Site Reliability Engineer
hace 2 meses
Guadalajara, México f5 A tiempo completoEverything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive. Business/Job Title: Senior Site Reliability Engineer Position Summary Software engineering is a core discipline at F5 for many roles. As a...
-
Senior Site Reliability Engineer
hace 7 meses
Guadalajara, México C3 AI A tiempo completoWe are looking for a Senior Site Reliability Engineer to join our team in Guadalajara. **Responsibilities**: - Maximize system uptime and availability, ensuring functional and performance SLAs. - Establish end-to-end monitoring and alerting on all critical aspects. - Solve complex problems for critical services and build automation to prevent problem...
-
Senior Site Reliability Engineer
hace 5 días
Guadalajara, México Nextiva A tiempo completoAt Nextiva, we create connected communication tools that help businesses stay in touch with their customers and teams. Over 100,000 companies rely on Nextiva for phone service and customer management tools. We're not your parent's phone company. Founded in 2008, Nextiva took on the trillion-dollar telecom industry and succeeded in changing the game by...
-
Senior Site Reliability Engineer
hace 6 meses
Guadalajara, México Tech Holding A tiempo completo**About us**: Working at Tech Holding isn't just a job, it's an opportunity to be a part of something bigger. We are a full-service consulting firm that was founded on the premise of delivering predictable outcomes and high-quality solutions to our clients. Our founders and team members have industry experience and have held senior positions in a wide...
-
Senior Reliability Engineer
hace 6 días
Guadalajara, México Oracle A tiempo completoSenior Reliability Engineer-22000E28 **Applicants are required to read, write, and speak the following languages***: English **Preferred Qualifications** The Database cloud service team can provide you the opportunity to build and operate a suite of massive scale, integrated cloud services in a broadly distributed, multi-tenant cloud environment. Oracle...
-
Site Reliability Engineer
hace 2 meses
Guadalajara, México f5 A tiempo completoEverything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive. - Site Reliability Engineer III Why do you want to join our team? - Everything we do centers around people. That means we obsess over how to...
-
Site Reliability Engineer
hace 6 meses
Guadalajara, México Finastra USA Corporation A tiempo completo**Responsibilities**: **What will you contribute?** As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra’s Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture where the primary objective is continuous improvement. You’ll be treating operations as a software...
-
Senior Network Reliability Engineer
hace 4 meses
Guadalajara, México AstraZeneca A tiempo completo**Positions are open to Mexican Citizens and official residents of Mexico.** **Location: Guadalajara (hybrid)** **Strong English interpersonal skills required**: **About the AstraZeneca**: **AstraZeneca is a global, science-led, patient-focused pharmaceutical company that focuses on the discovery, development, and commercialization of prescription medicines...
-
Senior Network Reliability Engineer
hace 2 semanas
Guadalajara, Jalisco, México Capgemini Engineering A tiempo completoAbout the RoleWe are seeking a seasoned Senior Network Reliability Engineer to join our Capgemini Engineering team. In this role, you will be responsible for designing, implementing, and maintaining highly available and scalable network infrastructure for our production and development environments.Key ResponsibilitiesDesign and implement large-scale network...
-
Site Reliability Engineer
hace 3 semanas
Guadalajara, México Valce Talent Solutions A tiempo completoWe are looking for a Lead Site Reliability Engineer who takes the initiative on developing and maintain the system and services for our Cash Management Platform, automating the deployment process, ensuring system scaling, investigating and resolving outdates, identifying and implementing preventive measures proactively, collaborating with key stakeholders,...
-
Site Reliability Engineer
hace 2 meses
Guadalajara, México f5 A tiempo completoEverything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive. Business/Job Title: Site Reliability Engineer - IAM - III Position Summary: Software engineering is a core discipline at F5 for many...
-
Senior .NET Software Engineer
hace 1 mes
Guadalajara, Jalisco, México Cognizant A tiempo completoSenior .NET Software Engineer - MexicoWe are seeking a highly skilled and experienced individual to join our team as a Senior .NET Software Engineer in Mexico City, Guadalajara, and Monterrey.Overview:Cognizant is a leading global company that provides consulting and IT services. As a Senior .NET Software Engineer, you will play a crucial role in supporting...
-
Senior Ios Engineer
hace 3 días
Guadalajara, México Brillio A tiempo completo**Senior iOS Engineer**: **About Brillio**: **Senior iOS Engineer** **Primary Skills**: - iOS Native **Secondary Skills**: - Jenkins, Objective C, Swift **Specialization**: - Mobile - iOS: Senior Engineer, XT **Job requirements**: - Experience building software with Redux or other unidirectional state management paradigms - Experience writing with...
-
Systems Reliability Engineer Ii
hace 3 días
Guadalajara, México f5 A tiempo completoEverything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive. But our success isn’t driven solely by what we do. We also care deeply about how we do it. At F5, our culture is how we live, every single...
-
Site Reliability Engineer
hace 7 meses
Guadalajara, México Finastra A tiempo completoYour deliverables as a Site Reliability Engineer will include, but are not limited to, the following: - Work with containers and container orchestration systems such as Kubernetes - Capacity Planning to determine resource requirements of your service for it to be scalable, efficient, and reliable - Collaborate with other engineers to implement operational...
-
Infrastructure Engineer
hace 2 meses
Guadalajara, Jalisco, México Broadridge A tiempo completoBroadridge fosters a culture where innovation meets reliability, empowering associates to drive scalable solutions.**Job Overview**We are seeking an experienced Infrastructure Engineer - Site Reliability to join our team. As a key member of our SRE group, you will be responsible for designing and implementing scalable and highly reliable software...
-
Site Reliability Engineer
hace 2 días
Guadalajara, México Oracle A tiempo completoSite Reliability Engineer-2200059K **Applicants are required to read, write, and speak the following languages**: English, Spanish **Preferred Qualifications** Are you a seasoned Site Reliability Engineer or Cloud DevOps guru? Are you a backup, restore and recovery expert? If you are, we are looking for you to join our exciting growing Cloud DevOps...