Site Reliability Engineer
hace 2 días
Senior Site Reliability Engineer We are looking for a Senior Site Reliability Engineer to build and maintain reliable, high‑capacity, and high‑performing systems that support our mission to protect and improve customer platforms, with a strong focus on reliability, security, performance, cost, and operational excellence. As a Site Reliability Engineer on a small team, you will collaborate in a DevOps model with product development teams, designing, deploying, and managing automation tools that improve predictability and accelerate time to market while reducing costs. Key Responsibilities Cloud Engineering Hands‑on design, analysis, development and troubleshooting of highly distributed large‑scale production systems and event‑driven, cloud‑based services. Ensure repeatability, traceability, and transparency of our infrastructure automation (infrastructure‑as‑code, monitoring‑as‑code). Participate in continual learning of the AWS ecosystem, game day scenarios, and professional conferences. Collaborative solutioning of enterprise applications with development teams utilizing our software stack. Actively monitor AWS cost and utilize optimizer to maximize ROI while maintaining Service Level Objectives. Observability Engineering Ownership of reliability, uptime, system security, cost, operations, capacity, resiliency and performance‑analysis thereof. Define, monitor and report on service level indicators for applications workloads. Support on‑call rotations for operational duties that have not been addressed with automation, with an eye for correcting issues that result in on‑call alarms. Maintain telemetry that improves the visibility to our applications’ performance and business metrics and keep operational workload in‑check. Develop, communicate, collaborate, and monitor standard processes to promote the long‑term health and sustainability of operational development tasks. DevSecOps Support healthy software development practices, including complying with agile software development methodology, building standards for code reviews, work packaging, and continuous delivery. Partner with Cybersecurity and develop plans and automation to respond to new risks and vulnerabilities. Collaborate with Systems Admins to coordinate middleware, network, storage, database, Windows, Linux, VMware maintenance. Automate legacy on‑prem system maintenance and migrate to cloud via thoughtful redesign. Collaborate with dev teams to identify failure points and blast radius of systems. Validate effectiveness of monitoring and observability configurations. Observe and document steady state production levels, growth patterns. Plan and forecast for seasonal growth, communicate trend lines with leadership, enhance infrastructure scaling plans to accommodate 2x planned load. Coordinate improvements of existing software and infrastructure to meet resiliency goals. Must Haves 3+ years of experience as a Site Reliability Engineer or in a similar role. Strong experience with Terraform, AWS EKS, and Kubernetes . Ability to work with stakeholders and experience leading P1 and P2 teams . Experience supporting on‑call rotations for operational duties that are not yet automated, with a focus on resolving issues that trigger on‑call alerts. Ability to collaborate effectively with cross‑functional teams. Nice to Haves (Team Stack) Code: Java, PHP, Node, and Golang Containers: ECS, Docker Cloud: Amazon AWS Telemetry: New Relic, CloudWatch Build: Jenkins, CircleCI, GitHub Actions Infrastructure‑As‑Code: CloudFormation Seniority level Mid‑Senior level Employment type Full‑time Job function Information Technology Industries IT Services and IT Consulting #J-18808-Ljbffr
-
Site Reliability Engineer
hace 3 días
Ciudad de México Atos A tiempo completo**Job Applicant Privacy Notice**:**Site Reliability Engineer**:- Publication Date: Jan 8, 2025- Ref. No: - Location: Mexico City, MX**_Site Reliability Engineer_**Certain Scripting experience in languages like Java or Python or Shell scripting.- +3 years of significant experience in working as Site Reliability Engineer- Strong in Terraform, Ansible, Packer,...
-
Site Reliability Engineer
hace 5 días
Ciudad de México Atos A tiempo completo**Job Applicant Privacy Notice**: **Site Reliability Engineer**: - Publication Date: Jan 8, 2025 - Ref. No: 523940 - Location: Mexico City, MX **_Site Reliability Engineer_** Certain Scripting experience in languages like Java or Python or Shell scripting. - +3 years of significant experience in working as Site Reliability Engineer - Strong in Terraform,...
-
Site Reliability Engineer
hace 4 semanas
Ciudad de México UST A tiempo completoJoin to apply for the Site Reliability Engineer role at UST Continue with Google Continue with Google Join to apply for the Site Reliability Engineer role at UST Get AI-powered advice on this job and more exclusive features. Sign in to access AI-powered advices Continue with Google Continue with Google Continue with Google Continue with Google Continue with...
-
Site Reliability Engineer
hace 2 semanas
Ciudad de México Royal Caribbean Group A tiempo completoJoin to apply for the Site Reliability Engineer role at Royal Caribbean Group 1 week ago Be among the first 25 applicants Join to apply for the Site Reliability Engineer role at Royal Caribbean Group Get AI-powered advice on this job and more exclusive features. Journey with us! Combine your career goals and sense of adventure by joining our incredible team...
-
Site Reliability Engineer
hace 3 días
Ciudad de México Atos A tiempo completo**Job Applicant Privacy Notice**: **Site Reliability Engineer**: - Publication Date: Jan 14, 2025 - Ref. No: 523941 - Location: Mexico City, MX Eviden, part of the Atos Group, with an annual revenue of circa € 5 billion is a global leader in data-driven, trusted and sustainable digital transformation. As a next generation digital business with worldwide...
-
Site Reliability Engineer
hace 2 semanas
Ciudad de México Royal Caribbean Group A tiempo completoPress Tab to Move to Skip to Content Link Select how often (in days) to receive an alert: Site Reliability Engineer Journey with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group . We are proud to offer a competitive compensation and benefits package, and excellent career development...
-
Site Reliability Engineer
hace 2 semanas
Ciudad de México Tata Consultancy Services A tiempo completoWe are looking for a Site Reliability Engineer (SRE) to join our team and help us ensure seamless, high-performing, and reliable technology operations. What you’ll work with: Azure DevOps - Pipelines, repositories, and automation ServiceNow - Incident, change, and problem management AppDynamics - Application performance monitoring and alerting Microsoft...
-
Senior Site Reliability Engineer
hace 2 semanas
Ciudad de México Royal Caribbean Group A tiempo completoPress Tab to Move to Skip to Content Link Select how often (in days) to receive an alert: Senior Site Reliability Engineer Journey with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group . We are proud to offer a competitive compensation and benefits package, and excellent career...
-
Senior Site Reliability Engineer
hace 2 semanas
Ciudad de México Royal Caribbean Group A tiempo completoJoin to apply for the Senior Site Reliability Engineer role at Royal Caribbean Group . 1 week ago Be among the first 25 applicants. Journey with us! Combine your career goals and sense of adventure by joining our incredible team at Royal Caribbean Group . We offer a competitive compensation and benefits package, along with excellent career development...
-
Site Reliability Engineer
hace 3 semanas
Ciudad de México Thomson Reuters A tiempo completoAre you passionate about the chance to bring your extensive technical experience to drive the Site Reliability Engineering team using industry best practices in a world class company? Thomson Reuters ONESOURCE Platform’s SRE team is looking for a Site Reliability Engineer who will provide hands-on technical skills and share industry best practices with...