Site Reliability Engineer
hace 3 semanas
We are hiring a development-oriented, collaborative, and detail-focused Site Reliability Engineer (SRE) responsible for solving operational, scalability, and reliability challenges. In this role, you will apply software engineering methodologies to system administration processes and collaborate with software engineers and product developers to optimize system performance, stability, and reliability. The ideal candidate will focus on improving and automating operational tasks while ensuring system availability and scalability. You will manage critical aspects such as latency, performance efficiency, monitoring, emergency response, capacity planning, and change management alongside your team. We are seeking a proactive individual with strong leadership, resource administration, and communication skills who thrives in a team-oriented environment. A background in development, combined with hands-on SRE or DevOps experience, is essential. What will you do? Gain a deep understanding of our platform, how it serves our clients, and how they interact with it. Monitor and maintain system availability, performance, and overall health. Build tools and systems to automate infrastructure management and operations. Run production environments with a holistic view of reliability, uptime, and scalability. Implement Infrastructure as Code (IaC) using tools like Terraform. Develop and manage CI/CD pipelines for seamless code integration and deployment. Create and maintain robust monitoring, alerting, and logging frameworks using tools such as New Relic, SumoLogic, Pingdom, CloudWatch, and CloudTrail. Lead incident response efforts, perform root cause analysis, and implement preventative measures. Participate in on-call rotations and ensure proper incident management and escalation. Collaborate with developers to enhance release processes, testing, and deployment automation. Document operational processes and create detailed runbooks/playbooks for emergency response. Measure and optimize system performance using SLOs, SLIs, and key metrics. Requirements: 5–7 years of proven experience in a Site Reliability Engineering or DevOps role. Bachelor's Degree in Computer Science, Engineering, or related field, or equivalent practical experience. Advanced English communication skills, both verbal and written. Background in software development (no longer a full-time developer but with hands-on past experience). Expert-level experience with AWS (mandatory) and cloud-native technologies. Strong understanding of Linux system internals, networking, distributed systems, and service-oriented architectures. Proficiency in containerization and orchestration technologies (e.g., Docker, Kubernetes). Hands-on experience with Infrastructure as Code (IaC), particularly with Terraform. Experience with relational databases (MSSQL, MySQL, Aurora MySQL) and NoSQL (especially DynamoDB). Knowledge of observability concepts, including metrics, logging, tracing, SLOs, and SLIs. Familiarity with CI/CD tools (e.g., Jenkins, CodePipeline, CodeDeploy). Ability to lead and influence technical decisions in a cross-functional team environment. Proactive mindset with strong problem-solving and automation skills. Passion for continuous improvement, scalability, and operational excellence.
-
Site Reliability Engineer
hace 2 semanas
Ciudad de México UST A tiempo completoJoin to apply for the Site Reliability Engineer role at UST Continue with Google Continue with Google Join to apply for the Site Reliability Engineer role at UST Get AI-powered advice on this job and more exclusive features. Sign in to access AI-powered advices Continue with Google Continue with Google Continue with Google Continue with Google Continue with...
-
Site Reliability Engineer
hace 2 días
Ciudad de México Royal Caribbean Group A tiempo completoJoin to apply for the Site Reliability Engineer role at Royal Caribbean Group 1 week ago Be among the first 25 applicants Join to apply for the Site Reliability Engineer role at Royal Caribbean Group Get AI-powered advice on this job and more exclusive features. Journey with us! Combine your career goals and sense of adventure by joining our incredible team...
-
Site Reliability Engineer
hace 3 semanas
Ciudad de México Atos A tiempo completo**Job Applicant Privacy Notice**:**Site Reliability Engineer**:- Publication Date: Jan 14, 2025- Ref. No: - Location: Mexico City, MXEviden, part of the Atos Group, with an annual revenue of circa € 5 billion is a global leader in data-driven, trusted and sustainable digital transformation. As a next generation digital business with worldwide leading...
-
Site Reliability Engineer
hace 2 días
Ciudad de México Royal Caribbean Group A tiempo completoPress Tab to Move to Skip to Content Link Select how often (in days) to receive an alert: Site Reliability Engineer Journey with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group . We are proud to offer a competitive compensation and benefits package, and excellent career development...
-
Site Reliability Engineer
hace 2 días
Ciudad de México Tata Consultancy Services A tiempo completoWe are looking for a Site Reliability Engineer (SRE) to join our team and help us ensure seamless, high-performing, and reliable technology operations. What you’ll work with: Azure DevOps - Pipelines, repositories, and automation ServiceNow - Incident, change, and problem management AppDynamics - Application performance monitoring and alerting Microsoft...
-
Site Reliability Engineer
hace 3 semanas
Ciudad de México Royal Caribbean Group A tiempo completo**Journey with us!** Combine your career goals and sense of adventure by joining our incredible team of employees at **Royal Caribbean Group** We are proud to offer a competitive compensation and benefits package and excellent career development opportunities each offering unique ways to explore the worldWe are proud to be the vacation-industry leader with...
-
Senior Site Reliability Engineer
hace 2 días
Ciudad de México Royal Caribbean Group A tiempo completoPress Tab to Move to Skip to Content Link Select how often (in days) to receive an alert: Senior Site Reliability Engineer Journey with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group . We are proud to offer a competitive compensation and benefits package, and excellent career...
-
Senior Site Reliability Engineer
hace 2 días
Ciudad de México Royal Caribbean Group A tiempo completoJoin to apply for the Senior Site Reliability Engineer role at Royal Caribbean Group . 1 week ago Be among the first 25 applicants. Journey with us! Combine your career goals and sense of adventure by joining our incredible team at Royal Caribbean Group . We offer a competitive compensation and benefits package, along with excellent career development...
-
Site Reliability Engineer
hace 6 días
Ciudad de México Thomson Reuters A tiempo completoAre you passionate about the chance to bring your extensive technical experience to drive the Site Reliability Engineering team using industry best practices in a world class company? Thomson Reuters ONESOURCE Platform’s SRE team is looking for a Site Reliability Engineer who will provide hands-on technical skills and share industry best practices with...
-
Site Reliability Engineer
hace 6 días
Ciudad de México Thomson Reuters A tiempo completoAre you passionate about the chance to bring your extensive technical experience to drive the Site Reliability Engineering team using industry best practices in a world class company? Thomson Reuters ONESOURCE Platform’s SRE team is looking for a Site Reliability Engineer who will provide hands-on technical skills and share industry best practices with...