Lead Site Reliability Engineer
hace 2 semanas
Journey with us
Combine your career goals and sense of adventure by joining our incredible team of employees at
Royal Caribbean Group
. We are proud to offer a competitive compensation and benefits package, and excellent career development opportunities, each offering unique ways to explore the world.
We are proud to be the vacation-industry leader with global brands — including Royal Caribbean International, Celebrity Cruises and Silversea Cruises — the most innovative fleet and private destinations, and the best people. Together, we are dedicated to turning the vacation of a lifetime into a lifetime of vacations for our guests.
Royal Caribbean Group's
Global eCommerce
has an exciting career opportunity for a full time
Lead Site Reliability Engineer
reporting to the
Sr. Manager, Site Reliability Engineer
This position will work on-site in Mexico City .
Position Summary:
Lead Site Reliability Engineer (Lead SRE) will assist the SRE Manager in support of the Royal Caribbean website ($183M gross revenue in 2021) using application and user performance data to guide informed decision making. The Lead SRE will use site performance metrics collected by various sources and tools to support the following tasks: the initial triage of critical production incidents, analysis of bugs, implementing best practices in site reliability engineering, optimizing infrastructure, ensuring seamless collaboration between internal teams and external service providers, among other operational initiatives.
Essential Duties and Responsibilities
:
Critical Incident Support
Review ticket analysis and approve closure of tickets/incidents
Understands architecture of Royal website and escalates incidents as needed to the appropriate team for further triage.
Synthesizes and communicates incident details to the production team, stakeholders, including executive level stakeholders.
Review postmortem / RCA document and follow up
Monitor and Optimize Systems
Builds case for prioritizing bug and enhancement tickets
Create reports on new deployment build performance for product teams to ensure quality.
Ensure System Reliability and Performance
Adjust health thresholds and other monitoring settings based on historical performance.
Creates and maintains performance dashboards used by support and product teams.
Maintains alerting, communication, and documentation tool chain to ensure it is up to date and efficient.
Collaboration with Cross-Functional Teams
Establish and maintain clear communication channels (e.g., Slack, Teams) with the scrum and marketing teams.
Ensure all team members are informed about relevant updates and changes that may affect the website.
Qualifications, Knowledge and Skills
:
Experience
Minimum Years of Experience
: 10+ years in Site Reliability Engineering (SRE), DevOps, or a related IT operations role.
Management Experience:
At least 3 years of experience managing teams and collaborating with external service providers.
Skills and Abilities
Technical Expertise:
Proficiency in cloud platforms such as AWS, AWS Elastic Beanstalk.
Understanding of API design principles: REST, SOAP, Graph
Advanced knowledge of monitoring and logging tools (AppDynamics, DataDog,
Splunk, New Relic, etc).
Problem-Solving Skills:
Strong analytical and troubleshooting skills to diagnose and resolve complex
production issues swiftly.
Ability to develop and implement effective incident response plans.
Communication and Collaboration
:
Excellent written and verbal communication skills for effective interaction with cross-
functional teams and documentation.
Ability to collaborate with Development, QA, IT, and external managed service
providers to ensure seamless operations.
Education
Bachelor's Degree:
In Computer Science, Information Technology, Engineering, or a related field.
Certifications
Preferred Certifications
:
Any monitoring and alerting tools equivalent to certification
Any certification or equivalent knowledge of IT service management
We know there's a lot to consider.
As you go through the application process, our recruiters will be glad to provide guidance, and more relevant details to answer any additional questions. Thank you again for your interest in Royal Caribbean Group. We'll hope to see you onboard soon
It is the policy of the Company to ensure equal employment and promotion opportunity to qualified candidates without discrimination or harassment on the basis of race, color, religion, sex, age, national origin, disability, sexual orientation, sexuality, gender identity or expression, marital status, or any other characteristic protected by law. Royal Caribbean Group and each of its subsidiaries prohibit and will not tolerate discrimination or harassment.
#LI-SS1
-
Site Reliability Engineer
hace 2 semanas
Ciudad Juárez, México Royal Caribbean Group A tiempo completoJourney with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group. We are proud to offer a competitive compensation and benefits package, and excellent career development opportunities, each offering unique ways to explore the world. We are proud to be the vacation-industry leader with...
-
Site Reliability Engineer
hace 2 semanas
Juárez, Juárez, Chih., México Yochana A tiempo completoWe're Hiring | Site Reliability Engineer (SRE) Hybrid | Mexico City Full-Time We are looking for a highly skilled Site Reliability Engineer (SRE) to join our team in Mexico City. This role is ideal for a proactive engineer with strong AWS expertise, a passion for automation, and a solid background in systems reliability, scalability, and performance. Key...
-
Site Reliability Engineer
hace 2 semanas
Ciudad de México UST A tiempo completoJoin to apply for the Site Reliability Engineer role at UST Continue with Google Continue with Google Join to apply for the Site Reliability Engineer role at UST Get AI-powered advice on this job and more exclusive features. Sign in to access AI-powered advices Continue with Google Continue with Google Continue with Google Continue with Google Continue with...
-
Site Reliability Engineer
hace 1 día
Ciudad de México Royal Caribbean Group A tiempo completoJoin to apply for the Site Reliability Engineer role at Royal Caribbean Group 1 week ago Be among the first 25 applicants Join to apply for the Site Reliability Engineer role at Royal Caribbean Group Get AI-powered advice on this job and more exclusive features. Journey with us! Combine your career goals and sense of adventure by joining our incredible team...
-
Site Reliability Engineer
hace 2 semanas
Ciudad de México Atos A tiempo completo**Job Applicant Privacy Notice**:**Site Reliability Engineer**:- Publication Date: Jan 14, 2025- Ref. No: - Location: Mexico City, MXEviden, part of the Atos Group, with an annual revenue of circa € 5 billion is a global leader in data-driven, trusted and sustainable digital transformation. As a next generation digital business with worldwide leading...
-
Site Reliability Engineer
hace 1 día
Ciudad de México Royal Caribbean Group A tiempo completoPress Tab to Move to Skip to Content Link Select how often (in days) to receive an alert: Site Reliability Engineer Journey with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group . We are proud to offer a competitive compensation and benefits package, and excellent career development...
-
Site Reliability Engineer
hace 2 semanas
Juárez, México Thomson Reuters A tiempo completo**Senior Site Reliability Engineer**Are you passionate about the chance to bring your experience to a world-class company that is market-leading or both content and technology? If yes, we’re looking for you.Join our team! We are looking for a Senior Site Reliability Engineer, to join our Service Reliability and Operation group. We provide innovative team...
-
Site Reliability Engineer
hace 1 día
Ciudad de México Tata Consultancy Services A tiempo completoWe are looking for a Site Reliability Engineer (SRE) to join our team and help us ensure seamless, high-performing, and reliable technology operations. What you’ll work with: Azure DevOps - Pipelines, repositories, and automation ServiceNow - Incident, change, and problem management AppDynamics - Application performance monitoring and alerting Microsoft...
-
Site Reliability Engineer
hace 3 semanas
Naucalpan de Juárez, México Ford Motor Company A tiempo completoSite Reliability Engineering at Ford Motor Company plays a critical role in maintaining and improving the reliability, scalability, and performance of our services.You will work closely with our development teams to build and maintain large-scale, distributed systems and ensure our products meet our high standards for availability and user experience.**Basic...
-
Site Reliability Engineer
hace 3 semanas
Ciudad de México Royal Caribbean Group A tiempo completo**Journey with us!** Combine your career goals and sense of adventure by joining our incredible team of employees at **Royal Caribbean Group** We are proud to offer a competitive compensation and benefits package and excellent career development opportunities each offering unique ways to explore the worldWe are proud to be the vacation-industry leader with...