Cloud Service Reliability Engineer
hace 3 semanas
Thales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more. More than 30,000 organizations already rely on us to verify the identities of people and things, grant access to digital services, analyze vast quantities of information and encrypt data to make the connected world more secure.As a **Cloud Service Reliability Engineer**, you will drive the execution and evolution of our Cloud Service Quality (CSQ) framework with astrong emphasis on service reliability, Cloud operations, and best-in-class customer experience. Your role involves implementing CloudService Quality program components across all CPL Cloud SaaS products, collaborating with cross-functional teams to understand andevaluate Cloud Service risks, monitor reliability standards, and measure service performance.**Key Responsibilities**:- Service Reliability: Implement Cloud Service Quality program components to provide an objective and measurableassessment of cloud service health, as well as to identify best practices to improve operational excellence.- Incident Management: Lead post-incident analysis to continuously improve the reliability and quality of Cloud services byconducting root cause analysis, implementing corrective and preventative actions for incidents affecting service performance,and ensuring mínimal service disruption during outages.- Data Analytics and KPIs/Metrics: Develop, maintain, and conduct data analytics by defining and implement insightful businessmetrics, key performance indicators (KPIs) and dashboards using PowerBI. Monitor KPIs for service resiliency (SLA, MeanTime, Root Cause) and service delivery to inform strategic decisions and drive improvements, including analyzing operationaldata to enhance cloud performance.SLI/SLO Implementation: Provide expertise to assist teams in identifying and implementing effective Service Level Indicators(SLIs) and Service Level Objectives (SLOs) to align with business goals and user experience, with a focus on Cloudoperational metrics.- Managed Supplier Program: Assist in implementing a supplier relationship program for critical cloud service providers, definingfirm metrics/targets for responsiveness, root cause analysis (RCA), prevention, measuring supplier performance, and settingclear expectations for maintenance and issue resolution, including collaboration with suppliers to enhance operationalreliability.- Collaboration: Collaborate with cross-functional teams to understand and evaluate cloud service risks, providingrecommendations to enhance resilience and performance.- Continuous Improvement: Monitor and track progress of continuous improvement actions in both service reliability and Cloudoperational practices, ensuring their effective implementation.- Reporting: Participate in management meetings and provide quality related updates and insights to the management team.Secondary Responsibilities:- Software Quality Support: Contribute to implementing software quality program components and maintaining quality standardsacross our software products.- PowerBI Maintenance: Support the maintenance of PowerBI visualizations and reports related to software quality metrics.**Qualifications**:- Bachelor’s degree in computer science, engineering, or a related field.- Proven experience in Cloud Service reliability engineering or a similar role.- Knowledge of Cloud platforms (e.g., AWS, Azure, GCP) and understanding of Cloud operations best practices.- Proficiency in PowerBI, data analytics, scripting or programming.- Familiarity with QA methodologies, such as DevOps, Scaled Agile, and CI/CD models.- Excellent problem-solving and communication skillEducation- Bachelor’s degree (or similar) with a concentration in a discipline that focuses on problem-solving, data-analytics, cloudservice quality, Information Systems, or equivalent experience.Competencies- Data-driven decision-making and visualization.- Microsoft Office Suite: Word, PowerPoint, Excel, PowerBI.
-
Customer Service Reliability Engineer
hace 2 semanas
Federal, México Thales A tiempo completoThales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more. More than 30,000...
-
Customer Service Reliability Engineer
hace 2 semanas
Federal, México Thales A tiempo completoThales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more. More than 30,000...
-
Cloud Customer Reliability Engineer
hace 7 días
distrito federal (polanco), México Thales A tiempo completoA global technology company is seeking a Customer Reliability Engineer based in Polanco, Mexico City. The role involves ensuring excellent customer experiences by managing incidents and service requests efficiently. Candidates should have a Bachelor’s degree in a related field and at least 5 years of relevant experience. Proficiency in public cloud...
-
Site Reliability Engineer
hace 2 semanas
distrito federal, México Royal Caribbean Group A tiempo completoTalent Acquisition @Royal Caribbean Group Journey with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group . We are proud to offer a competitive compensation and benefits package, and excellent career development opportunities, each offering unique ways to explore the world. We are proud...
-
Site Reliability Engineer
hace 1 día
distrito federal, México Sur Global A tiempo completoSite Reliability Engineer - 100% Remote in Mexico As the Site Reliability Engineer you will support and scale the infrastructure powering their secure, mission‑critical SaaS platform. You must be confident in operating and debugging both modern infrastructure (cloud‑native, containerized services) and classic Windows production environments (IIS, SQL...
-
Customer Reliability Engineer
hace 3 días
Federal, México Thales A tiempo completoThales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more. More than 30,000...
-
Site Reliability Engineer
hace 2 semanas
distrito federal, México BairesDev A tiempo completoSite Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev Site Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev Get AI-powered advice on this job and more exclusive features. At BairesDev, we've been...
-
Senior SRE: Cloud-Native Reliability
hace 7 días
distrito federal, México AgileEngine A tiempo completoA leading technology firm in Mexico City is looking for a Site Reliability Engineer to design secure, scalable cloud-native systems. You will develop infrastructure solutions, enhance CI/CD practices, and ensure system reliability through effective DevSecOps strategies. This role offers the chance to work with innovative technologies in a collaborative...
-
Cloud System Engineer
hace 2 semanas
distrito federal, México Infor A tiempo completoJoin to apply for the Cloud System Engineer role at Infor 4 days ago Be among the first 25 applicants The Cloud System Engineer Associate is responsible for delivering Cloud Operations services, including maintaining applications and technical environments, troubleshooting issues, applying configuration changes and patches, and managing data copies. This...
-
Site Reliability Engineer
hace 7 días
distrito federal, México W3Global A tiempo completoSite Reliability Engineer Join to apply for the Site Reliability Engineer role at W3Global Required qualifications: AWS experience Gitlab Terraform or AWS CDK Python Familiarity with GO Linux OS administration advanced scripting - bash Windows OS administration advanced scripting - powershell Seniority level Entry level Employment type Full-time Job function...