Site Reliability Engineer
hace 2 semanas
Our Purpose Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we’re helping build asustainableeconomy where everyone can prosper. We support a wide range of digital payments choices, making transactionssecure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential. Title and Summary Site Reliability Engineer (Automation & virtualization) Site Reliability Engineer About the Role We’re looking for a passionate and skilled Site Reliability Engineer (SRE) to join our Platform Engineering team. This role is pivotal in automating and managing VMware ESXi hypervisors across Dell and Cisco UCS platforms, ensuring high reliability, scalability, and performance of our infrastructure. You’ll work at the intersection of infrastructure and software, driving automation, observability, and operational excellence across our virtualization stack. Key Responsibilities Hypervisor & Infrastructure Management Deploy, configure, and patch ESXi hosts using tools like VMware Update Manager, iDRAC, and UCS Central. Validate host readiness and enforce consistency across environments. Automation & Infrastructure as Code Build and maintain automation pipelines using PowerCLI, Python, Terraform, and Ansible. Develop Infrastructure-as-Code (IaC) templates for scalable provisioning. NSX & Network Integration Administer NSX-T/V for logical switching, routing, and micro-segmentation. Troubleshoot endpoint tagging and network performance issues between NSX and ESXi. Monitoring & Observability Implement observability stacks using Prometheus, Grafana, Splunk, and Dynatrace. Define and track SLOs, SLIs, and error budgets. Security & Compliance Planning & Optimization Lead modernization efforts including UCS blade decommissioning and Dell R760 upgrades. Optimize cluster and VM sizing for performance and cost efficiency. Collaboration & Stakeholder Engagement Partner with application, storage, and network teams to align infrastructure with workload needs. Communicate upgrade plans and maintenance schedules across teams. Documentation & Knowledge Sharing Maintain build guides, validation checklists, and operational runbooks. Contribute to internal wikis and onboarding materials. Required Skills 5+ years in SRE, DevOps, or Platform Engineering roles. Strong scripting in PowerCLI, Python, or Go. Experience with VMware ESXi, vCenter, NSX, and UCS Manager. Proficiency in Terraform, Ansible, and CI/CD pipeline tools. Familiarity with observability platforms and incident response workflows. Preferred Qualifications Experience with REST API integration for ESXi and vCenter. Knowledge of GitOps, AIOps, and chaos engineering practices. Certifications: VMware VCP, CKA/CKAD, or equivalent. Corporate Security Responsibility Abide by Mastercard’s security policies and practices; Ensure the confidentiality and integrity of the information being accessed; Report any suspected information security violation or breach, and Complete all periodic mandatory security trainings in accordance with Mastercard’s guidelines. #J-18808-Ljbffr
-
Site Reliability Engineer
hace 3 semanas
WorkFromHome, México KI people A tiempo completo18 hours ago Be among the first 25 applicants Direct message the job poster from KI people In Search of the Best Global IT & Digital Talent We are looking for a Site Reliability Engineer to work on hybrid mode from GDL, MTY o CDMX for a multicultural project with stability and growth in the short, medium and long term. Role Overview: The SRE Operations...
-
Site Reliability Engineer
hace 3 semanas
WorkFromHome, México BairesDev A tiempo completoSite Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev Site Reliability Engineer - Remote Work | REF# 6 months ago Be among the first 25 applicants Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev At BairesDev, we've been leading the way in...
-
Sr. Site Reliability Engineer
hace 3 semanas
WorkFromHome, México Nova A tiempo completoSr. Site Reliability Engineer (Remote, Mexico) Join to apply for the Sr. Site Reliability Engineer (Remote, Mexico) role at Nova Sr. Site Reliability Engineer (Remote, Mexico) 1 year ago Be among the first 25 applicants Join to apply for the Sr. Site Reliability Engineer (Remote, Mexico) role at Nova Get AI-powered advice on this job and more exclusive...
-
Site Reliability Engineer
hace 3 semanas
WorkFromHome, México BairesDev A tiempo completoSite Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev Site Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev Get AI-powered advice on this job and more exclusive features. At BairesDev, we've been...
-
Remote Site Reliability Engineer
hace 4 semanas
WorkFromHome, México Resend A tiempo completoA modern email platform company is seeking a Site Reliability Engineer for a fully remote position. In this role, you will enhance system reliability and automation, monitor performance parameters, and collaborate with engineering teams. Ideal candidates will have over 5 years in Site Reliability or Infrastructure Engineering, strong skills in Node.js and...
-
Site Reliability Engineer
hace 2 semanas
WorkFromHome, México National Oilwell Varco, Inc. A tiempo completoSite Reliability Engineer (SRE) – Application Performance Monitoring (APM) Location: Monterrey, Nuevo León, Mexico (Hybrid – candidates must reside in Monterrey or the metropolitan area) Language requirement: Fluent English (spoken and written) About the Role We’re looking for a Site Reliability Engineer (SRE) with a passion for Application...
-
Site Reliability Engineer
hace 3 semanas
WorkFromHome, México BairesDev A tiempo completoSite Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev Site Reliability Engineer - Remote Work | REF# 6 months ago Be among the first 25 applicants Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev At BairesDev, we've been leading the way in...
-
Senior Site Reliability Engineer
hace 3 semanas
WorkFromHome, México DuckDuckGo A tiempo completo6 days ago Be among the first 25 applicants Who We AreHi, we're DuckDuckGo, the online protection company and remote-first team of 300+ on a mission to raise the standard of trust online. Founded in 2008 and profitable since 2014, our annual revenue now exceeds $100 million USD. Millions use our browser on Mac, Windows, iOS, and Android, our search engine,...
-
Site Reliability Engineer
hace 3 semanas
WorkFromHome, México Epam A tiempo completoA leading digital services company in Mexico City seeks a Site Reliability Engineer to enhance communication between operational and developmental sides of software. You will guide teams in designing, building, testing, and deploying software changes while maintaining and improving cloud infrastructure. Ideal candidates are proficient in Site Reliability...
-
Site Reliability Engineer, Infra
hace 4 semanas
WorkFromHome, México Resend A tiempo completoSite Reliability Engineer, Infra (Americas) Join to apply for the Site Reliability Engineer, Infra (Americas) role at Resend . Americas / Remote / Full‑time Resend is building the most accessible email platform for developers. As we’ve grown to over 15 K customers and continue to onboard thousands of new users every day, the challenge of maintaining a...