Reliability Engineer
hace 4 semanas
Job Role Summary Seeking an experienced SRE Lead to drive reliability, scalability, and automation across multi‑cloud and application platforms. Job DescriptionAs a seasoned SRE and DevOps Lead, you will combine leadership, hands‑on engineering, and strategic thinking to ensure high availability and performance of mission‑critical systems. Key Responsibilities Design and implement SRE best practices for monitoring, alerting, and incident response. Define and track SLIs, SLOs, and SLAs to improve system reliability. Lead CI/CD pipeline design and optimization for multi‑cloud environments (Azure & AWS). Automate infrastructure provisioning and deployments using Infrastructure as Code (IaC). Own incident response processes leveraging PagerDuty and Datadog for alerting and observability. Conduct post‑mortems and implement preventive measures. Architect and manage hybrid cloud environments (Azure, AWS). Optimize cost, performance, and security across cloud services. Ensure high availability and performance of MongoDB clusters. Implement backup, recovery, and disaster recovery strategies. Mentor SRE / DevOps engineers and foster a culture of reliability and automation. Collaborate with development, QA, and product teams to embed reliability into the SDLC. Required Skills & Qualifications The ideal candidate should possess the following skills: Strong experience with Datadog, PagerDuty, Azure, AWS, and MongoDB. Proficiency in scripting (Python, Bash) and Infrastructure as Code (Terraform, ARM templates). Hands‑on experience with containerization (Docker, Kubernetes). Deep understanding of SLIs / SLOs, error budgets, and reliability engineering practices. Expertise in CI/CD tools (Azure DevOps, Jenkins, GitHub Actions). Strong automation mindset and experience with configuration management tools (Ansible, Chef, or similar). Soft Skills Excellent communication and leadership skills. Ability to work in a fast‑paced, collaborative environment. Preferred Qualifications Experience in regulated industries (Healthcare, Finance, etc.). AWS Solutions Architect, Azure Administrator, or Datadog Certified Professional. #J-18808-Ljbffr
-
Site Reliability Engineer
hace 4 semanas
WorkFromHome, México Hcl International Ltd A tiempo completoSenior Site Reliability Engineer Site Reliability Engineer to join this fast growing, well-funded business with cloud built on AWS. With first class skills in AWS the Site Reliability Engineer must demonstrate expertise in spinning up featured environments. Reporting to the CTO, this is an excellent opportunity for an ambitious Site Reliability Engineer to...
-
Site Reliability Engineer
hace 2 semanas
WorkFromHome, México KI people A tiempo completo18 hours ago Be among the first 25 applicants Direct message the job poster from KI people In Search of the Best Global IT & Digital Talent We are looking for a Site Reliability Engineer to work on hybrid mode from GDL, MTY o CDMX for a multicultural project with stability and growth in the short, medium and long term. Role Overview: The SRE Operations...
-
Site Reliability Engineer
hace 2 semanas
WorkFromHome, México BairesDev A tiempo completoSite Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev Site Reliability Engineer - Remote Work | REF# 6 months ago Be among the first 25 applicants Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev At BairesDev, we've been leading the way in...
-
Sr. Site Reliability Engineer
hace 2 semanas
WorkFromHome, México Nova A tiempo completoSr. Site Reliability Engineer (Remote, Mexico) Join to apply for the Sr. Site Reliability Engineer (Remote, Mexico) role at Nova Sr. Site Reliability Engineer (Remote, Mexico) 1 year ago Be among the first 25 applicants Join to apply for the Sr. Site Reliability Engineer (Remote, Mexico) role at Nova Get AI-powered advice on this job and more exclusive...
-
Site Reliability Engineer
hace 2 semanas
WorkFromHome, México BairesDev A tiempo completoSite Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev Site Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev Get AI-powered advice on this job and more exclusive features. At BairesDev, we've been...
-
Senior Site Reliability Engineer
hace 2 semanas
WorkFromHome, México DuckDuckGo A tiempo completo6 days ago Be among the first 25 applicants Who We AreHi, we're DuckDuckGo, the online protection company and remote-first team of 300+ on a mission to raise the standard of trust online. Founded in 2008 and profitable since 2014, our annual revenue now exceeds $100 million USD. Millions use our browser on Mac, Windows, iOS, and Android, our search engine,...
-
Site Reliability Engineer
hace 3 semanas
WorkFromHome, México - A tiempo completoJOB DESCRIPTION Site Reliability Engineer (SRE) - Application Performance Monitoring (APM) Location: Monterrey, Nuevo León, Mexico (Hybrid - candidates must reside in Monterrey or the metropolitan area) Language requirement: Fluent English (spoken and written) About the Role We're looking for a Site Reliability Engineer (SRE) with a passion for Application...
-
Senior Cloud Reliability Engineer — Remote
hace 3 días
WorkFromHome, México Zipdev A tiempo completoA technology company is seeking a System Reliability Engineer in Mexico City. The role involves designing and maintaining resilient systems on Google Cloud Platform, enhancing reliability practices, and automating operational tasks. Ideal candidates have 5+ years of experience, strong skills in GCP and container technologies, and a proactive mindset....
-
Remote Site Reliability Engineer
hace 2 semanas
WorkFromHome, México Resend A tiempo completoA modern email platform company is seeking a Site Reliability Engineer for a fully remote position. In this role, you will enhance system reliability and automation, monitor performance parameters, and collaborate with engineering teams. Ideal candidates will have over 5 years in Site Reliability or Infrastructure Engineering, strong skills in Node.js and...
-
Site Reliability Engineer
hace 2 semanas
WorkFromHome, México BairesDev A tiempo completoSite Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev Site Reliability Engineer - Remote Work | REF# 6 months ago Be among the first 25 applicants Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev At BairesDev, we've been leading the way in...