Site Reliability Engineer

hace 3 semanas


WorkFromHome, México - A tiempo completo

JOB DESCRIPTION Site Reliability Engineer (SRE) - Application Performance Monitoring (APM) Location: Monterrey, Nuevo León, Mexico (Hybrid - candidates must reside in Monterrey or the metropolitan area) Language requirement: Fluent English (spoken and written) About the Role We're looking for a Site Reliability Engineer (SRE) with a passion for Application Performance Monitoring (APM) and system optimization. In this role, you'll be at the heart of ensuring the reliability, scalability, and performance of NOV's mission‑critical applications. You'll work closely with software engineering and operations teams to design monitoring strategies, analyze performance, and proactively prevent issues before they affect users. If you thrive in fast‑paced environments, love solving complex technical challenges, and enjoy turning data into insight, this is the role for you. What You'll Do Design and manage APM strategies using tools like Elastic APM, Datadog, Dynatrace, or similar platforms . Perform deep performance analysis , tracing distributed requests and identifying bottlenecks in both code and infrastructure. Build real‑time dashboards and alerting systems using Grafana, Kibana, or equivalent tools to visualize system health. Proactively monitor systems to detect performance degradations, security threats, and system failures - before users are impacted. Define and track Service Level Objectives (SLOs) and Service Level Agreements (SLAs) to continuously improve reliability. Lead Root Cause Analysis (RCA) sessions after incidents and implement corrective actions to prevent recurrence. Automate repetitive tasks and monitoring setups using Python, Bash, or PowerShell . Collaborate with cross‑functional teams to embed reliability, performance, and observability best practices into every stage of development. Continuously refine tools, processes, and APM strategies to enhance efficiency, reliability, and visibility across platforms. Engage with stakeholders to understand performance challenges and shape the platform roadmap. What You Bring Bachelor's or Master's degree in Computer Science, Engineering, or related field . 5+ years of experience in Site Reliability, DevOps, or Performance Engineering roles. Proven hands‑on experience with APM tools such as Elastic APM, Datadog, Dynatrace, New Relic, or AppDynamics . Expertise in the Elastic Stack (Elasticsearch, Logstash, Kibana, Beats) for logging, monitoring, and APM. Deep understanding of SRE principles , DevOps methodologies , and Production Support operations . Strong scripting ability in Python, Bash, or PowerShell for automation and analysis. Solid grasp of Linux/Unix systems , networking fundamentals , and distributed system architecture . Experience with containerization (Docker) and orchestration (Kubernetes) . Excellent analytical, problem‑solving, and collaboration skills, with the ability to communicate effectively in a global team. Preferred Skills Fluent English (Mandatory) Experience with Infrastructure as Code (IaC) tools such as Terraform, Ansible, or Chef . Familiarity with cloud‑native services (AWS, Azure, or GCP) and serverless architectures (AWS Lambda, Azure Functions). Knowledge of CI/CD tools like GitHub Actions, Azure DevOps, or Jenkins . Understanding of other observability pillars, including metrics (Prometheus) and logging . Experience working in agile environments . Why NOV At NOV, we combine over 150 years of innovation with cutting‑edge technology to power the global energy industry. You'll join a global engineering team that values collaboration, curiosity, and continuous improvement - giving you the opportunity to make a real impact on systems that matter. #J-18808-Ljbffr


  • Site Reliability Engineer

    hace 4 semanas


    WorkFromHome, México Hcl International Ltd A tiempo completo

    Senior Site Reliability Engineer Site Reliability Engineer to join this fast growing, well-funded business with cloud built on AWS. With first class skills in AWS the Site Reliability Engineer must demonstrate expertise in spinning up featured environments. Reporting to the CTO, this is an excellent opportunity for an ambitious Site Reliability Engineer to...

  • Site Reliability Engineer

    hace 2 semanas


    WorkFromHome, México KI people A tiempo completo

    18 hours ago Be among the first 25 applicants Direct message the job poster from KI people In Search of the Best Global IT & Digital Talent We are looking for a Site Reliability Engineer to work on hybrid mode from GDL, MTY o CDMX for a multicultural project with stability and growth in the short, medium and long term. Role Overview: The SRE Operations...

  • Site Reliability Engineer

    hace 2 semanas


    WorkFromHome, México BairesDev A tiempo completo

    Site Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev Site Reliability Engineer - Remote Work | REF# 6 months ago Be among the first 25 applicants Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev At BairesDev, we've been leading the way in...


  • WorkFromHome, México Nova A tiempo completo

    Sr. Site Reliability Engineer (Remote, Mexico) Join to apply for the Sr. Site Reliability Engineer (Remote, Mexico) role at Nova Sr. Site Reliability Engineer (Remote, Mexico) 1 year ago Be among the first 25 applicants Join to apply for the Sr. Site Reliability Engineer (Remote, Mexico) role at Nova Get AI-powered advice on this job and more exclusive...

  • Site Reliability Engineer

    hace 2 semanas


    WorkFromHome, México BairesDev A tiempo completo

    Site Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev Site Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev Get AI-powered advice on this job and more exclusive features. At BairesDev, we've been...


  • WorkFromHome, México Resend A tiempo completo

    A modern email platform company is seeking a Site Reliability Engineer for a fully remote position. In this role, you will enhance system reliability and automation, monitor performance parameters, and collaborate with engineering teams. Ideal candidates will have over 5 years in Site Reliability or Infrastructure Engineering, strong skills in Node.js and...


  • WorkFromHome, México National Oilwell Varco, Inc. A tiempo completo

    Site Reliability Engineer (SRE) – Application Performance Monitoring (APM) Location: Monterrey, Nuevo León, Mexico (Hybrid – candidates must reside in Monterrey or the metropolitan area) Language requirement: Fluent English (spoken and written) About the Role We’re looking for a Site Reliability Engineer (SRE) with a passion for Application...

  • Site Reliability Engineer

    hace 2 semanas


    WorkFromHome, México BairesDev A tiempo completo

    Site Reliability Engineer - Remote Work | REF# Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev Site Reliability Engineer - Remote Work | REF# 6 months ago Be among the first 25 applicants Join to apply for the Site Reliability Engineer - Remote Work | REF# role at BairesDev At BairesDev, we've been leading the way in...


  • WorkFromHome, México DuckDuckGo A tiempo completo

    6 days ago Be among the first 25 applicants Who We AreHi, we're DuckDuckGo, the online protection company and remote-first team of 300+ on a mission to raise the standard of trust online. Founded in 2008 and profitable since 2014, our annual revenue now exceeds $100 million USD. Millions use our browser on Mac, Windows, iOS, and Android, our search engine,...

  • Site Reliability Engineer

    hace 4 semanas


    WorkFromHome, México S&Amp;P Global A tiempo completo

    Site Reliability Engineer - Data Support | S&P Dow Jones Indices We are seeking an Site Reliability Engineer - Data Support to be a key player in the implementation and support of our Global Index Data Platform that supports our major headline indices like S&P 500, Dow Jones Industrial Averages & also the co-branded indices with our exchange partners such as...