Site Reliability Engineer
hace 1 día
Site Reliability Engineer (SRE) – Application Performance Monitoring (APM)
Location:
Monterrey, Nuevo León, Mexico (Hybrid – candidates must reside in Monterrey or the metropolitan area).
Language requirement:
Fluent English (spoken and written).
About the Role:
We're looking for a
Site Reliability Engineer (SRE)
with a passion for
Application Performance Monitoring (APM)
and system optimization.
In this role, you'll be at the heart of ensuring the
reliability, scalability, and performance
of NOV's mission-critical applications. You'll work closely with software engineering and operations teams to design monitoring strategies, analyze performance, and proactively prevent issues before they affect users.
If you thrive in fast-paced environments, love solving complex technical challenges, and enjoy turning data into insight, this is the role for you.
What You'll Do
- Design and manage APM strategies
using tools like
Elastic APM, Datadog, Dynatrace, or similar platforms
. - Perform
deep performance analysis
, tracing distributed requests and identifying bottlenecks in both code and infrastructure. - Build
real-time dashboards and alerting systems
using
Grafana, Kibana, or equivalent tools
to visualize system health. - Proactively monitor systems
to detect performance degradations, security threats, and system failures — before users are impacted. - Define and track
Service Level Objectives (SLOs)
and
Service Level Agreements (SLAs)
to continuously improve reliability. - Lead
Root Cause Analysis (RCA)
sessions after incidents and implement corrective actions to prevent recurrence. - Automate repetitive tasks
and monitoring setups using
Python, Bash, or PowerShell
. - Collaborate with cross-functional teams to embed
reliability, performance, and observability best practices
into every stage of development. - Continuously refine tools, processes, and APM strategies to enhance efficiency, reliability, and visibility across platforms.
- Engage with stakeholders to understand performance challenges and shape the platform roadmap.
What You Bring
- Bachelor's or Master's degree in
Computer Science, Engineering, or related field
. - 5+ years
of experience in
Site Reliability, DevOps, or Performance Engineering
roles. - Proven hands-on experience with
APM tools
such as
Elastic APM, Datadog, Dynatrace, New Relic, or AppDynamics
. - Expertise in the
Elastic Stack (Elasticsearch, Logstash, Kibana, Beats)
for logging, monitoring, and APM. - Deep understanding of
SRE principles
,
DevOps methodologies
, and
Production Support operations
. - Strong scripting ability in
Python, Bash, or PowerShell
for automation and analysis. - Solid grasp of
Linux/Unix systems
,
networking fundamentals
, and
distributed system architecture
. - Experience with
containerization (Docker)
and
orchestration (Kubernetes)
. - Excellent analytical, problem-solving, and collaboration skills, with the ability to communicate effectively in a global team.
Preferred Skills
- Experience with
Infrastructure as Code (IaC)
tools such as
Terraform, Ansible, or Chef
. - Familiarity with
cloud-native services
(AWS, Azure, or GCP) and
serverless architectures
(AWS Lambda, Azure Functions). - Knowledge of
CI/CD tools
like
GitHub Actions, Azure DevOps, or Jenkins
. - Understanding of other observability pillars, including
metrics (Prometheus)
and
logging
. - Experience working in
agile environments
.
Why NOV
At NOV, we combine over 150 years of innovation with cutting-edge technology to power the global energy industry.
You'll join a global engineering team that values
collaboration, curiosity, and continuous improvement
— giving you the opportunity to make a real impact on systems that matter.
-
Site Reliability Engineer
hace 1 día
Monterrey, Nuevo León, México NOV A tiempo completoDescriptionSite Reliability Engineer (SRE) – Application Performance Monitoring (APM)Location: Monterrey, Nuevo León, Mexico (Hybrid – candidates must reside in Monterrey or the metropolitan area)Language requirement: Fluent English (spoken and written)About the RoleWe're looking for a Site Reliability Engineer (SRE) with a passion for Application...
-
Site Reliability Engineer
hace 1 semana
Monterrey, Nuevo León, México Concord USA A tiempo completoLocation: Hybrid in Monterrey, MX. 8 days a month on-site.Possibility to get a relocation stipend if not currently based in Monterrey. Requirement: Must be legally authorized to work for any Mexican employer without sponsorship, now or in the future.About UsConcord isn't your typical consulting firm; we're an execution...
-
Senior Site Reliability Engineer:
hace 1 día
Monterrey, Nuevo León, México Regrello A tiempo completoCompany Regrello is a 40-person startup reimagining automation in supply chains, in which companies still communicate about $13T of annual shipments almost entirely via email. This is a $220-billion, Amazon-sized market opportunity that's been largely overlooked for over 20 years. Our team has experience building billion-dollar companies and includes...
-
DevOps Engineer
hace 1 día
Monterrey, Nuevo León, México Apptegy A tiempo completo USD90,000 - USD135,000Who We AreAt Apptegy, we are more than a SaaS company; we are partners dedicated to transforming how schools communicate and shape the future of education. Your work here will directly empower districts to share their stories, engage their communities, and celebrate student success. We're a team of thoughtful, high-performing individuals committed to making...
-
Electrical Engineer
hace 7 días
Monterrey, Nuevo León, México Aptiv A tiempo completoSHAPE THE FUTURE OF MOBILITY FROM DAY ONE.This position is part of our Advanced Safety & User Experience segment.Summary:Electrical engineer responsible for ECU electrical designDesign for EMCWorst case circuit analysisDFMEAECU validation oversight and failure investigationElectrical component selectionResponsibilities:Work with a cross-functional team...
-
IT QA Automation Engineer
hace 1 día
Monterrey, Nuevo León, México ProTrans A tiempo completoQA Automation EngineerPosition SummaryWe are seeking a highly skilled QA Automation Engineer responsible for designing, developing, and maintaining automated test scripts to ensure the quality and reliability of our applications. The role requires strong technical expertise, a proactive mindset, and the ability to collaborate in agile project teams to...
-
Software Engineer
hace 2 semanas
Monterrey, Nuevo León, México Redwood Logistics A tiempo completoSoftware Engineer (Mexico)Reports To: Software Engineering ManagerLocation: MexicoEnvironment: HybridWork Schedule:This position is full-time and remote Monday through Friday from 8:00 AM to 5:00 PM with an hour break, but flexibility is available based on coverage.Who We Are:Recognized by Gartner in their Modern 4PL Market Guide, Redwood Logistics is at the...
-
Control Engineer
hace 2 semanas
Monterrey, Nuevo León, México HCLTech A tiempo completoHCLTech is a global technology company, home to more than 223,000 people in 60 countries, delivering industry-leading capabilities centered around digital, engineering, cloud and AI, powered by a broad portfolio of technology services and products. We work with clients across all major verticals, providing industry solutions for Financial Services,...
-
Jr. Mechanical Engineer
hace 1 día
Monterrey, Nuevo León, México Vertiv A tiempo completoPosition SummaryJOB DESCRIPTIONThe Mechanical Design Engineer will contribute to the development and maintain of both new and existing products; work in a cross-functional environment with electrical controls, thermal & other development groups; manage a variety of tasks and activities in this dynamic environment; and solve complex problems related to High...
-
Project Engineer, Civil Engineering
hace 1 día
Monterrey, Nuevo León, México Ware Malcomb A tiempo completoAre you ready to join a growth-oriented team where creativity meets innovation? At Ware Malcomb, we are a dynamic and forward-thinking design firm committed to pushing the boundaries. Our team-oriented, collaborative approach ensures that every project is a blend of visionary design, seamless project delivery, and we are actively engaged with both the...