Monitoring and Observability Analyst
hace 7 días
About Us
Coderio designs and delivers scalable digital solutions for global businesses. With a strong technical foundation and a product mindset, our teams lead complex software projects from architecture to execution. We value autonomy, clear communication, and technical excellence. We work closely with international teams and partners, building technology that makes a difference.
Learn
In this role, as an Monitoring and Observability Analyst , you will design, implement, and maintain proactive monitoring and alerting systems to ensure the availability, performance, and health of IT infrastructure, applications, and services. Your main focus will be on designing end-to-end monitoring solutions using metrics, logs, and traces , configuring business-impact-based alert thresholds (SLIs/SLOs) , and supporting incident resolution by providing detailed monitoring data for Root Cause Analysis (RCA). You will work closely with Operations and Development (DevOps) teams to minimize MTTR (Mean Time to Recovery) and support the continuous improvement of the ecosystem.
The role of Monitoring and Observability Engineer/Analyst is critical to our operation and requires continuous coverage (24/7).
Since we support the infrastructure in the United States, all shifts and holidays are governed by the United States (U.S.) time zone and schedule.
From Monday to Friday, we need coverage during these two main blocks:
Evening/Night Shift: From 17:00 to 00:00 (7 hours).
Night/Early Morning Shift: From 00:00 to 08:00 (8 hours).
What to Expect in This Role (Responsibilities)
Contribute to the definition of the company's observability strategy, aligned with industry best practices (SRE/DevOps).
Design and implement end-to-end monitoring solutions.
Configure alert thresholds (SLIs/SLOs) based on business impact and minimize notification noise.
Develop and maintain informative and visually clear dashboards (e.g., Grafana, Kibana) for real-time visibility.
Implement and optimize monitoring automation, from agent deployment to automatic alert response (AIOps basic/intermediate).
Administer and maintain monitoring platforms (updates, patches, cost optimization).
Create and maintain technical documentation (runbooks, monitoring procedures, service maps).
Requirements
Minimum 3 years of experience in Monitoring, IT Operations, or SRE roles.
Advanced experience with one or more monitoring platforms: Prometheus/Grafana, ELK Stack, New Relic, Datadog or similar.
Dominance in monitoring Cloud environments (AWS/Azure/GCP) and containers (Docker, Kubernetes).
Solid understanding of Logs (fluentd, Logstash, Loki) and Distributed Tracing (Jaeger, Zipkin, OpenTelemetry).
Practical experience in scripting languages (e.g., Python, Bash) for task automation and custom checker development.
Deep knowledge of Linux operating systems.
Strong ability to correlate events and data from multiple sources to identify the root cause of complex problems (Analysis Skill).
Ability to anticipate problems instead of just reacting to alerts (Proactivity Orientation).
Excellent oral and written communication skills.
Experience in a collaborative work environment with a DevOps mindset.
Bachelor's degree in Systems Engineering, Computer Science, or a related field.
Nice to Have
Certifications related to Cloud (AWS, Azure).
Certifications related to Observability Platforms (Datadog, Dynatrace).
Certifications related to DevOps/SRE practices.
Understanding of basic networking concepts (TCP/IP, DNS, Load Balancers).
Benefits
100% remote Long-term commitment, with autonomy and impact
Strategic and high-visibility role in a modern engineering culture
Collaborative international team and strong technical leadership
Clear path to growth and leadership within Coderio
Why join Coderio?
At Coderio, we value talent regardless of location. We are a remote-first company, passionate about technology, collaborative work, and fair compensation.We offer an inclusive, challenging environment with real opportunities for growth.If you are motivated to build solutions with impact, we are waiting for you.
Apply now.
-
Monitoring and Observability Analyst
hace 7 días
Ciudad de México, Ciudad de México Coderio A tiempo completoAbout UsCoderio designs and delivers scalable digital solutions for global businesses. With a strong technical foundation and a product mindset, our teams lead complex software projects from architecture to execution. We value autonomy, clear communication, and technical excellence. We work closely with international teams and partners, building technology...
-
FBS Observability Engineer
hace 2 semanas
Ciudad de México, Ciudad de México Capgemini A tiempo completoOur Client is one of the United States' largest insurers, providing a wide range of insurance and financial services products with gross written premiums well over US$25 Billion (P&C). They proudly serve more than 10 million U.S. households with more than 19 million individual policies across all 50 states through the efforts of over 48,000 exclusive and...
-
Jr. Monitoring Analyst
hace 5 días
Ciudad de México, Ciudad de México Monato A tiempo completoJr. Monitoring AnalystHíbridoAcerca de MonatoEn Monato estamos revolucionando la integración de servicios financieros para empresas, eliminando barreras tecnológicas y simplificando procesos con un enfoque en seguridad, agilidad y experiencia del cliente. Nuestra misión es transformar la manera en que las empresas acceden a la infraestructura financiera...
-
Fleet Monitoring Analyst
hace 1 semana
Santiago de Querétaro, Querétaro de Arteaga, México Charger Logistics Inc. A tiempo completoTS Trucking is a top-tier transportation company with owned assets and nationwide presence across Mexico, providing the best logistics solutions. Over time, it has transformed into a growing transportation provider.We are currently seeking a Monitoring Analyst with safety experience to join our team in Querétaro. This role will work in a dynamic...
-
Fleet Monitoring Analyst
hace 1 semana
Santiago de Querétaro, Querétaro de Arteaga, México Charger Logistics Inc A tiempo completoTS Trucking is a top-tier transportation company with owned assets and nationwide presence across Mexico, providing the best logistics solutions. Over time, it has transformed into a growing transportation provider.We are currently seeking a Monitoring Analyst with safety experience to join our team in Querétaro. This role will work in a dynamic...
-
Credit and Syndications Analyst 3-Fin
hace 2 semanas
Ciudad de México, Ciudad de México Oracle A tiempo completoDescriptionSummary Statement:Oracle Financing provides payment solutions to help customers acquire Oracle products and services. The Credit and Syndications Analyst primary responsibilities are to review, evaluate, and provide written analysis of transactions with end-users to arrive at accurate risk ratings, provide approvals within delegated credit...
-
Quality Assurance Analyst
hace 5 días
Ciudad de México, Ciudad de México The Functionary A tiempo completoQA AnalystEntry-level QA Analyst to test and validate software controlling or monitoring washer/dryer hardware. This role focuses on ensuring software quality, identifying issues, and assisting in integration with hardware products.Key Responsibilities:Perform manual QA testing of software for functionality, usability, and reliability.Document and report ...
-
Data Engineer
hace 2 semanas
Ciudad de México, Ciudad de México AgileEngine A tiempo completoAgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards. WHY JOIN US If you're looking for a place to grow, make an...
-
Remote Senior AI Cloud Engineer
hace 2 semanas
Ciudad de México, Ciudad de México Oracle A tiempo completoDescriptionWho You AreA degree (or equivalent experience) in Computer Science, Engineering, Mathematics, Physics, Statistics, or a related field.3+ years of experience in cloud infrastructure, middleware engineering, or DevOps roles.Strong hands-on experience with Oracle Middleware (WebLogic, ODI, OTD).Skilled in OCI, OKE, Terraform, Ansible,...
-
Operations Analyst
hace 5 días
Ciudad de México, Ciudad de México Draiver A tiempo completoAbout the companyDRAIVER is an industry-leading self-learning logistics platform that enables vehicle movement planning, coordination, and execution. We streamline management by connecting businesses to transport services and independent contract drivers, simplifying employment with quick and direct access to jobs and drivers, GPS routing and tracking, as...