Engineer - Observability

hace 3 semanas


Tlahuac, México Rockwell Automation A tiempo completo

Rockwell Automation is a global technology leader focused on helping the world’s manufacturers be more productive, sustainable, and agile. With more than 28,000 employees who make the world better every day, we know we have something special. Behind our customers - amazing companies that help feed the world, provide life-saving medicine on a global scale, and focus on clean water and green mobility - our people are energized problem solvers that take pride in how the work we do changes the world for the better.

We welcome all makers, forward thinkers, and problem solvers who are looking for a place to do their best work. And if that’s you we would love to have you join us

**Job Description**:
Rockwell Automation is a global technology leader focused on helping the world’s manufacturers be more productive, sustainable, and agile. With more than 28,000 employees who make the world better every day, we know we have something special. Behind our customers - amazing companies that help feed the world, provide life-saving medicine on a global scale, and focus on clean water and green mobility - our people are energized problem solvers that take pride in how the work we do changes the world for the better.

We welcome all makers, forward thinkers, and problem solvers who are looking for a place to do their best work. And if that’s you we would love to have you join us

Sr Engineer - Observability

**Executive Summary**

**Key Responsibilities**:

- Analyzes, designs, programs, debugs, and modifies observability tools and interfaces.
- Code may be used to enrich and correlate telemetry from many data sources in order to isolate events that indicate future or immediate IT availability issues.
- Will interact with users to define system requirements and/or necessary modifications.
- Design and Implement Observability Solutions: Develop and implement comprehensive observability solutions utilizing industry-standard tools and technologies such as Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), Jaeger, and Open Telemetry.
- Distributed Tracing: Implement distributed tracing techniques to trace and visualize the flow of requests across microservices architectures. Utilize tracking data to identify performance bottlenecks and optimize system performance.
- Performance Analysis and Optimization: Analyze system performance metrics and identify opportunities for optimization. Collaborate with development teams to implement performance improvements and ensure scalability of systems.
- Incident Response and Post-Mortems: Actively participate in incident response activities, providing expertise in diagnosing and resolving complex issues. Conduct thorough post-incident reviews to identify root causes and recommend preventive measures.
- Documentation and Knowledge Sharing: Document observability best practices, standards, and procedures. Share knowledge and insights with team members through presentations, workshops, and documentation to foster a culture of continuous learning and improvement.
- Cross-Functional Collaboration: Collaborate with cross-functional teams including DevOps, SRE, and software engineering to drive observability initiatives and ensure alignment with organizational goals and objectives.

**Qualifications**:

- Bachelor's or Master's degree in Computer Science, Information Technology, or related field.
- 2+ years of experience in software engineering, with a focus on observability, monitoring, and/or site reliability engineering.
- 1-2 years of experience with one or more of the following: Application Performance Management APM, Monitoring / Alerting, New Relic, DynaTrace, AppDynamics, Zabbix, Big Panda and ServiceNow.
- Proficiency in designing and implementing observability solutions using tools such as Prometheus, Grafana, ELK Stack, Jaeger, and OpenTelemetry.
- Strong understanding of distributed systems, microservices architectures, and cloud computing platforms (e.g., AWS, Azure, GCP).
- Experience with containerization technologies such as Docker and Kubernetes.
- Ideally 2+ years of development experience with programming languages such as C#,.NET or JavaScript.
- Excellent analytical and problem-solving skills, with a strong attention to detail.
- Effective communication and collaboration skills, with the ability to work across teams and influence stakeholders.
- Experience working in an Agile/Scrum environment is preferred.

LI-PT2

LI-remote



  • Tlahuac, México Justworks A tiempo completo

    **Who We Are**: At Justworks, you'll enjoy a welcoming and casual environment, great benefits, wellness program offerings, company retreats, and the ability to interact with and learn from leaders in the startup community. We work hard and care about our most prized asset - our people. We're helping businesses get off the ground by enabling them to focus on...