Engineer - Observability

hace 3 semanas


Tlahuac, México Rockwell Automation A tiempo completo

Rockwell Automation is a global technology leader focused on helping the world’s manufacturers be more productive, sustainable, and agile. With more than 28,000 employees who make the world better every day, we know we have something special. Behind our customers - amazing companies that help feed the world, provide life-saving medicine on a global scale, and focus on clean water and green mobility - our people are energized problem solvers that take pride in how the work we do changes the world for the better.

We welcome all makers, forward thinkers, and problem solvers who are looking for a place to do their best work. And if that’s you we would love to have you join us

**Job Description**:
Rockwell Automation is a global technology leader focused on helping the world’s manufacturers be more productive, sustainable, and agile. With more than 28,000 employees who make the world better every day, we know we have something special. Behind our customers - amazing companies that help feed the world, provide life-saving medicine on a global scale, and focus on clean water and green mobility - our people are energized problem solvers that take pride in how the work we do changes the world for the better.

We welcome all makers, forward thinkers, and problem solvers who are looking for a place to do their best work. And if that’s you we would love to have you join us

Sr Engineer - Observability

**Executive Summary**

**Key Responsibilities**:

- Analyzes, designs, programs, debugs, and modifies observability tools and interfaces.
- Code may be used to enrich and correlate telemetry from many data sources in order to isolate events that indicate future or immediate IT availability issues.
- Will interact with users to define system requirements and/or necessary modifications.
- Design and Implement Observability Solutions: Develop and implement comprehensive observability solutions utilizing industry-standard tools and technologies such as Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), Jaeger, and Open Telemetry.
- Distributed Tracing: Implement distributed tracing techniques to trace and visualize the flow of requests across microservices architectures. Utilize tracking data to identify performance bottlenecks and optimize system performance.
- Performance Analysis and Optimization: Analyze system performance metrics and identify opportunities for optimization. Collaborate with development teams to implement performance improvements and ensure scalability of systems.
- Incident Response and Post-Mortems: Actively participate in incident response activities, providing expertise in diagnosing and resolving complex issues. Conduct thorough post-incident reviews to identify root causes and recommend preventive measures.
- Documentation and Knowledge Sharing: Document observability best practices, standards, and procedures. Share knowledge and insights with team members through presentations, workshops, and documentation to foster a culture of continuous learning and improvement.
- Cross-Functional Collaboration: Collaborate with cross-functional teams including DevOps, SRE, and software engineering to drive observability initiatives and ensure alignment with organizational goals and objectives.

**Qualifications**:

- Bachelor's or Master's degree in Computer Science, Information Technology, or related field.
- 2+ years of experience in software engineering, with a focus on observability, monitoring, and/or site reliability engineering.
- 1-2 years of experience with one or more of the following: Application Performance Management APM, Monitoring / Alerting, New Relic, DynaTrace, AppDynamics, Zabbix, Big Panda and ServiceNow.
- Proficiency in designing and implementing observability solutions using tools such as Prometheus, Grafana, ELK Stack, Jaeger, and OpenTelemetry.
- Strong understanding of distributed systems, microservices architectures, and cloud computing platforms (e.g., AWS, Azure, GCP).
- Experience with containerization technologies such as Docker and Kubernetes.
- Ideally 2+ years of development experience with programming languages such as C#,.NET or JavaScript.
- Excellent analytical and problem-solving skills, with a strong attention to detail.
- Effective communication and collaboration skills, with the ability to work across teams and influence stakeholders.
- Experience working in an Agile/Scrum environment is preferred.

LI-PT2

LI-remote


  • Sr Engineer

    hace 3 semanas


    Tlahuac, México Rockwell Automation A tiempo completo

    Rockwell Automation is a global technology leader focused on helping the world’s manufacturers be more productive, sustainable, and agile. With more than 28,000 employees who make the world better every day, we know we have something special. Behind our customers - amazing companies that help feed the world, provide life-saving medicine on a global scale,...

  • Site Reliability Engineer

    hace 4 semanas


    Tlahuac, México Tata Consultancy Services A tiempo completo

    We are a multinational company in the technology consulting and services field, with presence in 46 countries and more than 600,000 employees. Our company is looking for a Site Reliability Engineer to join our team: - Resource should have experience between 4 to 8 years. - Should have a SRE mindset - complying with SLAs, taking quick and responsive actions...


  • Tlahuac, México Thomson Reuters A tiempo completo

    **Senior Site Reliability Engineer**: Are you passionate about the chance to bring your experience to a world-class company that is market-leading for both content and technology? If yes, we are looking for you! Join our team! Thomson Reuters provides knowledge to act: we deliver information quickly and efficiently so professionals can make decisions that...


  • Tlahuac, México Flō Networks A tiempo completo

    **Flō Networks **is a leading provider of telecommunications services between the United States and Mexico. We provide connectivity and managed services to Fortune 1000 companies, telecommunications and cable companies through a fiber optic network that extends for 35,000 km between both countries. We have an immediate need for a** IT Infrastructure...

  • Site Reliability Engineer

    hace 4 semanas


    Tlahuac, México Flō Networks A tiempo completo

    **Flō Networks **is a leading provider of telecommunications services between the United States and Mexico. We provide connectivity and managed services to Fortune 1000 companies, telecommunications and cable companies through a fiber optic network that extends for 35,000 km between both countries. We have an immediate need for a** Site Reliability...

  • Senior QA Engineer

    hace 4 semanas


    Tlahuac, México Thomson Reuters A tiempo completo

    **Senior QA Engineer** **About the Role**: In this opportunity as **Senior QA Engineer**, you will: - Must be very good, detailed oriented, and self-driven in manual testing, test case designing, identifying issues, collaboration with the development team, validation of issues and bug reporting. - Investigate, design, configure / integrate and test process...


  • Tlahuac, México Cisco Systems A tiempo completo

    **Who We Are** In August 2020, Cisco Systems completed the acquisition of ThousandEyes, which now forms the ThousandEyes Business Unit within Cisco’s Network Services Business Group, and is a foundational component of Cisco’s growing Observability business. **What **You’ll** Do** The Service Management team at ThousandEyes plays a crucial role in...


  • Tlahuac, México Cisco Systems A tiempo completo

    Who We Are The name ThousandEyes was born from two big ideas: the power to see things not ordinarily possible and the ability to collect insights from a multitude of vantage points. As the world continues its digital transformation and relies more on cloud services and the Internet, the “network,” which is now both public and private, has become a black...

  • Technical Account Manager

    hace 3 semanas


    Tlahuac, México ThousandEyes A tiempo completo

    **Who We Are**: The name ThousandEyes was born from two big ideas: the power to see things not ordinarily possible and the ability to collect insights from a multitude of vantage points. As the world continues its digital transformation and relies more on cloud services and the Internet, the "network," which is now both public and private, has become a black...