Lead Operational Intelligence Developer

hace 2 semanas


Desde casa, México EPAM Systems, Inc. A tiempo completo

We are looking for a highly experienced and dynamic **Lead Operational Intelligence Developer** to join our team.

In this role, you will take ownership of leading the development, maintenance, and enhancement of our Elastic & Observability Platform deployed across GCP and Elastic Cloud. You will drive strategic initiatives, guide a high-performing technical team, and ensure platform reliability while fostering innovation and enabling self-service capabilities for platform consumers. This position also involves participating in an on-call rotation to oversee platform health and functionality.

**Responsibilities**
- Oversee the availability, functionality, performance, and security of observability and search platforms to exceed business SLAs
- Provide technical leadership during complex incidents and escalate resolutions promptly during on-call periods
- Develop and maintain comprehensive platform documentation, standard operating procedures, and knowledge-sharing resources
- Collaborate with cross-functional teams, stakeholders, and vendors to oversee operational requirements, drive strategic initiatives, and manage installations, troubleshooting, and upgrades
- Lead the enhancement of platform features and self-service capabilities, including advanced Elastic Synthetics and chargeback automation
- Architect and implement proof-of-concepts for platform innovation, such as AI-driven observability, advanced data processing models, or Kubernetes-based platform migration
- Supervise the building, deployment, and maintenance of Elastic clusters using Infrastructure-as-Code (IaC) tools like Terraform and Ansible, while mentoring team members on best practices
- Oversee platform lifecycle management activities, including component upgrades, capacity planning, cost optimization, and evolving compliance requirements
- Continuously assess and fine-tune ELK stack performance, including ingestion, indexing, and query optimization for large-scale environments
- Establish and enhance comprehensive alerting and incident management workflows, integrating sophisticated monitoring tools such as Kibana Rules, Watchers, and PagerDuty
- Supervise the ingestion, enrichment, backup, and restoration of large-scale platform data while optimizing data workflows
- Lead and plan critical operational events such as SSL certificate rotations, cluster migrations, or scalability optimization projects

**Requirements**:

- 5+ years of experience in Operational Intelligence, with a proven track record of leadership and technical expertise in managing large-scale observability platforms
- Demonstrated ability to architect and manage Elastic clusters in complex, multi-cloud environments
- In-depth knowledge of Elastic Stack components, including advanced configurations of Elasticsearch, Kibana, and Logstash
- Advanced proficiency in Infrastructure-as-Code (IaC) tools like Terraform and Ansible, with demonstrated flexibility in adapting other tools like Jenkins CI or GitOps frameworks
- Advanced Python scripting skills for automation, data processing, and extending platform interoperability
- Deep understanding of incident management frameworks and workflows with tools like PagerDuty, Uptrends, and other enterprise monitoring solutions
- Proven expertise in troubleshooting and resolving complex platform challenges under tight SLAs
- Strong capability in managing and scaling fault-tolerant platforms while ensuring performance, security, and compliance across large distributed systems
- Demonstrated ability to mentor and grow team members, manage priorities, and act as a bridge between technical and non-technical teams
- Excellent command of English (B2+ level), both written and spoken, with a strong emphasis on technical communication skills

**Nice to have**
- Expertise in scripting with Groovy or experience in advanced Linux administration to optimize platform processes
- Track record of optimizing observability workflows with additional integrations or customizations in tools like Uptrends, PagerDuty, or Elastic features
- Hands-on experience with advanced Elastic Synthetics setups for robust monitoring and custom synthetic testing frameworks
- Experience driving strategic initiatives such as modernization through AI tooling, cloud-native transitions, or cost-saving observability optimizations

**We offer**
- Career plan and real growth opportunities
- Unlimited access to LinkedIn learning solutions
- International Mobility Plan within 25 countries
- Constant training, mentoring, online corporate courses, eLearning and more
- English classes with a certified teacher
- Support for employee’s initiatives (Algorithms club, toastmasters, agile club and more)
- Enjoyable working environment (Gaming room, napping area, amenities, events, sport teams and more)
- Flexible work schedule and dress code
- Collaborate in a multicultural environment and share best practices from around the globe
- Hired directly by EPAM & 100% under payroll
- Law benef



  • Desde casa, México EPAM Systems, Inc. A tiempo completo

    We are seeking a highly skilled **Senior Operational Intelligence Developer**to join our team, responsible for supporting, enhancing, and maintaining our Elastic & Observability Platform deployed across GCP and Elastic Cloud. This role will involve developing innovative solutions, maintaining platform reliability, and enabling self-service capabilities to...

  • Lead Net Developer

    hace 6 días


    Desde casa, México ALTUMWARE A tiempo completo

    **Lead Net Developer** - **Ingles Avanzado**_ - Tu eres el talento que buscamos:_ Experiência: - **10+ Full Stack Net Develope**r - Angular version 6 en adelante - Administración de proyectos - Gestión de Personal - Metodologias Agiles Ofrecemos: - _**Sueldo bruto mencionado depende del desempeño en entrevista**_ - Sueldo 100% nomina - Prestaciones...

  • Business Intelligence

    hace 4 semanas


    Desde casa, México Inviso A tiempo completo

    **Business Intelligence & Reporting Developer**Remote (WFH) Full-Time PositionWe are seeking an experienced BI & Reporting Developer to drive enhancements and innovate across the program’s diverse business data, leveraging (primarily) Azure Data technologies and Power BI.**Key activities include**:- Collaboration with the Microsoft team to understand...

  • Business Intelligence

    hace 23 minutos


    Desde casa, México Inviso A tiempo completo

    **Business Intelligence & Reporting Developer** Remote (WFH) Full-Time Position We are seeking an experienced BI & Reporting Developer to drive enhancements and innovate across the program’s diverse business data, leveraging (primarily) Azure Data technologies and Power BI. **Key activities include**: - Collaboration with the Microsoft team to...

  • Lead Android Developer

    hace 2 semanas


    Desde casa, México EX Squared LATAM A tiempo completo

    **Become an EXpert as a Lead Android Developer**:At EX Squared LATAM, we help global brands transform their operations through innovative digital solutions. We're currently seeking a **Lead Android Developer** based in LATAM to join a long-term digital initiative for a premium international client in the automotive space.This is a great opportunity to shape...

  • Lead Android Developer

    hace 2 semanas


    Desde casa, México EX Squared LATAM A tiempo completo

    **Become an EXpert as a Lead Android Developer**: At EX Squared LATAM, we help global brands transform their operations through innovative digital solutions. We're currently seeking a **Lead Android Developer** based in LATAM to join a long-term digital initiative for a premium international client in the automotive space. This is a great opportunity to...

  • Lead .NET Developer

    hace 4 días


    Desde casa, México EPAM Systems, Inc. A tiempo completo

    We are seeking a skilled **Lead.NET Developer** to inspire and direct our team of technology professionals. As a leader, you will focus on designing, developing, optimizing, and delivering secure, scalable, and high-performance software solutions. You'll lead cross-functional efforts, mentor team members, and align technology strategies with business...


  • Desde casa, México Parexel A tiempo completo

    Mexico, Remote **Job ID** R0000029006 **Category** Clinical Trials **ABOUT THIS ROLE**: Parexel FSP is looking for a Site Intelligence Specialist in Mexico - The Site Intelligence Specialist (SIS) acts as a supportive team member for Regional Intelligence. Performs tasks for multiple Site Intelligence and Feasibility projects. Works under close...


  • Desde casa, México Parexel A tiempo completo

    Mexico, Remote**Job ID** R **Category** Clinical Trials**ABOUT THIS ROLE**:Parexel FSP is looking for a Site Intelligence Specialist in Mexico- The Site Intelligence Specialist (SIS) acts as a supportive team member for Regional Intelligence. Performs tasks for multiple Site Intelligence and Feasibility projects. Works under close supervision of Regional...


  • Desde casa, México Luxoft A tiempo completo

    **Project** Description**:Luxoft DXC Technology Company is an established company focusing on consulting and implementation of complex projects in the financial industry. At the interface between technology and business, we convince with our know-how, well-founded methodology and pleasure in success. As a reliable partner to our renowned customers, we...