IT Site Reliability Engineer

hace 6 días


Guadalajara Mexico Metropolitan Area Tata Consultancy Services A tiempo completo

About the Role

We are seeking a talented and experienced IT Engineer / Architect with a strong focus on site reliability engineering responsibilities to join our team. As a key member of our team, you will be responsible for ensuring the reliability, scalability, and performance of our infrastructure and applications, with a specific focus on the architecture design and implementation.

Responsibilities

  • Design, build, and maintain the architecture of our cloud-based infrastructure to ensure high availability, scalability, and security for our medical device applications, including but not limited to Ignition, PostgreSQL, HiveMQ, Qlik, Confluent Kafka, and Tanzu.
  • Collaborate with cross-functional teams to develop and implement best practices for container orchestration and management, with a specific focus on Kubernetes.
  • Develop and maintain CI/CD pipelines to automate the deployment and testing of applications and infrastructure changes, utilizing tools such as Tanzu, Confluent Kafka, and others as needed.
  • Manage and maintain the repository of infrastructure as code, ensuring proper version control and documentation, with a specific focus on the specified applications.
  • Monitor and analyze system performance, identifying and resolving potential issues to ensure optimal reliability and performance for the specified applications.
  • Lead efforts to implement disaster recovery and business continuity plans for critical systems and applications, including those utilizing Ignition, PostgreSQL, HiveMQ, Qlik, Confluent Kafka, and Tanzu.
  • Strong Linux Experience:
  • Proficient in administering Linux systems (e.g., Ubuntu, CentOS, RHEL, Debian) in production environments.
  • Strong knowledge of Linux internals including system calls, process management, networking, and filesystems.
  • Experience with system monitoring and performance tuning on Linux servers.

  • DevOps:

  • Implements GitOps workflows for Kubernetes using declarative infrastructure in Git.
  • Manages manifests, Helm charts, or Kustomize in version control.
  • Automates reconciliation between Git and clusters for consistent deployments.
  • Monitors and troubleshoots GitOps deployment issues, enforcing drift detection with Git-centric tools.

Qualifications

  • Bachelor's degree in computer science, Engineering, or a related field.
  • Proven experience in designing and implementing architecture for cloud-based infrastructure, preferably in the medical device or healthcare industry, with expertise in the specified applications.
  • Strong expertise in Kubernetes and other container orchestration technologies, with experience in managing the specified applications.
  • Experience with infrastructure as code tools such as Terraform, Ansible, or CloudFormation, with a focus on the specified applications.
  • Proficiency in developing and maintaining CI/CD pipelines using tools such as Tanzu, Confluent Kafka, and others as needed.
  • Solid understanding of networking, security, and monitoring concepts in a cloud environment, with a focus on the specified applications.
  • Experience working in Global / Multisite deployments of new architecture, change control for new requests as well as support in cases of issues.

Required Skills

  • Drive for Results
  • Interpersonal Relationships
  • Adaptability

Preferred Skills

  • Previous Medical Devices or Pharma experience
  • Certified AWS Solution Architect
  • Certified Kubernetes (CKS or CKA)
  • ITILv4

Equal Opportunity Statement

We are committed to diversity and inclusivity in our hiring practices.


  • Site Reliability Engineer

    hace 2 semanas


    Mexico City Azka IT A tiempo completo

    Site Reliability Engineer (SRE) AZKAIT es una empresa mexicana que busca y conecta el mejor talento IT con empresas Latinoamericanas y de Estados Unidos. Requisitos Licenciatura o Ingeniería en Sistemas, Informática o afín. +5 años de experiencia en roles de SRE, DevOps o Ingeniería de Software. Experiencia programando en Python. Experiencia con Docker...

  • Site Reliability Engineer

    hace 2 semanas


    Mexico City Azka IT A tiempo completo

    Site Reliability Engineer (SRE) AZKAIT es una empresa mexicana que busca y conecta el mejor talento IT con empresas Latinoamericanas y de Estados Unidos. Requisitos Licenciatura o Ingeniería en Sistemas, Informática o afín. +5 años de experiencia en roles de SRE, DevOps o Ingeniería de Software. Experiencia programando en Python. Experiencia con Docker...


  • Mexico City Royal Caribbean Group A tiempo completo

    Talent Acquisition @Royal Caribbean Group Journey with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group. We are proud to offer a competitive compensation and benefits package, and excellent career development opportunities, each offering unique ways to explore the world. We are proud to...

  • Site Reliability Engineer

    hace 3 semanas


    Mexico Pyramid Consulting, Inc A tiempo completo

    IT Recruiter at Pyramid || SOURCING || MEXICO/LATAM/US Must have: AWS & EKS CI/CD Tools (e.g., Bamboo, Bitbucket) Configuration Management (e.g., Chef) Observability Tools (e.g., Datadog, New Relic) Automation & Scripting Security & Compliance Awareness Seniority level Associate Employment type Full-time Job function Information Technology Industries IT...

  • Site Reliability Engineer

    hace 3 semanas


    Mexico City W3Global A tiempo completo

    Site Reliability Engineer Join to apply for the Site Reliability Engineer role at W3Global Required qualifications: AWS experience Gitlab Terraform or AWS CDK Python Familiarity with GO Linux OS administration advanced scripting - bash Windows OS administration advanced scripting - powershell Seniority level Entry level Employment type Full-time Job function...

  • Site Reliability Engineer

    hace 3 semanas


    Mexico City W3Global A tiempo completo

    Site Reliability Engineer Join to apply for the Site Reliability Engineer role at W3Global Required qualifications: AWS experience Gitlab Terraform or AWS CDK Python Familiarity with GO Linux OS administration advanced scripting - bash Windows OS administration advanced scripting - powershell Seniority level Entry level Employment type Full-time Job function...

  • Site Reliability Engineer

    hace 3 semanas


    Mexico City Yochana A tiempo completo

    We’re Hiring | Site Reliability Engineer (SRE) Hybrid | Mexico City Full-Time We are looking for a highly skilled Site Reliability Engineer (SRE) to join our team in Mexico City. This role is ideal for a proactive engineer with strong AWS expertise , a passion for automation, and a solid background in systems reliability, scalability, and performance. Key...


  • Mexico City Royal Caribbean Group A tiempo completo

    Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group . We are proud to be the vacation-industry leader with global brands — including Royal Caribbean International, Celebrity Cruises and Silversea Cruises — the most innovative fleet and private destinations, and the best people. Royal...


  • Mexico City Tata Consultancy Services A tiempo completo

    We are looking for a Site Reliability Engineer (SRE) to join our team and help us ensure seamless, high-performing, and reliable technology operations.What you’ll work with:Azure DevOps - Pipelines, repositories, and automationServiceNow - Incident, change, and problem managementAppDynamics - Application performance monitoring and alertingMicrosoft Azure...


  • Mexico City Tata Consultancy Services A tiempo completo

    We are looking for a Site Reliability Engineer (SRE) to join our team and help us ensure seamless, high-performing, and reliable technology operations.What you’ll work with:Azure DevOps - Pipelines, repositories, and automationServiceNow - Incident, change, and problem managementAppDynamics - Application performance monitoring and alertingMicrosoft Azure...