IT Site Reliability Engineer

hace 2 semanas


Guadalajara, México Tata Consultancy Services A tiempo completo

About the Role We are seeking a talented and experienced IT Engineer / Architect with a strong focus on site reliability engineering responsibilities to join our team. As a key member of our team, you will be responsible for ensuring the reliability, scalability, and performance of our infrastructure and applications, with a specific focus on the architecture design and implementation. Responsibilities Design, build, and maintain the architecture of our cloud-based infrastructure to ensure high availability, scalability, and security for our medical device applications, including but not limited to Ignition, PostgreSQL, HiveMQ, Qlik, Confluent Kafka, and Tanzu. Collaborate with cross-functional teams to develop and implement best practices for container orchestration and management, with a specific focus on Kubernetes. Develop and maintain CI/CD pipelines to automate the deployment and testing of applications and infrastructure changes, utilizing tools such as Tanzu, Confluent Kafka, and others as needed. Manage and maintain the repository of infrastructure as code, ensuring proper version control and documentation, with a specific focus on the specified applications. Monitor and analyze system performance, identifying and resolving potential issues to ensure optimal reliability and performance for the specified applications. Lead efforts to implement disaster recovery and business continuity plans for critical systems and applications, including those utilizing Ignition, PostgreSQL, HiveMQ, Qlik, Confluent Kafka, and Tanzu. Strong Linux Experience: Proficient in administering Linux systems (e.g., Ubuntu, CentOS, RHEL, Debian) in production environments. Strong knowledge of Linux internals including system calls, process management, networking, and filesystems. Experience with system monitoring and performance tuning on Linux servers. DevOps: Implements GitOps workflows for Kubernetes using declarative infrastructure in Git. Manages manifests, Helm charts, or Kustomize in version control. Automates reconciliation between Git and clusters for consistent deployments. Monitors and troubleshoots GitOps deployment issues, enforcing drift detection with Git-centric tools. Qualifications Bachelor’s degree in computer science, Engineering, or a related field. Proven experience in designing and implementing architecture for cloud-based infrastructure, preferably in the medical device or healthcare industry, with expertise in the specified applications. Strong expertise in Kubernetes and other container orchestration technologies, with experience in managing the specified applications. Experience with infrastructure as code tools such as Terraform, Ansible, or CloudFormation, with a focus on the specified applications. Proficiency in developing and maintaining CI/CD pipelines using tools such as Tanzu, Confluent Kafka, and others as needed. Solid understanding of networking, security, and monitoring concepts in a cloud environment, with a focus on the specified applications. Experience working in Global / Multisite deployments of new architecture, change control for new requests as well as support in cases of issues. Required Skills Drive for Results Interpersonal Relationships Adaptability Preferred Skills Previous Medical Devices or Pharma experience Certified AWS Solution Architect Certified Kubernetes (CKS or CKA) ITILv4 Equal Opportunity Statement We are committed to diversity and inclusivity in our hiring practices.



  • Guadalajara, Jalisco, México NTT DATA A tiempo completo

    SRE - Site Reliability EngineerWe are currently seeking a Site Reliability Engineer to join our team in GDL, Jalisco (MX-JAL), Mexico (MX). Perform L1.5 activities such as monitoring, deployment, rollback. Monitor the efficiency of the Azure cloud systems to prevent outages and initiate an Incident Management bridge in case of an outage. Troubleshoot Azure...


  • Guadalajara, Jalisco, México NTT DATA North America A tiempo completo

    SRE – Site Reliability EngineerWe are currently seeking a Site Reliability Engineer to join our team in GDL, Jalisco (MX-JAL), Mexico (MX).Perform L1.5 activities such as monitoring, deployment, rollback. Monitor the efficiency of the Azure cloud systems to prevent outages and initiate an Incident Management bridge in case of an outage. Troubleshoot Azure...

  • Site Reliability Engineer

    hace 3 semanas


    Guadalajara, México f5 A tiempo completo

    Everything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive.- Site Reliability Engineer IIIWhy do you want to join our team?- Everything we do centers around people. That means we obsess over how to...

  • Site Reliability Engineer

    hace 2 semanas


    Guadalajara, México F5 A tiempo completo

    Everything we do centers around people.That means we obsess over how to make the lives of our customers, and their customers, better.And it means we prioritize a diverse F5 community where each individual can thrive.- Site Reliability Engineer IIIWhy do you want to join our team?- Everything we do centers around people.That means we obsess over how to make...


  • Guadalajara, México Valce Talent Solutions A tiempo completo

    We are looking for a Lead Site Reliability Engineer who takes the initiative on developing and maintain the system and services for our Cash Management Platform, automating the deployment process, ensuring system scaling, investigating and resolving outdates, identifying and implementing preventive measures proactively, collaborating with key stakeholders,...

  • Site Reliability Engineer

    hace 3 semanas


    Guadalajara, México Valce Talent Solutions A tiempo completo

    We are looking for a Lead Site Reliability Engineer who takes the initiative on developing and maintain the system and services for our Cash Management Platform, automating the deployment process, ensuring system scaling, investigating and resolving outdates, identifying and implementing preventive measures proactively, collaborating with key stakeholders,...

  • Site Reliability Engineer

    hace 3 semanas


    Guadalajara, México Valce Talent Solutions A tiempo completo

    We are looking for a Lead Site Reliability Engineer who takes the initiative on developing and maintain the system and services for our Cash Management Platform, automating the deployment process, ensuring system scaling, investigating and resolving outdates, identifying and implementing preventive measures proactively, collaborating with key stakeholders,...


  • Guadalajara, México Finastra A tiempo completo

    Your deliverables as a Site Reliability Engineer will include, but are not limited to, the following: - Work with containers and container orchestration systems such as Kubernetes - Capacity Planning to determine resource requirements of your service for it to be scalable, efficient, and reliable - Collaborate with other engineers to implement operational...


  • Guadalajara, México f5 A tiempo completo

    Everything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive. Business/Job Title: Senior Site Reliability Engineer Position Summary Software engineering is a core discipline at F5 for many roles. As a...


  • Guadalajara, México Finastra USA Corporation A tiempo completo

    **Responsibilities**: **What will you contribute?** As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra’s Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture where the primary objective is continuous improvement. You’ll be treating operations as a software...