IT Site Reliability Engineer

hace 5 horas

Guadalajara Mexico Metropolitan Area Tata Consultancy Services A tiempo completo

About the Role

We are seeking a talented and experienced IT Engineer / Architect with a strong focus on site reliability engineering responsibilities to join our team. As a key member of our team, you will be responsible for ensuring the reliability, scalability, and performance of our infrastructure and applications, with a specific focus on the architecture design and implementation.

Responsibilities

Design, build, and maintain the architecture of our cloud-based infrastructure to ensure high availability, scalability, and security for our medical device applications, including but not limited to Ignition, PostgreSQL, HiveMQ, Qlik, Confluent Kafka, and Tanzu.
Collaborate with cross-functional teams to develop and implement best practices for container orchestration and management, with a specific focus on Kubernetes.
Develop and maintain CI/CD pipelines to automate the deployment and testing of applications and infrastructure changes, utilizing tools such as Tanzu, Confluent Kafka, and others as needed.
Manage and maintain the repository of infrastructure as code, ensuring proper version control and documentation, with a specific focus on the specified applications.
Monitor and analyze system performance, identifying and resolving potential issues to ensure optimal reliability and performance for the specified applications.
Lead efforts to implement disaster recovery and business continuity plans for critical systems and applications, including those utilizing Ignition, PostgreSQL, HiveMQ, Qlik, Confluent Kafka, and Tanzu.
Strong Linux Experience:
Proficient in administering Linux systems (e.g., Ubuntu, CentOS, RHEL, Debian) in production environments.
Strong knowledge of Linux internals including system calls, process management, networking, and filesystems.
Experience with system monitoring and performance tuning on Linux servers.
DevOps:
Implements GitOps workflows for Kubernetes using declarative infrastructure in Git.
Manages manifests, Helm charts, or Kustomize in version control.
Automates reconciliation between Git and clusters for consistent deployments.
Monitors and troubleshoots GitOps deployment issues, enforcing drift detection with Git-centric tools.

Qualifications

Bachelor's degree in computer science, Engineering, or a related field.
Proven experience in designing and implementing architecture for cloud-based infrastructure, preferably in the medical device or healthcare industry, with expertise in the specified applications.
Strong expertise in Kubernetes and other container orchestration technologies, with experience in managing the specified applications.
Experience with infrastructure as code tools such as Terraform, Ansible, or CloudFormation, with a focus on the specified applications.
Proficiency in developing and maintaining CI/CD pipelines using tools such as Tanzu, Confluent Kafka, and others as needed.
Solid understanding of networking, security, and monitoring concepts in a cloud environment, with a focus on the specified applications.
Experience working in Global / Multisite deployments of new architecture, change control for new requests as well as support in cases of issues.

Required Skills

Drive for Results
Interpersonal Relationships
Adaptability

Preferred Skills

Previous Medical Devices or Pharma experience
Certified AWS Solution Architect
Certified Kubernetes (CKS or CKA)
ITILv4

Equal Opportunity Statement

We are committed to diversity and inclusivity in our hiring practices.

Site Reliability Engineer

hace 4 semanas

Mexico City Royal Caribbean Group A tiempo completo

Talent Acquisition @Royal Caribbean Group Journey with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group. We are proud to offer a competitive compensation and benefits package, and excellent career development opportunities, each offering unique ways to explore the world. We are proud to...
Site Reliability Engineer

hace 2 horas

Mexico City W3Global A tiempo completo

Site Reliability Engineer Join to apply for the Site Reliability Engineer role at W3Global Required qualifications: AWS experience Gitlab Terraform or AWS CDK Python Familiarity with GO Linux OS administration advanced scripting - bash Windows OS administration advanced scripting - powershell Seniority level Entry level Employment type Full-time Job function...
Site Reliability Engineer

hace 2 horas

Mexico City W3Global A tiempo completo

Site Reliability Engineer Join to apply for the Site Reliability Engineer role at W3Global Required qualifications: AWS experience Gitlab Terraform or AWS CDK Python Familiarity with GO Linux OS administration advanced scripting - bash Windows OS administration advanced scripting - powershell Seniority level Entry level Employment type Full-time Job function...
Site Reliability Engineer

hace 2 horas

Mexico City Tata Consultancy Services A tiempo completo

We are looking for a Site Reliability Engineer (SRE) to join our team and help us ensure seamless, high-performing, and reliable technology operations.What you’ll work with:Azure DevOps - Pipelines, repositories, and automationServiceNow - Incident, change, and problem managementAppDynamics - Application performance monitoring and alertingMicrosoft Azure...
Site Reliability Engineer

hace 2 horas

Mexico City Tata Consultancy Services A tiempo completo

We are looking for a Site Reliability Engineer (SRE) to join our team and help us ensure seamless, high-performing, and reliable technology operations.What you’ll work with:Azure DevOps - Pipelines, repositories, and automationServiceNow - Incident, change, and problem managementAppDynamics - Application performance monitoring and alertingMicrosoft Azure...
Site Reliability Engineer

hace 2 horas

Mexico City The Functionary A tiempo completo

Senior Site Reliability Engineer We are looking for a Senior Site Reliability Engineer to build and maintain reliable, high‑capacity, and high‑performing systems that support our mission to protect and improve customer platforms, with a strong focus on reliability, security, performance, cost, and operational excellence. As a Site Reliability Engineer on...
Site Reliability Engineer

hace 2 horas

Mexico City The Functionary A tiempo completo

Senior Site Reliability Engineer We are looking for a Senior Site Reliability Engineer to build and maintain reliable, high‑capacity, and high‑performing systems that support our mission to protect and improve customer platforms, with a strong focus on reliability, security, performance, cost, and operational excellence. As a Site Reliability Engineer on...
Site Reliability Engineer

hace 2 horas

Mexico City Sur Global A tiempo completo

Site Reliability Engineer - 100% Remote in Mexico As the Site Reliability Engineer you will support and scale the infrastructure powering their secure, mission‑critical SaaS platform. You must be confident in operating and debugging both modern infrastructure (cloud‑native, containerized services) and classic Windows production environments (IIS, SQL...
Site Reliability Engineer

hace 2 horas

Mexico City Sur Global A tiempo completo

Site Reliability Engineer - 100% Remote in Mexico As the Site Reliability Engineer you will support and scale the infrastructure powering their secure, mission‑critical SaaS platform. You must be confident in operating and debugging both modern infrastructure (cloud‑native, containerized services) and classic Windows production environments (IIS, SQL...
Site Reliability Engineer

hace 3 semanas

Guadalajara, México Finastra USA Corporation A tiempo completo

**Responsibilities**:**What will you contribute?**As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra’s Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture where the primary objective is continuous improvement. You’ll be treating operations as a software...

Américas

Europa

Asia / Oceanía

África

IT Site Reliability Engineer