IT Site Reliability Engineer
hace 6 días
About the Role
We are seeking a talented and experienced IT Engineer / Architect with a strong focus on site reliability engineering responsibilities to join our team. As a key member of our team, you will be responsible for ensuring the reliability, scalability, and performance of our infrastructure and applications, with a specific focus on the architecture design and implementation.
Responsibilities
- Design, build, and maintain the architecture of our cloud-based infrastructure to ensure high availability, scalability, and security for our medical device applications, including but not limited to Ignition, PostgreSQL, HiveMQ, Qlik, Confluent Kafka, and Tanzu.
- Collaborate with cross-functional teams to develop and implement best practices for container orchestration and management, with a specific focus on Kubernetes.
- Develop and maintain CI/CD pipelines to automate the deployment and testing of applications and infrastructure changes, utilizing tools such as Tanzu, Confluent Kafka, and others as needed.
- Manage and maintain the repository of infrastructure as code, ensuring proper version control and documentation, with a specific focus on the specified applications.
- Monitor and analyze system performance, identifying and resolving potential issues to ensure optimal reliability and performance for the specified applications.
- Lead efforts to implement disaster recovery and business continuity plans for critical systems and applications, including those utilizing Ignition, PostgreSQL, HiveMQ, Qlik, Confluent Kafka, and Tanzu.
- Strong Linux Experience:
- Proficient in administering Linux systems (e.g., Ubuntu, CentOS, RHEL, Debian) in production environments.
- Strong knowledge of Linux internals including system calls, process management, networking, and filesystems.
Experience with system monitoring and performance tuning on Linux servers.
DevOps:
- Implements GitOps workflows for Kubernetes using declarative infrastructure in Git.
- Manages manifests, Helm charts, or Kustomize in version control.
- Automates reconciliation between Git and clusters for consistent deployments.
- Monitors and troubleshoots GitOps deployment issues, enforcing drift detection with Git-centric tools.
Qualifications
- Bachelor's degree in computer science, Engineering, or a related field.
- Proven experience in designing and implementing architecture for cloud-based infrastructure, preferably in the medical device or healthcare industry, with expertise in the specified applications.
- Strong expertise in Kubernetes and other container orchestration technologies, with experience in managing the specified applications.
- Experience with infrastructure as code tools such as Terraform, Ansible, or CloudFormation, with a focus on the specified applications.
- Proficiency in developing and maintaining CI/CD pipelines using tools such as Tanzu, Confluent Kafka, and others as needed.
- Solid understanding of networking, security, and monitoring concepts in a cloud environment, with a focus on the specified applications.
- Experience working in Global / Multisite deployments of new architecture, change control for new requests as well as support in cases of issues.
Required Skills
- Drive for Results
- Interpersonal Relationships
- Adaptability
Preferred Skills
- Previous Medical Devices or Pharma experience
- Certified AWS Solution Architect
- Certified Kubernetes (CKS or CKA)
- ITILv4
Equal Opportunity Statement
We are committed to diversity and inclusivity in our hiring practices.
-
Site Reliability Engineer
hace 2 semanas
Mexico City Azka IT A tiempo completoSite Reliability Engineer (SRE) AZKAIT es una empresa mexicana que busca y conecta el mejor talento IT con empresas Latinoamericanas y de Estados Unidos. Requisitos Licenciatura o Ingeniería en Sistemas, Informática o afín. +5 años de experiencia en roles de SRE, DevOps o Ingeniería de Software. Experiencia programando en Python. Experiencia con Docker...
-
Site Reliability Engineer
hace 2 semanas
Mexico City Azka IT A tiempo completoSite Reliability Engineer (SRE) AZKAIT es una empresa mexicana que busca y conecta el mejor talento IT con empresas Latinoamericanas y de Estados Unidos. Requisitos Licenciatura o Ingeniería en Sistemas, Informática o afín. +5 años de experiencia en roles de SRE, DevOps o Ingeniería de Software. Experiencia programando en Python. Experiencia con Docker...
-
Site Reliability Engineer
hace 4 días
Mexico City Royal Caribbean Group A tiempo completoTalent Acquisition @Royal Caribbean Group Journey with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group. We are proud to offer a competitive compensation and benefits package, and excellent career development opportunities, each offering unique ways to explore the world. We are proud to...
-
Site Reliability Engineer
hace 3 semanas
Mexico Pyramid Consulting, Inc A tiempo completoIT Recruiter at Pyramid || SOURCING || MEXICO/LATAM/US Must have: AWS & EKS CI/CD Tools (e.g., Bamboo, Bitbucket) Configuration Management (e.g., Chef) Observability Tools (e.g., Datadog, New Relic) Automation & Scripting Security & Compliance Awareness Seniority level Associate Employment type Full-time Job function Information Technology Industries IT...
-
Site Reliability Engineer
hace 3 semanas
Mexico City W3Global A tiempo completoSite Reliability Engineer Join to apply for the Site Reliability Engineer role at W3Global Required qualifications: AWS experience Gitlab Terraform or AWS CDK Python Familiarity with GO Linux OS administration advanced scripting - bash Windows OS administration advanced scripting - powershell Seniority level Entry level Employment type Full-time Job function...
-
Site Reliability Engineer
hace 3 semanas
Mexico City W3Global A tiempo completoSite Reliability Engineer Join to apply for the Site Reliability Engineer role at W3Global Required qualifications: AWS experience Gitlab Terraform or AWS CDK Python Familiarity with GO Linux OS administration advanced scripting - bash Windows OS administration advanced scripting - powershell Seniority level Entry level Employment type Full-time Job function...
-
Site Reliability Engineer
hace 3 semanas
Mexico City Yochana A tiempo completoWe’re Hiring | Site Reliability Engineer (SRE) Hybrid | Mexico City Full-Time We are looking for a highly skilled Site Reliability Engineer (SRE) to join our team in Mexico City. This role is ideal for a proactive engineer with strong AWS expertise , a passion for automation, and a solid background in systems reliability, scalability, and performance. Key...
-
Site Reliability Engineer Lead
hace 4 semanas
Mexico City Royal Caribbean Group A tiempo completoCombine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group . We are proud to be the vacation-industry leader with global brands — including Royal Caribbean International, Celebrity Cruises and Silversea Cruises — the most innovative fleet and private destinations, and the best people. Royal...
-
Site Reliability Engineer
hace 11 horas
Mexico City Tata Consultancy Services A tiempo completoWe are looking for a Site Reliability Engineer (SRE) to join our team and help us ensure seamless, high-performing, and reliable technology operations.What you’ll work with:Azure DevOps - Pipelines, repositories, and automationServiceNow - Incident, change, and problem managementAppDynamics - Application performance monitoring and alertingMicrosoft Azure...
-
Site Reliability Engineer
hace 8 horas
Mexico City Tata Consultancy Services A tiempo completoWe are looking for a Site Reliability Engineer (SRE) to join our team and help us ensure seamless, high-performing, and reliable technology operations.What you’ll work with:Azure DevOps - Pipelines, repositories, and automationServiceNow - Incident, change, and problem managementAppDynamics - Application performance monitoring and alertingMicrosoft Azure...