IT Site Reliability Engineer
hace 5 horas
About the Role
We are seeking a talented and experienced IT Engineer / Architect with a strong focus on site reliability engineering responsibilities to join our team. As a key member of our team, you will be responsible for ensuring the reliability, scalability, and performance of our infrastructure and applications, with a specific focus on the architecture design and implementation.
Responsibilities
- Design, build, and maintain the architecture of our cloud-based infrastructure to ensure high availability, scalability, and security for our medical device applications, including but not limited to Ignition, PostgreSQL, HiveMQ, Qlik, Confluent Kafka, and Tanzu.
- Collaborate with cross-functional teams to develop and implement best practices for container orchestration and management, with a specific focus on Kubernetes.
- Develop and maintain CI/CD pipelines to automate the deployment and testing of applications and infrastructure changes, utilizing tools such as Tanzu, Confluent Kafka, and others as needed.
- Manage and maintain the repository of infrastructure as code, ensuring proper version control and documentation, with a specific focus on the specified applications.
- Monitor and analyze system performance, identifying and resolving potential issues to ensure optimal reliability and performance for the specified applications.
- Lead efforts to implement disaster recovery and business continuity plans for critical systems and applications, including those utilizing Ignition, PostgreSQL, HiveMQ, Qlik, Confluent Kafka, and Tanzu.
- Strong Linux Experience:
- Proficient in administering Linux systems (e.g., Ubuntu, CentOS, RHEL, Debian) in production environments.
- Strong knowledge of Linux internals including system calls, process management, networking, and filesystems.
Experience with system monitoring and performance tuning on Linux servers.
DevOps:
- Implements GitOps workflows for Kubernetes using declarative infrastructure in Git.
- Manages manifests, Helm charts, or Kustomize in version control.
- Automates reconciliation between Git and clusters for consistent deployments.
- Monitors and troubleshoots GitOps deployment issues, enforcing drift detection with Git-centric tools.
Qualifications
- Bachelor's degree in computer science, Engineering, or a related field.
- Proven experience in designing and implementing architecture for cloud-based infrastructure, preferably in the medical device or healthcare industry, with expertise in the specified applications.
- Strong expertise in Kubernetes and other container orchestration technologies, with experience in managing the specified applications.
- Experience with infrastructure as code tools such as Terraform, Ansible, or CloudFormation, with a focus on the specified applications.
- Proficiency in developing and maintaining CI/CD pipelines using tools such as Tanzu, Confluent Kafka, and others as needed.
- Solid understanding of networking, security, and monitoring concepts in a cloud environment, with a focus on the specified applications.
- Experience working in Global / Multisite deployments of new architecture, change control for new requests as well as support in cases of issues.
Required Skills
- Drive for Results
- Interpersonal Relationships
- Adaptability
Preferred Skills
- Previous Medical Devices or Pharma experience
- Certified AWS Solution Architect
- Certified Kubernetes (CKS or CKA)
- ITILv4
Equal Opportunity Statement
We are committed to diversity and inclusivity in our hiring practices.
-
Site Reliability Engineer
hace 4 semanas
Mexico City Royal Caribbean Group A tiempo completoTalent Acquisition @Royal Caribbean Group Journey with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group. We are proud to offer a competitive compensation and benefits package, and excellent career development opportunities, each offering unique ways to explore the world. We are proud to...
-
Site Reliability Engineer
hace 2 horas
Mexico City W3Global A tiempo completoSite Reliability Engineer Join to apply for the Site Reliability Engineer role at W3Global Required qualifications: AWS experience Gitlab Terraform or AWS CDK Python Familiarity with GO Linux OS administration advanced scripting - bash Windows OS administration advanced scripting - powershell Seniority level Entry level Employment type Full-time Job function...
-
Site Reliability Engineer
hace 2 horas
Mexico City W3Global A tiempo completoSite Reliability Engineer Join to apply for the Site Reliability Engineer role at W3Global Required qualifications: AWS experience Gitlab Terraform or AWS CDK Python Familiarity with GO Linux OS administration advanced scripting - bash Windows OS administration advanced scripting - powershell Seniority level Entry level Employment type Full-time Job function...
-
Site Reliability Engineer
hace 2 horas
Mexico City Tata Consultancy Services A tiempo completoWe are looking for a Site Reliability Engineer (SRE) to join our team and help us ensure seamless, high-performing, and reliable technology operations.What you’ll work with:Azure DevOps - Pipelines, repositories, and automationServiceNow - Incident, change, and problem managementAppDynamics - Application performance monitoring and alertingMicrosoft Azure...
-
Site Reliability Engineer
hace 2 horas
Mexico City Tata Consultancy Services A tiempo completoWe are looking for a Site Reliability Engineer (SRE) to join our team and help us ensure seamless, high-performing, and reliable technology operations.What you’ll work with:Azure DevOps - Pipelines, repositories, and automationServiceNow - Incident, change, and problem managementAppDynamics - Application performance monitoring and alertingMicrosoft Azure...
-
Site Reliability Engineer
hace 2 horas
Mexico City The Functionary A tiempo completoSenior Site Reliability Engineer We are looking for a Senior Site Reliability Engineer to build and maintain reliable, high‑capacity, and high‑performing systems that support our mission to protect and improve customer platforms, with a strong focus on reliability, security, performance, cost, and operational excellence. As a Site Reliability Engineer on...
-
Site Reliability Engineer
hace 2 horas
Mexico City The Functionary A tiempo completoSenior Site Reliability Engineer We are looking for a Senior Site Reliability Engineer to build and maintain reliable, high‑capacity, and high‑performing systems that support our mission to protect and improve customer platforms, with a strong focus on reliability, security, performance, cost, and operational excellence. As a Site Reliability Engineer on...
-
Site Reliability Engineer
hace 2 horas
Mexico City Sur Global A tiempo completoSite Reliability Engineer - 100% Remote in Mexico As the Site Reliability Engineer you will support and scale the infrastructure powering their secure, mission‑critical SaaS platform. You must be confident in operating and debugging both modern infrastructure (cloud‑native, containerized services) and classic Windows production environments (IIS, SQL...
-
Site Reliability Engineer
hace 2 horas
Mexico City Sur Global A tiempo completoSite Reliability Engineer - 100% Remote in Mexico As the Site Reliability Engineer you will support and scale the infrastructure powering their secure, mission‑critical SaaS platform. You must be confident in operating and debugging both modern infrastructure (cloud‑native, containerized services) and classic Windows production environments (IIS, SQL...
-
Site Reliability Engineer
hace 3 semanas
Guadalajara, México Finastra USA Corporation A tiempo completo**Responsibilities**:**What will you contribute?**As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra’s Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture where the primary objective is continuous improvement. You’ll be treating operations as a software...