Reliability Solutions Engineer
hace 7 días
SRE Software Engineer is responsible for designing, configuring, monitoring, implementing, and maintaining our observability solutions and troubleshooting Ford Credit IT systems and applications to ensure optimal performance and reliability.
Key Responsibilities- Utilizing Observability and Monitoring tools to detect and resolve issues affecting positive user experience.
- Automating alerting and remediation processes to reduce mean time to resolution (MTTR) and improve system uptime.
- Working with Splunk query language and monitoring database connection health by using Splunk DB connect health dashboards, log parsing, complex Splunk searches, including external table lookups, Splunk data flow, components, features, and product capability.
- Implementing comprehensive monitoring and alerting solutions using GCP monitoring services and external services.
- Gathering and analyzing metrics from operating systems as well as applications to assist in performance tuning and fault finding.
- Building vital and efficient tooling to lower the barrier of entrance for engineering teams to plug in and enjoy the benefits of Reliability focused on Observability.
- 6+ years of SRE observability engineering experience.
- 6+ years of experience in observability best practices working with Dynatrace or similar tools (NewRelic, DataDog, AppDynamics, or other similar APM suites), delivering solutions across all environments, and integrating platforms and applications with monitoring and APM tools.
- Knowledge of CI/CD tools such as Puppet, Jenkins, Terraform, Ansible.
- Minimum 4 to 5 years' working experience in OpenShift and Docker/K8s.
- Proficiency in implementing monitoring and observability solutions using GCP monitoring services such as Cloud Monitoring, Logging, and Tracing.
- Deep understanding of IT infrastructure monitoring and observability best practices.
- Experience with gathering and organizing large amounts of data to use for instrumentation into an Enterprise monitoring solution.
- Experience with recommending baseline monitoring thresholds and performance monitoring KPIs and SLAs.
- At least 4 years of experience in the development of Grafana dashboards, developing metrics/monitoring standardization - metrics, collection, dashboards with Grafana a must.
- 3-5 years of experience with SQL and familiarity with at least one managed Kubernetes platform (EKS, AKS, GKE).
- Strong background in software engineering, with expertise in relevant programming languages (like Python, Java, Go) and cloud platforms (like AWS, GCP, Azure).
- Experience with container orchestration tools like Kubernetes.
- Strong interpersonal and organizational skills.
- Strong verbal and written skills.
- Attention to detail.
- Excellent time management.
- Extraordinary teamwork and collaborative skills.
-
Reliability Engineer
hace 3 semanas
Xico, México Jones Lang Lasalle Incorporated A tiempo completoReliability EngineerJones Lang Lasalle Incorporated is committed to empowering its employees to shape a brighter future for the real estate industry. Our team of experts is dedicated to delivering world-class services, advisory, and technology solutions to our clients. We are seeking a highly skilled Reliability Engineer to join our team and contribute to...
-
Site Reliability Engineer
hace 3 semanas
Xico, México Ford Motor Company A tiempo completoJob Title: Site Reliability EngineerAt Ford Motor Company, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, configuring, and maintaining our observability solutions to ensure optimal performance and reliability of our IT systems and applications.Key...
-
Cloud Reliability Engineer with Kubernetes Expertise
hace 2 semanas
Xico, México It Crowd Argentina A tiempo completoCloud Reliability Engineer with Kubernetes ExpertiseAre you a skilled Cloud Reliability Engineer looking for a new challenge? We're seeking a talented professional to join our team as a Cloud Reliability Engineer with expertise in Kubernetes.About Our ClientOur client is a leading global IT services company offering enterprise digital transformation and...
-
Reliability Engineer
hace 3 semanas
Xico, México Kyndryl Inc. A tiempo completoReliability EngineerWe are seeking a highly skilled Reliability Engineer to join our team at Kyndryl Inc. As a Reliability Engineer, you will be responsible for ensuring the reliability and resiliency of our information systems and ecosystems.Your key responsibilities will include:Designing and implementing application monitoring to ensure reliability and...
-
Cloud Reliability Engineer
hace 3 semanas
Xico, México Global Payments A tiempo completoCloud Reliability EngineerAt Global Payments, we're driven by our passion for success and our commitment to delivering best-in-class payment technology and software solutions. As a Cloud Reliability Engineer, you'll play a critical role in ensuring the reliability and stability of our cloud-based systems.Key Responsibilities:Design and implement chaos...
-
Kyndryl Reliability Engineer
hace 6 días
Xico, México Kyndryl A tiempo completoAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Kyndryl team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, resiliency, and innovation of our information systems and ecosystems.Your Key ResponsibilitiesAnalyze complex problems and develop strategic advice and designs to drive...
-
Customer Reliability Specialist
hace 3 semanas
Xico, México Thales A tiempo completoJob Title: Customer Reliability EngineerThales is seeking a highly skilled Customer Reliability Engineer to join our team in Mexico City, Mexico. As a key member of our team, you will be responsible for ensuring the best customer experience by assuring services reliability and resolving incidents in the shortest timeframe.Key Responsibilities:Manage...
-
Service Reliability Specialist
hace 3 semanas
Xico, México Thales Group A tiempo completoService Reliability EngineerThales is seeking a skilled Service Reliability Engineer to ensure the highest level of service quality and reliability for our customers. As a key member of our team, you will be responsible for managing incidents and service requests, ensuring timely resolution, and maintaining a high level of technical expertise on our...
-
Reliability Engineer II
hace 2 semanas
Xico, México Global Payments (Beamery) A tiempo completoAbout the RoleGlobal Payments (Beamery) is seeking an experienced Reliability Engineer II to join our team. In this role, you will be responsible for designing and implementing strategies to ensure the reliability and scalability of our systems. This includes conducting chaos engineering, pushing systems to their limits, and implementing remediation plans as...
-
Site Reliability Engineer
hace 2 semanas
Xico, México Jaak-It S.A.P.I. De C.V. A tiempo completoBuscamos a un Site Reliability Engineer para unirse a nuestro equipo de tecnología en Jáak-It S.A.P.I. De C.V..Como Site Reliability Engineer, será responsable de garantizar la disponibilidad y la confiabilidad de nuestros sistemas y servicios. Responsabilidades:Contribuir en la resolución de problemas y el soporte técnico para resolver incidentes...
-
Xico, México Comau Automatizacion S De Rl De Cv A tiempo completoAbout the Role:We are seeking a highly skilled Software Engineer to join our team. The ideal candidate will have a strong background in software architecture and a passion for scalability and reliability.Key Responsibilities:Design and implement scalable software solutionsCollaborate with cross-functional teams to deliver high-quality software...
-
Cloud Reliability Engineer
hace 3 semanas
Xico, México Global Payments A tiempo completoAt Global Payments, we're driven by our passion for success and proud to deliver best-in-class payment technology and software solutions. As a Cloud Reliability Engineer, you'll play a critical role in ensuring the stability and reliability of our systems.**Key Responsibilities:**- Design and implement chaos engineering tests to identify potential system...
-
**Software Engineer**
hace 1 semana
Xico, México Pepsico A tiempo completoAbout the RoleWe are seeking a skilled Software Engineer to join our team. As a Software Engineer, you will be responsible for designing and developing innovative software solutions that meet the needs of our customers.Key ResponsibilitiesDesign and develop software applications using various programming languages.Collaborate with cross-functional teams to...
-
Site Reliability Engineer
hace 2 semanas
Xico, México Thomson Reuters A tiempo completo**About the Role**We are seeking a highly skilled Senior Site Reliability Engineer to join our Service Reliability and Operation group at Thomson Reuters. As a key member of our team, you will be responsible for implementing site reliability engineering and DevOps best practices, ensuring the scalability, reliability, and security of our cloud...
-
Senior Software Engineer
hace 1 semana
Xico, México Manhattan A tiempo completoWe are seeking a Senior Software Engineer to join our team in Software Development. The ideal candidate will have expertise in Software Engineering and a passion for Innovation. As a key member of our team, you will be responsible for designing and developing software solutions that meet the needs of our clients.Key Responsibilities:Design and develop...
-
Site Reliability Engineer
hace 2 semanas
Xico, México Kyndryl A tiempo completoAbout UsKyndryl is a global leader in designing, building, and managing mission-critical technology systems. Our team is passionate about delivering exceptional service to our customers and driving continuous improvement in our information systems and ecosystems.The RoleWe are seeking a highly skilled Site Reliability Engineer to join our team. As an SRE at...
-
Service Reliability Engineer
hace 2 semanas
Xico, México Thomson Reuters A tiempo completo**Service Reliability Engineer****About the Role**We are seeking a skilled Service Reliability Engineer to join our Global Command Center team. As a key member of our team, you will be responsible for ensuring the smooth operation of our cloud infrastructure and applications.**Key Responsibilities:**Perform daily service checks and health evaluations to...
-
Senior Software Engineer
hace 1 semana
Xico, México Opal Group A tiempo completoAbout the Job:As a Senior Software Engineer – Cloud Solutions, you will design, develop, and deploy scalable cloud-based systems that meet the needs of our dynamic business environment. A strong understanding of cloud computing, software development, and technical leadership is required. If you have experience with cloud architecture, migration, and...
-
**Software Engineer** | Contribute to scalable solutions
hace 1 semana
Xico, México Centro De Atracción De Talento A tiempo completoAbout the RoleAs a software engineer at {company}, you will contribute to the development of scalable solutions that meet the needs of our clients. You will work closely with cross-functional teams to design, implement, and maintain high-quality software.Collaborate with team members to identify and prioritize project requirementsDesign, implement, and test...
-
Site Reliability Engineer III/Network
hace 3 semanas
Xico, México F5 A tiempo completoJob SummaryF5 is seeking a highly skilled Site Reliability Engineer III to join our team. As a key member of our engineering team, you will be responsible for ensuring the reliability, availability, and scalability of our critical systems, networks, and SaaS platforms.Key ResponsibilitiesApply modern engineering principles and practices to operational...