Reliability Solutions Engineer

hace 2 semanas


Xico, México Ford Motor Company A tiempo completo
Job Description

SRE Software Engineer is responsible for designing, configuring, monitoring, implementing, and maintaining our observability solutions and troubleshooting Ford Credit IT systems and applications to ensure optimal performance and reliability.

Key Responsibilities
  • Utilizing Observability and Monitoring tools to detect and resolve issues affecting positive user experience.
  • Automating alerting and remediation processes to reduce mean time to resolution (MTTR) and improve system uptime.
  • Working with Splunk query language and monitoring database connection health by using Splunk DB connect health dashboards, log parsing, complex Splunk searches, including external table lookups, Splunk data flow, components, features, and product capability.
  • Implementing comprehensive monitoring and alerting solutions using GCP monitoring services and external services.
  • Gathering and analyzing metrics from operating systems as well as applications to assist in performance tuning and fault finding.
  • Building vital and efficient tooling to lower the barrier of entrance for engineering teams to plug in and enjoy the benefits of Reliability focused on Observability.
Requirements
  • 6+ years of SRE observability engineering experience.
  • 6+ years of experience in observability best practices working with Dynatrace or similar tools (NewRelic, DataDog, AppDynamics, or other similar APM suites), delivering solutions across all environments, and integrating platforms and applications with monitoring and APM tools.
  • Knowledge of CI/CD tools such as Puppet, Jenkins, Terraform, Ansible.
  • Minimum 4 to 5 years' working experience in OpenShift and Docker/K8s.
  • Proficiency in implementing monitoring and observability solutions using GCP monitoring services such as Cloud Monitoring, Logging, and Tracing.
  • Deep understanding of IT infrastructure monitoring and observability best practices.
  • Experience with gathering and organizing large amounts of data to use for instrumentation into an Enterprise monitoring solution.
  • Experience with recommending baseline monitoring thresholds and performance monitoring KPIs and SLAs.
  • At least 4 years of experience in the development of Grafana dashboards, developing metrics/monitoring standardization - metrics, collection, dashboards with Grafana a must.
  • 3-5 years of experience with SQL and familiarity with at least one managed Kubernetes platform (EKS, AKS, GKE).
  • Strong background in software engineering, with expertise in relevant programming languages (like Python, Java, Go) and cloud platforms (like AWS, GCP, Azure).
  • Experience with container orchestration tools like Kubernetes.
Competencies
  • Strong interpersonal and organizational skills.
  • Strong verbal and written skills.
  • Attention to detail.
  • Excellent time management.
  • Extraordinary teamwork and collaborative skills.

  • Reliability Engineer

    hace 4 días


    Xico, México Thales A tiempo completo

    About the RoleThales is seeking a highly skilled Customer Reliability Engineer to ensure seamless customer experiences by guaranteeing service reliability and swift incident resolution. This hybrid position, based in Mexico City, Mexico, offers an exciting opportunity to collaborate with a talented team.Key ResponsibilitiesManage incidents and service...


  • Xico, México It Crowd Argentina A tiempo completo

    Cloud Reliability Engineer with Kubernetes ExpertiseAre you a skilled Cloud Reliability Engineer looking for a new challenge? We're seeking a talented professional to join our team as a Cloud Reliability Engineer with expertise in Kubernetes.About Our ClientOur client is a leading global IT services company offering enterprise digital transformation and...

  • Reliability Engineer

    hace 4 semanas


    Xico, México Kyndryl Inc. A tiempo completo

    Reliability EngineerWe are seeking a highly skilled Reliability Engineer to join our team at Kyndryl Inc. As a Reliability Engineer, you will be responsible for ensuring the reliability and resiliency of our information systems and ecosystems.Your key responsibilities will include:Designing and implementing application monitoring to ensure reliability and...


  • Xico, México Global Payments A tiempo completo

    Cloud Reliability EngineerAt Global Payments, we're driven by our passion for success and our commitment to delivering best-in-class payment technology and software solutions. As a Cloud Reliability Engineer, you'll play a critical role in ensuring the reliability and stability of our cloud-based systems.Key Responsibilities:Design and implement chaos...


  • Xico, México Kyndryl A tiempo completo

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Kyndryl team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, resiliency, and innovation of our information systems and ecosystems.Your Key ResponsibilitiesAnalyze complex problems and develop strategic advice and designs to drive...

  • Reliability Engineer

    hace 7 días


    Xico, México Think Hst A tiempo completo

    **About Us:** At Think Hst, we are seeking a skilled Senior Site Reliability Engineer to join our team. This role is ideal for someone with strong technical expertise and leadership abilities.About the Role:We are looking for an experienced engineer who can set direction and inspire team members to deliver high-quality application support.The ideal candidate...

  • Reliability Engineer II

    hace 3 semanas


    Xico, México Global Payments (Beamery) A tiempo completo

    About the RoleGlobal Payments (Beamery) is seeking an experienced Reliability Engineer II to join our team. In this role, you will be responsible for designing and implementing strategies to ensure the reliability and scalability of our systems. This includes conducting chaos engineering, pushing systems to their limits, and implementing remediation plans as...

  • Site Reliability Engineer

    hace 4 semanas


    Xico, México Jaak-It S.A.P.I. De C.V. A tiempo completo

    Buscamos a un Site Reliability Engineer para unirse a nuestro equipo de tecnología en Jáak-It S.A.P.I. De C.V..Como Site Reliability Engineer, será responsable de garantizar la disponibilidad y la confiabilidad de nuestros sistemas y servicios. Responsabilidades:Contribuir en la resolución de problemas y el soporte técnico para resolver incidentes...


  • Xico, México Comau Automatizacion S De Rl De Cv A tiempo completo

    About the Role:We are seeking a highly skilled Software Engineer to join our team. The ideal candidate will have a strong background in software architecture and a passion for scalability and reliability.Key Responsibilities:Design and implement scalable software solutionsCollaborate with cross-functional teams to deliver high-quality software...


  • Xico, México The Bridge México A tiempo completo

    Transforming Systems, Transforming LivesAt The Bridge Mexico, we are on a mission to create a world where technology and people thrive together. We are seeking a highly skilled Site Reliability Engineer to join our team and help us achieve this vision.About the RoleWe are looking for a talented engineer who can design, implement, and maintain scalable and...


  • Xico, México Global Payments A tiempo completo

    At Global Payments, we're driven by our passion for success and proud to deliver best-in-class payment technology and software solutions. As a Cloud Reliability Engineer, you'll play a critical role in ensuring the stability and reliability of our systems.**Key Responsibilities:**- Design and implement chaos engineering tests to identify potential system...

  • Site Reliability Engineer

    hace 3 semanas


    Xico, México Thomson Reuters A tiempo completo

    **About the Role**We are seeking a highly skilled Senior Site Reliability Engineer to join our Service Reliability and Operation group at Thomson Reuters. As a key member of our team, you will be responsible for implementing site reliability engineering and DevOps best practices, ensuring the scalability, reliability, and security of our cloud...

  • **Software Engineer**

    hace 3 semanas


    Xico, México Pepsico A tiempo completo

    About the RoleWe are seeking a skilled Software Engineer to join our team. As a Software Engineer, you will be responsible for designing and developing innovative software solutions that meet the needs of our customers.Key ResponsibilitiesDesign and develop software applications using various programming languages.Collaborate with cross-functional teams to...

  • Site Reliability Engineer

    hace 3 semanas


    Xico, México Kyndryl A tiempo completo

    About UsKyndryl is a global leader in designing, building, and managing mission-critical technology systems. Our team is passionate about delivering exceptional service to our customers and driving continuous improvement in our information systems and ecosystems.The RoleWe are seeking a highly skilled Site Reliability Engineer to join our team. As an SRE at...


  • Xico, México Thomson Reuters A tiempo completo

    **Service Reliability Engineer****About the Role**We are seeking a skilled Service Reliability Engineer to join our Global Command Center team. As a key member of our team, you will be responsible for ensuring the smooth operation of our cloud infrastructure and applications.**Key Responsibilities:**Perform daily service checks and health evaluations to...

  • Senior Software Engineer

    hace 2 semanas


    Xico, México Manhattan A tiempo completo

    We are seeking a Senior Software Engineer to join our team in Software Development. The ideal candidate will have expertise in Software Engineering and a passion for Innovation. As a key member of our team, you will be responsible for designing and developing software solutions that meet the needs of our clients.Key Responsibilities:Design and develop...

  • Senior Software Engineer

    hace 2 semanas


    Xico, México Opal Group A tiempo completo

    About the Job:As a Senior Software Engineer – Cloud Solutions, you will design, develop, and deploy scalable cloud-based systems that meet the needs of our dynamic business environment. A strong understanding of cloud computing, software development, and technical leadership is required. If you have experience with cloud architecture, migration, and...


  • Xico, México Gsb Solutions A tiempo completo

    Gsb Solutions seeks a seasoned Cloud Engineer to join its team in a critical role that combines technical expertise and business acumen. The ideal candidate will possess a strong background in Azure Cloud Platform and Site Reliability Engineering (SRE) philosophies.The successful applicant will have a minimum of 5 years' experience in Azure Cloud Platform,...


  • Xico, México Centro De Atracción De Talento A tiempo completo

    About the RoleAs a software engineer at {company}, you will contribute to the development of scalable solutions that meet the needs of our clients. You will work closely with cross-functional teams to design, implement, and maintain high-quality software.Collaborate with team members to identify and prioritize project requirementsDesign, implement, and test...

  • Reliability Engineer

    hace 3 semanas


    Xico, México Bsb-Jll - Lasalle Services, Mex A tiempo completo

    Bsb-Jll - Lasalle Services, Mex is seeking a highly skilled Reliability Engineer to join their team. The successful candidate will be responsible for implementing a strategic asset management plan to integrate clients existing systems, including building automation, energy management, maintenance programs, and life-cycle asset management approach.The ideal...