Site Reliability Engineer Iii

hace 5 meses


Desde casa, México Cabify A tiempo completo

Do you want to change the world? At Cabify, that’s what we’re doing. We aim to make cities better places to live by improving mobility for the people living in them, connecting riders to drivers, providing mobility alternatives such as scooters and mopeds and many others to come, all at the touch of a button. Maybe one day cities will be places where nobody needs a private car. But we’ve still got a long way to goFancy joining us?

Our Product & Engineering teams are both based in Madrid, with a strong remote culture, and include an eclectic bunch of awesome people from different backgrounds like Ruby, Go, Elixir, JavaScript, and Python.

Right now, we’re working on some greenfield projects with a solid set of product ideas lined up ready for innovative engineers to tackle. And of course, we have big plans to take over the taxi app service industry

Site Reliability Engineers at Cabify work on improving all aspects of our platform and have an impact across the whole organisation. They are a blend of systems engineers and software developers who solve scalability issues with software and implement the best production engineering and security practices.

**As a Site Reliability Engineer, you will be**:

- Evolving our infrastructure platform building self-service components that will be used by all the engineering team and by millions of users around the world.
- Working closely with our Product and Infrastructure teams to architecture and develop world-class infrastructure components.
- Designing and implementing tooling to improve the availability, scalability, observability and latency of our services, which are used by internal customers to deploy and operate their services.
- Increasing reliability awareness with other teams, helping with the adoption of reliability principles and reviewing observability implementations or software architectures.
- Defining SLIs, SLOs and SLAs as part of the services' lifecycle.
- Sharing an on-call schedule for the platform services you own.
- Solving problems in our highly available platform together with other teams, then build automations to prevent incidents from happening again.
- Participating in our recruiting process to help grow our engineering team.

**You may be a fit for this role if you**:

- Think Unix, you know the networking stack, the OSI model, containers (and schedulers), and you know your way around monitoring, logging and the CAP theorem (bonus).
- Have strong programming skills in at least one language, and know your way around a few more or can learn them if the opportunity arises.
- Automate yourself out of everything by nature, making machines do the toil.
- Communicate effectively and asynchronously.
- Care about the things that affect the company, your team, and yourself.
- Embrace diversity and humbleness (and a bit of trolling).
- Prefer taking iterative action over waiting for things to happen or to be perfect.
- Strongly favor simplicity over complexity. Ie, adhering to the KISS principle.
- Have a sense for identifying, exploiting and elevating bottlenecks.
- Are not afraid of expressing yourself in English. We aren't expecting you to have the Queen's accent, but you'll be part of an international team and we communicate in English, so you should be comfortable with that.
- Enjoy herding cats and shaving yaks. Ie, being a great influence to other product teams and teaching them best practices. As well as analyzing and simplifying our setup.

**Projects you could work on**:

- Helping us iterate on and improve our kubernetes setup (AWS EKS).
- Iterate our networking layer to implement network policies, service mesh, and more
- Evolving our time-series monitoring platform (Cortex), in order to provide a first-class service to all of our engineering teams.
- Help grow our adoption of distributed tracing (OTLP + Tempo), with the goal of providing request latency observability across microservices (as a service).
- Scaling our ever-growing logging platform (Loki) to keep up with the business demands.
- Maintaining our company-wide code repository and continuous integration solution (gitlab)

**What’s it like to work at Cabify?**

We’re a company full of happy, motivated people, and we never want that to change. Here are some more reasons why it rocks to be part of our high-performance team:
Excellent Salary conditions**:L3 - Up to 52K**

️ Recharge day
- Flexible work environment & hours.

Regular team events.

Cabify staff free rides.

Personal development programs based on our career paths.
- ️ iFeel: Free access to the iFeel platform, so you can take care of your emotional well-being through therapy sessions.

Coursera: your own license in Coursera to take as many courses as you wish and continue developing your skills.

Flexible compensation plan: Restaurant tickets, transport tickets, healthcare and childcare

All the equipment you need (you only have to bring your talent).

Cabify is proud of being an equal opportunity



  • Desde casa, México thegetch mexico A tiempo completo

    **Función: Site Reliability Engineer** **Aperturas: más de 10 contrataciones** **Ubicación: - any city with TCS Office presence (Queretaro, Guadalajara, Mexico City or Monterrey)** **Salario: - 25-33 USD/hr** **Comunicación en inglés: avanzado** **Experiência: 4+ años** **Responsabilidades de Site Reliability Engineer**: Reúna y analice métricas...


  • Desde casa, México Right Balance A tiempo completo

    **Overview** We're looking for a Site Reliability Engineer. Headquartered in Los Angeles, California, Right Balance provides top-tier technology talent for innovative companies in the US. We’re in the top 50 companies to watch in LA. **Engagement Details** Our client is a USA-based company producing video solutions with the mission to advance scientific...


  • Desde casa, México Tekshapers Inc A tiempo completo

    **Position : Lead Site Reliability Engineer** **Location : Remote** **Duration : Contract** - Lead and mentor a team of SREs to ensure operational excellence and maximize the reliability and availability of client systems. - Minimum 10 years of work experience in DevOps/SRE, including leadership roles. - Architect and design highly scalable and available...


  • Desde casa, México EPAM Systems A tiempo completo

    **DESCRIPTION**: Are you a skilled Azure DevOps Site Reliability Engineer with a passion for ensuring business continuity and helping businesses always be near their clients? Do you have experience in optimizing and supporting OSDU deployment, performing monitoring including incidents resolution, and suggesting improvements? If so, we have an exciting...


  • Desde casa, México Synechron A tiempo completo

    Synechron is a self-funded, leading digital transformation Consulting firm focused on the financial services industry working to accelerate digital initiatives for Banks, Asset Managers and Insurance. We achieve this by providing our clients with innovative solutions that solve their most complex business challenges and combining Synechron’s unique,...


  • Desde casa, México EPAM Systems A tiempo completo

    **DESCRIPTION**: Join EPAM as a **Senior Site Reliability Engineer specializing in AWS!** In this role, you'll ensure fleet services reliability and availability under the SRE model. If you have a good track record of highly scalable, distributed systems projects and previous experience working as an SRE, we'd love to hear from you. EPAM is a leading...


  • Desde casa, México EPAM Systems A tiempo completo

    **DESCRIPTION**: Are you a skilled **Cloud Site Reliability Engineer with experience in AWS or GCP?** If so, we have an exciting opportunity for you! We're currently seeking a Cloud Site Reliability Engineer to join our vibrant team. This role offers the chance to help the product team in maximizing the reliability of software solutions and ensure that...


  • Desde casa, México EPAM Systems A tiempo completo

    **DESCRIPTION**: Join EPAM as an **AWS Cloud Site Reliability Engineer.** In this role, you'll transfer security processes, manage authentication technologies, and support the implementation of a Palo Alto firewall. If you have 3+ years of experience with AWS, proficiency in designing and managing data migration processes, and superior communication...


  • Desde casa, México EPAM Systems A tiempo completo

    **DESCRIPTION**: Join EPAM as a remote **Site Reliability Engineer specializing in Java.** In this role, you'll provide 24/7 on-call support for Java backend services, prepare and deploy patches, and assist in establishing top-of-the-line metrics and dashboards. If you have 5-8 years of experience as a DevOps/SRE, proficiency in Java, and experience with...


  • Desde casa, México Consultoria Aguilar A tiempo completo

    Cloud Operations Engineer / Site Reliability Engineer (SRE) Job Description: Cloud Operations Engineer / Site Reliability Engineer (SRE) About the Company: Datascore.ai, through its EngageIQ platform, specializes in enhancing lead generation and engagement. The company leverages advanced data science and AI to score and enrich leads, optimizing outreach...

  • Tfa Iii

    hace 5 meses


    Desde casa, México FieldCore A tiempo completo

    **Job Summary**: The Controls TFA III, having completed level III competencies, manages controls activities on site during the Installation and Commissioning and Maintenance of heavy-duty turbines/equipment (generator or mechanical drive). Responsible for reviewing plant engineering documents and P&ID’s, troubleshooting of plant systems and equipment,...


  • Desde casa, México EPAM Systems A tiempo completo

    We are on the lookout for a skilled **Senior C++ Software Engineer** with deep expertise in Site Reliability Engineering, Borg, Spanner, and Google Cloud Platform. As a critical member of our Engineering team, you'll engage with a prestigious global Google infrastructure project, deploying various cutting-edge backend and cloud technologies. Your...

  • C++ Software Engineer

    hace 1 semana


    Desde casa, México EPAM Systems A tiempo completo

    We are seeking an experienced **C++ Software Engineer** with expertise in Site Reliability Engineering, Borg, Spanner, and Google Cloud Platform. You will be an integral part of the Engineering team, working on a top-notch global Google infrastructure project involving a variety of modern backend and cloud technologies. Your role will involve...

  • Cloud Network Engineer

    hace 5 meses


    Desde casa, México RED AMIGO DAL S.A.P.I. of C.V. S.O.F.O.M. E.N.R A tiempo completo

    **What´s Konfio?** A financial technology company dedicated to supporting the small and medium-sized companies in Mexico, developing and offering financial solutions to solve their main problems, and seeking to be the best ally of entrepreneurs with dreams and ambitions to create value, consolidate their well-being and contribute to the...

  • DevOps Engineer

    hace 5 meses


    Desde casa, México Soft Dev Team A tiempo completo

    DevOps engineers play a pivotal role in streamlining the software development process, ensuring swift time-to-market, and adapting to market dynamics and competition. They are responsible for maintaining system stability and reliability while continually improving the mean time to recovery. Expertise in AWS (Amazon Web Services) is essential for this role....


  • Desde casa, México Rejuve.AI A tiempo completo

    **Position: Systems/DevOps Engineer** **Location**:100% Remote **About Rejuve.AI** Rejuve.AI is an emerging spin-off project of SingularityNET, focused on extending the healthy human lifespan by creating a decentralized self-sustained research community powered by blockchain, AI, and the valuable contributions of data and AI models. Rejuve.AI’s core...

  • Maintenance Engineer

    hace 5 meses


    Desde casa, México Gold Media Tech A tiempo completo

    As a Maintenance Engineer, your primary responsibility will be to maintain and improve our technical infrastructure, ensuring it operates efficiently and effectively for our lending solutions. This role requires a deep understanding of engineering principles and a commitment to preventative maintenance and quick problem resolution. Working closely with the...

  • Break-fix Engineer

    hace 5 meses


    Desde casa, México Staff4Me A tiempo completo

    Staff4Me is currently seeking a dedicated and experienced Break-fix Engineer to join our team. As a Break-fix Engineer, you will be responsible for providing technical support and resolving hardware and software issues for our clients. You will work closely with our clients and cross-functional teams to diagnose and fix problems in a timely...

  • Maintenance Engineer

    hace 5 meses


    Desde casa, México AltScore A tiempo completo

    **What we’re looking for**: As a Maintenance Engineer at AltScore, your primary responsibility will be to maintain and improve our technical infrastructure, ensuring it operates efficiently and effectively for our lending solutions. This role requires a deep understanding of engineering principles and a commitment to preventative maintenance and quick...

  • Wifi Support Engineer

    hace 5 meses


    Desde casa, México Staff4Me A tiempo completo

    Staff4Me is currently looking for a skilled and dedicated Wifi Support Engineer to join our team. As a Wifi Support Engineer, you will be responsible for providing technical support and troubleshooting assistance for wifi-related issues. You will collaborate with clients and cross-functional teams to deliver reliable and high-performance wifi...