Site Reliability Engineer

hace 3 meses


Desde casa, México Right Balance A tiempo completo

**Overview**

We're looking for a Site Reliability Engineer. Headquartered in Los Angeles, California, Right Balance provides top-tier technology talent for innovative companies in the US. We’re in the top 50 companies to watch in LA.

**Engagement Details**

Our client is a USA-based company producing video solutions with the mission to advance scientific research and education. Their institutional clients comprise over 1,000 universities, colleges, and biopharma companies, including such leaders as Harvard, MIT, Yale, and Stanford. As a rapidly growing company, with offices in the USA, UK, Australia, and India servicing clients in over 60 countries, our client is seeking talented individuals to join their company.

Our client is looking for an amazing Site Reliability Engineer who will be part of their centralized Site Reliability Team. You will play an integral role in the deployment of highly scalable systems, optimization, documentation, and support of the infrastructure components of their software products hosted on AWS. Cloud Infrastructure and Operations are critical in enabling us to provide users with their technology offerings.

**What’s in it for you?**
- Learn and evolve your skills using the latest and greatest technology tools in a rapidly growing company.
- Learn from the best people around you. We constantly challenge the status quo and invent new ways of building a great product.
- 100% remote. Work anywhere, whether it is remotely in the comfort of your home, in a shared co-working space, in an RV on the beach, or while being a nomad in another country.
- Work on challenging problems, innovate, and positively impact many people's lives while having fun doing it.

**Required Qualifications**
- Upper-intermediate to fluent speaking and writing English. Able to have a real-time conversation.
- 3+ years of full-time hands-on Site Reliability Engineer experience.
- 3+ years of full-time hands-on DevOps experience.
- 3+ years of full-time hands-on AWS experience.
- 2+ years of full-time hands-on Docker experience.
- 2+ years of full-time hands-on Kubernetes experience.
- 2+ years of full-time hands-on IAC (Infrastructure as code) experience.
- 2+ years of full-time hands-on Software Developer experience.
- 2+ years of full-time hands-on Javascript experience.
- 1+ years of full-time hands-on Terraform experience.
- 1+ years of full-time hands-on PHP experience.
- Extensive in-depth experience with cloud-based provisioning, monitoring, troubleshooting, and related SRE and DevOps technologies, in addition to networking knowledge.
- MUST have working experience with cloud-native infrastructure such as AWS or GCP (ideally AWS).
- MUST understand AWS VPC, subnets, Network ACLs, Security Groups, IAM Role, and EKS.
- Experience configuring Kubernetes RBAC Authorization, Ingress controller, ServiceAccount, and AWS role annotations.
- Strong Experience with CI/CD automation and configuration management.
- Experience with monitoring, and observability systems such as New Relic, DataDog, Grafana, Kibana, CloudWatch, and Kafka.
- Ability to triage and resolve incidents and lead incident investigations.
- Experience with security practices, credential rotations, and secrets management systems like the Vault project.
- Must be able to ensure Agile/Scrum concepts and principles are adhered to and be a voice of reason.
- Experience working in a 24/7 on-call, highly transactional, or streaming production environment.

**Nice to Haves**
- Working knowledge of GitOps, FluxCD, or ArgoCD.
- Building Kubernetes Operator is a plus.
- Go (programming language) expertise.
- Crossplane experience.
- Bachelor’s degree in Computer Science or equivalent demonstrated ability.

**Frequently Asked Questions**
- What are your typical clients?_

The majority of our clients are venture-backed startups at the growth stage. Usually, at this stage, the company already achieved a product-market fit and is looking to expand rapidly. That’s where we bring the best engineering practices, strong architecture, the latest technologies, and consistent processes to help companies scale.
- What is the length of your engagements?_

Most of our long-term full-time engagements last multiple years. It allows you to evolve your career with the client company taking on more responsibilities.
- What’s your company size?_

The Right Balance team is 60+ engineers going to 100+ by the end of the year. The current client size team is 584+ people. The timing is great to be a part of a rapidly growing team making meaningful contributions.
- What happens if the engagement is completed?_

Most of our engagements are long-term in nature. That said, if the current engagement is ramping down, we’ll present you with more long-term opportunities to transition into.
- What are your core values?_
- Client First: we only win when our clients win. We treat client challenges as our own.
- Ownership: we embrace responsibility, taking on challenges, getti



  • Desde casa, México thegetch mexico A tiempo completo

    **Función: Site Reliability Engineer** **Aperturas: más de 10 contrataciones** **Ubicación: - any city with TCS Office presence (Queretaro, Guadalajara, Mexico City or Monterrey)** **Salario: - 25-33 USD/hr** **Comunicación en inglés: avanzado** **Experiência: 4+ años** **Responsabilidades de Site Reliability Engineer**: Reúna y analice métricas...

  • Site Reliability Engineer

    hace 2 semanas


    Desde casa, México Wise Athena A tiempo completo

    **Join Our Team as an SRE!** Wise Athena looking for a **Site Reliability Engineer (SRE)** to join our dynamic and innovative team! At our company, we’re revolutionizing Revenue Growth Management (RGM) with the power of AI. You will work with a passionate, forward-thinking team. This is a fully remote position. **Key Responsibilities** - **Problem...


  • Desde casa, México Tekshapers Inc A tiempo completo

    **Position : Lead Site Reliability Engineer** **Location : Remote** **Duration : Contract** - Lead and mentor a team of SREs to ensure operational excellence and maximize the reliability and availability of client systems. - Minimum 10 years of work experience in DevOps/SRE, including leadership roles. - Architect and design highly scalable and available...


  • Desde casa, México EPAM Systems A tiempo completo

    **DESCRIPTION**: Are you a skilled Azure DevOps Site Reliability Engineer with a passion for ensuring business continuity and helping businesses always be near their clients? Do you have experience in optimizing and supporting OSDU deployment, performing monitoring including incidents resolution, and suggesting improvements? If so, we have an exciting...


  • Desde casa, México Synechron A tiempo completo

    Synechron is a self-funded, leading digital transformation Consulting firm focused on the financial services industry working to accelerate digital initiatives for Banks, Asset Managers and Insurance. We achieve this by providing our clients with innovative solutions that solve their most complex business challenges and combining Synechron’s unique,...


  • Desde casa, México Cabify A tiempo completo

    Do you want to change the world? At Cabify, that’s what we’re doing. We aim to make cities better places to live by improving mobility for the people living in them, connecting riders to drivers, providing mobility alternatives such as scooters and mopeds and many others to come, all at the touch of a button. Maybe one day cities will be places where...


  • Desde casa, México EPAM Systems A tiempo completo

    **DESCRIPTION**: Join EPAM as a **Senior Site Reliability Engineer specializing in AWS!** In this role, you'll ensure fleet services reliability and availability under the SRE model. If you have a good track record of highly scalable, distributed systems projects and previous experience working as an SRE, we'd love to hear from you. EPAM is a leading...


  • Desde casa, México EPAM Systems A tiempo completo

    **DESCRIPTION**: Are you a skilled **Cloud Site Reliability Engineer with experience in AWS or GCP?** If so, we have an exciting opportunity for you! We're currently seeking a Cloud Site Reliability Engineer to join our vibrant team. This role offers the chance to help the product team in maximizing the reliability of software solutions and ensure that...


  • Desde casa, México EPAM Systems A tiempo completo

    **DESCRIPTION**: Join EPAM as an **AWS Cloud Site Reliability Engineer.** In this role, you'll transfer security processes, manage authentication technologies, and support the implementation of a Palo Alto firewall. If you have 3+ years of experience with AWS, proficiency in designing and managing data migration processes, and superior communication...


  • Desde casa, México EPAM Systems A tiempo completo

    **DESCRIPTION**: Join EPAM as a remote **Site Reliability Engineer specializing in Java.** In this role, you'll provide 24/7 on-call support for Java backend services, prepare and deploy patches, and assist in establishing top-of-the-line metrics and dashboards. If you have 5-8 years of experience as a DevOps/SRE, proficiency in Java, and experience with...


  • Desde casa, México Consultoria Aguilar A tiempo completo

    Cloud Operations Engineer / Site Reliability Engineer (SRE) Job Description: Cloud Operations Engineer / Site Reliability Engineer (SRE) About the Company: Datascore.ai, through its EngageIQ platform, specializes in enhancing lead generation and engagement. The company leverages advanced data science and AI to score and enrich leads, optimizing outreach...


  • Desde casa, México EPAM Systems A tiempo completo

    We are on the lookout for a skilled **Senior C++ Software Engineer** with deep expertise in Site Reliability Engineering, Borg, Spanner, and Google Cloud Platform. As a critical member of our Engineering team, you'll engage with a prestigious global Google infrastructure project, deploying various cutting-edge backend and cloud technologies. Your...


  • Desde casa, México EPAM Systems A tiempo completo

    We are seeking an experienced **C++ Software Engineer** with expertise in Site Reliability Engineering, Borg, Spanner, and Google Cloud Platform. You will be an integral part of the Engineering team, working on a top-notch global Google infrastructure project involving a variety of modern backend and cloud technologies. Your role will involve...

  • Cloud Network Engineer

    hace 6 meses


    Desde casa, México RED AMIGO DAL S.A.P.I. of C.V. S.O.F.O.M. E.N.R A tiempo completo

    **What´s Konfio?** A financial technology company dedicated to supporting the small and medium-sized companies in Mexico, developing and offering financial solutions to solve their main problems, and seeking to be the best ally of entrepreneurs with dreams and ambitions to create value, consolidate their well-being and contribute to the...

  • DevOps Engineer

    hace 6 meses


    Desde casa, México Soft Dev Team A tiempo completo

    DevOps engineers play a pivotal role in streamlining the software development process, ensuring swift time-to-market, and adapting to market dynamics and competition. They are responsible for maintaining system stability and reliability while continually improving the mean time to recovery. Expertise in AWS (Amazon Web Services) is essential for this role....


  • Desde casa, México Rejuve.AI A tiempo completo

    **Position: Systems/DevOps Engineer** **Location**:100% Remote **About Rejuve.AI** Rejuve.AI is an emerging spin-off project of SingularityNET, focused on extending the healthy human lifespan by creating a decentralized self-sustained research community powered by blockchain, AI, and the valuable contributions of data and AI models. Rejuve.AI’s core...

  • Maintenance Engineer

    hace 6 meses


    Desde casa, México Gold Media Tech A tiempo completo

    As a Maintenance Engineer, your primary responsibility will be to maintain and improve our technical infrastructure, ensuring it operates efficiently and effectively for our lending solutions. This role requires a deep understanding of engineering principles and a commitment to preventative maintenance and quick problem resolution. Working closely with the...

  • Break-fix Engineer

    hace 6 meses


    Desde casa, México Staff4Me A tiempo completo

    Staff4Me is currently seeking a dedicated and experienced Break-fix Engineer to join our team. As a Break-fix Engineer, you will be responsible for providing technical support and resolving hardware and software issues for our clients. You will work closely with our clients and cross-functional teams to diagnose and fix problems in a timely...

  • Maintenance Engineer

    hace 6 meses


    Desde casa, México AltScore A tiempo completo

    **What we’re looking for**: As a Maintenance Engineer at AltScore, your primary responsibility will be to maintain and improve our technical infrastructure, ensuring it operates efficiently and effectively for our lending solutions. This role requires a deep understanding of engineering principles and a commitment to preventative maintenance and quick...

  • Wifi Support Engineer

    hace 6 meses


    Desde casa, México Staff4Me A tiempo completo

    Staff4Me is currently looking for a skilled and dedicated Wifi Support Engineer to join our team. As a Wifi Support Engineer, you will be responsible for providing technical support and troubleshooting assistance for wifi-related issues. You will collaborate with clients and cross-functional teams to deliver reliable and high-performance wifi...