Staff Site Reliability Engineer

hace 1 semana


Ciudad de México CDMX Crunchyroll, LLC A tiempo completo

**About Crunchyroll**:
WE HELP EVERYONE BELONG. IT'S OUR PURPOSE.

Founded by fans, Crunchyroll delivers the art and culture of anime to a passionate community. We super-serve over 100 million anime and manga fans across 200+ countries and territories, and help them connect with the stories and characters they crave. Whether that experience is online or in-person, streaming video, theatrical, games, merchandise, events and more, it's powered by the anime content we all love.

Join our team, and help us shape the future of anime

**Who We Are**:
We're a cast of characters working to shine a spotlight on anime. Crunchyroll is an international business focused on creating both online and offline experiences for fans through content (licensed, co-produced, originals, distribution), merchandise, events, gaming, news, and more. Visit our About Us pages for more information about our collection of brands.

**About the Team**:
The Site Reliability Engineering (SRE) team is dedicated to ensuring the reliability, scalability, and performance of our data infrastructure. We focus on standardizing and implementing monitoring and alerting across all datastores to track key metrics like errors, latency, and throughput, and to ensure critical systems are covered. Our team also leads efforts to keep databases up-to-date, implements Infrastructure as Code (IaC) for high availability and performance, and automates key processes to enhance operational efficiency.

We lead and evangelize the principle of 100% automation. Additionally, we define and document operational requirements, develop incident response processes, and automate monitoring and compliance checks to maintain a secure and reliable data environment. By continuously improving load testing and optimizing data governance practices, we support the overall health and efficiency of our data systems.

**About the Role**

Crunchyroll is growing and changing, presenting unique challenges and opportunities to support millions of anime fans around the world. The Data Engineering team provides seamless help to our internal stakeholders, ensuring an exceptional experience for all Crunchyroll fans.

As a Staff Site Reliability Engineer for the Data Engineering team, you will be responsible for maintaining and enhancing the reliability of our data infrastructure. Your work will directly impact the availability and performance of our data services, enabling the organization to better decisions. You will collaborate closely with data engineers, and software engineers to develop and drive 100% automation, best practices for deep monitoring and alerting. This role will report to our Director of Data Engineering and will be based out of our Mexico City office.

**About You**:

- Bachelor's degree in Computer Science, Information Technology, or a related field.
- 12+ years of experience in site reliability engineering, database operations, or a related role with a focus on data platforms, data stores, data operations.
- Extensive experience with AWS cloud platform and their data-related services.
- Proficiency in monitoring tools (e.g., Datadog, CloudWatch, DevOps Guru, DB Performance Insights).
- Proficiency in one or more programming languages (e.g. Python, Java)
- Proficiency in automation frameworks (e.g., Terraform, Cloud Formation).
- Strong understanding of various performance metrics both at a high level and at a low level like Disk/IO saturation.
- Experience in identifying and eliminating the bottlenecks in the system.
- Strong understanding of database internals like types of indexes, schemas, query plans.
- Strong understanding of database systems (e.g., SQL, NoSQL) and experience in managing large-scale data infrastructures.
- Strong understanding and hands-on implementation of CI/CD pipelines and DataOps practices.
- Experience with data governance, compliance, and lifecycle management.
- Ability to own and execute projects while effectively collaborating with the team to influence and shape the vision of the data engineering organization.

LifeAtCrunchyroll #LI-Hybrid

**About our Values**:
We want to be everything for someone rather than something for everyone and we do this by living and modeling our values in all that we do. We value
- Courage. We believe that when we overcome fear, we enable our best selves.
- Curiosity. We are curious, which is the gateway to empathy, inclusion, and understanding.
- Service. We serve our community with humility, enabling joy and belonging for others.
- Kaizen. We have a growth mindset committed to constant forward progress.

**Our commitment to diversity and inclusion**:
Our mission of helping people belong reflects our commitment to diversity & inclusion. It's just the way we do business.

We are an equal opportunity employer and value diversity at Crunchyroll. Pursuant to applicable law, we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran statu


  • Site Reliability Engineer

    hace 3 semanas


    Ciudad de México Atos A tiempo completo

    **Job Applicant Privacy Notice**:**Site Reliability Engineer**:- Publication Date: Jan 8, 2025- Ref. No: - Location: Mexico City, MX**_Site Reliability Engineer_**Certain Scripting experience in languages like Java or Python or Shell scripting.- +3 years of significant experience in working as Site Reliability Engineer- Strong in Terraform, Ansible, Packer,...

  • Site Reliability Engineer

    hace 2 semanas


    Ciudad de México Zenta group A tiempo completo

    **Site Reliability Engineer | Presencial - CDMX****Resumen del Rol**:Como **Site Reliability Engineer (SRE)** en Zenta Group, serás el puente entre desarrollo y operaciones, asegurando que los servicios sean **escalables, confiables y resilientes**. Diseñarás e implementarás soluciones que mejoren la estabilidad y el rendimiento de la infraestructura,...


  • Ciudad de México Thomson Reuters A tiempo completo

    Are you passionate about the chance to bring your extensive technical experience to drive the Site Reliability Engineering team using industry best practices in a world class company? Thomson Reuters ONESOURCE Platform’s SRE team is looking for a Site Reliability Engineer who will provide hands-on technical skills and share industry best practices with...

  • Site Reliability Engineer

    hace 4 semanas


    Ciudad de México Thomson Reuters A tiempo completo

    Location Mexico City, Mexico Category Technology Careers Job Id JREQ Job Type Full time Hybrid**Senior Site Reliability Engineer**Are you passionate about the chance to bring your experience to a world-class company that is market-leading or both content and technology? If yes, we’re looking for you.Join our team! We are looking for a Senior Site...

  • Site Reliability Engineer

    hace 2 semanas


    Ciudad de México ITJ A tiempo completo

    Site Reliability Engineer (SRE). The Site Reliability Engineering team constantly practices the DevOps mindset to build and deploy distributed, fault-tolerant systems at scale. As part of this team, you will work with developers, operations, and product sponsors to help design, build, and deploy the critical infrastructure needed. Essential Duties Include,...


  • Ciudad de México ITJ A tiempo completo

    Mid-level Site Reliability Engineer (SRE). The Site Reliability Engineering team constantly practices the DevOps mindset to build and deploy distributed, fault-tolerant systems at scale. As part of this team, you will work with developers, operations, and product sponsors to help design, build, and deploy the critical infrastructure needed. Essential Duties...


  • Ciudad de México 1210 Kyndryl Mexico S. de R.L. de C.V. A tiempo completo

    A leading global technology firm in Mexico City seeks a Site Reliability Engineer to ensure system reliability and drive continuous improvement. The ideal candidate will have operational management experience and strong skills in application monitoring and scripting languages. Join a dynamic team that values innovation and offers extensive growth...


  • Ciudad de México Royal Caribbean Group A tiempo completo

    **Journey with us!** Combine your career goals and sense of adventure by joining our incredible team of employees at **Royal Caribbean Group** We are proud to offer a competitive compensation and benefits package and excellent career development opportunities each offering unique ways to explore the worldWe are proud to be the vacation-industry leader with...

  • Site Reliability Engineer

    hace 2 semanas


    Ciudad de México Zenta Group A tiempo completo

    Hoy nos encontramos en la búsqueda de **especialistas en roles** **,**tales como**:Site Reliability Engineer (SRE)**, con experiência en dominio de lenguajes de programación y scripting como Python, Go, Java, o Bash. Además de, contar con familiaridad de entornos en la nube (AWS, Google Cloud, Azure) y automatización de infraestructura (Terraform,...

  • Site Reliability Engineer

    hace 2 semanas


    Ciudad de México GrainChain Inc A tiempo completo

    ¡Estamos en busca de nuevos talentos!GrainChain es una empresa tecnológica dedicada a reducir la brecha digital en la industria agrícola. Nuestras plataformas facilitan las transacciones de manera rápida, seguras y sencillas para nuestros usuarios. Estamos en búsqueda de un Site Reliability Engineer capaz de integrar y automatizar las áreas de...