Site Reliability Engineer
hace 1 mes
About The Position
The Position
Site Reliability Engineer
Job Description
The Site Reliability Engineer (SRE) is responsible for implementing and maintaining the Cloud Infrastructure which runs services developed by Trax.
SREs are responsible for the reliability and scalability of Trax services. This includes supporting both our production-critical systems and our internal tools for developer productivity.
A strong candidate for this position would be a generalist who can maintain our cloud infrastructure while being an advocate for DevOps principles throughout our organization.
Responsibilities :
- Implement cost-effective and scalable solutions to complex cloud infrastructure problems.
- Maintain the reliability of our cloud infrastructure while simultaneously improving and upgrading it.
- Perform low-level analysis and debugging of problems in both containerized and VM-based Linux workloads.
- Automate manual processes to improve developer productivity.
- Ensure stable and reliable releases by maintaining and improving our CI / CD systems.
- Be an advocate for DevOps best practices in both the Infrastructure team and across the organization.
- Manage and participate in a rotating On Call team which is responsible for handling high-priority bugs and issues.
Requirements :
- 5+ years of experience managing Linux-based Server Operating Systems.
- 5+ years of experience managing cloud infrastructure (GCP, AWS, or Azure)
- 5+ years of experience managing large high-performance databases and data processing jobs for business-critical reporting applications.
- 5+ years of experience managing environments using Infrastructure and Configuration-as-Code (Terraform / CloudFormation / Puppet / Chef / Etc).
- 5+ years of experience with CI / CD and test automation systems (Jenkins / Gitlab / Argo / Helm / etc.)
- Excellent written and verbal communication skills and ability to communicate with stakeholders across the business.
- Knowledge of monitoring systems including host / OS metrics, logging, and web application performance, using both SaaS products (DataDog / NewRelic / etc.) and open-source solutions (syslog / Loki / Grafana / etc.).
- Knowledge of container orchestration systems such as Kubernetes, including autoscaling, service mesh, rollout strategies, and cost management.
- Knowledge of network protocols, including TCP / IP, HTTP / S, DNS, DHCP, and NAT.
- Thorough understanding of web service fundamentals, such as caching, CDNs, load balancing, and traffic shaping.
- MySQL Database performance tuning and high-availability experience.
- Experience with security systems, including WAF, firewall rules, public key infrastructure, and cryptography.
- Experience writing code in any programming language.
- Experience writing optimized SQL queries.
Preferred Skills and Experience :
- Production experience with Google Cloud Platform (GCP).
- Ability to code modern, containerized web applications.
- Strong understanding of the Python programming language.
- Ability to perform low-level network debugging, including packet analysis and an understanding of the Linux network stack.
-
Senior Site Reliability Engineer
hace 1 mes
distrito federal, México Trimble A tiempo completoYour Title: Senior Site Reliability Engineer Location: Mexicali, Mexico We are seeking a skilled and motivated Senior Site Reliability Engineer to join our team in Trimble’s Core Cloud Platform. The ideal candidate will have a strong background in cloud platforms, infrastructure as code, and automation via programming/scripting languages. You will embed...
-
Senior Site Reliability Engineer
hace 3 semanas
distrito federal, México Refinitiv A tiempo completoSenior Site Reliability Engineer page is loaded Senior Site Reliability Engineer Apply remote type Remote Job: Hybrid locations MEX-Distrito Federal-Reforma 26 time type Full time posted on Posted 13 Days Ago time left to apply End Date: November 8, 2024 (4 days left to apply) job requisition id JREQ177645 ...
-
Site Reliability Engineer III/Network
hace 1 mes
distrito federal, México F5 A tiempo completoPosition Summary: Software engineering is a core discipline at F5 for many roles. As a software engineer specializing in site reliability, you will bring a software engineering and automated solution mindset to your work. The Site Reliability Engineer III will be responsible for ensuring the reliability, availability, and scalability of critical systems,...
-
Site Reliability Engineer
hace 1 mes
distrito federal, México Thales A tiempo completoThales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billions of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more. More than 30,000...
-
Staff Site Reliability Engineer
hace 3 semanas
distrito federal, México Crunchyroll, LLC A tiempo completoAbout Crunchyroll WE HELP EVERYONE BELONG. IT'S OUR PURPOSE. Founded by fans, Crunchyroll delivers the art and culture of anime to a passionate community. We super-serve over 100 million anime and manga fans across 200+ countries and territories, and help them connect with the stories and characters they crave. Whether that experience is online or...
-
Site Reliability Engineer
hace 1 mes
distrito federal, México Thales A tiempo completoThales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billions of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more. More than 30,000...
-
Staff Site Reliability Engineer
hace 3 semanas
distrito federal, México Ellation US A tiempo completoWE HELP EVERYONE BELONG. IT’S OUR PURPOSE. Founded by fans, Crunchyroll delivers the art and culture of anime to a passionate community. We super-serve over 100 million anime and manga fans across 200+ countries and territories, and help them connect with the stories and characters they crave. Whether that experience is online or in-person, streaming...
-
Site Reliability Engineer
hace 3 semanas
distrito federal, México Improving A tiempo completoImproving is committed to building a great place to work by cultivating an environment that fosters professional and personal relationships. We value open communication, personal growth, and shared rewards, which result in sustainable success. Voted “best place to work” numerous times, Improving strives to create and maintain a culture that exemplifies...
-
Service Reliability Engineer
hace 1 mes
distrito federal, México Thales Group A tiempo completoService Reliability Engineer Service Reliability Engineer This is a hybrid position within Mexico City, Mexico. Thales is looking for a Service Reliability Engineer who is primarily responsible to ensure the best customer experience by assuring services reliability from the customer's perspective and making sure Incident/Service Requests are resolved in the...
-
Site Reliability Engineer
hace 1 mes
distrito federal, México Ford Motor Company A tiempo completoSRE Software Engineer is responsible for designing, configuring, monitoring, implementing, and maintaining our observability solutions and troubleshooting Ford Credit IT systems and applications to ensure optimal performance and reliability. MAJOR RESPONSIBILITIES Utilizing Observability and Monitoring tools to detect and resolve issues affecting positive...
-
Reliability Engineer
hace 1 mes
distrito federal, México Wipro A tiempo completoRole: Reliability Engineer Great opportunity to work in Global Company of Education / Publishing sector. Location: Hybrid in Mexico City (1 or 2 days per week in office) Required Skills and Experience: Bachelor’s in computer systems, Informatics, or alike. 3 to 5 years of experience/knowledge in a similar role: Cloud Engineering: Hands-on design,...
-
Senior Site Reliability Engineer
hace 3 semanas
distrito federal, México Thomson Reuters A tiempo completoThomson Reuters is the Answer Company. We provide authoritative content, advanced technologies, and human expertise to help our customers find trusted answers. We enable professionals in the legal, tax and accounting, and media markets to make the decisions that matter most, all powered by the world's most trusted news organization. The Legal group is...
-
Senior Site Reliability Engineer
hace 1 mes
distrito federal, México Refinitiv A tiempo completoSr Support Engineer L3 (.Net) Sr Support Engineer L3 (.Net) Thomson Reuters is seeking a Sr Support Engineer L3 (.Net) . This role will be a part of a high performing team of talented SRE specialists who provide world-class support for Commercial Engineering. This team manages ongoing incident detection and resolution, change planning and implementation, and...
-
Reliability Engineer
hace 1 mes
distrito federal, México BSB-JLL - LaSalle Services, MEX A tiempo completoAt JLL we have a great commitment to diversity, which is why we promote the inclusion of all people on equal terms; that is, we do not discriminate based on disability, sexual orientation, gender identity, sex, race, ethnic group, religion, and/or physical appearance. Regional Reliability Engineer What this job involves: Responsible for implementing a...
-
Service Reliability Engineer
hace 1 mes
distrito federal, México Thales A tiempo completoThales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more. More than 30,000...
-
Senior Site Reliability Engineer
hace 1 mes
distrito federal, México Medallia A tiempo completoOverview Medallia is the pioneer and market leader in Experience Management. Our award-winning SaaS platform, Medallia Experience Cloud, leads the market in the understanding and management of experience for candidates, customers, employees, patients, citizens and residents. We are more than a software company. We want to be known as a company that does the...
-
Senior Site Reliability Engineer
hace 2 semanas
distrito federal, México Medallia A tiempo completoOverview Medallia is the pioneer and market leader in Experience Management. Our award-winning SaaS platform, Medallia Experience Cloud, leads the market in the understanding and management of experience for candidates, customers, employees, patients, citizens and residents. We are more than a software company. We want to be known as a company that does the...
-
Reliability Engineer
hace 1 mes
distrito federal, México Jones Lang LaSalle Incorporated A tiempo completoJLL empowers you to shape a brighter way . Our people at JLL and JLL Technologies are shaping the future of real estate for a better world by combining world-class services, advisory, and technology for our clients. We are committed to hiring the best, most talented people and empowering them to thrive, grow meaningful careers, and to find a place where they...
-
Reliability Engineer
hace 2 meses
Federal, México BSB-JLL - LaSalle Services, MEX A tiempo completoAt JLL we have a great commitment to diversity, which is why we promote the inclusion of all people on equal terms; that is, we do not discriminate based on disability, sexual orientation, gender identity, sex, race, ethnic group, religion, and/or physical appearance. **Regional Reliability Engineer** **What this job involves **:Responsible for...
-
Site Reliability Enginner L2
hace 5 meses
Federal, México Thales A tiempo completoThales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more. More than 30,000...