Senior Site Reliability Engineer
hace 6 meses
**About us**:
Working at Tech Holding isn't just a job, it's an opportunity to be a part of something bigger. We are a full-service consulting firm that was founded on the premise of delivering predictable outcomes and high-quality solutions to our clients. Our founders and team members have industry experience and have held senior positions in a wide variety of companies - from emerging startups to large Fortune 50 firms - and we have taken our combined experiences and developed a unique approach that is supported by the principles of deep expertise, integrity, transparency, and dependability.
**The Role**:
**Responsibilities**:
**Site Reliability Engineering**:
- Partner with development teams to implement best practices for building reliable and scalable systems.
- Stay up-to-date on the latest SRE trends and technologies.
**Monitoring and Observability**:
- Design, implement, and maintain robust monitoring solutions using tools like Prometheus and Grafana.
- Develop and configure alerts within tools like PagerDuty to ensure timely notification of potential issues.
**Incident Management**:
- Lead incident response, ensuring timely resolution and minimizing downtime.
- Document and communicate incident details effectively to stakeholders.
- Conduct post-incident reviews to identify root causes and implement preventative measures.
**Service Level Agreements (SLAs)**:
- Collaborate with product and engineering teams to define clear and measurable SLAs for our SaaS offerings.
- Establish Service Level Objectives (SLOs) for key metrics based on SLA requirements.
- Define Service Level Indicators (SLIs) to track progress towards achieving SLOs.
- Monitor SLO compliance and proactively identify potential SLA breaches.
**Automation**:
- Identify opportunities for automation to improve efficiency and reliability.
- Develop and implement automation scripts using tools like Python or Bash.
- Automate routine tasks and incident response workflows.
**Cross-Team Collaboration**:
- Act as a liaison between SRE, Product, Security, Application Engineering, and Customer Operations teams.
- Facilitate communication and information sharing across teams to ensure smooth operations.
- Work collaboratively to define and implement solutions that meet the needs of all stakeholders.
**Mentorship and Knowledge Sharing**:
- Mentor and collaborate with junior SRE engineers.
- Share knowledge and best practices within the team.
- Contribute to the development and documentation of internal SRE processes.
**Required Skills**:
- 5-8 years of experience as a Site Reliability Engineer (SRE) or related role.
- Experience with cloud platform GCP
- Proven experience with monitoring tools like Prometheus and Grafana.
- Strong understanding of incident management best practices.
- Experience with alerting tools like PagerDuty.
- Experience with scripting languages like Python or Bash for automation.
- Excellent communication and collaboration skills.
- Ability to work independently and as part of a team.
- Strong problem-solving and analytical skills.
- Passion for building reliable and scalable systems.
**Nice to Have**:
- Experience with container orchestration platforms like Kubernetes.
- Experience with chaos engineering principles.
- Experience with configuration management tools like Ansible or Chef.
**What we offer**:
- Remote Work Opportunities
- Flexible Work Hours
-
Sr Site Reliability Engineer
hace 1 mes
Guadalajara, México f5 A tiempo completoEverything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive. Business/Job Title: Senior Site Reliability Engineer Position Summary Software engineering is a core discipline at F5 for many roles. As a...
-
Site Reliability Engineer
hace 4 semanas
Guadalajara, Jalisco, México Tech Holding A tiempo completoAbout UsTech Holding is a full-service consulting firm that delivers predictable outcomes and high-quality solutions to clients. Our team has industry experience and holds senior positions in various companies, including emerging startups and large Fortune 50 firms.Our unique approach is supported by the principles of deep expertise, integrity, transparency,...
-
Site Reliability Engineer
hace 1 mes
Guadalajara, México f5 A tiempo completoEverything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive. - Site Reliability Engineer III Why do you want to join our team? - Everything we do centers around people. That means we obsess over how to...
-
Site Reliability Engineer
hace 6 meses
Guadalajara, México Finastra USA Corporation A tiempo completo**Responsibilities**: **What will you contribute?** As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra’s Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture where the primary objective is continuous improvement. You’ll be treating operations as a software...
-
Associate Site Reliability Engineer/site
hace 5 días
Guadalajara, México C3 AI A tiempo completoWe are looking for **Associate Site Reliability Engineer**/**Site Reliability Engineer** to join our team in Guadalajara, Mexico. **Responsibilities**: - Maximize system uptime and availability, ensuring functional and performance SLAs. - Establish end-to-end monitoring and alerting on all critical aspects. - Solve complex problems for critical services...
-
Reliability Engineer Lead
hace 1 semana
Guadalajara, Jalisco, México Capgemini Engineering A tiempo completoAbout the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Capgemini Engineering. In this role, you will be responsible for ensuring the reliability, availability, and scalability of our clients' digital resources.Main RequirementsTo be successful in this position, you will need:Bachelor's degree in Computer Science,...
-
Infrastructure Engineer
hace 4 semanas
Guadalajara, Jalisco, México Broadridge A tiempo completoBroadridge fosters a culture where innovation meets reliability, empowering associates to drive scalable solutions.**Job Overview**We are seeking an experienced Infrastructure Engineer - Site Reliability to join our team. As a key member of our SRE group, you will be responsible for designing and implementing scalable and highly reliable software...
-
Site Reliability Engineer
hace 1 mes
Guadalajara, México f5 A tiempo completoEverything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive. Business/Job Title: Site Reliability Engineer - IAM - III Position Summary: Software engineering is a core discipline at F5 for many...
-
Site Reliability Engineer
hace 6 meses
Guadalajara, México Finastra A tiempo completoYour deliverables as a Site Reliability Engineer will include, but are not limited to, the following: - Work with containers and container orchestration systems such as Kubernetes - Capacity Planning to determine resource requirements of your service for it to be scalable, efficient, and reliable - Collaborate with other engineers to implement operational...
-
Site Reliability Engineer
hace 3 meses
Guadalajara, México Wizeline A tiempo completo**The Company**: Wizeline is a global digital services company helping mid-size to Fortune 500 companies build, scale, and deliver high-quality digital products and services. We thrive in solving our customer’s challenges through human-centered experiences, digital core modernization, and intelligence everywhere (AI/ML and data). We help them succeed in...
-
Site Reliability Specialist
hace 1 mes
Guadalajara, Jalisco, México Azka IT Consulting A tiempo completoAzka IT Consulting is seeking a highly skilled Site Reliability Specialist to join our team.The ideal candidate will have a strong background in automation, Unix, Linux, Ubuntu, and Windows, as well as experience with Oracle, MYSQL, and NOSQL Solutions.The Systems Operations Expert will be responsible for understanding system design and architecture,...
-
Site Reliability Engineer Iii/network
hace 6 meses
Guadalajara, México f5 A tiempo completoEverything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive. Position Summary Software engineering is a core discipline at F5 for many roles. As a software engineer specializing in site reliability,...
-
Site Reliability Engineer
hace 2 meses
Guadalajara, Jalisco, México FICO A tiempo completoAbout FICOFICO is a leading global analytics software company, helping businesses in 100+ countries make better decisions.The OpportunityThe Site Reliability Engineer is a critical role that combines software development and systems engineering. As a full-stack support engineer, you will be responsible for managing complex distributed enterprise SaaS...
-
Site Reliability Engineer
hace 6 meses
Guadalajara, México Finastra USA Corporation A tiempo completo**Responsibilities**: **What will you contribute?** As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra’s Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture where the primary objective is continuous improvement. You’ll be treating operations as a software...
-
Site Reliability Engineer III
hace 1 mes
guadalajara, México F5 Networks, Inc. A tiempo completoSite Reliability Engineer III At F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity, from protecting consumers from fraud to enabling companies to focus on innovation....
-
Highly Skilled Site Reliability Engineer
hace 1 semana
Guadalajara, Jalisco, México F5 A tiempo completoF5 is a global leader in the technology industry, dedicated to delivering exceptional products and services that enhance the lives of our customers.We are seeking a highly skilled Site Reliability Engineer III to join our team. As a key member of our organization, you will play a critical role in ensuring the security and scalability of our systems by...
-
Site Reliability Engineer II
hace 2 meses
Guadalajara, Jalisco, México F5 A tiempo completoJob Summary:At F5, we're seeking a skilled Site Reliability Engineer II to join our Technology Services team. As a key member of our CEDI team, you'll be responsible for managing the systems used by developers throughout their SDLC lifecycle. This includes JIRA, Confluence, and other Atlassian tools. Your expertise in automation, scripting, and DevOps...
-
Site Reliability Engineer
hace 2 meses
Guadalajara, México Azka IT Consulting A tiempo completoAZKA IT is a Mexican company that seeks and connects the best IT talent with Latin American and United States companies.We are looking for your talent as Site Reliability Engineer.Requirements:Mediator b/w development and application team, and good knowledge on automation Unix, Linux, Ubuntu, or Windows, Oracle, MYSQL, NOSQL Solutions.Key...
-
Site Reliability Engineer
hace 6 meses
Guadalajara, México Encora A tiempo completo**Responsibilities** - Architect and implement observability solutions utilizing advanced cloud monitoring tools such as New Relic, Dynatrace, Splunk or equivalent, to provide comprehensive insights into system metrics, logs, and traces - Configure and customize monitoring dashboards, alerts, and metrics to enable real-time visibility into system health,...
-
Site Reliability Engineer III/Network
hace 1 mes
Guadalajara, Jalisco, México F5 A tiempo completoAbout F5F5 is a company that puts people at the forefront of everything we do. We strive to make the lives of our customers and their customers better by delivering innovative solutions that drive business success.About the RoleThe Site Reliability Engineer III will be responsible for ensuring the reliability, availability, and scalability of critical...