Sre- Site Reliability Engineering
hace 1 semana
Important IT company At the Latin American level, growth requires:
**SRE- Site Reliability Engineering**
**Job description**:
- We are looking for a Lead Site Reliability Engineer who takes the initiative on developing and maintain the system and services for our Cash Management Platform, automating the deployment process, ensuring system scaling, investigating and resolving outdates, identifying and implementing preventive measures proactively, collaborating with key stakeholders, continuously looking for ways to provide real-time visual feedback for all the metrics and statuses.
**_ What you will do:_**
- Proactively build and implement services to make IT and support better at their jobs.
- Design and implement dashboard that provide valuable real-time insights of platform key metrics.
- Leads engagement with software developers, DevOps and other infrastructure engineers to integrate software development and delivery from inception to full operation, ensuring robust released software and systems.
- Optimizing on-call rotations & processes.
- Ensure Incidents assigned to the team are being managed within agreed SLAs
- Ensure alarms are documented in up to date Knowledge Base Articles.
- Conduct pot-incident reviews to identify platform status.
**_ What we’re looking for:_**
- Bachelor’s degree in computer science or equivalent relevant to SR or Automation/development experience.
- 7+ years’ experience focussed on Site Reliability Engineering or related position in some of the majors Cloud Platforms.
- Involved in the automation of multi-tenant systems, preferably in a cloud environment.
- Good understanding of Site Reliability Engineering (SRE) philosophies, technologies, platforms and tools, SLO management, incident resolution, and automation;
- Ability to explain technical concepts in clear, non-technical language
- Experience building Infrastructure-As-Code.
- Experience in Docker and Kubernetes and networking concepts.
- Experience with Graphana and Prometeus.
- Integration experience with Pager-Duty, ServiceNow, Datadog.
- Expertise with system and performance monitoring tools (Dynatrace, Splunk, etc.). Highly experienced us...
**ADVANCED CONVERSATIONAL ENGLISH ESSENTIAL**(Will be evaluated).
**Job type**: On site.
**Locatio**n**:Guadalajara | Mexico City | Monterey | Saltillo
**Salary**: $95,000 gross.
**Benefits**: Excellent superior benefits.
-
Sre (Site Reliability Engineering) On Site Monterrey
hace 1 semana
Monterrey, México GSB A tiempo completoImportant IT company At the Latin American level, growth requires: **SRE- Site Reliability Engineering** **Job description**: - We are looking for a Lead Site Reliability Engineer who takes the initiative on developing and maintain the system and services for our Cash Management Platform, automating the deployment process, ensuring system scaling,...
-
Site Reliability Engineer
hace 4 días
Monterrey, México Concord USA A tiempo completoLocation: Hybrid in Monterrey, MX. 8 days a month on-site. Possibility to get a travel or relocation stipend for travel. Type of Employment: contract to hire. Initial 6-12 month contract with pay in USD. About Us Concord isn't your typical consulting firm; we're an execution focused company passionate about delivering results. Our mission is to help clients...
-
Site Reliability Engineer
hace 4 días
Monterrey, México Concord USA A tiempo completoLocation: Hybrid in Monterrey, MX. 8 days a month on-site. Possibility to get a travel or relocation stipend for travel. Type of Employment: contract to hire. Initial 6-12 month contract with pay in USD. About Us Concord isn't your typical consulting firm; we're an execution focused company passionate about delivering results. Our mission is to help clients...
-
Lead Site Reliability Engineer
hace 4 semanas
Monterrey, México Concord USA A tiempo completo**Lead Site Reliability Engineer (SRE)**Location: Hybrid in Monterrey, MX. 8 days a month on-site.Possibility to get a travel or relocation stipend for travel.Type of Employment: contract to hire. Initial 6-12 month contract with pay in USD.**About Us**Concord isn't your typical consulting firm; we're an execution focused company passionate about delivering...
-
Site Reliability Engineer
hace 3 días
Monterrey, México NOV A tiempo completoJOB DESCRIPTIONSite Reliability Engineer (SRE) - Application Performance Monitoring (APM)Location: Monterrey, Nuevo León, Mexico (Hybrid - candidates must reside in Monterrey or the metropolitan area)Language requirement: Fluent English (spoken and written)About the RoleWe're looking for a Site Reliability Engineer (SRE) with a passion for Application...
-
Site Reliability Engineer
hace 6 días
Monterrey, Nuevo León, México NOV A tiempo completoJob DescriptionSite Reliability Engineer (SRE) – Application Performance Monitoring (APM)Location:Monterrey, Nuevo León, Mexico (Hybrid – candidates must reside in Monterrey or the metropolitan area)Language requirement:Fluent English (spoken and written)About The RoleWe're looking for aSite Reliability Engineer (SRE)with a passion forApplication...
-
Site Reliability Engineer
hace 4 días
Monterrey, Nuevo León, México Concord USA A tiempo completoLocation: Hybrid in Monterrey, MX. 8 days a month on-site.Possibility to get a relocation stipend if not currently based in Monterrey. Requirement: Must be legally authorized to work for any Mexican employer without sponsorship, now or in the future.About UsConcord isn't your typical consulting firm; we're an execution...
-
Site Reliability Engineer
hace 2 días
Monterrey, México National Oilwell Varco A tiempo completoOverview We are seeking a highly motivated and experienced Site Reliability Engineer (SRE) with a specialization in Application Performance Monitoring (APM) to join our team. You will be a key player in ensuring the reliability, performance, and scalability of our mission-critical applications and systems. You will work closely with software engineering and...
-
Senior SRE
hace 2 semanas
Monterrey, México National Oilwell Varco A tiempo completoA leading company in the energy sector is seeking an experienced Site Reliability Engineer (SRE) to enhance the performance and reliability of its applications. The role focuses on applying Application Performance Monitoring tools, driving proactive issue resolution, and continuously improving system observability. Ideal candidates will have a relevant...
-
Site Reliability Engineer
hace 1 semana
Monterrey, México NOV Inc. A tiempo completo**Overview** **Responsibilities** - **APM Strategy**:Design, implement, and manage our Application Performance Monitoring strategy using tools like Elastic APM, Datadog, Dynatrace, or similar platforms. - **Proactive Issue Resolution**:Monitor systems to detect and respond swiftly to performance degradations, security threats, and system failures before...