Site Reliability Engineer
hace 4 semanas
Title: Site Reliability Engineer Location: 100% Remote
Job Type: Contract
Skills:
SRE, EKS, EC2, Docker, AWS
Job Description
The Senior SRE role is ultimately responsible for ensuring the reliability, availability, and performance of our technology and systems directly supporting our end customers and internal customers. They will work closely with the product development and platform engineering teams to build and maintain scalable systems and robust automation that supports the company's business goals.
The ideal candidate will have a history of successfully implementing and using tools like Terraform, Packer, Splunk, SignalFx, and other observability/IAC tools supporting systems with around the clock availability requirements. In addition, the ideal candidate will possess sufficient software skills to properly scrutinize and troubleshoot applications supporting our customers. They should have a strong aptitude for learning new technologies, embracing and driving solutions to challenging projects and problems. This role requires a seasoned engineer with the ability to collaborate across multiple cross-functional teams while exhibiting a rich set of problem-solving skills, along with being self-motivated and have a passion for quality
Responsibilities:
- Develop and maintain monitoring tools, alerts, and dashboards to provide visibility into system health and performance.
- Proactively gather and analyze both metric and log data from systems and applications to perform anomaly detection, performance tuning, capacity planning and fault isolation.
- Collaborate with development teams to implement and deploy new features and enhancements, ensuring they meet reliability, security and performance standards.
- Partner closely with other teams on enterprise standards/best practices.
- Identify options for problem resolution and initiate corrective actions.
- Mentor junior members, document and share solutions.
- Collaborate cross functionally.
Qualifications:
- Minimum 4 years' experience in any combination of software engineering roles of some type: SRE, DevOps, applications, services, tools/automation, release, etc.
- Minimum 3 years' experience with SRE/DevOps practices and automation tooling
- Experience with observability solutions tools like Splunk, Datadog, SignalFx, etc.
- Experience deploying, maintaining and supporting software applications/services in the AWS ecosystem
- Proactive approach to identifying problems and solutions
- Experience writing code with one or more interpreted languages such as: Python, PHP, Perl, Ruby, Linux Shell
- Experience with Terraform or Cloud Formation scripting
- Experience with configuration management tools like Ansible, Chef or Puppet
- Experience with standard software development best practices and tools such as code repositories (Git preferred)
- Experience executing in an agile software development environment
- Good understanding of pricing/cost models across AWS services, especially compute, storage, and database offerings
- Must be able to multitask and work well with changing priorities in a fast paced, 24x7 environment
- Must be highly collaborative and be able to work in a team environment consisting of both technical and business people
- Excellent communication, problem solving and customer service skills
- A strong ability to learn and adapt to new technologies
Education: Bachelor's degree in computer science, science, engineering or workforce equivalent technical certifications preferred
Thanks
Aatmesh
aatmesh.singh@ampstek.com
-
Site Reliability Engineer
hace 4 semanas
Mexico City W3Global A tiempo completoSite Reliability Engineer Join to apply for the Site Reliability Engineer role at W3Global Required qualifications: AWS experience Gitlab Terraform or AWS CDK Python Familiarity with GO Linux OS administration advanced scripting - bash Windows OS administration advanced scripting - powershell Seniority level Entry level Employment type Full-time Job function...
-
Site Reliability Engineer
hace 4 semanas
Mexico City W3Global A tiempo completoSite Reliability Engineer Join to apply for the Site Reliability Engineer role at W3Global Required qualifications: AWS experience Gitlab Terraform or AWS CDK Python Familiarity with GO Linux OS administration advanced scripting - bash Windows OS administration advanced scripting - powershell Seniority level Entry level Employment type Full-time Job function...
-
Site Reliability Engineer
hace 2 semanas
Mexico City Royal Caribbean Group A tiempo completoTalent Acquisition @Royal Caribbean Group Journey with us! Combine your career goals and sense of adventure by joining our incredible team of employees at Royal Caribbean Group. We are proud to offer a competitive compensation and benefits package, and excellent career development opportunities, each offering unique ways to explore the world. We are proud to...
-
Site Reliability Engineer
hace 4 semanas
Mexico City UST A tiempo completoJoin to apply for the Site Reliability Engineer role at USTContinue with Google Continue with GoogleJoin to apply for the Site Reliability Engineer role at USTGet AI-powered advice on this job and more exclusive features.Sign in to access AI-powered advicesContinue with Google Continue with GoogleContinue with Google Continue with GoogleContinue with Google...
-
Site Reliability Engineer
hace 4 semanas
Mexico City UST A tiempo completoJoin to apply for the Site Reliability Engineer role at USTContinue with Google Continue with GoogleJoin to apply for the Site Reliability Engineer role at USTGet AI-powered advice on this job and more exclusive features.Sign in to access AI-powered advicesContinue with Google Continue with GoogleContinue with Google Continue with GoogleContinue with Google...
-
Site Reliability Engineer
hace 1 semana
Mexico City Tata Consultancy Services A tiempo completoWe are looking for a Site Reliability Engineer (SRE) to join our team and help us ensure seamless, high-performing, and reliable technology operations.What you’ll work with:Azure DevOps - Pipelines, repositories, and automationServiceNow - Incident, change, and problem managementAppDynamics - Application performance monitoring and alertingMicrosoft Azure...
-
Site Reliability Engineer
hace 1 semana
Mexico City Tata Consultancy Services A tiempo completoWe are looking for a Site Reliability Engineer (SRE) to join our team and help us ensure seamless, high-performing, and reliable technology operations.What you’ll work with:Azure DevOps - Pipelines, repositories, and automationServiceNow - Incident, change, and problem managementAppDynamics - Application performance monitoring and alertingMicrosoft Azure...
-
Site Reliability Engineer
hace 7 horas
Mexico City The Functionary A tiempo completoSenior Site Reliability Engineer We are looking for a Senior Site Reliability Engineer to build and maintain reliable, high‑capacity, and high‑performing systems that support our mission to protect and improve customer platforms, with a strong focus on reliability, security, performance, cost, and operational excellence. As a Site Reliability Engineer on...
-
Site Reliability Engineer
hace 11 horas
Mexico City The Functionary A tiempo completoSenior Site Reliability Engineer We are looking for a Senior Site Reliability Engineer to build and maintain reliable, high‑capacity, and high‑performing systems that support our mission to protect and improve customer platforms, with a strong focus on reliability, security, performance, cost, and operational excellence. As a Site Reliability Engineer on...
-
Senior Site Reliability Engineer
hace 1 semana
Mexico City Royal Caribbean Group A tiempo completoJoin to apply for the Senior Site Reliability Engineer role at Royal Caribbean Group.1 week ago Be among the first 25 applicants.Journey with us! Combine your career goals and sense of adventure by joining our incredible team at Royal Caribbean Group. We offer a competitive compensation and benefits package, along with excellent career development...