Sr Sre
hace 4 semanas
Job Description:Required TechnologiesAWS, Docker, Kubernetes, Prometheus, JavaScript, Rust, PythonOn-Call Duties:Participate in on-call rotations to provide 24/7 supportBe available to respond to critical incidents outside regular working hours.The Site Reliability Engineer (SRE) will support the maintenance and improvement of system reliability, performance, and scalability. The role involves monitoring system health, troubleshooting issues, and assisting in the implementation of automation solutions. The SRE will work under the guidance of senior engineers and participate in on-call rotations.Qualifications:Basic understanding of cloud services (primarily AWS).Proficiency in scripting languages (JS, Rust, Python, Bash, etc.).Familiarity with containerization and orchestration (Docker, Kubernetes).Experience with monitoring tools (Prometheus, Grafana, Datadog).Day-to-Day Responsibilities:System Monitoring:Continuously monitor system performance and health using monitoring tools.Track key metrics and logs to identify potential issues.Incident Response:Respond to system alerts and incidents promptly.Assist in diagnosing and resolving system outages or performance issues.Automation:Develop and maintain automation scripts to streamline operational tasks.Implement automated monitoring and alerting solutions.Maintenance:Perform regular system maintenance tasks, such as updates and patches.Assist in capacity planning and scaling efforts.Collaboration:Work with other engineering teams to improve system reliability.Participate in team meetings and contribute to discussions on system improvements.Documentation:Document incident reports, troubleshooting steps, and resolution outcomes.Maintain up-to-date documentation for systems and processes.On-Call Duties:Participate in on-call rotations to provide 24/7 support.Be available to respond to critical incidents outside regular working hours.Performance Tuning:Analyze system performance data to identify areas for improvement.Learning and Development:Continuously learn new tools, technologies, and best practices in site reliability engineering.Attend training sessions and workshops as needed.Security:Assist in security audits and vulnerability assessments.*Full-time availability position aligned with Arkansas time-zone*Recommendation letters will be requestedTipo de puesto: Tiempo completoSueldo: $70,000.00 - $100,000.00 al mesBeneficios:- Horarios flexiblesTipo de jornada:- De guardia- Turno de 8 horasExperiência:- SRE/Devops: 5 años (Obligatorio)- Javascript: 2 años (Obligatorio)- python: 2 años (Obligatorio)- rust: 1 año (Obligatorio)- AWS: 1 año (Obligatorio)Idioma:- Inglés (Obligatorio)Lugar de trabajo: Empleo remoto
-
DevOps Engineer Sr
hace 4 semanas
Desde casa, México TECH - KLISH MEXICO A tiempo completo**Empresa líder en servicios de IT por proyecto** está en búsqueda de un(a) **DevOps Engineer Senior** para integrarse a un equipo dinámico y altamente técnico.**Detalles del Proyecto**- **Duración inicial**: 6 meses (con posibilidad de extensión)- **Presupuesto mensual máximo**: $3,300 USD / $61,200 MXN- **Zona horaria**: América/Denver-...
-
Sr. Systems Engineer
hace 3 días
Desde casa, México Zillow A tiempo completo**About the team**: The Infrastructure Systems Engineering team transforms complex infrastructure systems into simple, scalable, and highly available platforms that accelerate developer velocity and business outcomes. We focus on building resilient, software-driven solutions that power core compute, storage, and foundational services like DNS, OS imaging,...