Architect-AWS Site Reliability
hace 2 semanas
Introduction
At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most challenging problems? If so, let's talk.
Your Role and Responsibilities
- Participate in design, execute automated builds, and ensure reliable, highly available and scalable systems on AWS.
- Design, develop, and implement robust and scalable automation systems to manage and monitor our infrastructure, applications, and services.
- Collaborate with software engineering teams to integrate best practices for reliability, scalability, and performance into the development lifecycle.
- Optimize systems and applications for maximum performance, reliability, and cost-efficiency.
- Troubleshoot and resolve complex technical issues related to infrastructure, networking, and application performance.
- Perform proactive monitoring, health checks, and capacity planning to ensure system availability and performance targets are met or exceeded.
- Continuously improve infrastructure and operational practices using a data-driven, iterative approach.
- Collaborate with cross-functional teams to define and enforce Service Level Objectives (SLOs) and Service Level Indicators (SLIs). Put monitoring in place to ensure these are followed.
- Participate in on-call rotation and provide rapid response to incidents, driving root cause analysis and implementing corrective actions. Responsible for ensuring someone always has their hand on the wheel.
- Contribute to the knowledge base and documentation library to facilitate self-service and empower other teams.
- Stay up-to-date with industry trends, emerging technologies, and best practices, and evaluate their potential applicability to NextEra Energy's infrastructure and operations.
Required Technical and Professional Expertise
- Good understanding in AWS VPC, Subnets, Security groups, AZ’s.
- Knowledge in Github.
- Excellent technical knowledge in AWS Lambda, ECS, Fargate, ECR, Glue, ALB, API Gateway, DynamicDB, RDS, EMR, MSK, Kinesis, SQS, SNS, Cloud Watch, Cloud Formation.
- Experience building pipelines, AWS infrastructure, build pipelines (Terraform, Cloud formation).
- Experience with Java REST APIs, microservices, good understanding of microservices architecture, SQL.
Preferred Technical and Professional Expertise
- Basic functional knowledge in distribution and transmission grid.
- Knowledge in IoT devices.
- Automation skills using Python.
- Experience with Kafka.
- A seasoned architect, should have experience with messaging queue services.
- Show experience as Tech Lead, managing teams.
-
Site Reliability Engineer
hace 5 meses
Guadalajara, México Encora A tiempo completo**Responsibilities** - Architect and implement observability solutions utilizing advanced cloud monitoring tools such as New Relic, Dynatrace, Splunk or equivalent, to provide comprehensive insights into system metrics, logs, and traces - Configure and customize monitoring dashboards, alerts, and metrics to enable real-time visibility into system health,...
-
Site Reliability Engineer
hace 7 días
guadalajara, México ITPS.ONE A tiempo completoWe are seeking a skilled and experienced AWS Site Reliability Engineer (SRE) to join our forward-thinking team. As a software engineer specializing in site reliability, you will bring a software engineering and automated solution mindset to your work. The Site Reliability Engineer III will be responsible for ensuring the reliability, availability, and...
-
Site Reliability Specialist
hace 2 días
Guadalajara, Jalisco, México Azka IT Consulting A tiempo completoAzka IT Consulting is seeking a highly skilled Site Reliability Specialist to join our team.The ideal candidate will have a strong background in automation, Unix, Linux, Ubuntu, and Windows, as well as experience with Oracle, MYSQL, and NOSQL Solutions.The Systems Operations Expert will be responsible for understanding system design and architecture,...
-
Site Reliability Engineer
hace 4 semanas
Guadalajara, Jalisco, México Azka IT Consulting A tiempo completoSite Reliability EngineerAzka IT Consulting is a Mexican company that connects top IT talent with Latin American and United States companies.We are seeking a skilled Site Reliability Engineer to join our team.Key Responsibilities:Collaborate with development and application teams to ensure system reliability and performance.Design and implement automation...
-
Site Reliability Engineer
hace 2 semanas
Guadalajara, Jalisco, México Azka It Consulting A tiempo completoJob Title: Site Reliability EngineerAzka IT is a Mexican company that connects top IT talent with Latin American and United States companies. We are seeking a skilled Site Reliability Engineer to join our team.Key Responsibilities:Collaborate with development and application teams to ensure system reliability and performance.Design and implement automation...
-
Site Reliability Engineer
hace 2 semanas
Guadalajara, Jalisco, México FICO A tiempo completoUnlock Your Potential as a Site Reliability Engineer at FICOFICO is a leading global analytics software company, empowering businesses to make better decisions. As a Site Reliability Engineer, you will play a critical role in ensuring the uptime, scalability, and reliability of our complex distributed enterprise SaaS offerings.About the OpportunityThis is a...
-
Site Reliability Engineer
hace 4 meses
Guadalajara, México Finastra USA Corporation A tiempo completo**Responsibilities**: **What will you contribute?** As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra’s Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture where the primary objective is continuous improvement. You’ll be treating operations as a software...
-
Site Reliability Engineer
hace 3 semanas
Guadalajara, Jalisco, México Fico A tiempo completoAbout the RoleWe are seeking a skilled Site Reliability Engineer to join our team at FICO. As a Site Reliability Engineer, you will play a critical role in ensuring the stability and performance of our cloud-based systems.Key ResponsibilitiesSupport the full stack of Public Cloud, Private Cloud, and SaaS Enterprise ProductsManage the operations and stability...
-
Site Reliability Engineer
hace 4 días
Guadalajara, Jalisco, México f5 A tiempo completoWe're looking for a highly skilled Site Reliability Engineer to join our team at F5 Networks, Inc. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and scalability of our critical Identity and Access Management Systems and SaaS platforms.Your primary responsibilities will include ensuring the full...
-
Site Reliability Engineer II
hace 2 meses
Guadalajara, Jalisco, México F5 A tiempo completoAbout the RoleWe are seeking a highly skilled Site Reliability Engineer II to join our Technology Services team at F5. As a key member of our CEDI team, you will be responsible for ensuring the smooth operation of our systems, including JIRA, Confluence, and other Atlassian tools.Key ResponsibilitiesAdminister and maintain JIRA, Confluence, and other...
-
Site Reliability Engineer
hace 1 día
Guadalajara, Jalisco, México f5 A tiempo completoWe're seeking a talented Site Reliability Engineer III to join our team at F5. As a key member of our organization, you'll be responsible for ensuring the security and scalability of our systems by deploying, integrating, and supporting Privileged Access Management (PAM) and Privileged Endpoint Management (PEM) technologies.As a Site Reliability Engineer...
-
Site Reliability Engineer
hace 5 meses
Guadalajara, México Finastra A tiempo completoYour deliverables as a Site Reliability Engineer will include, but are not limited to, the following: - Work with containers and container orchestration systems such as Kubernetes - Capacity Planning to determine resource requirements of your service for it to be scalable, efficient, and reliable - Collaborate with other engineers to implement operational...
-
Site Reliability Engineer
hace 2 semanas
Guadalajara, Jalisco, México Fico A tiempo completoFICO is a leading global analytics software company, helping businesses make better decisions. We're seeking a skilled Site Reliability Engineer to support our Public Cloud, Private Cloud, and SaaS Enterprise Products.The OpportunityAs a generalist with software engineering background, you will:Manage the operations and stability of our environments in...
-
Site Reliability Engineer
hace 1 mes
Guadalajara, Jalisco, México FICO A tiempo completoAbout FICOFICO is a leading global analytics software company, helping businesses in 100+ countries make better decisions.The OpportunityThe Site Reliability Engineer is a critical role that combines software development and systems engineering. As a full-stack support engineer, you will be responsible for managing complex distributed enterprise SaaS...
-
Site Reliability Engineer
hace 2 meses
Guadalajara, México Wizeline A tiempo completo**The Company**: Wizeline is a global digital services company helping mid-size to Fortune 500 companies build, scale, and deliver high-quality digital products and services. We thrive in solving our customer’s challenges through human-centered experiences, digital core modernization, and intelligence everywhere (AI/ML and data). We help them succeed in...
-
AWS Solutions Architect
hace 2 semanas
Guadalajara, Jalisco, México Ioconnectservices A tiempo completoAbout IO Connect Services: IO Connect Services is a leading provider of cloud-based solutions, committed to delivering innovative and well-architected technical solutions worldwide. Our team of experts is dedicated to establishing and maintaining trust with our clients and business partners for long-term relationships.Position Overview: We are seeking an...
-
Site Reliability Engineering Specialist
hace 2 semanas
Guadalajara, Jalisco, México Fico A tiempo completoJob SummaryAs a Site Reliability Engineering Specialist at FICO, you will play a critical role in ensuring the uptime, scalability, and reliability of our complex distributed enterprise SaaS offerings. You will be responsible for managing the full stack of Public Cloud, Private Cloud, and SaaS Enterprise Products, driving incidents to resolution, and...
-
Site Reliability Engineering Specialist
hace 2 días
Guadalajara, Jalisco, México FICO A tiempo completoAbout the Role:The Site Reliability Engineering position at FICO represents a unique blend of software development and systems engineering principles. As a key member of our team, you will be responsible for ensuring the high availability, scalability, and reliability of our distributed enterprise SaaS offerings.Key Responsibilities:Full Stack Support:...
-
Site Reliability Engineer
hace 2 meses
Guadalajara, Jalisco, México FICO A tiempo completoAbout the RoleFICO is a leading global analytics software company, helping businesses in 100+ countries make better decisions. We're seeking a skilled Site Reliability Engineer to join our world-class team.Key ResponsibilitiesSupport the full stack of Public Cloud, Private Cloud, and SaaS Enterprise Products as a generalist with software engineering...
-
Site Reliability Engineer
hace 3 semanas
Guadalajara, Jalisco, México FICO A tiempo completoAbout FICOFICO is a leading global analytics software company, helping businesses in 100+ countries make better decisions.The OpportunityThe Site Reliability Engineer is a critical role that combines software development and systems engineering. As a full-stack support engineer, you will be responsible for managing complex distributed enterprise SaaS...