Architect-AWS Site Reliability

hace 2 semanas


guadalajara, México IBM A tiempo completo

Introduction
At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most challenging problems? If so, let's talk.


Your Role and Responsibilities

  • Participate in design, execute automated builds, and ensure reliable, highly available and scalable systems on AWS.
  • Design, develop, and implement robust and scalable automation systems to manage and monitor our infrastructure, applications, and services.
  • Collaborate with software engineering teams to integrate best practices for reliability, scalability, and performance into the development lifecycle.
  • Optimize systems and applications for maximum performance, reliability, and cost-efficiency.
  • Troubleshoot and resolve complex technical issues related to infrastructure, networking, and application performance.
  • Perform proactive monitoring, health checks, and capacity planning to ensure system availability and performance targets are met or exceeded.
  • Continuously improve infrastructure and operational practices using a data-driven, iterative approach.
  • Collaborate with cross-functional teams to define and enforce Service Level Objectives (SLOs) and Service Level Indicators (SLIs). Put monitoring in place to ensure these are followed.
  • Participate in on-call rotation and provide rapid response to incidents, driving root cause analysis and implementing corrective actions. Responsible for ensuring someone always has their hand on the wheel.
  • Contribute to the knowledge base and documentation library to facilitate self-service and empower other teams.
  • Stay up-to-date with industry trends, emerging technologies, and best practices, and evaluate their potential applicability to NextEra Energy's infrastructure and operations.

Required Technical and Professional Expertise

  • Good understanding in AWS VPC, Subnets, Security groups, AZ’s.
  • Knowledge in Github.
  • Excellent technical knowledge in AWS Lambda, ECS, Fargate, ECR, Glue, ALB, API Gateway, DynamicDB, RDS, EMR, MSK, Kinesis, SQS, SNS, Cloud Watch, Cloud Formation.
  • Experience building pipelines, AWS infrastructure, build pipelines (Terraform, Cloud formation).
  • Experience with Java REST APIs, microservices, good understanding of microservices architecture, SQL.

Preferred Technical and Professional Expertise

  • Basic functional knowledge in distribution and transmission grid.
  • Knowledge in IoT devices.
  • Automation skills using Python.
  • Experience with Kafka.
  • A seasoned architect, should have experience with messaging queue services.
  • Show experience as Tech Lead, managing teams.
#J-18808-Ljbffr

  • Guadalajara, México Encora A tiempo completo

    **Responsibilities** - Architect and implement observability solutions utilizing advanced cloud monitoring tools such as New Relic, Dynatrace, Splunk or equivalent, to provide comprehensive insights into system metrics, logs, and traces - Configure and customize monitoring dashboards, alerts, and metrics to enable real-time visibility into system health,...


  • guadalajara, México ITPS.ONE A tiempo completo

    We are seeking a skilled and experienced AWS Site Reliability Engineer (SRE) to join our forward-thinking team. As a software engineer specializing in site reliability, you will bring a software engineering and automated solution mindset to your work. The Site Reliability Engineer III will be responsible for ensuring the reliability, availability, and...


  • Guadalajara, Jalisco, México Azka IT Consulting A tiempo completo

    Azka IT Consulting is seeking a highly skilled Site Reliability Specialist to join our team.The ideal candidate will have a strong background in automation, Unix, Linux, Ubuntu, and Windows, as well as experience with Oracle, MYSQL, and NOSQL Solutions.The Systems Operations Expert will be responsible for understanding system design and architecture,...

  • Site Reliability Engineer

    hace 4 semanas


    Guadalajara, Jalisco, México Azka IT Consulting A tiempo completo

    Site Reliability EngineerAzka IT Consulting is a Mexican company that connects top IT talent with Latin American and United States companies.We are seeking a skilled Site Reliability Engineer to join our team.Key Responsibilities:Collaborate with development and application teams to ensure system reliability and performance.Design and implement automation...

  • Site Reliability Engineer

    hace 2 semanas


    Guadalajara, Jalisco, México Azka It Consulting A tiempo completo

    Job Title: Site Reliability EngineerAzka IT is a Mexican company that connects top IT talent with Latin American and United States companies. We are seeking a skilled Site Reliability Engineer to join our team.Key Responsibilities:Collaborate with development and application teams to ensure system reliability and performance.Design and implement automation...

  • Site Reliability Engineer

    hace 2 semanas


    Guadalajara, Jalisco, México FICO A tiempo completo

    Unlock Your Potential as a Site Reliability Engineer at FICOFICO is a leading global analytics software company, empowering businesses to make better decisions. As a Site Reliability Engineer, you will play a critical role in ensuring the uptime, scalability, and reliability of our complex distributed enterprise SaaS offerings.About the OpportunityThis is a...


  • Guadalajara, México Finastra USA Corporation A tiempo completo

    **Responsibilities**: **What will you contribute?** As a Site Reliability Engineer your mission is to protect and advance the software & systems behind Finastra’s Cloud hosted services running on Fusion Operate. Finastra believes in a blameless culture where the primary objective is continuous improvement. You’ll be treating operations as a software...

  • Site Reliability Engineer

    hace 3 semanas


    Guadalajara, Jalisco, México Fico A tiempo completo

    About the RoleWe are seeking a skilled Site Reliability Engineer to join our team at FICO. As a Site Reliability Engineer, you will play a critical role in ensuring the stability and performance of our cloud-based systems.Key ResponsibilitiesSupport the full stack of Public Cloud, Private Cloud, and SaaS Enterprise ProductsManage the operations and stability...


  • Guadalajara, Jalisco, México f5 A tiempo completo

    We're looking for a highly skilled Site Reliability Engineer to join our team at F5 Networks, Inc. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and scalability of our critical Identity and Access Management Systems and SaaS platforms.Your primary responsibilities will include ensuring the full...


  • Guadalajara, Jalisco, México F5 A tiempo completo

    About the RoleWe are seeking a highly skilled Site Reliability Engineer II to join our Technology Services team at F5. As a key member of our CEDI team, you will be responsible for ensuring the smooth operation of our systems, including JIRA, Confluence, and other Atlassian tools.Key ResponsibilitiesAdminister and maintain JIRA, Confluence, and other...


  • Guadalajara, Jalisco, México f5 A tiempo completo

    We're seeking a talented Site Reliability Engineer III to join our team at F5. As a key member of our organization, you'll be responsible for ensuring the security and scalability of our systems by deploying, integrating, and supporting Privileged Access Management (PAM) and Privileged Endpoint Management (PEM) technologies.As a Site Reliability Engineer...


  • Guadalajara, México Finastra A tiempo completo

    Your deliverables as a Site Reliability Engineer will include, but are not limited to, the following: - Work with containers and container orchestration systems such as Kubernetes - Capacity Planning to determine resource requirements of your service for it to be scalable, efficient, and reliable - Collaborate with other engineers to implement operational...

  • Site Reliability Engineer

    hace 2 semanas


    Guadalajara, Jalisco, México Fico A tiempo completo

    FICO is a leading global analytics software company, helping businesses make better decisions. We're seeking a skilled Site Reliability Engineer to support our Public Cloud, Private Cloud, and SaaS Enterprise Products.The OpportunityAs a generalist with software engineering background, you will:Manage the operations and stability of our environments in...


  • Guadalajara, Jalisco, México FICO A tiempo completo

    About FICOFICO is a leading global analytics software company, helping businesses in 100+ countries make better decisions.The OpportunityThe Site Reliability Engineer is a critical role that combines software development and systems engineering. As a full-stack support engineer, you will be responsible for managing complex distributed enterprise SaaS...


  • Guadalajara, México Wizeline A tiempo completo

    **The Company**: Wizeline is a global digital services company helping mid-size to Fortune 500 companies build, scale, and deliver high-quality digital products and services. We thrive in solving our customer’s challenges through human-centered experiences, digital core modernization, and intelligence everywhere (AI/ML and data). We help them succeed in...

  • AWS Solutions Architect

    hace 2 semanas


    Guadalajara, Jalisco, México Ioconnectservices A tiempo completo

    About IO Connect Services: IO Connect Services is a leading provider of cloud-based solutions, committed to delivering innovative and well-architected technical solutions worldwide. Our team of experts is dedicated to establishing and maintaining trust with our clients and business partners for long-term relationships.Position Overview: We are seeking an...


  • Guadalajara, Jalisco, México Fico A tiempo completo

    Job SummaryAs a Site Reliability Engineering Specialist at FICO, you will play a critical role in ensuring the uptime, scalability, and reliability of our complex distributed enterprise SaaS offerings. You will be responsible for managing the full stack of Public Cloud, Private Cloud, and SaaS Enterprise Products, driving incidents to resolution, and...


  • Guadalajara, Jalisco, México FICO A tiempo completo

    About the Role:The Site Reliability Engineering position at FICO represents a unique blend of software development and systems engineering principles. As a key member of our team, you will be responsible for ensuring the high availability, scalability, and reliability of our distributed enterprise SaaS offerings.Key Responsibilities:Full Stack Support:...


  • Guadalajara, Jalisco, México FICO A tiempo completo

    About the RoleFICO is a leading global analytics software company, helping businesses in 100+ countries make better decisions. We're seeking a skilled Site Reliability Engineer to join our world-class team.Key ResponsibilitiesSupport the full stack of Public Cloud, Private Cloud, and SaaS Enterprise Products as a generalist with software engineering...

  • Site Reliability Engineer

    hace 3 semanas


    Guadalajara, Jalisco, México FICO A tiempo completo

    About FICOFICO is a leading global analytics software company, helping businesses in 100+ countries make better decisions.The OpportunityThe Site Reliability Engineer is a critical role that combines software development and systems engineering. As a full-stack support engineer, you will be responsible for managing complex distributed enterprise SaaS...