Lead Data Engineer

hace 2 semanas


Desde casa, México Fusemachines A tiempo completo

**About Fusemachines**

Fusemachines is a leading AI strategy, talent, and education services provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, United States, Canada, and Dominican Republic and more than 450 employees). Fusemachines seeks to bring its global expertise in AI to transform companies around the world.

**About the role**

This is a remote full-time position, responsible for designing, building, testing, optimizing and maintaining the infrastructure and code required for data integration, storage, processing, pipelines and analytics (BI, visualization and Advanced Analytics) from ingestion to consumption, implementing data flow controls, and ensuring high data quality and accessibility for analytics and business intelligence purposes. This role requires a strong foundation in programming, and a keen understanding of how to integrate and manage data effectively across various storage systems and technologies.

We're looking for someone who can quickly ramp up, contribute right away and lead the work in Data & Analytics, helping from backlog definition, to architecture decisions, and lead technical the rest of the team with mínimal oversight.

This role is perfect for an individual passionate about leading, leveraging data to drive insights, improve decision-making, and support the strategic goals of the organization through innovative data engineering solutions.

**Qualification / Skill Set Requirement**:

- Must have a full-time Bachelor's degree in Computer Science Information Systems, Engineering, or a related field.
- 5+ years of real-world data engineering development experience in AWS and GCP (certifications preferred). Strong expertise in Python, SQL, PySpark and AWS in an Agile environment, with a proven track record of building and optimizing data pipelines, architectures, and datasets, and proven experience in data storage, modeling, management, lake, warehousing, processing/transformation, integration, cleansing, validation and analytics.
- Senior person who can understand requirements and design end to end solutions with mínimal oversight.
- Strong programming Skills in one or more languages such as **Python**, Scala, and proficient in writing efficient and optimized code for data integration, storage, processing and manipulation.
- Strong knowledge SDLC tools and technologies, including project management software (Jira or similar), source code management (GitHub or similar), CI/CD system (GitHub actions, AWS CodeBuild or similar) and binary repository manager (AWS CodeArtifact or similar).
- Good understanding of Data Modeling and Database Design Principles. Being able to design and implement efficient database schemas that meet the requirements of the data architecture to support data solutions.
- Strong **SQL** skills and experience working with complex data sets, Enterprise Data Warehouse and writing advanced SQL queries. Proficient with Relational Databases (RDS, MySQL, Postgres, or similar) and NonSQL Databases (Cassandra, MongoDB, Neo4j, etc.).
- Skilled in Data Integration from different sources such as APIs, databases, flat files, event streaming.
- Strong experience in implementing data pipelines and efficient ELT/ETL processes, batch and real-time, in AWS and using open source solutions, being able to develop custom integration solutions as needed, including Data Integration from different sources such as APIs (PoS integrations is a plus), ERP (Oracle and Allegra are a plus), databases, flat files, Apache Parquet, event streaming, including cleansing, transformation and validation of the data.
- Strong experience with scalable and distributed Data Technologies such as Spark/**PySpark**, DBT and **Kafka**, to be able to handle large volumes of data.
- Experience with stream-processing systems: Storm, Spark-Streaming, etc. is a plus.
- Strong experience in designing and implementing Data Warehousing solutions in AWS with **Redshift**. Demonstrated experience in designing and implementing efficient ELT/ETL processes that extract data from source systems, transform it (DBT), and load it into the data warehouse.
- Strong experience in Orchestration using Apache Airflow.
- Expert in Cloud Computing in AWS, including deep knowledge of a variety of AWS services like Lambda, Kinesis, **S3**, Lake Formation, EC2, **EMR**, ECS/ECR, IAM, CloudWatch, etc
- Good understanding of Data Quality and Governance, including implementation of data quality checks and monitoring processes to ensure that data is accurate, complete, and consistent.
- Good understanding of BI solutions including Looker and LookML (Looker Modeling Language).
- Strong knowledge and hands-on experience of **DevOps** principles, tools and technologies (GitHub and AWS DevOps) including continuous integration, continuous delivery (CI/CD), infrastructure as code (IaC - Terraform), configur



  • Desde casa, México Datalogics A tiempo completo

    **Lead Azure Data Engineer (REMOTE)** - **5000 USD - 6000 USD/month**: - **100% remote work**: - Full time - **B2B contract**: - **Direct employment by the company** We’re looking for a **Lead** **Azure Data Engineer **for a specialist supply chain, data and software engineering solutions company with over 380 employees, development centers in the USA...

  • Lead Data Engineer

    hace 1 semana


    Desde casa, México Luxoft A tiempo completo

    **Project** Description**: Luxoft DXC Technology Company is an established company focusing on consulting and implementation of complex projects in the financial industry. At the interface between technology and business, we convince with our know-how, well-founded methodology and pleasure in success. As a reliable partner to our renowned customers, we...

  • Lead Data Engineer

    hace 7 días


    Desde casa, México Luxoft A tiempo completo

    **Project** Description**:Luxoft DXC Technology Company is an established company focusing on consulting and implementation of complex projects in the financial industry. At the interface between technology and business, we convince with our know-how, well-founded methodology and pleasure in success. As a reliable partner to our renowned customers, we...


  • Desde casa, México EPAM Systems A tiempo completo

    We are seeking a highly skilled and experienced **Lead Data Software** **Engineer **to lead our data engineering team in the design, implementation, and maintenance of our data platform. RESPONSIBILITIES - Collaborate with Solution Architects to design and build a configuration/metadata-driven framework for data transformation and orchestration - Develop...

  • Data Lead Engineer

    hace 4 semanas


    Desde casa, México Testing IT A tiempo completo

    **TESTING IT** Somos una empresa líder en pruebas de software, brindamos servicios en México y Nearshore a EEUU; te invitamos a unirte como** DATA LEAD ENGINEER**, jugará un papel fundamental para brindar servicios de TI excepcionales:**Buscamos a partir de 5-8 años de Experiência en**:**NIVEL DE INGLES CONVERSACIONAL B2-C2/ OBLIGATORIO**Alto dominio de...

  • Lead Data Engineer

    hace 2 semanas


    Desde casa, México EPAM Systems, Inc. A tiempo completo

    We are looking for a highly skilled **Lead Data Engineer**to join our team.In this role, you will lead technical efforts for NoSQL-related projects, evaluate database platform technologies, and provide expert analysis and recommendations based on business requirements. You will collaborate closely with the engineering team to design and implement best...

  • Lead Data Engineer

    hace 2 semanas


    Desde casa, México EPAM Systems, Inc. A tiempo completo

    Join our dynamic team as a **Lead Data Engineer**, where you will contribute both independently and collaboratively to deliver top-tier data integration solutions.Your knowledge of investment financial data and expertise in agile methodologies will be key as you work with cross-functional teams to enhance our trading platform and reporting functionalities....

  • Data Engineer

    hace 3 semanas


    Desde casa, México Inetum A tiempo completo

    We are very grateful that you have decided to participate in the **recruitment and selection process**, you will see that your incorporation will lead you to great opportunities for professional development, remember that we are an international agile digital consulting group. In the era of post digital transformation, we strive to enable each of our 27,000...

  • Lead Data Engineer

    hace 4 días


    Desde casa, México EPAM Systems, Inc. A tiempo completo

    We are seeking a **Lead Data Engineer** to join our innovative team and develop robust, scalable data infrastructure. If you are passionate about crafting high-quality data solutions and working with cutting-edge technologies, this role offers an excellent opportunity to showcase your expertise. **Responsibilities** - Create scalable data pipelines...


  • Desde casa, México TechBiz Global GmbH A tiempo completo

    **About TechBiz Global**TechBiz Global is a leading recruitment and software development company. Our diverse, globally distributed team provides IT recruitment, outstaffing, outsourcing, software development, and different consulting services with a primary focus on making our partners achieve their business goals successfully.With headquarters in Germany...