Data Engineer

hace 3 semanas


Ciudad de México, Ciudad de México Derevo A tiempo completo
Job Description

Databricks Data Engineer

Summary:

The ideal candidate will have at least 5 years of experience in designing, implementing, and maintaining data management and storage systems. Proficiency in collecting, processing, cleaning, and deploying large datasets, understanding ER data models, and integrating with multiple data sources is required. The ability to analyze, communicate, and propose different ways of building Data Warehouses, Data Lakes, End-to-End Pipelines, and Big Data solutions to clients, either in batch or streaming strategies, is essential.

Technical Proficiencies:

SQL:

Data Definition Language, Data Manipulation Language, Intermediate/advanced queries for analytical purposes, Subqueries, CTEs, Data types, Joins with business rules applied, Grouping and Aggregates for business metrics, Indexing and optimizing queries for efficient ETL process, Stored Procedures for transforming and preparing data, SSMS, DBeaver

Python:

Experience in object-oriented programming, Management and processing datasets, Use of variables, lists, dictionaries, and tuples, Conditional and iterating functions, Optimization of memory consumption, Structures and data types, Data ingestion through various structured and semi-structured data sources, Knowledge of libraries such as pandas, numpy, sqlalchemy, Good practices when writing code

Databricks / Pyspark:

Intermediate knowledge in

Understanding of narrow and wide transformations, actions, and lazy evaluations

How DataFrames are transformed, executed, and optimized in Spark

Use DataFrame API to explore, preprocess, join, and ingest data in Spark

Use Delta Lake to improve the quality and performance of data pipelines

Use SQL and Python to write production data pipelines to extract, transform, and load data into tables and views in the Lakehouse

Understand the most common performance problems associated with data ingestion and how to mitigate them

Monitor Spark UI: Jobs, Stages, Tasks, Storage, Environment, Executors, and Execution Plans

Configure a Spark cluster for maximum performance given specific job requirements

Configure Databricks to access Blob, ADL, SAS, user tokens, Secret Scopes, and Azure Key Vault

Configure governance solutions through Unity Catalog and Delta Sharing

Use Delta Live Tables to manage an end-to-end pipeline with unit and integrations test

Azure:

Intermediate/Advanced knowledge in

Azure Storage Account:

Provision Azure Blob Storage or Azure Data Lake instances

Build efficient file systems for storing data into folders with static or parametrized names, considering possible security rules and risks

Experience identifying use cases for open-source file formats like parquet, AVRO, ORC

Understanding optimized column-oriented file formats vs optimized row-oriented file formats

Implementing security configurations through Access Keys, SAS, AAD, RBAC, ACLs

Azure Data Factory:

Provision Azure Data Factory instances

Use Azure IR, Self-Hosted IR, Azure-SSIS to establish connections to distinct data sources

Use of Copy or Polybase activities for loading data

Build efficient and optimized ADF Pipelines using linked services, datasets, parameters, triggers, data movement activities, data transformation activities, control flow activities, and mapping data flows

Build Incremental and Re-Processing Loads

CICD (deseable):

Automate the deployment, scaling, and de-scaling of Azure Databricks clusters using tools like ARM Templates, Terraform, or Azure DevOps Pipelines.

Process Automation:

Automate the deployment, scaling, and de-scaling of Azure Databricks clusters using tools like ARM Templates, Terraform, or Azure DevOps Pipelines.

Monitoring and Performance Optimization:

Set up alerts and monitor key performance metrics in Azure Databricks using Azure Monitor and other monitoring tools. Optimize cluster and workload performance to ensure efficiency and scalability.

Security and Compliance:

Implement security controls and compliance policies in Azure Databricks

Integration with Azure Services:

Integrate Azure Databricks with other Azure services such as Azure Data Lake Storage, Azure SQL Database, Azure Synapse Analytics, and Azure DevOps to create end-to-end data analytics solutions.

Configuration and Secrets Management:

Manage configurations and sensitive secrets using Azure Key Vault or other secrets management solutions. Ensure the security of credentials and access keys.

Training and Support:

Provide training and technical support to development and data analytics teams in the effective use of Azure Databricks. Document best practices and usage patterns to facilitate adoption and collaboration.


  • Data Engineer

    hace 4 semanas


    Ciudad de México, Ciudad de México 1210 Kyndryl Mexico S. de R.L. de C.V. A tiempo completo

    About the RoleWe are seeking a highly skilled Data Engineer to join our team at Kyndryl. As a Data Engineer, you will be responsible for designing, building, and maintaining our data platforms, ensuring the availability of pristine, refined data sets.Key ResponsibilitiesDesign and develop data pipelines using various tools and technologiesEnsure data quality...

  • Data Engineer

    hace 3 semanas


    Ciudad de México, Ciudad de México Azka IT Consulting A tiempo completo

    Azka IT Consulting: Data Engineer OpportunityWe are seeking a highly skilled Data Engineer to join our team at Azka IT Consulting. As a Data Engineer, you will play a crucial role in designing and implementing data extraction, transformation, and loading (ETL) processes using SSIS and Azure platform.Key Responsibilities:Design and implement data pipelines to...

  • Data Engineer

    hace 2 semanas


    Ciudad de México, Ciudad de México Virtualent A tiempo completo

    Data EngineerWe are seeking a highly skilled Data Engineer to join our team at Virtualent. As a Data Engineer, you will be responsible for designing, developing, and maintaining robust and scalable data pipelines.Key Responsibilities:Design and develop data pipelines to meet business requirements.Work with data scientists and analysts to understand their...

  • Data Engineer

    hace 4 semanas


    Ciudad de México, Ciudad de México Chubb A tiempo completo

    Job Title: Data EngineerJob Summary:We are seeking a highly skilled Data Engineer to join our team at Chubb. As a Data Engineer, you will be responsible for designing, implementing, and maintaining robust and scalable data pipelines to support our business operations.Key Responsibilities:Design and develop ETL processes to extract, transform, and load data...

  • Data Engineer

    hace 3 semanas


    Ciudad de México, Ciudad de México Thomson Reuters A tiempo completo

    About the RoleWe are seeking a highly skilled Data Engineer to join our team at Thomson Reuters. As a Data Engineer, you will play a pivotal role in shaping the future of our data products and analytical services.Key ResponsibilitiesDevelop and maintain robust data pipelines that power our analytics and drive informed decision-making for our...

  • Data Engineer

    hace 4 semanas


    Ciudad de México, Ciudad de México Lionbridge A tiempo completo

    About the RoleLionbridge is seeking a highly skilled Data Engineer to join our team. As a Data Engineer, you will be responsible for designing, developing, and maintaining large-scale data systems and applications. You will work closely with cross-functional teams to ensure data quality, integrity, and security.Key ResponsibilitiesDesign, develop, and deploy...

  • Data Engineer

    hace 2 meses


    Ciudad de México, Ciudad de México Rocket Code (AL-I Digital Solutions S.A. de C.V.) A tiempo completo

    About Rocket Code (AL-I Digital Solutions S.A. de C.V.)We are a leading digital solutions company that specializes in AI-first approaches, revolutionizing the tech landscape. Our mission is to transform existing technology into digital experiences that generate a profoundly positive impact.Job SummaryWe are seeking a highly skilled Data Engineer to join our...

  • Data Engineer

    hace 3 semanas


    Ciudad de México, Ciudad de México Capgemini A tiempo completo

    Job DescriptionJob Title: Foundations SetupJob Summary:We are seeking a highly skilled Data Engineer to join our team at Capgemini. As a Data Engineer, you will be responsible for designing, developing, and maintaining large-scale data systems. You will work closely with cross-functional teams to understand data requirements and contribute to the design and...

  • Data Engineer

    hace 1 mes


    Ciudad de México, Ciudad de México Thomson Reuters A tiempo completo

    About the RoleWe are seeking a highly skilled Data Engineer to join our team at Thomson Reuters. As a Data Engineer, you will play a pivotal role in shaping the future of our data products and analytical services.Key ResponsibilitiesDevelop and maintain robust data pipelines that power our analytics and drive informed decision-making for our...

  • Data Engineer

    hace 3 semanas


    Ciudad de México, Ciudad de México Rocket Code (AL-I Digital Solutions S.A. de C.V.) A tiempo completo

    About Rocket CodeRocket Code is a pioneering company in the AI revolution, pushing the boundaries of technology to create impactful solutions. Our mission is to transform existing technology into digital experiences that generate a profoundly positive impact.Job DescriptionWe are seeking a highly skilled Data Engineer to join our team. As a Data Engineer at...


  • Ciudad de México, Ciudad de México Balsam Brands A tiempo completo

    Job Title: Senior Data EngineerWe are seeking a highly skilled Senior Data Engineer to join our team at Balsam Brands. As a Senior Data Engineer, you will be responsible for designing and building a robust, scalable, and high-performance data infrastructure to meet the company-wide data and analytics needs.Key Responsibilities:Design and maintain data...

  • Senior Data Engineer

    hace 3 semanas


    Ciudad de México, Ciudad de México Balsam Brands A tiempo completo

    Job Title: Senior Data EngineerWe are seeking a highly skilled Senior Data Engineer to join our team at Balsam Brands. As a Senior Data Engineer, you will be responsible for designing and building a robust, scalable, and high-performance data infrastructure to meet the company-wide data and analytics needs.Key Responsibilities:Design and maintain data...

  • Data Engineer

    hace 1 mes


    Ciudad de México, Ciudad de México Rocket Code (AL-I Digital Solutions S.A. de C.V.) A tiempo completo

    About Rocket CodeRocket Code is a leading technology company that is revolutionizing the digital landscape with its AI-first approach. We are a team of passionate individuals who are dedicated to transforming existing technology into digital experiences that generate a profoundly positive impact.Job DescriptionWe are seeking a highly skilled Data Engineer to...

  • Data Engineer

    hace 1 mes


    Ciudad de México, Ciudad de México Rocket Code (AL-I Digital Solutions S.A. de C.V.) A tiempo completo

    About Rocket CodeRocket Code is a leading technology company that is revolutionizing the digital landscape with its AI-first approach. We are a team of passionate individuals who are dedicated to transforming existing technology into digital experiences that generate a profoundly positive impact.Job DescriptionWe are seeking a highly skilled Data Engineer to...

  • Data Software Engineer

    hace 4 semanas


    Ciudad de México, Ciudad de México CRH Talento en IT A tiempo completo

    Job Title: Data Software EngineerCRH Talento en IT is seeking a highly skilled Data Software Engineer to join our team. As a Data Software Engineer, you will be responsible for designing, developing, and maintaining software applications that collect, process, and analyze large datasets.Responsibilities:Design and develop software applications using...

  • Lead Data Engineer

    hace 3 semanas


    Ciudad de México, Ciudad de México NBCUniversal A tiempo completo

    Job Title: Lead Data EngineerWe are seeking a highly skilled Lead Data Engineer to join our team at NBCUniversal. As a key member of our Strategy and Insights team, you will be responsible for designing and implementing data pipelines, ensuring data quality, and collaborating with cross-functional teams to drive business decisions.Responsibilities:Develop...

  • Data Engineer

    hace 3 semanas


    Ciudad de México, Ciudad de México 3Pillar Global A tiempo completo

    Join 3Pillar Global as a Data EngineerWe are a leading product development partner that builds breakthrough software products for digital businesses. Our innovative solutions drive rapid revenue, market share, and customer growth for industry leaders in Software and SaaS, Media and Publishing, Information Services, and Retail.Key Responsibilities:Design and...


  • Ciudad de México, Ciudad de México Ntt Data A tiempo completo

    Job SummaryWe are seeking a highly skilled Database Operations Engineer to join our team at NTT DATA Services. As a key member of our database team, you will be responsible for managing and overseeing the entire lifecycle of SQL Server databases, from development to mission-critical production systems.Key ResponsibilitiesManage and maintain database servers...

  • Data Engineer

    hace 1 mes


    Ciudad de México, Ciudad de México Lionbridge A tiempo completo

    Data Engineer I (Games)Lionbridge is seeking a skilled Data Engineer I to join our team. As a key member of our data analytics team, you will be responsible for designing, developing, and maintaining large-scale data systems to support our business operations.Key Responsibilities:Design and implement data pipelines to extract, transform, and load data from...


  • Ciudad de México, Ciudad de México AgileEngine A tiempo completo

    Job Title: Lead Data Orchestration EngineerAbout the Role:AgileEngine is seeking a highly skilled Lead Data Orchestration Engineer to join our team. As a key member of our data engineering team, you will be responsible for designing and implementing data pipelines, ensuring data quality, and collaborating with cross-functional teams to drive business...