Databricks Data Engineer

hace 2 semanas


México Derevo A tiempo completo

**Databricks Data Engineer**

**Summary**:
The desired profile should have at least 5 years hands-on experience in designing, establishing, and maintaining data management and storing systems. Skilled in collecting, processing, cleaning, and deploying large datasets, understanding ER data models, and integrating with multiple data sources. Efficient in analyzing, communicating, and proposing different ways of building Data Warehouses, Data Lakes, End-to-End Pipelines, and Big Data solutions to clients, either in batch or streaming strategies.

**Technical Proficiencies**:

- SQL:
Data Definition Language, Data Manipulation Language, Intermediate/advanced queries for analytical purpose, Subqueries, CTEs, Data types, Joins with business rules applied, Grouping and Aggregates for business metrics, Indexing and optimizing queries for efficient ETL process, Stored Procedures for transforming and preparing data, SSMS, DBeaver
- Python:
Experience in object-oriented programming, Management and processing datasets, Use of variables, lists, dictionaries and tuples, Conditional and iterating functions, Optimization of memory consumption, Structures and data types, Data ingestion through various structured and semi-structured data sources, Knowledge of libraries such as pandas, numpy, sqlalchemy, Must have good practices when writing code
- Databricks / Pyspark:
Intermediate knowledge in

Understanding of narrow and wide transformations, actions, and lazy evaluations

How DataFrames are transformed, executed, and optimized in Spark

Use DataFrame API to explore, preprocess, join, and ingest data in Spark

Use Delta Lake to improve the quality and performance of data pipelines

Use SQL and Python to write production data pipelines to extract, transform, and load data into

tables and views in the Lakehouse

Understand the most common performance problems associated with data ingestion and how to

mitigate them

Monitor Spark UI: Jobs, Stages, Tasks, Storage, Environment, Executors, and Execution Plans

Configure a Spark cluster for maximum performance given specific job requirements

Configure Databricks to access Blob, ADL, SAS, user tokens, Secret Scopes and Azure Key Vault

Configure governance solutions through Unity Catalog and Delta Sharing

Use Delta Live Tables to manage an end-to-end pipeline with unit and integrations test
- Azure:
Intermediate/Advanced knowledge in

Azure Storage Account:
Provision Azure Blob Storage or Azure Data Lake instances

Build efficient file systems for storing data into folders with static or parametrized names, considering possible security rules and risks

Experience identifying use cases for open-source file formats like parquet, AVRO, ORC

Understanding optimized column-oriented file formats vs optimized row-oriented file formats

Implementing security configurations through Access Keys, SAS, AAD, RBAC, ACLs

Azure Data Factory:
Provision Azure Data Factory instances

Use Azure IR, Self-Hosted IR, Azure-SSIS to establish connections to distinct data sources

Use of Copy or Polybase activities for loading data

Build efficient and optimized ADF Pipelines using linked services, datasets, parameters, triggers, data movement activities, data transformation activities, control flow activities and mapping data flows

Build Incremental and Re-Processing Loads
- CICD (deseable)

**Process Automation**: Automate the deployment, scaling, and de-scaling of Azure Databricks clusters using tools like ARM Templates, Terraform, or Azure DevOps Pipelines.

**Monitoring and Performance Optimization**: Set up alerts and monitor key performance metrics in Azure Databricks using Azure Monitor and other monitoring tools. Optimize cluster and workload performance to ensure efficiency and scalability.

**Security and Compliance**: Implement security controls and compliance policies in Azure Databricks

**Integration with Azure Services**: Integrate Azure Databricks with other Azure services such as Azure Data Lake Storage, Azure SQL Database, Azure Synapse Analytics, and Azure DevOps to create end-to-end data analytics solutions.

**Configuration and Secrets Management**: Manage configurations and sensitive secrets using Azure Key Vault or other secrets management solutions. Ensure the security of credentials and access keys.

**Training and Support**: Provide training and technical support to development and data analytics teams in the effective use of Azure Databricks. Document best practices and usage patterns to facilitate adoption and collaboration.


  • Data Engineer

    hace 4 semanas


    Ciudad de México HR NET A tiempo completo

    We are looking for a Data Engineer for a global provider of IT services, products, and solutions across diverse industries. Our client is a Cloud & Data company involved in building and delivering managed services. **Your challenge**: Improving engineering practices and producing high quality software with big data and analytics technologies. **Main...

  • Data Engineer

    hace 1 semana


    Ciudad de México VISEO - Spain A tiempo completo

    En VISEO buscamos incorporar un Data Engineer. Podrás desempeñar tareas a lo largo de todo el ciclo de vida del desarrollo , desde la toma de requisitos hasta la implementación y análisis/diseño de informes en la plataforma de GCP. Formarás parte de un equipo de expertos en BI, para participar en proyectos pioneros para nuestro cliente del...

  • Azure Data Engineer

    hace 1 semana


    Ciudad de México Infosys Limited A tiempo completo

    Company Requisition ID 118437BR Country Mexico State / Region / Province Mexico Work Location Domain Delivery Interest Group Infy Mexico The Senior Data Engineer is a core member of the team responsible for providing digital solutions to data challenges through design and implementation of an ecosystem of Cloud solutions to support modern data...

  • Data Engineer

    hace 4 semanas


    México Factored A tiempo completo

    Who we are: Factored was conceived in Palo Alto, California by Andrew Ng and a team of highly experienced AI researchers, educators, and engineers to help address the significant shortage of qualified AI & Machine-Learning engineers globally.We know that exceptional technical aptitude, intelligence, communication skills, and passion are equally distributed...

  • Data Engineer

    hace 4 semanas


    Ciudad de México Trinity Structural Towers A tiempo completo

    Trinity Industries is seeking a Data Enginee r in our Queretaro MX Office. The successful candidate will be part of our Dallas, TX Corporate Enterprise Data Engineering team. The engineer will have professional experience in data integration, data transformation and developing data pipelines. What you’ll do: • Design data models and data...


  • Ciudad de México Trinity Structural Towers A tiempo completo

    Trinity Industries is seeking a Data Management Engineer in our Queretaro, MX Office. The successful candidate will be part of our Dallas, TX Corporate Enterprise Data Engineering team. The engineer will have professional experience in data integration, data transformation and developing data pipelines What you’ll do: • Design data models and...

  • Lead Data Engineer

    hace 4 semanas


    México Chubb A tiempo completo

    **With you Chubb is better!** Chubb is the world’s largest publicly traded P&C insurance company and a leading commercial lines insurer in the United States. With operations in 54 countries and territories, Chubb provides commercial and personal property and casualty insurance, personal accident and supplemental health insurance, reinsurance, and life...

  • Data Engineer, Quality

    hace 4 semanas


    México Chubb A tiempo completo

    **With you, Chubb is better!** Are passionate with data infrastructure, metrics and coding? Do you love creating pipelines to support business? Would you like to be a member of a fun working environment where your innovative projects make a real impact? Then, check this outstanding opportunity in our new **Technology Hub in Mexico - CBSM (Chubb Business...

  • Data Engineer

    hace 4 semanas


    México Chubb A tiempo completo

    **With you, Chubb is better!** Are passionate with data infrastructure, metrics and coding? Do you love creating pipelines to support business? Would you like to be a member of a fun working environment where your innovative projects make a real impact? Then, check this outstanding opportunity in our new **Technology Hub in Mexico **- CBSM (Chubb Business...

  • Data Engineer

    hace 3 semanas


    Ciudad de México CHUBB A tiempo completo

    With you, Chubb is better! Are passionate with data infrastructure, metrics and coding? Do you love creating pipelines to support business? Would you like to be a member of a fun working environment where your innovative projects make a real impact? Then, check this outstanding opportunity in our new Technology Hub in Mexico – CBSM (Chubb Business...

  • Data Engineer Azure

    hace 1 mes


    Ciudad de México LAAgencia A tiempo completo

    Lingaro Group is the end-to-end data services partner to global brands and enterprises. We lead our clientsthrough their data journey, from strategy through development to operations and adoption, helping them to realize the full value of their data. Responsabilities Responsible for implementing data ingestion pipelines from diverse data sources employing...


  • Ciudad de México LAAgencia A tiempo completo

    Lingaro Group is the end-to-end data services partner to global brands and enterprises. We lead our clientsthrough their data journey, from strategy through development to operations and adoption, helping them to realize the full value of their data. Responsabilities Set up data ingestion pipelines from diverse data origins utilizing Azure Data Factory,...

  • Data Engineer Jr

    hace 4 semanas


    México Bond A tiempo completo

    **Bond **es una empresa que conecta oportunidades con talento joven, estamos buscando a un **Data Engineer** para una de nuestras empresas aliadas**, Cleber.** **RESPONSABILIDADES** - Diseño de data stores y bases de datos. - Implementar almacenamiento de datos. - Realizar Extracción, Transformación y Carga (ETL) o variaciones de esta actividad. -...

  • Associate Data Engineer

    hace 4 semanas


    México Chubb A tiempo completo

    **With you, Chubb is better!** Are passionate with data infrastructure, metrics and coding? Do you love creating pipelines to support business? Would you like to be a member of a fun working environment where your innovative projects make a real impact? Then, check this outstanding opportunity in our new **Technology Hub in Mexico **- CBSM ( **Chubb...


  • México Mural! A tiempo completo

    YOUR MISSION As a Senior Software Engineer in the Data Modeling & Analytics team, you will grow our business by building and maintaining Audit Log APIs, Reporting APIs and Analytics Insights that deliver valuable data to our Enterprise customers and help us expand globally. Your expertise will be instrumental in improving our data-driven product...

  • Data Engineer

    hace 1 mes


    México Chubb A tiempo completo

    With you, Chubb is better!     Are passionate with data infrastructure, metrics and coding? Do you love creating pipelines to support business? Would you like to be a member of a fun working environment where your innovative projects make a real impact? Then, check this outstanding opportunity in our new Technology Hub in Mexico – CBSM ( Chubb...

  • Data Engineer

    hace 4 semanas


    México Ascend Square A tiempo completo

    We are seeking a talented and driven **Data Engineer** to join our team. As a Data Engineer, you will play a pivotal role in designing, constructing, and maintaining our data architecture. If you are passionate about data, possess a Bachelor's Degree, are fluent in English, and open to relocating to the USA, we welcome you to apply. - Bachelor's Degree in...

  • Senior Data Engineer

    hace 4 semanas


    México Welocalize, Inc. A tiempo completo

    As a trusted global transformation partner, Welocalize accelerates the global business journey by enabling brands and companies to reach, engage, and grow international audiences. Welocalize delivers multilingual content transformation services in translation, localization, and adaptation for over 250 languages with a growing network of over 400,000...

  • Senior Data Engineer

    hace 4 semanas


    México Factored A tiempo completo

    Who we are: Factored was conceived in Palo Alto, California by Andrew Ng and a team of highly experienced AI researchers, educators, and engineers to help address the significant shortage of qualified AI & Machine-Learning engineers globally. We know that exceptional technical aptitude, intelligence, communication skills, and passion are equally...

  • Sr. Data Engineer

    hace 1 mes


    México Chubb A tiempo completo

    Sr. Data Engineer With you, Chubb is better!     Are passionate with data infrastructure, metrics and coding? Do you love creating pipelines to support business? Would you like to be a member of a fun working environment where your innovative projects make a real impact? Then, check this outstanding opportunity in our new Technology Hub in Mexico –...