Data Engineer
hace 3 semanas
Databricks Data Engineer
Summary:
The ideal candidate will have at least 5 years of experience in designing, implementing, and maintaining data management and storage systems. Proficiency in collecting, processing, cleaning, and deploying large datasets, understanding ER data models, and integrating with multiple data sources is required. The ability to analyze, communicate, and propose different ways of building Data Warehouses, Data Lakes, End-to-End Pipelines, and Big Data solutions to clients, either in batch or streaming strategies, is essential.
Technical Proficiencies:
SQL:
Data Definition Language, Data Manipulation Language, Intermediate/advanced queries for analytical purposes, Subqueries, CTEs, Data types, Joins with business rules applied, Grouping and Aggregates for business metrics, Indexing and optimizing queries for efficient ETL process, Stored Procedures for transforming and preparing data, SSMS, DBeaver
Python:
Experience in object-oriented programming, Management and processing datasets, Use of variables, lists, dictionaries, and tuples, Conditional and iterating functions, Optimization of memory consumption, Structures and data types, Data ingestion through various structured and semi-structured data sources, Knowledge of libraries such as pandas, numpy, sqlalchemy, Good practices when writing code
Databricks / Pyspark:
Intermediate knowledge in
Understanding of narrow and wide transformations, actions, and lazy evaluations
How DataFrames are transformed, executed, and optimized in Spark
Use DataFrame API to explore, preprocess, join, and ingest data in Spark
Use Delta Lake to improve the quality and performance of data pipelines
Use SQL and Python to write production data pipelines to extract, transform, and load data into tables and views in the Lakehouse
Understand the most common performance problems associated with data ingestion and how to mitigate them
Monitor Spark UI: Jobs, Stages, Tasks, Storage, Environment, Executors, and Execution Plans
Configure a Spark cluster for maximum performance given specific job requirements
Configure Databricks to access Blob, ADL, SAS, user tokens, Secret Scopes, and Azure Key Vault
Configure governance solutions through Unity Catalog and Delta Sharing
Use Delta Live Tables to manage an end-to-end pipeline with unit and integrations test
Azure:
Intermediate/Advanced knowledge in
Azure Storage Account:
Provision Azure Blob Storage or Azure Data Lake instances
Build efficient file systems for storing data into folders with static or parametrized names, considering possible security rules and risks
Experience identifying use cases for open-source file formats like parquet, AVRO, ORC
Understanding optimized column-oriented file formats vs optimized row-oriented file formats
Implementing security configurations through Access Keys, SAS, AAD, RBAC, ACLs
Azure Data Factory:
Provision Azure Data Factory instances
Use Azure IR, Self-Hosted IR, Azure-SSIS to establish connections to distinct data sources
Use of Copy or Polybase activities for loading data
Build efficient and optimized ADF Pipelines using linked services, datasets, parameters, triggers, data movement activities, data transformation activities, control flow activities, and mapping data flows
Build Incremental and Re-Processing Loads
CICD (deseable):
Automate the deployment, scaling, and de-scaling of Azure Databricks clusters using tools like ARM Templates, Terraform, or Azure DevOps Pipelines.
Process Automation:
Automate the deployment, scaling, and de-scaling of Azure Databricks clusters using tools like ARM Templates, Terraform, or Azure DevOps Pipelines.
Monitoring and Performance Optimization:
Set up alerts and monitor key performance metrics in Azure Databricks using Azure Monitor and other monitoring tools. Optimize cluster and workload performance to ensure efficiency and scalability.
Security and Compliance:
Implement security controls and compliance policies in Azure Databricks
Integration with Azure Services:
Integrate Azure Databricks with other Azure services such as Azure Data Lake Storage, Azure SQL Database, Azure Synapse Analytics, and Azure DevOps to create end-to-end data analytics solutions.
Configuration and Secrets Management:
Manage configurations and sensitive secrets using Azure Key Vault or other secrets management solutions. Ensure the security of credentials and access keys.
Training and Support:
Provide training and technical support to development and data analytics teams in the effective use of Azure Databricks. Document best practices and usage patterns to facilitate adoption and collaboration.
-
Data Engineer
hace 4 semanas
Ciudad de México, Ciudad de México 1210 Kyndryl Mexico S. de R.L. de C.V. A tiempo completoAbout the RoleWe are seeking a highly skilled Data Engineer to join our team at Kyndryl. As a Data Engineer, you will be responsible for designing, building, and maintaining our data platforms, ensuring the availability of pristine, refined data sets.Key ResponsibilitiesDesign and develop data pipelines using various tools and technologiesEnsure data quality...
-
Data Engineer
hace 3 semanas
Ciudad de México, Ciudad de México Azka IT Consulting A tiempo completoAzka IT Consulting: Data Engineer OpportunityWe are seeking a highly skilled Data Engineer to join our team at Azka IT Consulting. As a Data Engineer, you will play a crucial role in designing and implementing data extraction, transformation, and loading (ETL) processes using SSIS and Azure platform.Key Responsibilities:Design and implement data pipelines to...
-
Data Engineer
hace 2 semanas
Ciudad de México, Ciudad de México Virtualent A tiempo completoData EngineerWe are seeking a highly skilled Data Engineer to join our team at Virtualent. As a Data Engineer, you will be responsible for designing, developing, and maintaining robust and scalable data pipelines.Key Responsibilities:Design and develop data pipelines to meet business requirements.Work with data scientists and analysts to understand their...
-
Data Engineer
hace 4 semanas
Ciudad de México, Ciudad de México Chubb A tiempo completoJob Title: Data EngineerJob Summary:We are seeking a highly skilled Data Engineer to join our team at Chubb. As a Data Engineer, you will be responsible for designing, implementing, and maintaining robust and scalable data pipelines to support our business operations.Key Responsibilities:Design and develop ETL processes to extract, transform, and load data...
-
Data Engineer
hace 3 semanas
Ciudad de México, Ciudad de México Thomson Reuters A tiempo completoAbout the RoleWe are seeking a highly skilled Data Engineer to join our team at Thomson Reuters. As a Data Engineer, you will play a pivotal role in shaping the future of our data products and analytical services.Key ResponsibilitiesDevelop and maintain robust data pipelines that power our analytics and drive informed decision-making for our...
-
Data Engineer
hace 4 semanas
Ciudad de México, Ciudad de México Lionbridge A tiempo completoAbout the RoleLionbridge is seeking a highly skilled Data Engineer to join our team. As a Data Engineer, you will be responsible for designing, developing, and maintaining large-scale data systems and applications. You will work closely with cross-functional teams to ensure data quality, integrity, and security.Key ResponsibilitiesDesign, develop, and deploy...
-
Data Engineer
hace 2 meses
Ciudad de México, Ciudad de México Rocket Code (AL-I Digital Solutions S.A. de C.V.) A tiempo completoAbout Rocket Code (AL-I Digital Solutions S.A. de C.V.)We are a leading digital solutions company that specializes in AI-first approaches, revolutionizing the tech landscape. Our mission is to transform existing technology into digital experiences that generate a profoundly positive impact.Job SummaryWe are seeking a highly skilled Data Engineer to join our...
-
Data Engineer
hace 3 semanas
Ciudad de México, Ciudad de México Capgemini A tiempo completoJob DescriptionJob Title: Foundations SetupJob Summary:We are seeking a highly skilled Data Engineer to join our team at Capgemini. As a Data Engineer, you will be responsible for designing, developing, and maintaining large-scale data systems. You will work closely with cross-functional teams to understand data requirements and contribute to the design and...
-
Data Engineer
hace 1 mes
Ciudad de México, Ciudad de México Thomson Reuters A tiempo completoAbout the RoleWe are seeking a highly skilled Data Engineer to join our team at Thomson Reuters. As a Data Engineer, you will play a pivotal role in shaping the future of our data products and analytical services.Key ResponsibilitiesDevelop and maintain robust data pipelines that power our analytics and drive informed decision-making for our...
-
Data Engineer
hace 3 semanas
Ciudad de México, Ciudad de México Rocket Code (AL-I Digital Solutions S.A. de C.V.) A tiempo completoAbout Rocket CodeRocket Code is a pioneering company in the AI revolution, pushing the boundaries of technology to create impactful solutions. Our mission is to transform existing technology into digital experiences that generate a profoundly positive impact.Job DescriptionWe are seeking a highly skilled Data Engineer to join our team. As a Data Engineer at...
-
Senior Data Engineer
hace 1 mes
Ciudad de México, Ciudad de México Balsam Brands A tiempo completoJob Title: Senior Data EngineerWe are seeking a highly skilled Senior Data Engineer to join our team at Balsam Brands. As a Senior Data Engineer, you will be responsible for designing and building a robust, scalable, and high-performance data infrastructure to meet the company-wide data and analytics needs.Key Responsibilities:Design and maintain data...
-
Senior Data Engineer
hace 3 semanas
Ciudad de México, Ciudad de México Balsam Brands A tiempo completoJob Title: Senior Data EngineerWe are seeking a highly skilled Senior Data Engineer to join our team at Balsam Brands. As a Senior Data Engineer, you will be responsible for designing and building a robust, scalable, and high-performance data infrastructure to meet the company-wide data and analytics needs.Key Responsibilities:Design and maintain data...
-
Data Engineer
hace 1 mes
Ciudad de México, Ciudad de México Rocket Code (AL-I Digital Solutions S.A. de C.V.) A tiempo completoAbout Rocket CodeRocket Code is a leading technology company that is revolutionizing the digital landscape with its AI-first approach. We are a team of passionate individuals who are dedicated to transforming existing technology into digital experiences that generate a profoundly positive impact.Job DescriptionWe are seeking a highly skilled Data Engineer to...
-
Data Engineer
hace 1 mes
Ciudad de México, Ciudad de México Rocket Code (AL-I Digital Solutions S.A. de C.V.) A tiempo completoAbout Rocket CodeRocket Code is a leading technology company that is revolutionizing the digital landscape with its AI-first approach. We are a team of passionate individuals who are dedicated to transforming existing technology into digital experiences that generate a profoundly positive impact.Job DescriptionWe are seeking a highly skilled Data Engineer to...
-
Data Software Engineer
hace 4 semanas
Ciudad de México, Ciudad de México CRH Talento en IT A tiempo completoJob Title: Data Software EngineerCRH Talento en IT is seeking a highly skilled Data Software Engineer to join our team. As a Data Software Engineer, you will be responsible for designing, developing, and maintaining software applications that collect, process, and analyze large datasets.Responsibilities:Design and develop software applications using...
-
Lead Data Engineer
hace 3 semanas
Ciudad de México, Ciudad de México NBCUniversal A tiempo completoJob Title: Lead Data EngineerWe are seeking a highly skilled Lead Data Engineer to join our team at NBCUniversal. As a key member of our Strategy and Insights team, you will be responsible for designing and implementing data pipelines, ensuring data quality, and collaborating with cross-functional teams to drive business decisions.Responsibilities:Develop...
-
Data Engineer
hace 3 semanas
Ciudad de México, Ciudad de México 3Pillar Global A tiempo completoJoin 3Pillar Global as a Data EngineerWe are a leading product development partner that builds breakthrough software products for digital businesses. Our innovative solutions drive rapid revenue, market share, and customer growth for industry leaders in Software and SaaS, Media and Publishing, Information Services, and Retail.Key Responsibilities:Design and...
-
Database Operations Engineer
hace 3 semanas
Ciudad de México, Ciudad de México Ntt Data A tiempo completoJob SummaryWe are seeking a highly skilled Database Operations Engineer to join our team at NTT DATA Services. As a key member of our database team, you will be responsible for managing and overseeing the entire lifecycle of SQL Server databases, from development to mission-critical production systems.Key ResponsibilitiesManage and maintain database servers...
-
Data Engineer
hace 1 mes
Ciudad de México, Ciudad de México Lionbridge A tiempo completoData Engineer I (Games)Lionbridge is seeking a skilled Data Engineer I to join our team. As a key member of our data analytics team, you will be responsible for designing, developing, and maintaining large-scale data systems to support our business operations.Key Responsibilities:Design and implement data pipelines to extract, transform, and load data from...
-
Data Orchestration Engineer
hace 3 semanas
Ciudad de México, Ciudad de México AgileEngine A tiempo completoJob Title: Lead Data Orchestration EngineerAbout the Role:AgileEngine is seeking a highly skilled Lead Data Orchestration Engineer to join our team. As a key member of our data engineering team, you will be responsible for designing and implementing data pipelines, ensuring data quality, and collaborating with cross-functional teams to drive business...