Julian Malaver
Data Engineer
Bogotá, Bogota, Colombia
8+ Years Exp
Summary
Julian is a highly skilled Data Engineer with expertise in API, ETL, Data Lake, and a comprehensive technology stack encompassing Power BI, BigQuery, MySQL, and XML. He is proficient in designing, creating, and deploying complex data pipelines, connecting BI tools to data sources, applying best practices in data warehousing, and leading and coordinating data teams. Julian has created and deployed 100+ data pipelines and batch jobs in Python and Spark, extracting raw data from API’s MySQL data sources and storing them in the staging and landing zone (CSV, JSON, and XML files). He has experience designing and architecting data solutions in cloud and on-premises environments. He has utilized multipurpose and job compute clusters (ephemeral) and incorporated external Delta Tables in S3. Julian has applied good practices in DW and columnar databases, including clustering, partitioning, compression, materialized views, and denormalization, and has migrated the previous Data Warehouse from BigQuery to Redshift, implementing Dimension Model Type 2 in Redshift. He has provided technical guidelines for data modeling, analytics, and big data and participated in RFI/RFP processes.
Technical Skills
Detailed View
Other Skills
Work Experience
Data Engineer
Foodics
Full Time | 01/06/2022 - 01/06/2023
Dubai, AE
- Utilized multipurpose and job compute clusters (ephemeral) and incorporated external Delta Tables in S3.
- Created and deployed 100+ data pipelines and batch jobs in Python and Spark, extracting raw data from API’s MySQL data sources and storing them in the staging and landing zone (CSV, Json, and XML files).
- Logged execution statistics into AWS RDS (MySQL) and implemented an alert system to monitor job status.
- Defined and monitored metrics to guarantee data consistency between all company systems.
- Built and scheduled ETL jobs in AWS Glue Studio and AWS Data Brew, transforming raw data into Parquet files and storing them in the processed/trusted zone inside the company data lake.
- Migrated the previous Data Warehouse from BigQuery to Redshift, implementing Slowly Changing Dimension Model Type 2 in Redshift.
- Designed and implemented RESTful APIs using Flask and FastAPI frameworks in Python.
- Created a Data Lakehouse in Databricks, executing Spark Jobs, and administered cluster administration, monitoring, and optimization.
Data Engineer
Wizeline
Full Time | 01/07/2021 - 01/11/2022
Bogota, CO
- Managed analytical projects and consultancy processes, showcasing a strategic understanding of data needs.
- Developed Extract, Transform, Load (ETL) processes, ensuring efficient data processing and integration.
- Designed and built OLAP cube systems, Data Warehouse, and Data Marts, displaying expertise in data architecture.
- Implemented dynamic dashboards and Key Performance Indicators (KPIs) using Power BI, highlighting data visualization skills.
- Provided technical support for the migration of a Teradata Data Warehouse to BigQuery.
- Supported the connection between Google BigQuery and BI tools (Tableau, Power BI, Alteryx, Python, R) through JDBC, ODBC, and APIs.
- Implemented streaming data pipelines to handle continuous data inflow.
- Successfully utilized AWS services to achieve project goals and enhance overall system architecture.
- Applied good practices in DW and columnar databases, including clustering, partitioning, compression, materialized views, and denormalization.
- Analyzed GCP - BQ logging through Google Cloud Monitoring and used Information Schema DB to measure performance metrics.
- Conducted knowledge transfer sessions for beginners and advanced users of Walmart.
- Acted as a speaker/panelist in a series of five introductory webinars about BQ and DW.
- Created dashboards to monitor SLA within the IT/migration teams.
- Proficient in SAS 9.4 suite, SPSS Data Collection software, RUBY DP, and SPSS Survey Reporter.
- Expertise in handling Excel and PowerPoint for data management, reporting, and analysis.
Data Science Coordinator
Procibernetica
Full Time | 22/08/2019 - 02/06/2021
Bogota, CO
Data Scientist
- Implemented a Data Lake over HDFS and Hadoop clusters using Cloudera Manager and SAS Viya/SAS Data Integration:
- Performed text preparation tasks such as Tokenization, Stemming, Lemmatization, Part of Speech (POS) tagging, and implemented similarity algorithms including Levenshtein, Jaro-Winkler, Cosine Distances, and Word2Vec.
- Extracted information from images and documents using OCR technologies, specifically Tesseract.
- Conducted data preprocessing utilizing various algorithms and image filters such as Thresholding, Sharpening, and Denoising, along with Feature Detection techniques.
- Developed AI Natural Language Processing (NLP) models in TensorFlow for classifying medical procedures, materials, and treatments.
- Trained an NLP model for sentiment analysis of text using GPU on an IBM AC922 High-Performance Computer.
- Created simple Front-End UI applications using Django and Nginx to interact with analytical models deployed in the backend.
- Specialized in Diagnostic Imaging: Trained, evaluated, and deployed a computer vision model to detect stroke/CVA on DICOM/jpeg images of patients using IBM Maximo Visual Inspection. Applied data augmentation and image tagging.
- Conducted assessments, explorations, and research on market-leading software, technologies, and tools in artificial intelligence, machine learning, deep learning, robotic process automation, auto-ML, cloud platforms, and context-awareness/adaptive applications.
Data Science Coordinator
- Designed and architected data solutions in both cloud and on-premises environments.
- Estimated the size of Big Data clusters, ensuring scalability, high availability, security, and encryption.
- Delivered implementations on time, solving and managing problems and materialized risks associated with implementations.
- Coordinated the activities of a data team consisting of six professionals, including data scientists, data engineers, and data analysts.
- Managed the team and resources assigned to data science and BI projects, estimating effort and time for projects and tasks.
- Provided technical guidelines related to data modeling, data analytics, and big data and participated in RFI/RFP processes.
Researcher
Pontifical Javeriana University
Full Time | 01/02/2018 - 01/06/2019
JAVERIANA, Bogota, CO
- Worked as a Researcher/Consultant in Big Data and Data Analytics.
- Developed software projects applying SCRUM agile methodologies, DevOps strategies, and version control.
- Conducted research projects in Big Data and Data Analytics, including descriptive and predictive analytical models, computer vision and natural language processing models, data ingestion, ETL processes, data cleansing, and data quality processes.
- Publicized a research article about data anonymization in Big Data environments.
Business Analytics Executive
SINNETIC
Full Time | 24/01/2017 - 02/01/2018
Colombia
- Managed analytical projects and consultancy processes.
- Utilized SAS 9.4 suite for database administration and analysis.
- Developed ETL and data quality processes, programmed and scheduled jobs and reports.
- Designed and built OLAP cube systems, Data Warehouse, and Data Marts.
- Implemented dynamic dashboards and KPI using Power BI, executed DAX functions.
- Monitored retention, collection, and sales campaigns using SAS Customer Intelligence.
Data Processing and Scripting Assistant
Ipsos
Full Time | 26/01/2016 - 07/12/2016
Colombia
Data Processing and Scripting Assistant
- Conducted data cleaning, databases validation, reporting, and correction of data errors.
- Handled SPSS Data Collection software for the generation and transformation of databases.
- Applied weighting files and developed/designed data rules and scripts testing.
Research Assistant
- Provided support to analysts and project managers in the development of brand tracking, brand image, and market research reports.
- Managed RUBY DP and SPSS Survey Reporter software for statistical analysis, generation of cross tables, and creation of new variables.
- Worked on the automation and optimization tasks of the area's internal processes.
- Designed and formulated Excel formats for fast data extraction and visualization.
Assistant
- Managed EXCEL databases for developing and presenting graphs, statistical results, and market analysis.
- Reported in Excel and PowerPoint.
Education
Master of Systems Engineering and Computer Science
Pontifical Xavierian University
Bachelor of Mechatronics Engineering
National University of Colombia
Master of Analytics for Business Intelligence
Pontifical Xavierian University
Certifications

Google Cloud Certified - Professional Data Engineer

Winter 2020 Data Science for All (DS4A)
Correlation One

Artificial Intelligence Program 2020
BD Guidance

B2 First - Cambridge English Level 1 Certificate in ESOL International FIRST

Scrum Fundamentals Certified
ScrumStudy

20461: Querying Microsoft SQL Server 2014

20462: Administering Microsoft SQL Server 2014 Databases

20463: Implementing a Data Warehouse with Microsoft SQL Server 2014