Gustavo Almeida

Cloud Data Engineer

São Paulo, State of São Paulo, Brazil

8+ Years Exp

Summary

Gustavo Almeida is a dedicated Cloud Data engineer with 6 years of experience designing, developing, and implementing data solutions at large enterprises like Roche and Santander Bank. At ANBIMA, Gustavo has made strides transforming the data analytics area by bringing together disparate data sets, applying transformation, and data validation to provide usable data. He has architected and built complex data pipelines and conducted data modeling, performance, and integration testing. He utilizes clean code and modern cloud-native deployment techniques to design and integrate cloud computing and virtualization systems. Gustavo has built data cubes, data marts, and queries and maintained all aspects of storage and translation. He has the skills and best practice knowledge to assist stakeholders with their most challenging data needs.

Technical Skills

Detailed View

SQL
ETL
Python
PowerBI
PySpark
Scrum
AWS
Airflow
GCP
CI/CD
Python Programming
S3
Data Modelling
Graphql
Power BI
REST API
Pyspark
Amazon Web Services
Microsoft Excel
Pentaho

Work Experience

Data Engineer

Hyqoo (formerly ClikSource)

Full Time | 01/07/2022 - Present

Remote | United States

  • Created Data Lakehouse using S3 and Glue Data Catalog.
  • Data ingestion integration with several data providers from different sources like REST API, GraphQL, CDC, DMS, PostgreSQL, and MongoDB
  • Designed pipelines templates for AWS Glue using AWS CloudFormation
  • Pipeline migration from SQL to PySpark
  • IaC migration from CloudFormation to Terraform
  • Full pipeline running in AWS Glue, with Workflow, Triggers, Crawlers, and Jobs getting data from different sources like MongoDB, MySQL, and PostgreSQL and creating a Glue Data Catalog to connect through Redshift Spectrum and Athena or directly in Redshift Storage
  • Streaming data ingestion pipelines with AWS Kinesis + Spark Structured Streaming
  • Near real time pipelines with AWS MSK + Spark Structured Streaming

Data Engineer

number8

Full Time | 10/01/2022 - 28/02/2023

Brazil

  • Created Data Lakehouse using S3 and Glue Data Catalog.
  • Data ingestion integration with several data providers from different sources like REST API, GraphQL and CDC.
  • Designed pipelines templates for AWS Glue using AWS CloudFormation.
  • Pipeline migration from SQL to PySpark.
  • Lakehouse architecture development with AWS Glue Data Catalog
  • Pipeline orchestration with AWS Glue Jobs, Triggers, Crawlers with Workflow
  • Ingestion of REST API, GraphQL, MongoDB and AWS DMS sources
  • Resource provisioning with AWS CloudFormation
  • Making data available in AWS Redshift

Data Engineer

Anbima

Full Time | 01/04/2020 - 01/01/2022

São Paulo, BR

  • Created Data Lakehouse using S3 and Glue Data Catalog.
  • He worked on an Oracle Database 12c hosted on an on-premise server in which there was an OLTP model responsible for the company's training and certification application and an OLAP model responsible for providing insights and KPI.
  • The project requirement was to migrate the OLTP and OLAP to AWS using open-source or AWS tools. For the OLTP model, we used DynamoDB because it was not necessary a relational database for the application, and it was easier for the developers to create new features, and the OLAP model was migrated to a Redshift single-node cluster.
  • Pipeline migration from on-premises servers and legacy and graphical interfaces tools to AWS Glue with Python and PySpark Jobs, orchestrating with Triggers and Crawlers through a Workflow.
  • Data modeling to provide a self-service schema for business analyst and integrated with Power BI and Improved critical daily pipeline from 8 hours to 20 minutes, handling around 60 GB of data.
  • Conducted data journey workshops to business analyst also created pipeline templates for easy maintenance.
  • Designed CI/CD pipeline with AWS CodePipeline and CloudFormation.
  • Data ingestion integration with several data providers from different sources like REST API, GraphQL and CDC.
  • Mined internal and external sources and joined disparate, non-normalized data sets.
  • Integrated information from multiple data sources, solved common transformation problems and resolved data cleansing and quality issues.
  • Utilized code and modern cloud-native deployment techniques to design, plan and integrate cloud computing and virtualization systems.
  • Understood client needs and objectives by conducting proactive customer and data analysis and researched, designed and implemented scalable applications for data extraction, analysis, retrieval and indexing.
  • Conducted data modeling, performance and integration testing and compiled, cleaned and manipulated data for proper handling.
  • Building pipelines using native cloud products - PaaS & SaaS.
  • Architecting and building complex data pipelines using leading-edge technologies.

Data Engineer

Febrafar

Full Time | 12/03/2018 - 01/03/2020

São Paulo, BR

  • Responsible for implementing a data driven culture, migrating all the Excel reports to Python and PySpark.
  • Pentaho pipeline migration to PySpark increasing performance by 80%.
  • Created Data Lake using Google Cloud Storage and Google BigQuery.
  • Data modeling to provide a self-service schema for business analyst and integrated with Power BI.
  • Created pipeline to deliver 10k+ personalized reports for customers.
  • Pipeline orchestration with Airflow (Google Cloud Composer)
  • Generated reports, maintaining dimensional as well as relational data structures and managing operational data store and data warehouse.
  • Developed applications and designed processes for transformation and data management from company-wide databases.
  • Built data cubes, data marts and queries, maintaining every aspect of storage and translation.
  • Created data models and mapped content storage pathways to facilitate easy access.
  • Selected methods and criteria for warehouse data evaluation procedures.
  • Mapped data between source systems and warehouses and validated warehouse data structure and accuracy.
  • Mined internal and external sources and joined disparate, non-normalized data sets.
  • Integrated information from multiple data sources, solved common transformation problems and resolved data cleansing and quality issues.

Business Analytics Engineer

Roche

Full Time | 01/07/2017 - 01/03/2018

São Paulo, BR

  • Created data warehouse in SQL Server and PySpark to obtain data from Salesforce, making it possible to develop KPI of customer's journey.
  • Developed KPI's and dashboards in Power BI allowing full view of forecast process in the organization and strategic focus.
  • Migrated VBA and Excel reports to Python.
  • Designed and developed analytical data structures.
  • Built databases and table structures following Star Schema architecture methodology.
  • Explained data results clearly and discussed how it can be utilized to support project objectives.
  • Selected methods and criteria for warehouse data evaluation procedures.
  • Mapped data between source systems and warehouses.
  • Validated warehouse data structure and accuracy.

Business Intelligence Analyst

Santander

Full Time | 01/08/2014 - 01/06/2017

São Paulo, BR

  • Automated Excel reports, reducing errors and inconsistency.
  • Improved flow of information, developing a centralized pipeline using SQL Server Triggers and Stored Procedures.
  • Developed KPI's and dashboards for strategic focus.
  • Optimized data sources and processing rules to enhance data quality through design and development phases.
  • Determined data storage and optimization policies, shaping organization efforts to enhance performance.
  • Compiled, cleaned and manipulated data for proper handling.
  • Explained data results clearly and discussed how it can be utilized to support project objectives.

Education

MBA: Data Science

Universidade De São Paulo

Bacharel, Sistemas de Informação

FIAP

Certifications

 logo

Microsoft Certified Data Analyst Associate

line-stroke

Hire Faster. Innovate Faster.

Hyqoo AI streamlines the entire process, moving seamlessly from precise skill matching to interviews and onboarding. The moment your request enters the system, our intelligent algorithms spring into action, identifying the ideal talent with laser focus. With Hyqoo AI, you spend less time searching and more time building your dream team. Get the best talent, faster, and focus on innovation

1

24 hours

Requirements Discovery

You request talent on the Hyqoo platform and the process formally begins.

2

48 hours

Opportunity Mapping

Hyqoo talent specialists combine AI matching with real-world experience to find the best available talent to fill your role.

3

72 hours

Team Evaluation

Hyqoo specialists review talent profiles and present them to you for evaluation.

4

Offer & Onboarding

Hyqoo talent specialists work with our professionals on your behalf – helping expedit

Hyqoo Experts

Prompt Engineer

AI Product Manager

Generative AI Engineer

AI Integration Specialist

Data Privacy Consultant

AI Security Specialist

AI Auditor

Machine Managers

AI Ethicist

Generative AI Safety Engineer

Generative AI Architect

Data Annotator

AI QA Specialists

Data Architect

Data Engineer

Data Modeler

Data Visualization Analyst

Data QA

Data Analyst

Data Scientist

Data Governance

Database Operations

Front-End Engineer

Backend Engineer

Full Stack Engineer

QA Engineer

DevOps Engineer

Mobile App Developer

Software Architect

Project Manager

Scrum Master

Cloud Platform Architect

Cloud Platform Engineer

Cloud Software Engineer

Cloud Data Engineer

System Administrator

Cloud DevOps Engineer

Site Reliability Engineer

Product Manager

Business Analyst

Technical Product Manager

UI UX Designer

UI UX Developer

Application Security Engineer

Security Engineer

Network Security Engineer

Information Security Analyst

IT Security Specialist

Cybersecurity Analyst

Security System Administrator

Penetration Tester

IT Control Specialist

Instagram
Facebook
Twitter
LinkedIn
© 2025 Hyqoo LLC. All rights reserved.
110 Allen Road, Basking Ridge, New Jersey 07920.
V0.5.5
ISOhr6hr8hr3hr76