Great on Implement LRU Cache Algorithm, Difficulty Medium
Great on Design an ETL Pipeline for Integrated Insights, Difficulty Hard
Great on Optimize AWS S3 Storage and Processing Architecture, Difficulty Hard
Built end-to-end pipeline to sync reconciliation systems between SAP, Odoo and ecommerce platforms utilizing Airflow
Designed scalable ETL pipelines between legacy platforms and cloud platforms /CRMs including Odoo
Led Change data capture (CDC) pipeline implementation migrating data from PostgreSQLto azure datalake, which was then used for consumption to enable efficient data updates for business intelligence
Worked with cross functional team members to design a scalable solution migrating data to enable accurate reconciliation process between their previous system called foodics and the current system Odoo. Using python and airflow
Optimized data ingestion pipelines with Airbyte (ELT), and connecting their reporting to a metabase dashboard which is for empowering stakeholders to make data-driven decisions and identify growth achieving 30% reduction in processing time
Implemented AWS Redshift data warehouses enabling rapid access to high-volume marketing datasets for advanced analytics
Consumed real-time streaming data with Azure Event Hubs and Microsoft fabrics for a financial institution, enhancing data processing speed 50% for Business Intelligence
Consumed real-time streaming data with Azure Event Hubs and Microsoft fabrics for a financial institution, enhancing data processing speed 50% for Business Intelligence
Worked with RPA Developers to pull daily financial data from SFTP to ADLS utilizing azure data factory for incremental loading and databricks to transform daily records
Mentored team in data governance, data quality standards, and performance optimization for large-scale datasets.
Engineered Databricks Delta Live Tables pipelines on Azure (ADLS + Synapse SQL), improving data accuracy by 30% and cutting end-to-end processing time by 50% across a Medallion Lakehouse
Architected and maintained AWS Glue Jobs and Glue Catalog metadata to structure raw partner files into analytics-ready datasets. Designed partition strategies (daily/monthly) optimized for Athena query performance and long-term lake maintainability, supporting BI consumption via Athena SQL
Led the end-to-end migration of NetSuite data infrastructure from V1 to V2 (SuiteAnalytics Connect), re-engineering 17 legacy tables with updated OAuth2 JWT authentication, ensuring full data integrity and continuity of historical financial records for the FP&A team.
Designed and developed reconciliation tables by authoring complex ad-hoc SQL queries, enabling accurate financial reconciliation and data validation across key reporting datasets.
Conducted root cause analysis on data quality issues using Snowflake, collaborating with data integrity teams to resolve discrepancies and drive continuous improvement.
Built complex queries, stored procedures and views to support advanced analytics requirement and data integrity rules
Migrated legacy stored procedures to modular dbt models, improving transparency and reducing maintenance time by 45%
I also engineered a partition monitoring solution that queries partition counts across entire databases and tables in AWS Glue, calculating usage against the 21 million partition threshold enforced by AWS. The system applies configurable threshold alerts — and when usage approaches or exceeds the limit, it automatically triggers a log to flag the breach, giving the team proactive visibility and preventing pipeline failures caused by partition limit violations.
I led an initiative to migrate Athena queries to Spark queries within AWS Glue jobs. This transition significantly improved job execution speed and reduced cloud compute costs for the company — demonstrating my ability to identify performance bottlenecks and deliver optimisations that have a direct business impact.
Fraud-Detection-PySpark is a real-time fraud detection pipeline using PySpark, Kafka, Google Cloud services, and BigQuery. It processes transaction data from multiple sources, addresses class imbalance in fraud detection, and prioritizes precision and recall over accuracy. The project covers business understanding, data infrastructure, EDA, data preparation, and deployment readiness.

This project involves an ETL (Extract, Transform, Load) process to analyze sleep data exported from Apple Health to iCloud in XML format. The data is then processed and transformed using AWS services, queried through Amazon Athena, and visualized using a Apache Superset dashboard.


Adunoluwa A. is senior Level Developer