Great score on Find Indices of Two Numbers That Add Up to Target, Difficulty Easy
Great score on Design a Scalable and Compliant SaaS Platform Architecture, Difficulty Medium
Great score on Enhance Kubernetes Security on GCP, Difficulty Easy
Great score on Reverse Digits of a 32-bit Integer, Difficulty Easy
Great score on Word Search in Grid, Difficulty Medium
Great score on Validate Parentheses in a String, Difficulty Hard
Collaborate with business clients and stakeholders to gather business requirements and build data pipelines aligned with client goals and reporting needs.
Deliver weekly live training sessions for 20+ intern data engineers, simplifying complex technical concepts into practical, beginner-friendly lessons, improving learner participation and engagement by an estimated 60%.
Guide interns through practical projects and team-aligned learning, helping them strengthen problem-solving abilities and improve their confidence in building ETL, ELT and streaming data engineering projects independently and collaboratively.
Trained 30+ students in SQL and dbt query writing, improving learners’ confidence and SQL proficiency.
Created case studies and projects that strengthened learners’ ability to write SQL queries and perform data analysis.
Designed analytics projects and case studies that improved learners’ confidence in data cleaning, visualization, and dashboard creation by an estimated 70%.
Built a real-time data pipeline with Databricks and Python using Bronze, Silver, and Gold layers, which reduced data latency by 35%. Designed and implemented a structured data flow that separates raw, cleaned, and business-ready data, making fraud analysis easier and more reliable. Handled late and duplicate events in streaming data to ensure clean and consistent fraud detection results in real time.

Built a serverless ETL pipeline with AWS Kinesis Firehose and Lambda, helping reduce infrastructure and maintenance costs by 50%. Improved analyst query speed by 4x by transforming raw event logs into partitioned Parquet files using PySpark on Amazon EMR. Used Terraform to automate the full AWS infrastructure setup, reducing deployment time from days to a few minutes and keeping environments consistent.

Engineered a streaming pipeline using Kafka, PySpark, and PostgreSQL to process live stock trades, reducing data delay and enabling near real-time analytics for market insights. Processed and transformed streaming stock market data into structured datasets for reporting and analysis, enabling faster and more consistent access to market insights for executives, clients, and stakeholders to support better decision-making. Built monitoring and observability using Prometheus and Grafana to track pipeline health and failures, improving system monitoring and enabling faster issue detection.
