Developed an end-to-end Conversational AI Voice Agent that allows users to interact with private PDF documents using natural speech. Unlike standard chatbots, this system utilizes Retrieval-Augmented Generation (RAG) to ensure every spoken response is strictly grounded in the provided source material, effectively eliminating hallucinations and providing a hands-free research experience. Technical Architecture: The "Listen-Think-Speak" Pipeline Speech Processing (The Ear): Integrated SpeechRecognition and Google’s Web Speech API to capture and transcribe live user audio into text queries with high accuracy. Document Intelligence (The Brain): Data Ingestion: Utilized PyPDFLoader for semantic document partitioning. Vector Search: Implemented HuggingFace (all-MiniLM-L6-v2) for local text embeddings and FAISS (Facebook AI Similarity Search) as a high-speed, in-memory vector database for millisecond retrieval. Reasoning Engine: Leveraged Llama 3.3 (70B) via Groq’s LPU (Language Processing Unit) to achieve near-instant inference speeds, maintaining a natural conversational flow. Conversational Logic & Memory: Designed a ConversationalRetrievalChain with a custom Condense Question Prompt. This allows the agent to resolve pronouns (like "it" or "that") by referencing ConversationBufferMemory, enabling complex, multi-turn dialogues. Audio Synthesis (The Voice): Used gTTS (Google Text-to-Speech) to synthesize grounded text responses back into natural speech for a seamless user experience.
Great on Find the Length of the Longest Substring Without Repeating Characters, Difficulty Easy
Great on LLM Integration Producing Inconsistent Outputs, Difficulty Medium
Great on Evaluate and Improve RAG Chatbot Response Quality, Difficulty Medium
Great on Reverse Digits of a 32-bit Integer, Difficulty Easy
Great on Word Search in Grid, Difficulty Medium
Perfect on Validate Parentheses in a String, Difficulty Hard
Developed and deployed AI-powered solutions using machine learning and deep learning techniques for real-world applications.
Created voice-to-voice AI agents integrating speech-to-text and text-to-speech systems for interactive applications.
Worked on custom model fine-tuning (e.g., Whisper, LLMs) using domain-specific datasets to improve accuracy and performance.
Developed AI chatbots for customer support, healthcare assistance, and business automation.
Implemented recommendation systems (collaborative and hybrid filtering) using datasets like MovieLens.
Conducted data preprocessing, feature engineering, and exploratory data analysis (EDA) for structured and unstructured datasets.
Optimized model performance through hyperparameter tuning, evaluation metrics, and error analysis.
Built and deployed AI applications using tools like Python, PyTorch and Hugging Face
Integrated AI models into real-time applications and APIs for scalable solutions.
Collaborated with clients to understand business requirements and translate them into AI-driven solutions.
Developed healthcare AI models for disease detection (e.g., EEG-based seizure detection).
Built liveness detection systems using computer vision techniques to prevent spoofing.
Designed AI agents for insurance sales and customer interaction automation.
Ensured data privacy, security, and ethical AI practices in all projects.
Automate their overall processes, including Call Center ops.
Built and fine-tuned Large Language Models (LLMs) (e.g., LLaMA-based models) for conversational AI, particularly for Urdu call center and healthcare use cases.
Designed and implemented end-to-end speech pipelines, including audio preprocessing, transcription (Whisper), speaker diarization, and text generation.
Created voice-to-voice AI agents integrating speech-to-text and text-to-speech systems for interactive applications.
Worked on custom model fine-tuning (e.g., Whisper, LLMs) using domain-specific datasets to improve accuracy and performance.
Fine-tuned models for low-resource languages (Urdu) to improve accessibility and inclusivity.
Developed an advanced speech emotion recognition system capable of identifying human emotions from voice data using state-of-the-art deep learning models. The project focused on analyzing acoustic features and linguistic content to improve emotion classification accuracy in real-world conversational settings. Leveraged Librosa for audio preprocessing and feature extraction, including MFCCs and spectral features. Integrated Whisper for robust speech-to-text transcription, enabling the model to incorporate contextual understanding alongside audio signals. Utilized transformer-based architectures, specifically wav2vec2-large and Whisper, to capture both raw audio representations and semantic information. Implemented the pipeline using the Transformers library for model training and inference, and used the evaluate library to assess performance through key metrics such as accuracy, precision, recall, and F1-score. The system was designed to handle noisy and real-world audio inputs, making it suitable for applications in call centers, healthcare monitoring, and conversational AI systems.
Designed and developed an end-to-end voice-based AI agent capable of handling real-time customer interactions in a call center environment. The system processes spoken input, generates intelligent responses, and delivers natural-sounding voice output, enabling seamless human-like conversations. Built a complete pipeline integrating Whisper for accurate speech-to-text transcription and gTTS for text-to-speech synthesis, allowing full voice-to-voice communication. Implemented advanced language understanding and response generation using LLaMA with a Retrieval-Augmented Generation (RAG) framework, enhancing the agent’s ability to provide context-aware and knowledge-grounded responses. Additionally, leveraged GPT-4.5 for improved conversational quality and dynamic dialogue handling. Utilized Transformers, Torchaudio, and Librosa for audio processing, model integration, and feature handling, while Accelerate was used to optimize training and inference performance. Model evaluation was conducted using the evaluate library to ensure accuracy, response relevance, and robustness. The system was specifically designed for real-world deployment scenarios such as insurance sales and customer support, with support for multilingual (including Urdu) conversations, making it highly adaptable for diverse user bases.
Developed a deep learning-based image classification system to accurately identify fish species from visual data. The project focused on building a robust model capable of distinguishing between multiple species under varying lighting and environmental conditions. Utilized TensorFlow and Keras to design, train, and evaluate convolutional neural network architectures. Implemented a custom CNN model for feature extraction and classification, and leveraged MobileNet for transfer learning to improve performance and reduce computational complexity. Applied image preprocessing and data augmentation techniques to enhance model generalization and handle limited or imbalanced datasets. The model was evaluated using standard performance metrics such as accuracy, precision, and recall, demonstrating strong classification performance. This system can be applied in areas such as marine biodiversity monitoring, fisheries management, and automated inspection systems.
Developed a machine learning-based predictive system to assess the likelihood of heart disease using patient health data. The project aimed to support early diagnosis by identifying key risk factors and patterns within clinical datasets. Utilized Pandas and NumPy for data preprocessing, cleaning, and feature engineering, while Seaborn was used for exploratory data analysis and visualization of correlations and trends. Implemented multiple machine learning models using Scikit-learn, including Random Forest (RF), Gradient Boosting (GB), and Logistic Regression, to compare performance and select the most effective approach. Conducted model evaluation using metrics such as accuracy, precision, recall, and F1-score to ensure reliable predictions. The final model demonstrated strong performance and interpretability, making it suitable for real-world healthcare applications and decision support systems.
Developed a machine learning-based system to predict the likelihood of accidents using historical and environmental data. The project focused on identifying key risk factors such as road conditions, time, weather, and traffic patterns to enable proactive safety measures. Utilized Pandas and NumPy for data cleaning, preprocessing, and feature engineering, while Seaborn was used for exploratory data analysis and visualization of trends and correlations. Implemented multiple machine learning models using Scikit-learn, including Random Forest (RF), Support Vector Machine (SVM), and XGBoost, to compare performance and improve prediction accuracy. Performed model evaluation using metrics such as accuracy, precision, recall, and F1-score to ensure reliability and robustness. The final model provided actionable insights that can support traffic management systems, urban planning, and accident prevention strategies.
Developed a machine learning-based system to detect epileptic seizures from EEG (electroencephalogram) signals, aiming to support early diagnosis and continuous patient monitoring. The project focused on extracting meaningful patterns from high-dimensional time-series brain signal data. Utilized Pandas and NumPy for data preprocessing, cleaning, and feature engineering, and applied Seaborn for exploratory data analysis and visualization of signal patterns and class distributions. Implemented multiple machine learning models using Scikit-learn, including Random Forest (RF), Gradient Boosting (GB), XGBoost, Support Vector Machine (SVM), and K-Nearest Neighbors (KNN) to compare performance across different algorithms. Performed comprehensive model evaluation using metrics such as accuracy, precision, recall, and F1-score to ensure robust and reliable detection. The final system demonstrated strong predictive capability and can be applied in real-time healthcare monitoring systems and clinical decision support tools.
Developed a time series forecasting model to analyze and predict employee retirement trends over time, supporting workforce planning and organizational decision-making. The project focused on identifying temporal patterns, seasonality, and long-term trends in retirement data. Utilized Pandas and NumPy for data preprocessing and time-series structuring, while Seaborn was used for exploratory data analysis and visualization of trends and seasonal patterns. Applied statistical modeling techniques using statsmodels, including Linear Regression and VARIMA (Vector Autoregressive Integrated Moving Average), to capture both univariate and multivariate temporal dependencies. Performed model evaluation using appropriate time-series metrics and residual analysis to ensure accuracy and reliability of forecasts. The resulting insights can assist organizations in proactive workforce management, succession planning, and policy development.
I have developed a high-precision computational pipeline for retinal image analysis, focusing on the emerging field of oculomics. By leveraging advanced deep learning architectures, the project automates the segmentation of critical ocular structures such as the retinal vasculature, optic disc, and macula to identify non-invasive biomarkers for systemic and neurological health. The technical core of this work involves engineering robust feature extraction models to quantify morphological changes, including vessel tortuosity and the arteriole-to-venule ratio. These metrics are integrated into a predictive framework designed to detect early-stage indicators of cardiovascular and neurodegenerative conditions. By transforming fundus imaging into a comprehensive diagnostic tool, this research aims to bridge the gap between computer vision and personalized, preventative medicine, making high-level health screening more accessible and scalable.