Led migration of healthcare risk‑scoring pipelines to AWS using PySpark, EMR, Glue and S3, optimising large‑scale risk and anomaly detection workloads. Owned 75% of the core migration runtime, improved performance by 45%, processed 2.8 TB daily and cut monthly costs by $12K. Designed AI‑driven RAG solutions and predictive models integrated with ML workflows, built distributed data pipelines for LLM dataset prep processing over 1.2 billion tokens, and engineered ETL orchestration with AWS Step Functions and Airflow to triple concurrency and reduce recovery time by 70%.
AWS (Redshift, S3, Lambda, EMR, Step Functions, API Gateway, IAM, CloudFormation), Apache Spark (PySpark), Airflow, distributed systems, ETL pipeline automation and data warehousing.
LLM data preparation (tokenisation, embeddings, fine‑tuning datasets), NLP, predictive analytics, RAG models, Generative AI (Bedrock), Scikit‑learn, model deployment, experiment tracking and model evaluation (AUC, precision, recall).
Object‑oriented programming, design patterns, algorithms, data structures, concurrency and system design.
CMS RAF risk scoring, HIPAA, FHIR, HL7 and healthcare claims & payer data frameworks.
CloudWatch, Splunk, SES, SNS, data validation frameworks, schema design and data dictionaries.
Python, PySpark, SQL (with familiarity in Scala and Java), QuickSight for dashboards and reporting, modern ETL/orchestration (Airflow/Step Functions).
Developed a supervised machine‑learning model using Python and Scikit‑learn to classify high‑risk transactions from historical payment data, achieving an 81% recall for fraudulent transactions.
Implemented a Spark‑based anomaly detection pipeline on AWS to analyse log data and identify suspicious access patterns, improving detection speed and scalability.
GPA: 3.94, Scholastic Excellence Award.
GPA: 3.64.
GPA: 3.78.
Machine Learning Specialisation (Coursera, Andrew Ng); Certified AWS Cloud Practitioner; Certified Data Analytics (AWS); PySpark; Python; Natural Language Processing; Machine Learning; Pattern Recognition; Advanced SQL; AWS; Problem Solving.