Hello, I'm

Kush Patel

|

Transforming raw data into decisions that matter. Specializing in scalable machine learning models and predictive analytics.

Kush Patel

About Me

01

I am a Data Scientist and Machine Learning Engineer passionate about bridging the gap between complex algorithms and real-world business value. With a strong foundation in statistical modeling and distributed computing, I architect end-to-end data pipelines that empower decision-makers. My focus is on deploying scalable, interpretable AI solutions that drive measurable impact.

๐Ÿง  ML Research ๐Ÿ“Š Data Viz โšก Scalable APIs โ˜• Coffee-Driven Dev
0 Projects
0 Research Paper
Currently Focusing on : Advanced LLM fine-tuning, Agentic AI & Model Context Protocol (MCP)

Arsenal

02

Core Competencies

Python & R95%
SQL & Databases90%
Machine Learning 85%
Deep Learning (PyTorch, TF)80%
Data Visualization (Plotly, Tableau, Power BI)85%

Technologies & Tools

AWS
Docker
Git
Spark
FastAPI
Pandas
NLP
BERT
LangChain
LLMs
MySQL
MongoDB
JavaScript
Excel
Django
Django
RAG
Pinecone

Featured Work

03

Auto-Grading Subjective Test Platform

A full-stack Django-based automatic subjective grading platform that uses BERT-driven semantic similarity (Sentence Transformers, cosine scoring) with role-based dashboards to evaluate unstructured answers at scale, achieving a 0.87 F1 score and retaining educator control via a human-in-the-loop override.

Django NLP SQLite

CareBot: Clinical-Support Medical Chatbot

A context-aware conversational AI for preliminary symptom analysis, integrating LangChain and Gemini LLMs with Pinecone vector databases to perform high-speed semantic search across chunks of clinical text from the Gale Encyclopedia of Medicine

RAG Pinecone LLM

Biological Age Prediction

Developed a reproducible, end-to-end ML pipeline for estimating biological age from DNA methylation data using interpretable models and a calibrated stacked ensemble, with robust cross-study validation demonstrating strong generalization on an external cohort

Pythont Pandas ML Pipeline

Diabetes Prediction

Built an end-to-end, interpretable diabetes prediction pipeline on a 100K-record clinical dataset using Lasso-based feature selection and ensemble modeling, achieving a 0.97 AUC with a tuned XGBoost model while translating predictions into clinically actionable risk insights.

Python Regression XGBoost

Experience

04

Machine Learning Intern

IntershipStudio

May 2023 - Jun 2023
  • Engineered predictive regression models (Random Forest, Keras/TensorFlow) to forecast digital asset monetization based on audience interaction signals and content metadata in Google BigQuery, outperforming the baseline decision-tree model by 15% in Mean Absolute Error (MAE) to support targeted promotional strategies.
  • Architected a data preprocessing pipeline in Python using Pandas, NumPy, and Spark to handle corrupted data flags, encode complex categorical taxonomies, and filter statistical anomalies in engagement metrics while managing multiple projects to streamline data preparation and enable downstream models to train on clean data without manual intervention
  • Developed interpretability frameworks using Random Forest feature importance to demystify complex predictive outputs, identifying the primary drivers of user engagement and translating model behavior into actionable content strategies for stakeholders.

Data Analyst Intern

Trainity

Nov 2022 - Dec 2022
  • Engineered data sanitization workflows using SQL and Excel on a 300,000+ row loan portfolio, eliminating 41 high-null features and imputing missing financial data to establish a reliable baseline for risk modeling.
  • Conducted segmented bivariate analysis and designed KPI-driven dashboards in Power BI and other Microsoft applications for a highly imbalanced credit dataset (92% non-default vs. 8% default), visualizing demographic distributions to uncover hidden risk correlations in income and employment.
  • Presented a data-driven risk mitigation strategy to business stakeholders, specifically highlighting the Transport sector's 16% default rate, coordinating multiple project work streams to propose dynamic interest rate adjustments and optimize the underwriting process.

Education

05

Illinois Institute of Technology

M.S. in Data Science

2024 - 2026 GPA: 3.8/4.0
ARC MATH Tutor Indian Student Association

University of Mumbai

B.Tech in Computer Science

2019 - 2024
Data Structures Big Data Technologies Python Database Managements

Want the full picture?

My resume has everything โ€” from specific project architectures to publications.

Download Resume PDF ยท Last updated February 2026

Let's Connect

06

Currently open for new opportunities. Whether you have a question, a project proposal, or just want to say hi, I'll try my best to get back to you!

Email

patel.h.kush@gmail.com

LinkedIn

linkedin.com/in/kush-patel2416

Location

Chicago, IL (Remote OK)

Message sent successfully!