Currently Open to Work • New York, NY

Atharva Deshmukh Data Scientist & AI Engineer

3+ years building production GenAI systems, multi-agent workflows, and ML solutions. Expert in MCP, LLM fine-tuning, and statistical ML, delivering 70–90% efficiency gains across healthcare and enterprise.

Portrait of Atharva Deshmukh, Data Scientist and AI Engineer
CVS Health CrowdDoing VR Digital Solutions Rochester Institute of Technology AWS SageMaker Hugging Face LangChain PyTorch Microsoft Azure Oracle Cloud CVS Health CrowdDoing VR Digital Solutions Rochester Institute of Technology AWS SageMaker Hugging Face LangChain PyTorch Microsoft Azure Oracle Cloud
3+
Years Experience
Building production ML & GenAI systems
10+
Projects Delivered
From MVPs to enterprise platforms
2
Cloud Certifications
Azure & Oracle Cloud Data Science

About Me

Building AI That
Actually Scales

I'm a Data Scientist and AI Engineer with 3+ years of experience building production GenAI systems, multi-agent workflows, and machine learning solutions that deliver measurable business impact.

My expertise spans Model Context Protocol (MCP), LLM fine-tuning, and statistical ML across healthcare, enterprise, and environmental domains. Currently completing my Master's in IT & Analytics at RIT.

MS
IT & Analytics
RIT, 2025
BE
Computer Engineering
Pune University
01 DATA SCIENCE
02 MACHINE LEARNING
03 GENERATIVE AI
04 DATA ENGINEERING

Professional Journey

Where I've Worked

Forward Deployed AI Engineer (Contract)

ContextQA Feb 2026 – Mar 2026
  • Built React + TypeScript CRM to capture and manage 115+ AI-sourced leads, reducing sales rep processing time by 60% through automated pipeline management.
  • Delivered 18 qualified prospects (16% conversion rate) within 2 weeks, validating an agentic outreach pipeline from POC to production.
React TypeScript Agentic AI CRM

Data Scientist (AI/ML)

CrowdDoing Sep 2025 – Present
  • Built wildfire satellite image-classification model using PyTorch (Landsat-8 + MODIS), achieving 87% precision and 82% recall, reducing false alerts by 30% across monitored regions.
  • Developed real-time inference pipeline on AWS SageMaker, combining environmental sensor signals with ensemble ML models to prevent early wildfire spread.
  • Integrated LLM-based natural language search (LangChain + OpenAI) into dashboards for conversational access to risk insights, improving coordination efficiency by 25%.
  • Built MCP-powered AI agent for automated QA with council-of-agents validation, dynamically generating Playwright tests, reducing manual testing time by 70% while improving coverage to 95%.
PyTorch AWS SageMaker LangChain MCP Playwright

GenAI Engineer

CVS Health Jan 2025 – Feb 2026
  • Built clinical RAG pipeline using pgvector and Azure AI Foundry, enabling real-time retrieval from 2M+ patient records with RAGAS-evaluated answer accuracy.
  • Fine-tuned clinical embeddings with QLoRA (Hugging Face) to improve medical case-classification accuracy by 19%, enabling more reliable treatment prioritization.
  • Deployed FastAPI inference microservices on GCP Vertex AI serving 10K+ daily clinical predictions for pharmacy and case-management teams.
  • Implemented MLflow experiment tracking with CI/CD integration, reducing release cycle time by 30%.
RAG pgvector Azure AI Foundry QLoRA RAGAS FastAPI

ML Engineer

VR Digital Solutions Mar 2021 – Jul 2023
  • Built XGBoost-based customer segmentation and churn models on 2M+ records, improving marketing ROI by 17% across multiple client accounts.
  • Deployed YOLO-based defect detection pipeline for product image QA, saving $200K+ in manual inspection costs and reducing defect escape rate by 35%.
  • Engineered ETL workflows using Python, SQL, Pandas, and Airflow to process 3M–8M daily records, improving pipeline throughput by 2.1x.
  • Standardized MLflow experiment tracking and model registry across team, enabling reproducible deployments and cutting release cycle time by 25%.
XGBoost YOLO PyTorch MLflow Airflow

My AI & Data Toolbox

A comprehensive suite of tools and frameworks I use to design, build, and deploy AI systems at scale.

Programming
Python SQL R Java TypeScript
ML / DL Frameworks
TensorFlow PyTorch Scikit-learn XGBoost YOLO
Cloud Platforms
AWS Azure AI Foundry GCP Vertex AI OCI
Gen AI & LLMs
LangChain LangGraph CrewAI n8n RAG QLoRA RAGAS MCP
MLOps & DevOps
Docker Kubernetes Terraform MLflow Airflow CI/CD Observability
Data & Full-Stack
FastAPI PostgreSQL pgvector Redis FAISS Pinecone Pandas Tableau

Featured Work

Projects That Ship

Agentic AI
Vizard AI
Intelligent AI platform using LangChain to automatically generate business-relevant data visualizations and insights from raw datasets.
Machine Learning
LendSafe
FCRA-compliant loan decision explanation system with fine-tuned IBM Granite 350M for transparent, auditable lending decisions.
Agentic AI
Hamcaller Custom LLM
Custom fine-tuned LLM designed to identify and filter spam calls using advanced NLP techniques to protect users from fraud.
Agentic AI
Stars Yapp
GenAI-powered astrology predictor delivering personalized insights using LLMs and RAG architecture for context-aware predictions.
Data Analysis
F1 Telemetry Analytics
Advanced analytics on Formula 1 telemetry data modeling driver performance patterns and building predictive insights for race outcomes.
Machine Learning
ML Pipeline on AWS SageMaker
End-to-end ML pipeline to predict phone prices using ensemble methods, deployed as scalable REST API on AWS SageMaker for production inference.
Status
Production
Platform
AWS SageMaker
Type
REST API

Cloud Certifications

Microsoft Certified: Azure Data Scientist Associate
Credential ID: C2F98ADE9E9ECB9D
Verify ↗
Oracle Cloud Infrastructure 2025 Certified Data Science Professional
Oracle Certification 2025
Verify ↗

Get In Touch

Let's Connect

I'm currently open to new opportunities and collaborations. Feel free to reach out if you'd like to discuss data science, AI projects, or just want to connect!

Send a Message