I'm a Healthcare Data Analyst with a rare combination of clinical depth and data engineering breadth — a licensed Pharm.D. with an M.S. in Health Data Science and 3+ years building production-grade analytics systems across payer, PBM, and hospital settings.
I don't just build reports — I build the infrastructure that makes reporting trustworthy. My work sits at the intersection of SQL engineering, ETL architecture, HEDIS/STAR quality measurement, and ML-powered clinical tools — all grounded in hands-on knowledge of how claims, eligibility, and pharmacy data actually flow in the real world.
🏥 Currently: Sr. Healthcare Data Analyst @ Cardinal Health supporting Medicare and Commercial analytics at scale.
┌─────────────────────────────────────────────────────────────────────┐
│ CLINICAL KNOWLEDGE + DATA ENGINEERING + ML/ANALYTICS │
│ │
│ ✔ Pharm.D.-trained — I understand NDC/GPI drug logic, PDC │
│ calculation, and medication adherence at a clinical level │
│ │
│ ✔ HEDIS/STAR production experience — denominator/numerator/ │
│ exclusion logic, CMS submission-ready, zero audit defects │
│ │
│ ✔ SQL engineering at scale — CTE pipelines, window functions, │
│ query optimization across 3M+ member datasets │
│ │
│ ✔ ML in production — XGBoost risk model actively used by │
│ Pharm.D. interns for clinical decision support │
│ │
│ ✔ EDI expertise — end-to-end 837/835/834 validation in Facets │
│ across enrollment, claims, and payment workflows │
└─────────────────────────────────────────────────────────────────────┘
🔬 BRFSS Chronic Disease Risk Engine v2 · Live Production App
Python·XGBoost·SHAP·Streamlit·CDC BRFSS 2019–2021
A multi-disease risk prediction application actively used by Pharm.D. interns in a superspecialty hospital for clinical decision support — not a toy project, a real deployment.
- Predicts patient-level risk for Diabetes, Hypertension, Heart Disease, and Obesity from CDC BRFSS population data
- SHAP explainability surfaces individual risk drivers for clinician interpretability
- Resolved 12 production-level engineering challenges: age bin misalignment, label inconsistencies across survey years, and Streamlit Cloud deployment failures
- Why it matters: Bridges the gap between population health data and point-of-care clinical workflows
Python·XGBoost·SHAP·CMS PDE + Beneficiary Files·515K+ records (2015–2023)
An end-to-end ML pipeline built on real CMS public data — not synthetic datasets.
- Engineered 14 predictive features including polypharmacy indicators (
POLY_5PLUS,POLY_10PLUS),GENERIC_PERCENT,OOP_TOTAL,ESRD_IND - AUC: 0.974 | Recall: 0.833 for identifying top-10% high-cost Medicare beneficiaries
- SHAP analysis confirmed
TOTAL_FILLSas the dominant cost driver (64.4% importance) — aligned with real PBM Medication Therapy Management targeting strategies - Why it matters: Directly actionable for PBM cost containment and Medicare risk stratification programs
Python·TF-IDF·XGBoost·Clinical NLP·10,000 SOAP Notes
NLP pipeline classifying unstructured clinical notes into 6 disease categories for automated triage support.
- Built a 150+ term medical abbreviation expansion dictionary to normalize clinical shorthand before vectorization
- TF-IDF (20K features + bigrams) + XGBoost achieved 90% accuracy, macro F1: 0.64 — outperforming Logistic Regression baseline by 27%
- Validated on an external clinical case dataset for out-of-sample generalization
- Why it matters: Demonstrates applied NLP on the messy, abbreviation-heavy text that defines real clinical documentation
SQL Server·PostgreSQL·HEDIS·Claims Analytics·EDI Validation
A production-quality SQL portfolio replicating the logic used at real payers and PBMs.
- Diabetic cohort engine: ICD-10 (E10/E11) + NDC drug pathways with full PDC adherence metrics by drug class
- HEDIS quality measures: Complete 30-day all-cause readmission logic with denominator, numerator, and exclusion specifications
- ETL validation framework: 10 automated data quality rules — financial reconciliation, orphan record detection, eligibility overlap checks, CPT/ICD format validation
- Exception reporting dashboards for data governance and pipeline monitoring
- Why it matters: Shows I can build the foundation that HEDIS and STAR reporting actually depend on
| Role | Organization | Key Impact |
|---|---|---|
| Sr. Healthcare Data Analyst | Cardinal Health | 3M+ member pipelines · $2M+ projected savings · 1,800 hrs/yr manual reporting eliminated |
| Healthcare Data Analyst | BCBS | 500K+ member analytics · 18% reduction in reporting errors · 200+ EDI defects resolved |
| Data Analyst Intern | Excelerate | Claims & pharmacy analytics · dashboard development · HIPAA compliance |
| Clinical Data Analyst | ESI Hospital | SQL/Oracle/Excel analytics · EDI validation · UAT support |
| Credential | Institution |
|---|---|
| 🎓 M.S. Medical Informatics (Aug 2025) | Saint Louis University, USA |
| 💊 Doctor of Pharmacy (Pharm.D.) | JNTUH, India |
| 🏅 Certified Health Data Analyst (CHDA) | AHIMA |
| 🔒 HIPAA Privacy & Security Certification | — |
I'm open to Senior Healthcare Data Analyst, Healthcare Analytics Engineer, and Clinical Informatics roles where deep domain knowledge + technical execution both matter.
📧 Saikumarchary1709@gmail.com | 💼 LinkedIn
"Healthcare data is complex by nature. I build systems that make it simple, reliable, and decision-ready."