Pratheek Nistala

23Data Scientist • Machine Learning

GitHubHyderabad, India

I do data science end-to-end. Exploratory analysis, feature engineering, model development, validation, deployment.

My work centers on credit risk modeling, predictive analytics, and machine learning pipelines. I work with structured and unstructured data, build statistical models, train classifiers and regressors, tune hyperparameters, evaluate performance.
Currently expanding into LLM applications and RAG systems. Working with embeddings, vector databases, prompt engineering, retrieval augmentation.

I write documentation and technical posts on methods, implementations, and results. More about my writings here.

Proffessional Experience
Dun and Bradstreet

Dun and Bradstreet

Data Science Apprentice ( Risk Analysis and Modeling)

June 2025 – Present

• Collaborated with the data science team on Delinquency Score Model by leading feature engineering efforts and enabling an XGBoost-driven probability-to-score mapping (101 to 670) with Class 1 to 5 risk segmentation, improving model discriminatory power (AUC +4, KS +7). • Developed scorecard-style models to convert predictive outputs into business-interpretable risk scores. • Conducted SQL-driven EDA and data quality validation on large datasets (50M+ records) to support modeling workflows. • Performed performance validation of external vendor data against internal risk models.

Tech Stack

Python
Python
Pandas
Pandas
Scikit-Learn
Scikit-Learn
XGBoost
XGBoost
Apache Spark
Apache Spark
SQL
SQL
Jupyter
Jupyter
Docker
Docker
Kubernetes
Kubernetes
Git
Git
Linux
Linux
FastAPI
FastAPI
Google Cloud
Google Cloud
Shell Scripting
Shell Scripting
Streamlit
Streamlit

If you're still reading, you're probably curious about my work.

Let's Chat

I am available on these platforms

"Talk is cheap. Show me the code." — Linus Torvalds
© 2026 Built with ❤️ by Pratheek Nistala.