09 Model Evaluation & Dagshub

Model Evaluation with MLflow & Dagshub: The Real-World Teenage ML Engineer's Playbook
Dagshub: The “GitHub for Data Science” That Actually Gets It
Alright, first up-let’s talk about Dagshub. If you’re tired of losing track of your models, datasets, or just want to flex your ML workflow to your friends (or future employers), Dagshub is your new best friend.
What’s cool about Dagshub?
It’s like GitHub, but made for data science. You can track your code, data, models, and even all your experiment results.
It works with MLflow, so you get a slick dashboard to compare how different models perform.
Collaboration is a breeze. You can literally share your whole ML pipeline with your squad or mentor.
Honestly, once you use it, you’ll wonder how you ever managed with just folders named “final_model_v3_really_final”.
Setting Up the Model Evaluation: No More Guesswork
You know how annoying it is when you can’t remember which model you trained with which data? That’s why we use configuration files. Here’s the deal: all the important paths and filenames go into config.yaml
.
model_evaluation:
root_dir: artifacts/model_evaluation
test_data_path: artifacts/data_transformation/test.csv
model_path: artifacts/model_trainer/model.joblib
metric_file_name: artifacts/model_evaluation/metrics.json
In Python, we keep things tight with a dataclass. It’s like a cheat code for keeping your configs organized:
pythonfrom dataclasses import dataclass
from pathlib import Path
@dataclass
class ModelEvaluationConfig:
root_dir: Path
test_data_path: Path
model_path: Path
all_params: dict
metric_file_name: Path
target_column: str
mlflow_uri: str
And to make sure we always have the right config, we use a function that grabs everything from the YAML and puts it into this dataclass. No more “where did I save that model?” moments.
The ModelEvaluation Class: Where the Magic Happens
This class is the MVP. It loads your test data and trained model, runs the predictions, calculates the metrics, and logs everything to MLflow (which Dagshub tracks for you).
class ModelEvaluation:
def __init__(self, config: ModelEvaluationConfig):
self.config = config
def eval_metrics(self, actual, pred):
rmse = np.sqrt(mean_squared_error(actual, pred))
mae = mean_absolute_error(actual, pred)
r2 = r2_score(actual, pred)
return rmse, mae, r2
def log_into_mlflow(self):
test_data = pd.read_csv(self.config.test_data_path)
model = joblib.load(self.config.model_path)
test_x = test_data.drop([self.config.target_column], axis=1)
test_y = test_data[[self.config.target_column]]
# Load MLflow credentials from .env
MLFLOW_TRACKING_URI = os.getenv("MLFLOW_TRACKING_URI")
MLFLOW_TRACKING_USERNAME = os.getenv("MLFLOW_TRACKING_USERNAME")
MLFLOW_TRACKING_PASSWORD = os.getenv("MLFLOW_TRACKING_PASSWORD")
if not all([MLFLOW_TRACKING_URI, MLFLOW_TRACKING_USERNAME, MLFLOW_TRACKING_PASSWORD]):
raise ValueError("Missing MLflow environment variables. Check your .env file.")
mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)
os.environ["MLFLOW_TRACKING_USERNAME"] = MLFLOW_TRACKING_USERNAME
os.environ["MLFLOW_TRACKING_PASSWORD"] = MLFLOW_TRACKING_PASSWORD
tracking_url_type_store = urlparse(mlflow.get_tracking_uri()).scheme
with mlflow.start_run():
predicted_qualities = model.predict(test_x)
(rmse, mae, r2) = self.eval_metrics(test_y, predicted_qualities)
scores = {"rmse": rmse, "mae": mae, "r2": r2}
save_json(path=Path(self.config.metric_file_name), data=scores)
mlflow.log_params(self.config.all_params)
mlflow.log_metric("rmse", rmse)
mlflow.log_metric("r2", r2)
mlflow.log_metric("mae", mae)
if tracking_url_type_store != "file":
mlflow.sklearn.log_model(model, "model", registered_model_name="ElasticnetModel")
else:
mlflow.sklearn.log_model(model, "model")
print("Model evaluation and logging to MLflow completed successfully.")
What’s Actually Happening?
Loads your test data and model: No more manual file picking.
Evaluates predictions: You get RMSE, MAE, and R²-so you know if your model is actually any good.
Saves metrics to a JSON file: Handy for quick checks or sharing results.
Logs everything to MLflow: This is where Dagshub comes in-now you can compare every experiment, see what worked, and flex your best models.
How to Run the Pipeline (No More “It Worked On My Laptop”)
Here’s how you set up the pipeline so anyone (even your future self) can run it:
pythonfrom src.Predict_Pipe.config.configuration import ConfigurationManager
from src.Predict_Pipe.components.model_evaluation import ModelEvaluation
class ModelEvaluationTrainingPipeline:
def __init__(self):
pass
def initiate_model_evaluation(self):
config = ConfigurationManager()
model_evaluation_config = config.get_model_evaluation_config()
model_evaluation = ModelEvaluation(config=model_evaluation_config)
model_evaluation.log_into_mlflow()
And you kick it off in main.py
like this:
pythonSTAGE_NAME = "Model Evaluation stage"
try:
logger.info(f">>>>>> stage {STAGE_NAME} started <<<<<<")
obj = ModelEvaluationTrainingPipeline()
obj.initiate_model_evaluation()
logger.info(f">>>>>>>{STAGE_NAME} completed <<<<<<\n\nx==========x")
except Exception as e:
logger.exception(e)
raise e
.
Results: What Do You Actually Get?
After running this, you’ll see:
A
metrics.json
file with your model’s RMSE, MAE, and R² scores. (Perfect for screenshots or sharing with your mentor!)Logs in your terminal and log files, so you know what happened and where.
All your experiments, metrics, and models tracked on Dagshub’s MLflow dashboard. You can literally compare every run, see which parameters worked best, and download the best model whenever you want, here the link for my repo : https://dagshub.com/mainiyash2/Predict-pipe
{
"rmse": 0.6379414257638729,
"mae": 0.492435366850595,
"r2": 0.32618194013718615
}
On Dagshub:
You get a dashboard with all your runs. It’s like a high score table for your models. You can see which model did best, what parameters you used, and even download the model straight from the browser. These are my results :
In next section we try to give it a outer look and try to write a simple flask app for it.
Subscribe to my newsletter
Read articles from Yash Maini directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
