A Machine Learning meme.

What happens when a teacher trains students on a few math problems and later tests them on the same?

Many students will learn them perfectly and perform brilliantly on the test.

But in the context of Machine learning, it never works.

We need models that generalize well to the new cases when deployed rather than the models that have unusually high accuracy or low RMSE during the training phase.

For this purpose, setting aside a portion of the initial data as test data becomes an essential practice.

Evaluating the model on this test set gives one an estimate of how well the model generalizes to new data.

But there are a few caveats one needs to sort out:

1. There should be no data leakage — the model shouldn’t know some of the test data beforehand. (it would be cheating, right?)

2. Test data should have similar distribution as training data.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Article

Ukraine says it will stop transporting some Russian gas.

Next Article

Gallery: The Giro's date with an active volcano - CyclingTips

Related Posts