The Paradox of Overfitting

Author Volker Nannen
Supervisor/s Peter Grünwald | Rineke Verbrugge
In Rijksuniversiteit Groningen, 2003.


A serious problem of most of the common learning algorithms is overfitting. Overfitting occurs when the models describe the examples better and better but get worse and worse on other instances of the same phenomenon. This can make the whole learning process worthless. A good way to observe overfitting is to split a number of examples in two, a training set, and a test set and to train the models on the training set. Clearly, the higher the degree of the model, the more information the model will contain about the training set. But when we look at the generalization error of the models on the test set, we will usually see that after an initial phase of improvement the generalization error suddenly becomes catastrophically bad. To the uninitiated student this takes some effort to accept since it apparently contradicts the basic empirical truth that more information will not lead to worse predictions. We may well call this the paradox of overfitting and hence the title of this thesis.

Uso de cookies

Este sitio web utiliza cookies para que usted tenga la mejor experiencia de usuario. Si continúa navegando está dando su consentimiento para la aceptación de las mencionadas cookies y la aceptación de nuestra política de cookies, pinche el enlace para mayor información.

ACEPTAR