< 2.1.5 Regression Versus Classification Problems 2.2.1 Measuring The Quality Of Fit >

💡 학습 팁: 원문 해석이 어렵다면? 한 줄씩 나란히 번역된 📖 직역본 보기를 추천합니다!

2.2 Assessing Model Accuracy

One of the key aims of this book is to introduce the reader to a wide range of statistical learning methods that extend far beyond the standard linear regression approach.

Why is it necessary to introduce so many different statistical learning approaches, rather than just a single best method?

There is no free lunch in statistics: no one method dominates all others over all possible data sets.

On a particular data set, one specific method may work best, but some other method may work better on a similar but different data set.

Hence it is an important task to decide for any given set of data which method produces the best results.

Selecting the best approach can be one of the most challenging parts of performing statistical learning in practice.

In this section, we discuss some of the most important concepts that arise in selecting a statistical learning procedure for a specific data set.

As the book progresses, we will explain how the concepts presented here can be applied in practice.

2.2.1 Measuring the Quality of Fit

Explains the Mean Squared Error (MSE), which is the most commonly used metric when evaluating a model’s excellence in a regression environment. Emphasizes the importance of generalization, which performs well on unfamiliar test data, rather than just simply fitting the training data well.

2.2.2 The Bias-Variance Trade-Off

Deals with the complex correlation between Bias and Variance, which are the essential components that make up the error on test data. Mathematically explores the U-shaped validation curve (U-Shape) where as the flexibility of the model increases, variance grows and bias gradually decreases.

2.2.3 The Classification Setting

Introduces the Error Rate, a ratio metric for comparing performance in a model environment where discrete class outcomes must be predicted. Learns about the Bayes Error Rate, which defines the lowest limit by performing optimal predictions within a given data space.


Sub-Chapters

< 2.1.5 Regression Versus Classification Problems 2.2.1 Measuring The Quality Of Fit >
서브목차