7.4.1 Piecewise Polynomials
Instead of fitting a high-degree polynomial over the entire range of X , piecewise polynomial regression involves fitting separate low-degree polynomials piecewise over different regions of X . For example, a piecewise cubic polynomial polynomial works by fitting a cubic regression model of the form regression
polynomial regression
\[y_i = \beta_0 + \beta_1 x_i + \beta_2 x_i^2 + \beta_3 x_i^3 + \epsilon_i\]where the coefficients β 0, β 1, β 2, and β 3 differ in different parts of the range of X . The points where the coefficients change are called knots .
For example, a piecewise cubic with no knots is just a standard cubic polynomial, as in (7.1) with d = 3. A piecewise cubic polynomial with a single knot at a point c takes the form
knot
\[y_i = \begin{cases} \beta_{01} + \beta_{11} x_i + \beta_{21} x_i^2 + \beta_{31} x_i^3 + \epsilon_i & \text{if } x_i < c; \\ \beta_{02} + \beta_{12} x_i + \beta_{22} x_i^2 + \beta_{32} x_i^3 + \epsilon_i & \text{if } x_i \ge c. \end{cases}\]In other words, we fit two different polynomial functions to the data, one on the subset of the observations with xi < c , and one on the subset of the observations with xi ≥ c . The first polynomial function has coefficients
7.4 Regression Splines 295

FIGURE 7.3. Various piecewise polynomials are fit to a subset of the Wage data, with a knot at age=50 . Top Left: The cubic polynomials are unconstrained. Top Right: The cubic polynomials are constrained to be continuous at age=50 . Bottom Left: The cubic polynomials are constrained to be continuous, and to have continuous first and second derivatives. Bottom Right: A linear spline is shown, which is constrained to be continuous.
β 01 , β 11 , β 21 , and β 31, and the second has coefficients β 02 , β 12 , β 22 , and β 32. Each of these polynomial functions can be fit using least squares applied to simple functions of the original predictor.
Using more knots leads to a more flexible piecewise polynomial. In general, if we place K different knots throughout the range of X , then we will end up fitting K + 1 different cubic polynomials. Note that we do not need to use a cubic polynomial. For example, we can instead fit piecewise linear functions. In fact, our piecewise constant functions of Section 7.2 are piecewise polynomials of degree 0!
The top left panel of Figure 7.3 shows a piecewise cubic polynomial fit to a subset of the Wage data, with a single knot at age=50 . We immediately see a problem: the function is discontinuous and looks ridiculous! Since each polynomial has four parameters, we are using a total of eight degrees of freedom in fitting this piecewise polynomial model.
degrees of freedom
296 7. Moving Beyond Linearity