\(R^2\) Overkill — R2 • MSBMisc

Way too many ways to calculate \(R^2\).

Usage

R2(pred, obs, type = 1, na.rm = TRUE)

Arguments

pred: The predicted values by some model; typically the result of a call to predict().
obs: The true observed values.
type: Which of the 8 versions of R-square to use. See details.
na.rm: a logical value indicating whether NA values should be stripped before the computation proceeds.

Details

The types of \(R^2\):

\(R^2_1 = 1 - \sum (y-\hat{y})^2 / \sum (y-\bar{y})^2\)
\(R^2_2 = \sum (\hat{y}-\bar{y})^2 / \sum (y-\bar{y})^2\)
\(R^2_3 = \sum (\hat{y}-\bar{\hat{y}})^2 / \sum (y-\bar{y})^2\)
\(R^2_4 = 1 - \sum (e-\bar{e})^2 / \sum (y-\bar{y})^2\)
\(R^2_5 = \) squared multiple correlation coefficient between the regressand and the regressors
\(R^2_6 = r_{y,\hat{y}}^2\)
\(R^2_7 = 1 - \sum (y-\hat{y})^2 / \sum y^2\)
\(R^2_8 = \sum \hat{y}^2 / \sum y^2\)

References

Kvålseth, T. O. (1985). Cautionary note about R 2. The American Statistician, 39(4), 279-285.

Examples

X <- c(1, 2, 3, 4, 5, 6)
Y <- c(15, 37, 52, 59, 83, 92)

m1 <- lm(Y ~ X)
m2 <- lm(Y ~ 0 + X)
m3 <- nls(Y ~ a * X^b, start = c(a = 1, b = 1))

# Table 2 from Kvålset0 (1985)
data.frame(
  mod1 = sapply(1:8, R2, pred = predict(m1), obs = Y),
  mod2 = sapply(1:8, R2, pred = predict(m2), obs = Y),
  mod3 = sapply(1:8, R2, pred = 16.3757 * X^0.99, obs = Y)
)
#>        mod1      mod2      mod3
#> 1 0.9808189 0.9776853 0.9777219
#> 2 0.9808189 1.0836003 1.0982588
#> 3 0.9808189 1.0829977 1.0982028
#> 4 0.9808189 0.9782880 0.9777779
#> 5        NA        NA        NA
#> 6 0.9808189 0.9808189 0.9810793
#> 7 0.9966075 0.9960532 0.9960597
#> 8 0.9966075 0.9960532 1.0230915