Way too many ways to calculate \(R^2\).
Arguments
- pred
The predicted values by some model; typically the result of a call to
predict()
.- obs
The true observed values.
- type
Which of the 8 versions of R-square to use. See details.
- na.rm
a logical value indicating whether NA values should be stripped before the computation proceeds.
Details
The types of \(R^2\):
\(R^2_1 = 1 - \sum (y-\hat{y})^2 / \sum (y-\bar{y})^2\)
\(R^2_2 = \sum (\hat{y}-\bar{y})^2 / \sum (y-\bar{y})^2\)
\(R^2_3 = \sum (\hat{y}-\bar{\hat{y}})^2 / \sum (y-\bar{y})^2\)
\(R^2_4 = 1 - \sum (e-\bar{e})^2 / \sum (y-\bar{y})^2\)
\(R^2_5 = \) squared multiple correlation coefficient between the regressand and the regressors
\(R^2_6 = r_{y,\hat{y}}^2\)
\(R^2_7 = 1 - \sum (y-\hat{y})^2 / \sum y^2\)
\(R^2_8 = \sum \hat{y}^2 / \sum y^2\)
References
Kvålseth, T. O. (1985). Cautionary note about R 2. The American Statistician, 39(4), 279-285.
Examples
X <- c(1, 2, 3, 4, 5, 6)
Y <- c(15, 37, 52, 59, 83, 92)
m1 <- lm(Y ~ X)
m2 <- lm(Y ~ 0 + X)
m3 <- nls(Y ~ a * X^b, start = c(a = 1, b = 1))
# Table 2 from Kvålset0 (1985)
data.frame(
mod1 = sapply(1:8, R2, pred = predict(m1), obs = Y),
mod2 = sapply(1:8, R2, pred = predict(m2), obs = Y),
mod3 = sapply(1:8, R2, pred = 16.3757 * X^0.99, obs = Y)
)
#> mod1 mod2 mod3
#> 1 0.9808189 0.9776853 0.9777219
#> 2 0.9808189 1.0836003 1.0982588
#> 3 0.9808189 1.0829977 1.0982028
#> 4 0.9808189 0.9782880 0.9777779
#> 5 NA NA NA
#> 6 0.9808189 0.9808189 0.9810793
#> 7 0.9966075 0.9960532 0.9960597
#> 8 0.9966075 0.9960532 1.0230915