Tuesday, October 20, 2009

Psuedo R^2

So you built your fantastic logistic regression models, presented your ORs, you coefficients, analyzed the residuals, etc., and the review panel comes back asking "Where are your R-squared values?" Now you could argue that R2 isn't a logical statistic for logistic regression, but arguing with review boards is difficult and frustrating, so the best thing may be to just submit to their claims.

There are a number of "Pseudo R-Squared" measures that use the log-likelihood to calculate a fit statistic. One of them, Nagelkerke's, is provided in the Design library by Frank Harrell, in a function called lrm.

You can use this function in pretty much the same manner as glm, but if you already have a number of glm models built and don't want to change them, you can run lrm on the models to get the statistics

library(Design)
set.seed(11)
out = rbinom(1000,1,0.5)
pred1 = rnorm(1000,0,1)
pred2 = rnorm(1000,out,1)
dat = data.frame(o=out,p1=pred1,p2=pred2)
model = glm(o~p1+p2,family=binomial(link=logit),data=dat)
m2 = lrm(model)
m2$stats

The best plan is probably to use the lrm function from the start, but if you're a die-hard glm fan then this is a quick way to satisfy those R-Squared'ers

Monday, September 14, 2009

Converting Vectors to Strings

Just a quick one, as I just created the blog, and I have a meeting in ten minutes.

I find it surprisingly difficult to convert a vector to a comma-separated string. Since I work with Sweave and LaTeX a lot this comes up, if I want to output the names of a dataset into a paragraph. I thought that paste or cat would do it, but paste doesn't work and cat doesn't output to the LaTeX file.

The solution is the function toString, and it works just like paste:

> toString(1:10,sep=', ')
[1] "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"


Happy coding!