Wednesday, April 22, 2009

Visulization of correlation matrix 2



The plot.corr() function was updated, now it can
1. Add colorkey and text labels more flexible.
2. Reorder the variables using PCA or hierarchical clustering methods.
3. Excellent in details.
4. Other.

What's more, I found a new method to display correlation matricex, using squares with different areas and colors .

See also here . Get code from my google docs here.

Tuesday, March 24, 2009

Comparison of different circle graphs



See in my Picasa here and code here. Thanks Bob O'Hara's advice:)

I found people's tastes differ, so input parameter col (fill color) and bg (background color) was added in new edition. What is more, now you can order your variables using PCA (order=TRUE) to get a better impression.

Sunday, March 22, 2009

Play Sliding Puzzles on R



The code was shared on my google docs. See it here.

Friday, March 13, 2009

Visulization of correlation matrix

  • Color Image
data(mtcars)
fit = lm(mpg ~ ., mtcars)
cor = summary(fit, correlation = TRUE)$correlation
cor2 = t(cor[11:1, ])
colors = c("#A50F15", "#DE2D26", "#FB6A4A", "#FCAE91", "#FEE5D9",
"white", "#EFF3FF", "#BDD7E7", "#6BAED6", "#3182BD", "#08519C")
image(1:11, 1:11, cor2, axes = FALSE, ann = F, col = colors)
text(rep(1:11, 11), rep(1:11, each = 11), round(100 * cor2))
  • Ellipses
library(ellipse)
col = colors[as.vector(apply(corr, 2, rank))]
plotcorr(cor, col = col, mar = rep(0, 4))
  • Taiyun's circles (my method)

circle.cor = function(cor, axes = FALSE, xlab = "",
ylab = "", asp = 1, title = "Taiyun's cor-matrix circles",
...) {
n = nrow(cor)
par(mar = c(0, 0, 2, 0), bg = "white")
plot(c(0, n + 0.8), c(0, n + 0.8), axes = axes, xlab = "",
ylab = "", asp = 1, type = "n")
##add grid
segments(rep(0.5, n + 1), 0.5 + 0:n, rep(n + 0.5, n + 1),
0.5 + 0:n, col = "gray")
segments(0.5 + 0:n, rep(0.5, n + 1), 0.5 + 0:n, rep(n + 0.5,
n), col = "gray")
##define circles' background color.
##black for positive correlation coefficient and white for negative
bg = cor
bg[cor > 0] = "black"
bg[cor <= 0] = "white" ##plot n*n circles using vector language, suggested by Yihui Xie symbols(rep(1:n, each = n), rep(n:1, n), add = TRUE, inches = F, circles = as.vector(sqrt(abs(cor))/2), bg = as.vector(bg)) text(rep(0, n), 1:n, n:1, col = "red") text(1:n, rep(n + 1), 1:n, col = "red") title(title) } ## an example data(mtcars) fit = lm(mpg ~ ., mtcars) cor = summary(fit, correlation = TRUE)$correlation circle.cor(cor)

The circles with black background denote positive correlation coefficient, and the area of circles denotes the absolute value. See more in my Picasa here.

The above three graphs based on the same data. Dear friends, which gives your more information at first galance?