This page will show you how you can analyze more variables and their relationships using R.

Explanation

3-way contingency tables with tabyl

Use the tabyl() function from the janitor package. With several adorning functions you can adjust the table to your liking. You can force the table displaying the values ‘none’, ‘some’ and ‘many’ in the correct order, by making sure the variable is stored as an ‘ordered factor’.

library(janitor)
mtcars %>% 
  tabyl(am, gear, cyl, show_na = FALSE) %>% 
  adorn_title("combined") %>% #both var names in the title
  adorn_totals("col") %>% #column totals
  adorn_totals("row") %>% #row totals
  adorn_percentages("col") %>% #columnwise, rowwise (row), or total percentages (all)
  adorn_pct_formatting(digits = 1) %>%
  adorn_ns() #show the numbers of cases

Linear models

For the ordinary linear model, we use lm(). For a clear presentation of the regression table, we use the tidy() function from the broom package.

library(broom)

# multiple linear regression model (no interaction)
# y = qsec, x1 = wt, x2 = cyl
# data set: mtcars

model <- mtcars %>% 
  lm(qsec ~ wt + cyl, data = .) 

# the regression table  
model %>% 
  tidy()

For an ANOVA table, we also use the tidy() function.

model %>% 
  anova() %>% 
  tidy()

To obtain the R-squared, we can use the summary() function.

out <- mtcars %>% 
  lm(qsec ~ wt + cyl, data = .) %>% 
  summary() 

out$r.squared
out$adj.r.squared

To make plots with the residuals and/or the predicted values we can use the modelr package to paste residuals and predicted values to the data frame:

library(modelr)

model <- mtcars %>% 
  lm(qsec ~ wt + cyl, data = .)  

mtcars %>% 
  add_predictions(model) %>% 
  add_residuals(model) %>% 
  ggplot(aes(x = pred, y = resid)) + 
  geom_point()

linear mixed models

For the linear mixed model, we use the lmer() function from the lme4 package. Note that the output doesn’t show p-values, nor residual degrees of freedom for fixed effects. This is for a good reason.

library(lme4)

mtcars %>% 
  lmer(qsec ~ wt + (1|gear), data = .) %>% 
  tidy()

# When we have factors as fixed variables:
mtcars %>% 
  lmer(qsec ~ wt + factor(cyl) + (1|gear), data = .) %>% 
  anova() %>% 
  tidy()

If you want approximate p-values, you can use Satterthwaite’s degrees of freedom method, implemented in the lmerTest package. The same method is used in SPSS.

library(lmerTest)
mtcars %>% 
  lmer(qsec ~ wt + (1|gear), data = .) %>% 
  summary()

For a residual plot, we can use similar syntax as for the linear model:

model <- mtcars %>% 
  lmer(qsec ~ wt + (1|gear), data = .)  

mtcars %>% 
  add_predictions(model) %>% 
  add_residuals(model) %>% 
  ggplot(aes(x = pred, y = resid)) +
  geom_point()

logistic regression

For a logistic regression model, we use the glm() function.

mtcars %>% 
  glm(am ~ wt, family = binomial, data = .) %>% 
  tidy()

Similarly for a Poisson regression.

mtcars %>% 
  glm(carb ~ wt, family = poisson, data = .) %>% 
  tidy()

Nonparametric tests

For a non-parametric test for comparing two or more groups, we use

airquality %>% 
  kruskal.test(Ozone ~ Month, data = .)

For a non-parametric test for repeated measures, we use

iris %>% 
  select(Sepal.Length, Petal.Length) %>% 
  as.matrix() %>%
  friedman.test()

For Kendall’s tau, we use

library(VGAM)
kendall.tau(mtcars$mpg, mtcars$gear)

For Spearman’s rho, we use

library(Hmisc) 

rcorr(mtcars$mpg, mtcars$disp, type = "spearman")

Examples

No examples available yet

Functions

No function overview yet

FAQ

No Frequently Asked Questions yet

Resources

No resources yet

Multivariate data analysis