This page will show you how you can analyze and visualize 2 variables and their relationships using R.
Use the tabyl()
function from the janitor
package. The first line creates the table. The rest formats the table (giving it a title, adding totals etc…).
library(janitor)
mtcars %>%
tabyl(am, gear, show_na = FALSE) %>%
adorn_title("combined") %>% #both var names in the title
adorn_totals("col") %>% #column totals
adorn_totals("row") %>% #row totals
adorn_percentages("col") %>% #columnwise, rowwise (row), or total percentages (all)
adorn_pct_formatting(digits = 1) %>%
adorn_ns() #show the numbers of cases
For a boxplot use geom_boxplot()
:
# A boxplot comparing different species is having two variables in the aesthetics
iris %>%
ggplot(aes(x = Species, y = Petal.Width)) +
geom_boxplot()
A scatter plot in the ggplot package will be done with the geom geom_point()
. Notice that the aesthetics in the first layer now contain two variables.
mtcars %>%
ggplot(aes(x = cyl, y = mpg)) +
geom_point()
If you want to add a linear regression line, add an extra ‘geom’ (geom_smooth).
mtcars %>%
ggplot(aes(x = cyl, y = mpg)) +
geom_point() +
geom_smooth(method = "lm", se = F)
If you want to visualize different groups in the scatterplot, you can add extra variables to the aesthetics. In addition to x and y, you can use color (the “outside” color of points), fill (the “inside” color of the points), shape (of points) and size.
mtcars %>%
ggplot(aes(x = cyl,
y = mpg,
color = factor(gear)
)
) +
geom_point()
For different subplots, use the layer facet_wrap(~ )
as default. This is another way to introduce a third variable.
mpg %>%
ggplot(aes(x = factor(cyl), y = hwy)) +
geom_boxplot() +
facet_wrap(~ year)
No examples available yet
No function overview yet
No Frequently Asked Questions yet
No resources yet