Content
Before we begin…
These are the packages we will be using in this demonstration. If you
have been following along with the content each week, the
tidyverse
package should already be installed on your
computer. However, this is the first time we have encountered the
effectsize
package. We will be using functions from this
package to calculate the effect size, which we need to report with our
statistical test. So make sure you install it using the
install.packages()
function.
library(tidyverse)
library(effectsize)
Also, something to keep in mind, throughout the next few demonstrations, whenever we analyse data, we will roughly be following this procedure:
- Prepare the data for analysis.
- Visualise the data.
- Run the statistical test.
- Write-up analysis.
The General Linear Model in R
R has built in functions that performs all the statistical tests we will be covering this week. Whenever we run an analysis in R (whether it is a t-test, ANOVA, correlation, regression, etc.), generally, we need to provide the function with two things:
- The formula we wish to test (more on this below).
- The data.frame that holds the data.
All functions take the following template:
function_name(formula,data)
In the next few weeks, we will be covering examples of different analyses. In almost all cases, the function to conduct these analyses follows this template:
Formulas
Whenever we conduct an analysis in R, we need to write the formula for the linear model. In general, this takes the following form:
dependent_variable ~ independent_variable1 + independent_variable2 + ...
The variable on the left of the ~
symbol is the
dependent variable you are trying to predict, while variables on the
right are your independent variables. The variable names are the same as
the column names in your data.frame. Each independent variable in your
model (if there is more than one) is separated by the +
symbol.
Independent-samples t-test
As covered in the lecture series, an independent samples t-test is used to assess the relationship between a categorical IV (with two levels that is between-subjects) and a continuous DV. In this example, we test whether people who self-identify as a ‘cat-person’ are more introverted than those who self-identify as a ‘dog-person’. This is an ‘independent-samples’ t-test because participants in one group are different to participants in the other group.
1. Prepare the data for analysis.
Click here
for details on the Introversion Scale. To keep things simple, we have
only used 5 items from this scale. Note: as specified in the link, low
scores on this scale indicate higher levels of introversion. Therefore,
to be consistent with the name of the measure, scores have been reversed
in the dataset (i.e., high scores corresponds with high introversion).
Also note: sometimes you will need to reverse code the items yourself,
while we don’t have to do this for this data, I have left the code in
(commented out) that would reverse score an Item 1 - to do this, we use
the recode()
function within the mutate()
function.
data1.clean <- data %>%
filter(cat.dog != "both") %>%
filter(cat.dog != "neither") %>%
filter(cat.dog != "") %>%
mutate( #introversion1R = recode(introversion1,`1` = 5,`2` = 4,`3`=3,`4` = 2,`5` = 1),
introvert = introversion2 + introversion5 + introversion7 + introversion8 + introversion10) %>%
select(cat.dog,introvert)
In the code above, we have removed participants that did not specify
identify as either a “dog-person” or “cat-person”. We have also used the
mutate()
function to calculate an overall introversion
score. We have also only selected the variables we need for our analysis
- this step is not strictly necessary, but can help you organise your
data.
2. Visualise the data.
Box Plot
Box plots are good for representing the range and quantiles of your
data. To create a box plot, use geom_boxplot()
. Make sure
to set the correct variables on the x- and y-axes (cat.dog
and introvert
respectively). We also map the ‘fill’
aesthetic to the cat.dog
variable so that the colours are
different.
ggplot(data1.clean,aes(x = cat.dog,y = introvert,fill = cat.dog)) +
geom_boxplot() +
theme_classic()
Violin Plot
Violin plots are good for representing the distributions of scores on
the DV between groups, and can be created using
geom_violin()
. All aesthetics are identical to the box plot
above. If you want to includes visualisations of each individual data
point, you can use geom_jitter()
. You can also visualise
the mean and standard error using the stat_summary()
function.
ggplot(data1.clean,aes(x = cat.dog,y = introvert,fill = cat.dog)) +
geom_violin() +
stat_summary() +
geom_jitter(width = .1,alpha = .2) +
theme_classic()
Bar Graphs
A common way of reporting data from a t-test is a bar graph. Unfortunately, this is a very uninformative way of presenting data (it gives no representation for the spread/variability in the data). We provide the code here so we you can re-create a bar graph if needed, and know how to read one if you see one in a paper, but then we will never speak of this evil again. If you need to plot data with a categorical IV and continuous DV, aim to use a box plot, or better yet a violin plot.
In a bar graph, the height of the bars represent the mean for each group. If you’re lucky, you may get error bars, which usually represents the standard error. In the example below, the error bars represents the standard error of the mean for each group on the introversion scale. Note that we have to calculate the standard error ourselves.
data1.summary <- data1.clean %>%
group_by(cat.dog) %>%
summarise(introvert_mean = mean(introvert,na.rm = TRUE),
introvert_sd = sd(introvert,na.rm = TRUE),
introvert_se = introvert_sd/sqrt(n()))
ggplot(data1.summary,aes(x = cat.dog,y = introvert_mean,fill = cat.dog)) +
geom_col() +
geom_errorbar(aes(ymin = introvert_mean - introvert_se,ymax = introvert_mean + introvert_se),width = .2) +
theme_classic()
3. Run the statistical test.
To run the t-test, we use the function t.test()
. As
mentioned above, the first argument is the formula. In this analysis,
the DV is called introvert
and the IV is called
cat.dog
, so the formula is
introvert ~ cat.dog
. The second argument or the
t.test()
function is the cleaned data.frame, which we’ve
called data1.clean
.
t.test(introvert ~ cat.dog,data1.clean)
##
## Welch Two Sample t-test
##
## data: introvert by cat.dog
## t = 0.97993, df = 19.771, p-value = 0.339
## alternative hypothesis: true difference in means between group cat and group dog is not equal to 0
## 95 percent confidence interval:
## -1.322787 3.663446
## sample estimates:
## mean in group cat mean in group dog
## 13.78571 12.61538
The output from the statistical test gives you a lot of information, but there are a couple bits that are particularly important:
- The t-statistic - this is the value for the test statistic (above this equals 0.98)
- The degrees of freedom (above this equals 19.77)
- The p-value - this tells you whether the test is significant or not (above this equals 0.339)
Can you see where each of these values come from in the output above?
4. Write-up analysis.
In order to report a t-test in APA format, we need the following information:
- The t-statistic and degrees of freedom (df) for the statistical test.
- the p-value from the statistical test.
- The mean on the DV for both groups.
- The standard deviations on the DV for both groups.
- The effect size (Cohen’s d)
The first three items can be obtained from the output of
t.test()
above; however, we need to do some extra things to
get the other two.
First, to get the SDs of the two groups, we need to use a few more
tidyverse functions. As covered in Demonstration 2,
summarise()
can be used to get descriptive statistics by
using functions like mean()
and sd()
to get
the mean and standard deviation of a variable respectively. However, in
order to calculate these statistics separately for different groups, we
need to use the group_by()
function.
group_by()
will group data based on a column, and perform
all subsequent tidyverse functions separately for each group until you
use the ungroup()
function.
data1.summary <- data1.clean %>%
group_by(cat.dog) %>%
summarise(introvert_mean = mean(introvert,na.rm = TRUE),
introvert_sd = sd(introvert,na.rm = TRUE))
data1.summary
## # A tibble: 2 × 3
## cat.dog introvert_mean introvert_sd
## <chr> <dbl> <dbl>
## 1 cat 13.8 4.00
## 2 dog 12.6 3.31
To get the Cohen’s d, we will be using the function
cohens_d()
from the effectsize
package. This
function works very similarly to the t.test()
function
above: we need to specify a formula and a dataset. In return, the
function will calculate our effect size.
cohens_d(introvert ~ cat.dog,data = data1.clean)
## Cohen's d | 95% CI
## -------------------------
## 0.33 | [-0.28, 0.95]
##
## - Estimated using pooled SD.
Once we have all the required information, we can write-up! A few things to note about APA formatting:
- All numerical values are rounded to 2 decimal points, except p-values which are rounded to 3 decimal points.
- All letters indicating stats (e.g., M, SD, t, p, d) should be italicised.
An independent-samples t-test found that there was no significant difference on introversion between cat-people (M = 13.79, SD = 4) and dog-people (M = 12.62, SD = 3.31), t(19.77) = 0.98, p = 0.339, Cohen’s d = 0.33.
One-way Between-Subjects ANOVA (Analysis of Variance)
The one-way ANOVA is used when you have a continuous DV and a categorical IV with more than two groups. Let’s revisit the hypothesis that cat-people are more introverted than dog-people, but what if we did not exclude participants that reported being ‘both’ or ‘neither’? We would then have four groups in the IV; therefore, a one-way ANOVA is the appropriate statistical test.
1. Prepare the data for analysis.
The code below is identical to the code above, except we do not exclude participants who reported “both” or “neither” (these commands have been commented out below).
data3.clean <- data %>%
# filter(cat.dog != "both") %>%
# filter(cat.dog != "neither") %>%
filter(cat.dog != "") %>%
mutate(introvert = introversion2 + introversion5 + introversion7 + introversion8 + introversion10) %>%
select(cat.dog,introvert)
2. Visualise the Data.
Similar to the independent-samples t-test, we can use a violin plot to visualise the distributions for each group.
ggplot(data3.clean,aes(x = cat.dog,y = introvert,fill = cat.dog)) +
geom_violin() +
stat_summary() +
geom_jitter(width = .2,alpha = .2) +
theme_classic()
3. Run the statistical test.
We run the ANOVA using the aov()
function. We can then
pipe it to the summary()
function to get interpretable
output.
aov(introvert ~ cat.dog,data = data3.clean) %>%
summary()
## Df Sum Sq Mean Sq F value Pr(>F)
## cat.dog 3 31.1 10.36 0.863 0.464
## Residuals 84 1008.7 12.01
The table above tells us whether there is a significant difference between any of the groups, but it does not tell us which groups are significantly different. In order to do this, we need to do follow-up t-tests. You only need to conduct follow up comparisons if the ANOVA is significant. Even though we do not find any significant differences above, we will go through the process anyway from demonstrative purposes.
In the code below, we run a t.test similar to what we did above.
However, when calling the data, we use the filter function to only
select participants who are in one of two groups. The |
symbol means ‘or’, so below, we are only analysing the data of
participants who are in the dog ‘or’ cat group.
t.test(introvert ~ cat.dog,data = filter(data3.clean,cat.dog == "dog" | cat.dog == "cat"))
##
## Welch Two Sample t-test
##
## data: introvert by cat.dog
## t = 0.97993, df = 19.771, p-value = 0.339
## alternative hypothesis: true difference in means between group cat and group dog is not equal to 0
## 95 percent confidence interval:
## -1.322787 3.663446
## sample estimates:
## mean in group cat mean in group dog
## 13.78571 12.61538
Now, we are only including participants in the dog or both group:
t.test(introvert ~ cat.dog,data = filter(data3.clean,cat.dog == "dog" | cat.dog == "both"))
##
## Welch Two Sample t-test
##
## data: introvert by cat.dog
## t = 0.82354, df = 60.689, p-value = 0.4134
## alternative hypothesis: true difference in means between group both and group dog is not equal to 0
## 95 percent confidence interval:
## -0.9778595 2.3470903
## sample estimates:
## mean in group both mean in group dog
## 13.30000 12.61538
We continue to repeat this process until we have tested all possible linear combinations:
t.test(introvert ~ cat.dog,data = filter(data3.clean,cat.dog == "dog" | cat.dog == "neither"))
##
## Welch Two Sample t-test
##
## data: introvert by cat.dog
## t = -1.7156, df = 5.8303, p-value = 0.1385
## alternative hypothesis: true difference in means between group dog and group neither is not equal to 0
## 95 percent confidence interval:
## -5.3226197 0.9533889
## sample estimates:
## mean in group dog mean in group neither
## 12.61538 14.80000
t.test(introvert ~ cat.dog,data = filter(data3.clean,cat.dog == "cat" | cat.dog == "both"))
##
## Welch Two Sample t-test
##
## data: introvert by cat.dog
## t = -0.38959, df = 22.665, p-value = 0.7005
## alternative hypothesis: true difference in means between group both and group cat is not equal to 0
## 95 percent confidence interval:
## -3.066906 2.095477
## sample estimates:
## mean in group both mean in group cat
## 13.30000 13.78571
t.test(introvert ~ cat.dog,data = filter(data3.clean,cat.dog == "cat" | cat.dog == "neither"))
##
## Welch Two Sample t-test
##
## data: introvert by cat.dog
## t = -0.64345, df = 11.232, p-value = 0.5329
## alternative hypothesis: true difference in means between group cat and group neither is not equal to 0
## 95 percent confidence interval:
## -4.475049 2.446478
## sample estimates:
## mean in group cat mean in group neither
## 13.78571 14.80000
t.test(introvert ~ cat.dog,data = filter(data3.clean,cat.dog == "both" | cat.dog == "neither"))
##
## Welch Two Sample t-test
##
## data: introvert by cat.dog
## t = -1.134, df = 6.7319, p-value = 0.2955
## alternative hypothesis: true difference in means between group both and group neither is not equal to 0
## 95 percent confidence interval:
## -4.65312 1.65312
## sample estimates:
## mean in group both mean in group neither
## 13.3 14.8
Remember, in your interpretation to adjust the significance level (alpha) to account for family-wise error rate. Since we conducted six comparisons, the significance level should be reduced to 0.008 (that is, .05 divided by 6).
4. Write-up analysis.
In order to report a one-way ANOVA in APA format, we need the following information:
- The F-statistic for the statistical test.
- The Group and Residual degrees-of-freedom.
- The p-value from the statistical test.
- The effect size (measured as eta-squared).
To get the eta-squared value, we can use the
eta_squared()
function from the effectsize
package. You can pipe the output of the aov()
function into
the eta_squared()
function to get the effect size
estimate.
aov(introvert ~ cat.dog,data = data3.clean) %>%
eta_squared()
## # Effect Size for ANOVA
##
## Parameter | Eta2 | 95% CI
## -------------------------------
## cat.dog | 0.03 | [0.00, 1.00]
##
## - One-sided CIs: upper bound fixed at [1.00].
If the ANOVA is significant, then you need to also include the post-hoc comparisons. For this, you need the following information (essentially the same as if you were writing up a t-test):
- The mean on the DV for all groups.
- The standard deviations on the DV for both groups.
- The test-statistics and p-values from the comparisons.
- The effect size for each comparison (measured as Cohen’s d).
The code below will calculate the mean and standard deviation on the dependent variable for all groups. It is identical to the code that does this for the t-test.
data3.clean %>%
group_by(cat.dog) %>%
summarise(dv_mean = mean(introvert,na.rm = TRUE),
dv_sd = sd(introvert,na.rm = TRUE))
## # A tibble: 4 × 3
## cat.dog dv_mean dv_sd
## <chr> <dbl> <dbl>
## 1 both 13.3 3.51
## 2 cat 13.8 4.00
## 3 dog 12.6 3.31
## 4 neither 14.8 2.59
Once you have all this information, the write-up becomes:
A one-way between subjects ANOVA found no significant difference between groups on introversion, F(3, 84) = 0.86, p = 0.464, Eta-squared = 0.03.
Paired-samples t-test
As discussed in the lecture, a paired-samples t-test is like an independent-samples t-test, except rather than assess two separate groups of participants (i.e., a between-subjects design), the paired samples t-test is used when data is collected from the same participant in two separate conditions (i.e., a within-subjects design). A common example of this is testing for significant differences in an outcome variable before and after a treatment or intervention.
In the example below, we test whether mood is influenced by exposure to a cute cat video. Click here to see that video.. We would expect that mood would improve after viewing the cat video.
1. Prepare the data for analysis.
When conducting a statistical test, we follow the same procedure as before. First, we must clean the data. Below, we select the columns we need to run the paired-samples t-test and remove any participants with missing data on either variable.
data2.clean <- data %>%
drop_na(pre.mood) %>%
drop_na(post.mood) %>%
select(student.no,pre.mood,post.mood)
3. Visualise the data.
We can plot the data using a violin plot, the same way we visualised
the data for an independent-samples t-test. However, this does not give
an indication of the change in scores within a participant. One way to
represent this graphically, is to plot the data point for each
participant, and include a line between their scores across the two
conditions. We also adjust the opacity of the points and lines using the
alpha
aesthetic, just so that the graph isn’t a complete
mess.
To create this plot, we first need to re-shape the data using the
gather()
function so that each observation is on a separate
row (rather than each participant). The gather function works by first
specifying a column name that distinguishes between observations made by
the same participant (i.e., between pre.mood
and
post.mood
), then a variable name that corresponds with the
response (i.e., the number corresponding with mood). Then, you specify
the variables that you want to reshape. Note: The last line in the code
that reshapes the data below is only used so that the pre-mood and
post-mood variables are in the right order when we plot the data.
plot.data <- data2.clean %>%
gather(key = "time",value = "mood",pre.mood,post.mood) %>%
mutate(time = factor(time,levels = c("pre.mood","post.mood")))
ggplot(data = plot.data,aes(x = time,y = mood)) +
geom_violin(aes(fill = time)) +
stat_summary() +
geom_point(alpha = .2) +
geom_line(aes(group = student.no),alpha = .2) +
xlab("Exposure Time (Before or After Video)") +
ylab("Mood Rating") +
theme(legend.position = "none") +
theme_classic()
3. Run the statistical test.
We use the same function as with the independent-samples t-test to
run a paired-samples t-test, though the formula looks slightly
different. Rather than have the DV on the left, and the IV on the right,
we specify both variables on the left within a Pair()
function. You could consider this is because both columns technically
contains values of the DV (just at different levels of the IV of time).
On the right side of the ~
symbol, we have a
1
.
t.test(Pair(pre.mood,post.mood) ~ 1,data = data2.clean)
##
## Paired t-test
##
## data: Pair(pre.mood, post.mood)
## t = -5.9739, df = 87, p-value = 4.944e-08
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -11.706674 -5.861508
## sample estimates:
## mean difference
## -8.784091
Alternatively, if you don’t want to use a formula, you can also use
the following technique, but note that the arguments supplied here are
different to almost all other analyses we will be covering in this
module. The first and second arguments are the columns we want to
compare. In this example, this is the ‘before’ and ‘after’ column. How
we specify this is using the following syntax:
data$pre.mood
. This is short-hand for specifying the
‘pre.mood’ column in the ‘data’ data.frame. We also have to set ‘paired’
to TRUE
so that the function knows it’s a paired-sample
t-test. Note, though, we do not have to supply a data.frame using this
method. So the code looks like:
t.test(data$pre.mood,data$post.mood,paired = TRUE)
##
## Paired t-test
##
## data: data$pre.mood and data$post.mood
## t = -5.9739, df = 87, p-value = 4.944e-08
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -11.706674 -5.861508
## sample estimates:
## mean difference
## -8.784091
4. Write-up analysis.
Similar to the independent-samples t-test, we require information on the mean and standard deviation for each condition. This code is exactly the same as above.
data2.summary <- data2.clean %>%
summarise(pre.mean = mean(pre.mood),
pre.sd = sd(pre.mood),
post.mean = mean(pre.mood),
post.sd = sd(post.mood))
data2.summary
## pre.mean pre.sd post.mean post.sd
## 1 62.81818 19.34594 62.81818 18.9976
Or if you have not reshaped the data:
data2.summary <- data2.clean %>%
summarise(pre.mean = mean(pre.mood,na.rm = TRUE),
pre.sd = sd(pre.mood,na.rm = TRUE),
pst.mean = mean(post.mood,na.rm = TRUE),
pst.sd = sd(post.mood,na.rm = TRUE))
data2.summary
## pre.mean pre.sd pst.mean pst.sd
## 1 62.81818 19.34594 71.60227 18.9976
We also need the effect size, which is Cohen’s d. Again, we can use
the cohens_d()
function from the effectsize
package.
cohens_d(Pair(pre.mood,post.mood) ~ 1,data = data2.clean)
## Cohen's d | 95% CI
## --------------------------
## -0.64 | [-0.86, -0.41]
A paired-samples t-test found that there was a significant difference on mood before (M = 62.82, SD = 19.35) and after (M = 71.6 , SD = 19) viewing a cute cat video, t(87) = -5.97, p < .001, Cohen’s d = -0.64.