Posted on January 15, 2022 by R in the Lab in R bloggers | 0 Comments
[This article was first published on R in the Lab, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Basic design of experiments in R for one factor and two factors designs.
You can find all the code, data and results in the GitHub repository for this post: Basic design of experiments.
There is no signal without noise
It never hurts to go back to basics before tackling more complex things. The purpose of this post is to give a brief overview of the basics of design of experiments, their analysis and how to present results using R and packages like ggplot2 and agricolae. Included are one- and two-factor experiments.
The design of experiments (DOE) deals with the planning and performance of tests with the objective of generating data. Statistical analysis of these data will provide objective evidence that will allow the researcher to resolve questions about a given situation, process or phenomenon.
DOE can be applied to scientific research and in industry. Its goal is to generate knowledge and learning in an efficient manner. Ideally, this process can be considered as a cycle where initial hypotheses are refined as more information is obtained. Thus, we start with an initial hypothesis that we test through experimentation. If the data (our evidence) does not agree with the hypothesis, a new hypothesis is sought to explain the observed discrepancy.
Hypotheses are tentative explanations of the research phenomenon that are stated as propositions or assertions. There are different types of hypotheses, such as:
A datum is a number that is the product of a measurement. A measurement is a process that links abstract concepts with empirical indicators. Of course, measurements could not be made without a measurement instrument, which can be defined as the resource used by the researcher to record information or data.
It is a change in the operating conditions of a system or process, which is made with the objective of measuring the effect of the change on one or more properties of the product or result.
Piece(s) or sample(s) used to generate a value that is representative of the test result.
Through these variable(s) we know the effect or results of each experimental test. They are the product of our measurements once we make changes in the system.
They are process variables that can be modified and fixed at a given level.
These are variables that cannot be controlled during the experiment or the normal operation of the process.
These are the variables that are investigated in the experiment to observe how they affect or influence the response variables.
The different values assigned to each factor studied in an experimental design are called levels. A combination of levels of all the factors studied is called a treatment or design point. In the case of experimenting with a single factor, each level is a treatment.
It is the arrangement formed by the treatments that will be carried out, including the repetitions. In this same array are usually included the results of each response variable for each treatment and repetition. For me this is a concept of great importance, since from this matrix it is possible to perform directly the different statistical tests in R.
It is the observed variability that cannot be explained by the factors studied. It is the result of the effect of factors not studied and experimental error.
A component of random error that reflects the experimenter’s errors in the planning and execution of the experiment.
When the objective is to study the effect of a single factor on the mean of a response variable taking into account more than two values of the factor under consideration, it is advisable to perform a completely randomized experiment and analyze the results by analysis of variance (ANOVA). The reason for comparing means by ANOVA and not by Student’s t-tests is that the latter test increases false positives as the number of means to be compared increases, i.e., we will find differences between means where there are none.
The analysis of variance consists of separating the total variation observed in each of the sources that contribute to it.
Obtaining a single factor design is very simple using R commands. Let’s say we want to compare the effect of three baking times (35, 45 and 60 min) in the average diameter of cookie batches. For each time we baked ten cookie batches so we have ten replicates. Note that each time and replicate have to be run in aleatory order:
set.seed(6) b_timeSubsequently, the design can be exported in some other format such as CSV:
write.csv(one_fct_dgn, "data/one_fct_dgn.csv", row.names = FALSE)The results can be integrated into the design matrix directly, let’s simulate the average diameter with a small function:
av_diam set.seed(3) diameterOr, once the results have been recorded externally, they can be imported from a CSV file:
exp_one_factorAnalysis
The aov function is used for the analysis of factorial designs. Previously, it is recommended to convert the factor values into the factor class and then perform the analysis. Here I going to analyze our first design ( one_fct_dgn ):
one_fct_dgn$b_timeNote the aov requires a formula that explicitly specifies the relationship between the response and the factor(s) in the design. Once this has been done, it is possible to obtain an ANOVA table using the “anova” function:
one_fct_anova F) ## b_time 2 367.51 183.755 44.902 2.587e-09 *** ## Residuals 27 110.49 4.092 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1And export it in the format of your choice:
write.csv(one_fct_anova, "results/anova_one_fct.csv", na = "", row.names = FALSE)Multiple range comparisons or tests
Once significant differences are established between at least two of the means, the next step is to perform a multiple comparison test, which will allow us to define which means are significantly different. There are several methods that can be used as the LSD method, Tukey’s method, or Duncan’s method. Nice functions for these methods can be found in agricolae package.
LSD
library(agricolae) one_fct_lsdTukey HSD
one_fct_hsdDuncan
one_fct_duncanThe results are consistent among the three methods. The section groups indicates which means are different, different letters mean significant differences. You can export the results of this kind of methods, as a TXT file, with the function capture.output . For example, to export the results of the Tukey method:
capture.output(one_fct_hsd, file = "results/hsd_one_fct.txt")Verification of model assumptions
Something that is not very widespread in the scientific literature is the verification of the normality of the residuals, the equality between variances for the means of each treatment and the independence between each data recorded. Verification of these assumptions helps to validate the differences established as significant.
Shapiro-Wilks test to verify normality
This test is easy to perform:
one_fct_shapiroHere, if p-value > 0.05, we can say that data come, approximately, from a normal distribution.
The results of this analysis can be exported with the capture.output function:
capture.output(one_fct_shapiro, file = "results/shapiro_one_fct.txt")Bartlett’s test for equality of variances
To perform this test it is possible to use the bartlett.test function:
one_fct_btA p-value > 0.05 mean that treatment means have equal variances. We can export the results in the same way:
capture.output(one_fct_bt, file = "results/bartlett_one_fct.txt")Independence
To verify independence between observations, residuals are plotted as a function of run order. If the residuals follow a defined pattern it will be a clear indication of lack of independence. For this plot I used the ggplot2 package:
library(ggplot2) one_fct_dgn$residualsTests such as the Durbin-Watson test are also available to verify this assumption. Use the function durbinWatsonTest in the car package:
library(car) one_fct_dwA p-value > 0.05 means that data observations are not auto correlated. You can also save the results in the usual way:
capture.output(one_fct_dw, file = "results/durbin_watson_one_fct.txt")Display of results
Bar graph
To visualize the results of this type of experiments, bar graphs are usually used, whose height represents the magnitude of each mean, together with error bars representing the standard deviation and letters showing the significant differences established by the multiple comparisons test:
library(dplyr) # Means, standard deviations and groups from Tukey HSD one_fct_means % group_by(b_time) %>% summarise( av_diam = mean(diameter), sd_diam = sd(diameter) ) %>% ungroup() %>% arrange(desc(av_diam)) %>% mutate(group = unlist(one_fct_hsd$groups["groups"])) # Bar plot ggplot(one_fct_means, aes(b_time, av_diam)) + geom_col(width = 0.5) + geom_errorbar( aes(ymin = av_diam - sd_diam, ymax = av_diam + sd_diam), width = 0.1 ) + geom_text(aes(label = group, y = av_diam + sd_diam), vjust = -0.5) + xlab("Baking time (min)") + ylab("Average diameter") + theme_classic()Alternative
Although bar graphs are widely used in the scientific literature of various types, their use hides the true dispersion of the data for each treatment. One thing I have also noticed is that people tend to judge differences between means by the overlap or separation of standard deviations, which seems to me to be more of a bias than an objective criterion.
As an alternative to the bar chart, it is possible to make a graph showing each observation, the mean, and the letters obtained in the multiple comparisons test:
ggplot(one_fct_dgn, aes(b_time, diameter)) + geom_point() + stat_summary(fun = mean, geom = "crossbar", width = 0.2, color = "blue") + geom_text( data = one_fct_means, aes(label = group, y = av_diam), hjust = -3 ) + xlab("Baking time (min)") + ylab("Diameter") + theme_classic()Two-factor designs
In this type of design the objective is to verify the individual and interaction effect of two factors. Before showing how to obtain and analyze them, it is necessary to recall some concepts:
To obtain this kind of designs we can use the function fac.design from DoE.base package. For this we need specify the number of factors and the number of levels of each factor:
library(DoE.base) two_fct_dgnFor more details about this function, just type ?fac.design in your console. Also note that there is an extra column with the name “Blocks”, this is not big deal since there is just one and we won’t taking it into account further in the analysis. In future posts I will address block designs.
This design can also be exported in the desired format:
write.csv(two_fct_dgn, file = "data/two_fct_dgn.csv", row.names = FALSE)The results can also be integrated directly or imported once recorded in the design matrix. Here, I’m going to import and work with data previously recorded:
exp_two_factorAnalysis
To analyze this type of design, the “aov” function is used, specifying each main effect and the interaction effect in the formula:
exp_two_factor$factor1You can also use the short form Y ~ A*B , which will be displayed in full once you get the ANOVA table:
two_fct_aov F) ## factor1 3 2125.11 708.37 24.6628 1.652e-07 *** ## factor2 2 3160.50 1580.25 55.0184 1.086e-09 *** ## factor1:factor2 6 557.06 92.84 3.2324 0.01797 * ## Residuals 24 689.33 28.72 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 write.csv(two_fct_anova, file = "results/anova_two_fct.csv", na = "")Note that previously I also converted the values of each factor in the factor class.
Multiple range comparisons tests
Once a significant interaction effect is established, the next step is to establish which means differ. For this, a Tukey test can be used, for which the interaction effect must first be explicitly specified in a separated aov model:
two_fct_aov_2The results of Tukey’s test can be exported directly:
capture.output(two_fct_hsd, file = "results/hsd_two_fct.txt")Verification of model assumptions
For the verification of the assumptions it is possible to proceed in a manner similar to that of the single-factor designs.
Shapiro-Wilks test to verify normality
two_fct_shapiroBartlett’s test for equality of variances
For Bartlett test I previously made an extra column specifying the combination between factors (treatments). This will indicate the groups to bartlett.test function:
exp_two_factor$treatmentIndependence
exp_two_factor$residualstwo_fct_dwDisplay of results
Bar graph
The realization of this type of graphs requires keeping in mind the individual means of each treatment and the interaction effects, if there is some. To calculate them together with the standard deviations we used the “group_by” function and then “summarise” from the “dplyr” package:
# Means, standard deviations and groups from Tukey HSD two_fct_means % group_by(factor1, factor2) %>% summarise( av_res = mean(res), sd_res = sd(res) ) %>% ungroup() %>% arrange(desc(av_res)) %>% mutate(group = unlist(two_fct_hsd$groups["groups"])) # Bar plot ggplot(two_fct_means, aes(factor1, av_res, fill = factor2)) + geom_col(width = 0.5, position = position_dodge()) + geom_errorbar( aes(ymin = av_res - sd_res, ymax = av_res + sd_res), width = 0.1, position = position_dodge(width = 0.5) ) + geom_text( aes(label = group, y = av_res + sd_res), vjust = -0.5, position = position_dodge(width = 0.5) ) + labs(fill = "Factor 2") + xlab("Factor 1") + ylab("Average response") + theme_classic()Alternative
To make the graph showing each observation, the mean, a confidence interval and the letters obtained in the multiple comparisons test, it is necessary to use the data.frame with the observations next to the one with the means:
ggplot(exp_two_factor, aes(factor1, res, group = factor2)) + geom_point(aes(color = factor2), position = position_dodge(width = 0.9)) + stat_summary( fun = mean, geom = "crossbar", width = 0.2, color = "blue", position = position_dodge(width = 0.9) ) + geom_text( data = two_fct_means, aes(label = group, y = av_res), vjust = -4, size = 3.5, position = position_dodge(width = 0.9) ) + ylim(c(55, 130)) + xlab("Factor 1") + ylab("Response") + labs(color = "Factor 2") + theme_classic()## Interaction plot
A good way to visualize the interaction effect is through interaction.plot function:
Factor1
That’s it! Thank you very much for visiting this site, I hope you find the content of this post useful. See you soon!
Juan Pablo Carreón Hidalgo 🤓