Title: | Data Sets for the Book "Experimental Design for Laboratory Biologists" |
---|---|
Description: | Contains data sets to accompany the book "Experimental Design for Laboratory Biologists: Maximising Information and Improving Reproducibility" (available mid 2016 from Cambridge University Press). |
Authors: | Stanley E. Lazic |
Maintainer: | Stanley E. Lazic <[email protected]> |
License: | GPL-3 |
Version: | 1.0 |
Built: | 2025-02-07 05:11:51 UTC |
Source: | https://github.com/stanlazic/labstats |
Simulated data for two variables to calculate an assay window between positive and negative control samples.
A data frame with 40 rows and 3 variables:
Values for the first outcome variable.
Values for the second outcome variable.
Indicates if a sample is a positive or negative control.
Simulated data to illustrate the effects of blocking versus adjusting for a covariate.
A data frame with 8 rows and 6 variables:
A baseline measurement of body weight for eight rats. The rows of the data frame are sorted according to weight (lightest to heaviest).
Rats are grouped into four blocks based on body weight.
Treatment group when using a randomised block design (RBD).
Treatment group when using a completely randomised design (CRD).
Outcome variable under the RBD.
Outcome variable under the CRD.
The experimental manipulation is a new diet versus a standard control diet and the outcome is the amount of food eaten on each diet. Since rats that weigh more at the beginning of the experiment are expected to eat more food, regardless of the diet, it would be beneficial to account for this source of variation. This can be done either through use of blocking or covariate adjustment and data for both designs are included. Note that only one design could be used in a real experiment but here we generate outcome values for two experiments using the same baseline body weight values.
For the randomised block design the eight rats are ranked according to baseline body weight and grouped into four blocks of two (the two lightest rats form the first block, the next two the second, and so on). Assignment to treatment group is done within blocks.
# Randomised block design summary(aov(y.RBD ~ factor(block) + RBD, data=block.covars)) # Completely randomised design with weight as a covariate summary(aov(y.CRD ~ weight + CRD, data=block.covars))
# Randomised block design summary(aov(y.RBD ~ factor(block) + RBD, data=block.covars)) # Completely randomised design with weight as a covariate summary(aov(y.CRD ~ weight + CRD, data=block.covars))
Cell count data from a high-throughput screen with six 1536 well plates.
A data frame with 9216 rows and 4 variables:
Name of plate.
Row position of the well.
Column position of the well.
Number of cells in the well.
Simulated data for two outcome variables.
A data frame with 15 rows and 3 variables:
Sample identification number.
The first outcome variable.
The second outcome variable.
The data contains five samples with three measurements made on each sample; these are "technical replicates" and do not contribute to the sample size. Which of the two outcome variables, A or B, would make a better primary outcome?
Sixteen mice from four strains were used; half were assigned to the treated condition and half to the control. The experiment was split into two batches, which were run two months apart.
A data frame with 16 rows and 4 variables:
Four strains of mice were used: NIH, BALB/c, A/J, and 129/Ola.
Presence or absence of diallyl sulphide.
The experiment was conducted in two batches.
Levels of glutathione-S-transferase (Gst).
The purpose of the experiment was to test if diallyl sulphide (DS) affects the activity of the liver enzyme Gst. Four strains of mice were used and the experiment was conducted in two batches, where the housing conditions differed between batches. Both batch and strain are considered blocking variables.
Festing MFW (2014). Randomized block experimental designs can increase the power and reproducibility of laboratory animal experiments. ILAR Journal 55(3):472-476.
Twenty rats were given fluoxetine and tested in the forced swim test (FST).
A data frame with 20 rows and 2 variables:
Dose of fluoxetine (0, 80, 160, or 240 mg/L).
Total immobility time in the FST (seconds).
Twenty rats were randomly assigned to a control condition or three doses of fluoxetine. Fluoxetine was administered in the drinking water. The amount of time the rats were immobile in the FST was measured and greater immobility time implies a more depressive phenotype.
Lazic SE (2008). Why we should use simpler models if the data allow this: relevance for ANOVA designs in experimental biology. BMC Physiology 8:16.
An experiment measuring the glycogen content in rat livers that uses subsampling.
A data frame with 36 rows and 4 variables:
Glycogen content.
A numeric treatment group indicator (1, 2, 3).
Rat identification number. The numbers are not unique; each treatment has a rat numbered 1 and 2, but these not the same rats (rats are nested under treatment).
A numeric variable indicating the liver piece. Each liver was divided into three pieces.
Six rats were randomised to three treatment conditions (two per condition). Their livers were divided into three pieces and two measurements were taken on each piece for 6 x 3 x 2 = 36 observations. The rats are the experimental units and there are two levels of subsampling.
Sokal RR and Rohlf FJ (1995). Biometry. WH Freeman and Co., New York, NY.
Simulated data for one gold standard and four other outcome variables.
A data frame with 20 rows and 5 variables:
Values for the gold standard outcome variable.
An outcome variable that is identical to the gold standard.
An outcome variable created by adding noise to the gold standard.
An outcome variable created by adding noise plus a shift in location.
An outcome variable created by adding noise plus a shift in scale.
The data contains a gold standard plus four other simulated variables that vary from the gold standard in different ways.
Repeated measures data set from Casella (2008) testing the effect of high and low dietary calcium on blood pressure.
A data frame with 30 rows and 4 variables:
Patient identification number.
High or low calcium condition.
One of three time points (units unknown).
Blood pressure, the outcome variable.
Ten patients were randomly assigned to a high or low dietary calcium condition (five per group). Blood pressure measurements were taken at three time points.
Casella G (2008). Statistical Design. Springer, New York, NY.
Repeated measures data set from Kristensen and Hansen (2004) testing the effect of pinacidil on the force of muscle contraction.
A data frame with 210 rows and 4 variables:
Observation number, from 0 to 14. Observations were taken every 30 seconds.
Condition, either Placebo or Pinacidil.
Rat identification number.
Force of muscle contraction, normalised to the first time point.
Seven rats were euthanized and both soleus muscles (lower leg) were extracted from each rat. One muscle from each pair was randomised to the pinacidil condition and the other to the control condition. Fifteen measurements of the force of muscle contraction were taken every thirty seconds on each muscle.
Kristensen M, Hansen T (2004). Statistical analyses of repeated measures in physiological research: a tutorial. Adv Physiol Educ. 28(1-4):2-14.
Distance travelled by rats in the open field test for a drug and control condition.
A data frame with 282 rows and 4 variables:
Drug or control indicator.
Rat identification number.
Time point of observation, in increments of 15 minutes.
Distance travelled.
Forty-seven rats were randomised to a control or drug condition. Rats were injected with either a drug or saline at time = 0. Locomotor activity in the open field was recorded for 90 minutes. The distance travelled in each 15 minute interval is reported.
A mouse split-plot experiment with two treatments.
A data frame with 24 rows and 5 variables:
Mice were from one of six litters.
Male or Female.
VPA or saline (SAL).
MPEP or saline (SAL).
Locomotor activity.
Six pregnant female mice were randomly assigned to receive an injection of valproic acid (n = 3) or saline (n = 3). The offspring of these mice (n = 24) were then randomly assigned to receive an injection of the glutamate receptor antagonist MPEP (n = 12) or saline (n = 12). There are two levels of randomisation: the pregnant females and their offspring.
This is a subset of the full data set from Mehta et al., which contained fourteen litters. The litters were selected to have an equal sample size across all conditions for illustrative purposes. The complete data can be found in the supplementary material of Lazic and Essioux (2013).
Mehta MV, Gandal MJ, Siegel SJ (2011). mGluR5-antagonist mediated reversal of elevated stereotyped, repetitive behaviors in the VPA model of autism. PLoS ONE 6(10):e26077.
Lazic SE, Essioux L (2013). Improving basic and translational science by accounting for litter-to-litter variation in animal models. BMC Neuroscience 14:37.