University of Minnesota, Twin Cities School of Statistics Charlie's Home Page Stat 3701 Reproducibility Page
Note added in 2020: This web page is old, written before R packages knitr and Rmarkdown even existed. This page has been updated (more or less) to cover those as well as R function Sweave on the Stat 3701 Reproducibility Page. So you probably want to look there.
But for historical reasons we leave the old web page that was always here still here.
This web page provides an illustration or three of Sweave
,
which is literate programming
for R or (a newer buzzword)
a package for reproducible research
.
Sweave
is an R function now available by default.
Its author is
Friedrich Leisch
and its web site is
just a directory,
not a web page wherein one can find the manual, the FAQ, and
three papers about it.
The term literate programming
was coined by
Donald Knuth,
one of the true geniuses of computing, the author/inventor of
TeX, among other things.
The basic idea is that
Some web sites describing this are
The term reproducible research
is newer, and I don't know who
coined it. The basic idea is simple. It's the scientific ideal.
page pressureand no room for full explanations.
page pressurethere!
supplementary materialson the internet, either at the journal's or the author's web site. It doesn't matter so long as the material is permanently available. Data, computer programs, whatever should be there.
But even more, the entire analysis should be reproducible. In real science, this is hard. Redoing all the chemistry, or all the field work, or whatever is asking a lot.
But in mathematical and computing sciences, like statistics, reproducibility is perfectly possible. It only takes will and knowledge to do it.
Some web sites describing this are
www.reproducibleresearch.org
This one is just a demo.
The LaTeX (PDF) shows what one can do.
The source file (foo.Rnw
)
shows how to do it.
This example is a little unusual because it includes a lot of
Sweave
included literally in the LaTeX portion of the
document (so you can see what it looks like), but that's not what
you usually want to do.
The following examples are more typical. They don't explain
Sweave
, they just do a job.
A package vignette
is an Sweave
file that illustrates
the use of the package. Because it is Sweave
, it is non-bogus.
The code actually works. We know it works because it worked to produce
the LaTeX output!
For the official poop on vignettes see the relevant section in the Writing R Extensions book.
The day I heard Robert Gentleman talk about this, I went home and wrote
the following vignette for an (already written) package.
The LaTeX (PDF) shows what one can do.
The source file
(demo.Rnw
) shows how to do it.
One new bit about vignettes. They have a line like
% \VignetteIndexEntry{MCMC Example}
from this example, which is just a LaTeX comment but is used to tell the package that this is a vignette. So the regular documentation mentions the vignette (click on overview to see it).
A paper on the theory in Yun Ju Sung's thesis,
Monte Carlo Likelihood
Inference for Missing Data Models (preprint) contained computing examples.
Every number in the paper and every plot was taken (by cut-and-paste,
I must admit) from a supplementary materials
document done
in Sweave
.
The LaTeX (PDF) shows what one can do.
The source file
(examples.Rnw
) shows how to do it.
Sweave.sty
In recent versions of R the way to run Sweave from the command line
R CMD Sweave foo.Rnw latex foo latex foo
(or pdflatex
instead of latex
) no longer works
unless you have the line
export SWEAVE_STYLEPATH_DEFAULT="TRUE"
in your .bashrc
file. Then it does work.
Also the following works without this environment variable set. Inside R do
Sweave("foo.Rnw") library(tools) texi2dvi("foo.tex")
with the latter replaced by
texi2dvi("foo.tex", pdf = TRUE)
if you want PDF rather than DVI output. By the way, the texi2dvi
command automagically runs latex
or pdflatex
multiple
times until all the cross references are right.