This section is aiming for a happy medium of flexibility for producing training material. The approach is inspired by how Software Carpentry structure their git repositories, however by using the rmarkdown
package we can keep things simple. If you want to go simpler still, rmarkdown
also has a render_site
function which can build a whole website at once, which might be sufficient to your needs.
We will first look at the rmarkdown
package, then how to use this package from a Makefile to automate building of .html files from .Rmd files.
Under the hood, when you press the “Knit” button in RStudio the rmarkdown
package’s render
function is used. We can also use this package from the R console.
rmarkdown::render("foo.Rmd")
The rmarkdown
package in turn uses the knitr
package to actually run R code, and external program pandoc
to create HTML. When creating a PDF, pandoc
in turn uses pdflatex
.
We can run this from a shell using:
Rscript -e 'rmarkdown::render("foo.Rmd")'
A programmer’s natural instinct will be to place this in a “Makefile” so that the make
command can make the HTML document whenever we need to.
Create a file called Makefile
(in RStudio, select “File/New File/Text File”, and save it with the name Makefile
). Enter the following in the file:
foo.html : foo.Rmd
Rscript -e 'rmarkdown::render("foo.Rmd")'
Important: The indentation in a Makefile is important, and it must be a tab character. You can check this from R with the command readLines("Makefile")
– you should see the indent shown as a \t
escape sequence.
We have described how to create the file foo.html
from the input file foo.Rmd
. Multiple input files can be given if necessary, separated by spaces.
Now in a shell, type:
make foo.html
Nothing happens, as the .html
file was created after out last edit of the .Rmd
. Make thinks the file is up to date, and does not run the command.
Edit the .Rmd
file, and again run make foo.html
in the shell. This time the command should run.
If you just type make
on its own, the first rule in the Makefile
is used.
Add some code to foo.Rmd
that deliberately contains an error. What happens when you try to make
it?
Create a second R Markdown file, bar.Rmd
. We could add a second rule to the Makefile to convert this into HTML, but this becomes repetitive. A better option is the use a “pattern rule”:
%.html : %.Rmd
Rscript -e 'rmarkdown::render("$<")'
Note here:
%
symbol in the input and output filenames indicates this is a pattern rule.$<
to insert the input filename into the command. More on variables later.Now to render our second file we can type:
make bar.html
Consider this Makefile:
all : foo.html bar.html
echo All files are now up to date
%.html : %.Rmd
Rscript -e 'rmarkdown::render("$<")'
The all
rule depends on some files, so make
will ensure these are up to date.
The command echo
in the rule just prints out a message (this can even be left out entirely). make
does not check that a file called all
was actually created.
The all
rule is the first rule in the Makefile
, so it is the default rule that is run when you type make
.
Try editing one of your Rmarkdown files and running make
again. Notice only the command to build that file’s corresponding HTML file is run.
It’s common practice to also have a rule to clean away all the files that have been created. This gives a way to test everything works from a blank slate.
all : foo.html bar.html
echo All files are now up to date
clean :
rm -f foo.html bar.html
%.html : %.Rmd
Rscript -e 'rmarkdown::render("$<")'
Now we can type make clean
to clean up, and make clean ; make
will test that everything can be built without errors.
A final element of a Makefile is use of variables. With this, you should be able to understand most of what is going on in Makefiles you encounter in the wild.
In our last Makefile, we typed foo.html bar.html
twice. It would be nice to avoid this repetition. This can be achieved with a variable:
HTML_FILES=foo.html bar.html
all : $(HTML_FILES)
echo All files are now up to date
clean :
rm -f $(HTML_FILES)
%.html : %.Rmd
Rscript -e 'rmarkdown::render("$<")'
Create a file index.Rmd
, with hyperlinks to foo.html
and bar.html
. Recall that hyperlinks can be created in markdown with [link text](destination.html)
.
Adjust your Makefile so this file is also built into an HTML file.
Now you have a website. To test your website, from the “Files” pane in RStudio, click on index.html
and then “View in Web Browser”.
You could make this public for example by using GitHub Pages, or by running your own webserver (such as Apache or nginx).
The discussion above has focussed on R Markdown. In Python, a similar role is played by Jupyter notebooks. These are created in a web-based editor, which can be started by typing jupyter notebook
in the shell. It is also possible to run a multi-user Jupyter Hub on a server. Notebooks are stored in JSON format, and have extension .ipynb
.
Suppose you have created a notebook called my_notebook.ipynb
. Then the following shell command can be used to run the code in it and convert the notebook to HTML:
jupyter nbconvert --execute --to html my_notebook.ipynb
One difference between .Rmd
and .ipynb
files is that .ipynb
files also contain cached output of code chunks. Specifying the --execute
option here ensures all of the code in the notebook is run afresh. Using this command in a Makefile would ensure the build will halt with an error if there are any problems with the code.
make clean ; make
can be used to test all of the code in our tutorial. We will see an error message and the process will abort immediately if anything goes wrong as the HTML files are built. This is also a handy way to test that a training server has all the necessary R or Python packages installed!
You might want to add some further rules, for example:
rmarkdown
or knitr
and then using pandoc
to create a PDF using LaTeX.If someone wants to adapt our tutorial, they can easily do so. Having made the changes they want, the HTML files can be re-built with a single command. Give your notes an appropriate licence, such as a Creative Commons license, so that people know they can do this without having to ask your permission.
Setting up tutorial notes in this way gives us desirable features of open science. We have left nothing out or the code will not run. They are easily reproducable and sharable. They are written plain text formats that anyone can edit, and built with tools that are all freely available. They can serve as a base that anyone who wants to can build on.