R Markdown is an authoring framework for data science. An R Markdown document can execute code side by side with regular text.
Before we get into R Markdown, we’ll touch on the basics of Markdown. The general idea behind Markdown is to write text which is as easy to read as possible that can be easily converted into HTML (or other file types).
Markdown is both:
A plain text formatting syntax
A software tool that converts the formatted plain text to HTML
Say we want to write this:
The text below is what you’d write in HTML:
<body>
<h3>Today's shopping list:</h3>
<ul>
<li>Milk</li>
<li>Eggs</li>
<li>Cereal</li>
<li>Fruit</li>
</ul>
</body>
Whereas in Markdown:
### Today's shopping list:
* Milk
* Eggs
* Cereal
* Fruit
Markdown is a lot quicker to write and is more human readable. Let’s explore some more Markdown syntax together. We’ll do this in a R Markdown file in R Studio since it’s easy to render to html from there.
Firstly, open up a R Markdown file in R Studio.
Click the File
tab, New File
, then R Markdown
.
Leave the default output as is (HTML), choose a title for the new R Markdown file or leave it blank. The new document generated will already contain text - this will demonstrate the basics of R Markdown. You can knit
the new document to take a look at the resulting html file. Since R Studio knows that this is a R Markdown file, it will have a little Knit
button in the source pane. This will generate a html file from the R Markdown file.
For our purposes of working through Markdown, we’re going to clear everything out of the file and save as demo.Rmd
.
We can format the text into a header using the #
symbol:
# Time to learn some markdown!
## Time to learn some markdown!
### Time to learn some markdown!
#### Time to learn some markdown!
Click Knit
or use rmarkdown::render("demo.Rmd")
to examine the resulting html.
The #
symbol is Markdown’s syntax for a header. The number of #
s choose which type of header to produce. The equivalent html would be:
<h1> Time to learn some markdown!</h1>
<h2> Time to learn some markdown!</h2>
<h3> Time to learn some markdown!</h3>
<h4> Time to learn some markdown!</h4>
*
or _
can be used to note emphasis
**
or __
can be used to bold text
They can also be combined together:
*Time* to learn some markdown!
Time to _learn_ some markdown!
Time to learn **some** markdown!
Time to learn some __markdown!__
Time to learn __*some*__ markdown!
Time to **_learn_** some markdown!
Time ***to*** learn some markdown!
___Time___ to learn some markdown!
Time to learn some markdown!
Time to learn some markdown!
Time to learn some markdown!
Time to learn some markdown!
Time to learn some markdown!
Time to learn some markdown!
Time to learn some markdown!
Time to learn some markdown!
For unordered lists, you can use: *
, -
or +
:
* a bullet point
- a bullet point
+ still a bullet point
For ordered lists, you use a number with a dot, e.g: 1.
:
1. First item on our numbered list
2. Second item on our numbered list
To create sub-lists, indent the next list evenly by two or four spaces
1. First item on our numbered list
* a bullet point
- a bullet point
+ still a bullet point
2. Second item on our numbered list
* a bullet point:
* Now with a sub-list to our sub-list
* still with a sub-list to our sub-list
- a bullet point
+ still a bullet point
To create a link, the syntax is [text](link)
e.g:
[The text shown on the page](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet)
This will produce:
For an image, the syntax is ![](www.image_url_here.png)
e.g:
![](https://imgs.xkcd.com/comics/eternal_flame.gif)
The take home here is that it’s quick and easy to format your text as you write.
We are now going to work with an R Markdown file.
An R Markdown file contains three things:
---
title: "Writing documents with R Markdown"
date: "24/10/2017"
output: html_document
---
This is the YAML (originally meant Yet Another Markup Language
now stands for YAML Ain't Markup Language
) header for this R Markdown document. The header is enclosed by two sets of three dashes ---
. This block allows you to fine-tune the output of your document. It’s a set of key:value pairs that describes the file that should be built from the R Markdown file. You can adjust the theme, alter the table of contents, choose the type of file(s) to output (i.e could just be to html or to html and pdf at the same time), etc.
At the top of our file, set the YAML header to:
---
title: "Your title here"
date: "Todays date"
output:
html_document
---
We’ll come back to this, once we’ve got a bit more in our document, to really appreciate the control YAML has over the output of a document.
The full list of YAML header options for a HTML document.
We’ve covered this above. For now, delete the multiple headers of Time to learn some markdown!
and create a header over the body of the text:
## Markdown Basics
Create subheaders to the different topics (Bold and italics, lists, hyperlinks, etc). At the bottom of the document, add a new header:
## Embedding Code
Code chunks are used to render R (and code from other programming languages!) output into a document. A code chunk delimiter looks like:
```{r}
```
All code falls between the triple backtrick marks, e.g:
```{r}
1+1
```
You can write this manually but within R Studio, there’s a little green Insert
button in the source pane that will insert code chunks when clicked. Also when working R-Studio, a little green arrow appears at the end of the code block, clicking this will run and evaluate the code.
Inside the curly braces, options can be passed to control the output of the code chunk, the name of the chunk, how it appears, whether it’s evaluated, etc. There’s also a whole range of figure options specifically for configuring the appearance of plots within the document.
10 * 4
## [1] 40
10:15
## [1] 10 11 12 13 14 15
rep(c(1, 2, 3), times = 5)
## [1] 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
If we just want to show code but not run it, we can add the eval=FALSE
option.
10 * 4
10:15
rep(c(1, 2, 3), times = 5)
Or if we just want to show the results but no code, add the echo=FALSE
option:
## [1] 40
## [1] 10 11 12 13 14 15
## [1] 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
We can display tables from code blocks:
mtcars
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
## Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
## Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
## Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
## Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
## Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
## Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
## Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
## Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
## Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
## Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
## Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
## Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
## AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
## Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
## Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
## Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
## Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
## Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
## Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
## Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
## Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8
## Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
The mtcars dataset is a dataset inbuilt in R that we’ll use an example test dataset. The dataset here will print a non-interactive table in the output html (there are ways to change this if you want, e.g adding df_print: paged
to the YAML header. This is useful if the table is quite large and can’t fit neatly on the page).
We can also create plots:
plot(mtcars$mpg, mtcars$disp)
We can use figure options to customise the output of the plot, e.g:
fig.align='center
to set the alignment to the middle of the documentfig.height=8
to set the height of the figurefig.width=8
to set the width of the figurefig.cap="Fig 1. Miles per gallon vs displacement"
to add a caption describing the plotplot(mtcars$mpg, mtcars$disp)
You’ve written your R Markdown file. But how do you want to present it? Maybe you don’t like the default appearance of the web page. R Markdown has a number of in-built themes you could try out to quickly change the look of a web page.
Our YAML header currently looks something like:
---
title: "Your title here"
date: "Todays date"
output:
html_document
---
We can add a theme like this:
---
title: "Your title here"
date: "Todays date"
output:
html_document:
theme: journal
---
Other theme options: default
, cerulean
, journal
, flatly
, readable
, spacelab
, united
, cosmo
, lumen
, paper
, sandstone
, simplex
, yeti
.
We can also alter the appearance of code chunks with the highlight
option:
---
title: "Your title here"
date: "Todays date"
output:
html_document:
theme: journal
highlight: tango
---
Other highlight options: default
, tango
, pygments
, kate
, monochrome
, espresso
, zenburn
, haddock
, textmate
.
If you want to create your own styling, you can add your own CSS to the document. We’ll come back to that a little later.
Your document has gotten very large and it’s hard to navigate. You can add a table of contents to aid in navigating with toc
:
---
title: "Your title here"
date: "Todays date"
output:
html_document:
theme: journal
highlight: expresso
toc: true
---
This will produce a web page with a table of contents using the headers inside our document. There’s a few more things we can do with the table of contents:
---
title: "Your title here"
date: "Todays date"
output:
html_document:
theme: journal
highlight: espresso
toc: true
toc_depth: 4
toc_float: true
---
This will now ensure the table of contents floats on the side of the web page and is always accesible, allowing for easy navigation. By default, the table of contents depth is set to 3 so it includes any H1, H2, H3 headers. Anything lower than 3 is not included (e.g if we have a H4 or H5 header) but by setting the depth to 4, now H4 headers are included in the table of contents. Or maybe we can go the opposite way and insist that anything lower than a H2 header shouldn’t be included in the table of contents (toc_depth: 2
).
However, our floating toc defaults to collapsing down smaller headers. To prevent this behaviour, we can add collapsed: true
:
---
title: "Your title here"
date: "Todays date"
output:
html_document:
theme: journal
highlight: espresso
toc: true
toc_depth: 4
toc_float:
collapsed: false
---
Something has come up and now you need to deliver your material in a different form. We’ve been creating web pages from our R Markdown file so far but now you need a PDF instead.
To change the file format, we change from html_document
to pdf_document
under the output
option:
---
title: "Your title here"
date: "Todays date"
output:
pdf_document:
theme: journal
highlight: espresso
toc: true
toc_depth: 4
toc_float: true
---
And when we click the Knit
button, we use the dropdown menu, Knit to PDF
. Our R Markdown file will now produce a PDF file instead.
Wait - an error has occured. The message we have is unused argument (theme = "journal")
. PDF files can’t take the theme
option, this is what’s causing our error. Not all YAML options are universally applicable - each document type has options that are specific to them, so it’s worth looking at the R Mardown website to see what you can do with them.
For the YAML options for a PDF document, look here.
PDF files can’t take the toc_float: true
as well. Therefore we should try:
---
title: "Your title here"
date: "Todays date"
output:
pdf_document:
highlight: espresso
toc: true
toc_depth: 4
---
We also don’t need to solely produce just a web page or just a pdf file. We can do both at the same time and specify options to them:
---
title: "Your title here"
date: "Todays date"
output:
pdf_document:
highlight: espresso
toc: true
toc_depth: 4
html_document:
toc: false
highlight: haddock
theme: journal
---
This will produce a pdf document with a table of contents and a themed web page without one. The highlights will be different for each document.
We still have to seperately knit to the type of file we want if we use the Knit
button. Or we could use the render function: rmarkdown::render("demo.Rmd", output_format = "all")
and this will render all the specified output types.
While R Markdown has a few in-built themes to customise the appearance of a document that can be passed to the YAML header, a stylesheet can be used to further customise the appearance of the document. We aren’t going to cover how CSS works, but if you are familiar with it, you either write it within the body of the document or you can generate a CSS file to provide the styles you want to the document.
e.g, if you add this in the body of your document, under the YAML block:
<style>
p {
background: #e6f2ff
}
</style>
That applies a light blue background to all the paragraphs in the document. You could go further and add more styling but rather than sticking it in the body of the document, it’s better to save the styling to a seperate CSS file.
We can write the example styling to a simple CSS file as so:
p {
background: #e6f2ff
}
Save this as style_demo.css
and save it in the same directory as the R Markdown you are working on. Edit your YAML header to:
---
title: "Demo Markdown Document"
output:
html_document:
css: style_demo.css
---
This file is then passed to the YAML header - depending on how much the CSS file affects, you may need to set the theme and highlight to null to ensure no conflicts arise e.g:
output:
html_document:
theme: null
highlight: null
css: style_demo.css
This should change all paragraphs to have a blue background. (Note that if you want to create a table of contents, you cannot set theme:null
)
It’s also possible to target specific parts of the document from the CSS file by using ids and classes to section headers in the document.
For example, we want to style a specific part of our document.
.particular_topic_block {
margin: 2em;
padding: 2em;
border: 1px solid red;
border-radius: 5px;
background: #ffffe6;
}
Then, next to one of your headers, add:
## Markdown Basics {.particular_topic_block}
This will apply the styling to this part of the document until the next header.
It isn’t pretty if you’ve left in the p
styling but we hope this demonstrates how you can write your material in R Markdown and then apply styling with an external file. With just one line in the YAML header for each document, this styling can be applied to multiple files.
So you’ve got a bunch of R Markdown document full of material. You could set the YAML header individually for each one of them or you could create a _output.yaml
file within the directory of multiple R Markdown files. All documents located in the same directory will inherit the YAML options defined in this file as default. However, specifically setting the YAML header of a file will overide the options from the _output.yaml
file.
In the _output.yaml
file, no delimiters are required:
html_document:
toc: true
toc_float: true
toc_depth: 4
theme: paper
highlight: expresso
css: style_demo.css
Now we don’t even need to set the YAML headers for each file.
In summary with R Markdown we can:
Things we haven’t touched on but think are neat: