R Markdown

R Markdown is an authoring framework for data science. An R Markdown document can execute code side by side with regular text.

Markdown Basics

Before we get into R Markdown, we’ll touch on the basics of Markdown. The general idea behind Markdown is to write text which is as easy to read as possible that can be easily converted into HTML (or other file types).

Markdown is both:

  1. A plain text formatting syntax

  2. A software tool that converts the formatted plain text to HTML

Say we want to write this:

Today’s shopping list:

  • Milk
  • Eggs
  • Cereal
  • Fruit

The text below is what you’d write in HTML:

<body>
  <h3>Today's shopping list:</h3>
    <ul>
      <li>Milk</li>
      <li>Eggs</li>
      <li>Cereal</li>
      <li>Fruit</li>
    </ul>
</body>

Whereas in Markdown:

### Today's shopping list:

* Milk
* Eggs
* Cereal
* Fruit

Markdown is a lot quicker to write and is more human readable. Let’s explore some more Markdown syntax together. We’ll do this in a R Markdown file in R Studio since it’s easy to render to html from there.

Firstly, open up a R Markdown file in R Studio.

Click the File tab, New File, then R Markdown.

Leave the default output as is (HTML), choose a title for the new R Markdown file or leave it blank. The new document generated will already contain text - this will demonstrate the basics of R Markdown. You can knit the new document to take a look at the resulting html file. Since R Studio knows that this is a R Markdown file, it will have a little Knit button in the source pane. This will generate a html file from the R Markdown file.

For our purposes of working through Markdown, we’re going to clear everything out of the file and save as demo.Rmd.

Markdown syntax:

We can format the text into a header using the # symbol:

# Time to learn some markdown!
## Time to learn some markdown!
### Time to learn some markdown!
#### Time to learn some markdown!

Click Knit or use rmarkdown::render("demo.Rmd") to examine the resulting html.

Time to learn some markdown!

Time to learn some markdown!

Time to learn some markdown!

Time to learn some markdown!

The # symbol is Markdown’s syntax for a header. The number of #s choose which type of header to produce. The equivalent html would be:

<h1> Time to learn some markdown!</h1>
<h2> Time to learn some markdown!</h2>
<h3> Time to learn some markdown!</h3>
<h4> Time to learn some markdown!</h4>

Examples of Markdown syntax - bold and italics:

* or _ can be used to note emphasis

** or __ can be used to bold text

They can also be combined together:

*Time* to learn some markdown!

Time to _learn_ some markdown!

Time to learn **some** markdown!

Time to learn some __markdown!__

Time to learn __*some*__ markdown!

Time to **_learn_** some markdown!

Time ***to*** learn some markdown!

___Time___ to learn some markdown!

Time to learn some markdown!

Time to learn some markdown!

Time to learn some markdown!

Time to learn some markdown!

Time to learn some markdown!

Time to learn some markdown!

Time to learn some markdown!

Time to learn some markdown!

Examples of Markdown syntax - ordered and unordered lists:

For unordered lists, you can use: *, - or +:

* a bullet point
- a bullet point
+ still a bullet point
  • a bullet point
  • a bullet point
  • still a bullet point

For ordered lists, you use a number with a dot, e.g: 1.:

1. First item on our numbered list
2. Second item on our numbered list
  1. First item on our numbered list
  2. Second item on our numbered list

To create sub-lists, indent the next list evenly by two or four spaces

1. First item on our numbered list
    * a bullet point
    - a bullet point
    + still a bullet point
  
2. Second item on our numbered list
    * a bullet point:
        * Now with a sub-list to our sub-list
            * still with a sub-list to our sub-list
    - a bullet point
    + still a bullet point
  1. First item on our numbered list
    • a bullet point
    • a bullet point
    • still a bullet point
  2. Second item on our numbered list
    • a bullet point:
      • Now with a sub-list to our sub-list
        • Continuing to sub-list
    • a bullet point
    • still a bullet point

Anatomy of an R Markdown file

We are now going to work with an R Markdown file.

An R Markdown file contains three things:

1. A YAML header (optional) at the top of the document:

---
title: "Writing documents with R Markdown"
date: "24/10/2017"
output: html_document
---

This is the YAML (originally meant Yet Another Markup Language now stands for YAML Ain't Markup Language) header for this R Markdown document. The header is enclosed by two sets of three dashes ---. This block allows you to fine-tune the output of your document. It’s a set of key:value pairs that describes the file that should be built from the R Markdown file. You can adjust the theme, alter the table of contents, choose the type of file(s) to output (i.e could just be to html or to html and pdf at the same time), etc.

At the top of our file, set the YAML header to:

---
title: "Your title here"
date: "Todays date"
output:
  html_document
---

We’ll come back to this, once we’ve got a bit more in our document, to really appreciate the control YAML has over the output of a document.

The full list of YAML header options for a HTML document.

2. Markdown text

We’ve covered this above. For now, delete the multiple headers of Time to learn some markdown! and create a header over the body of the text:

## Markdown Basics

Create subheaders to the different topics (Bold and italics, lists, hyperlinks, etc). At the bottom of the document, add a new header:

## Embedding Code

3. Code chunks

Code chunks are used to render R (and code from other programming languages!) output into a document. A code chunk delimiter looks like:

```{r}

```

All code falls between the triple backtrick marks, e.g:

```{r}

1+1

```

You can write this manually but within R Studio, there’s a little green Insert button in the source pane that will insert code chunks when clicked. Also when working R-Studio, a little green arrow appears at the end of the code block, clicking this will run and evaluate the code.

Inside the curly braces, options can be passed to control the output of the code chunk, the name of the chunk, how it appears, whether it’s evaluated, etc. There’s also a whole range of figure options specifically for configuring the appearance of plots within the document.

10 * 4
## [1] 40
10:15
## [1] 10 11 12 13 14 15
rep(c(1, 2, 3), times = 5)
##  [1] 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3

If we just want to show code but not run it, we can add the eval=FALSE option.

10 * 4

10:15

rep(c(1, 2, 3), times = 5)

Or if we just want to show the results but no code, add the echo=FALSE option:

## [1] 40
## [1] 10 11 12 13 14 15
##  [1] 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3

We can display tables from code blocks:

mtcars
##                      mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
## Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
## Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
## Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
## Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
## Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
## Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
## Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
## Fiat 128            32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
## Honda Civic         30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
## Toyota Corolla      33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
## Toyota Corona       21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
## Dodge Challenger    15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
## AMC Javelin         15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
## Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
## Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
## Fiat X1-9           27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
## Porsche 914-2       26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
## Lotus Europa        30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
## Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
## Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
## Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
## Volvo 142E          21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

The mtcars dataset is a dataset inbuilt in R that we’ll use an example test dataset. The dataset here will print a non-interactive table in the output html (there are ways to change this if you want, e.g adding df_print: paged to the YAML header. This is useful if the table is quite large and can’t fit neatly on the page).

We can also create plots:

plot(mtcars$mpg, mtcars$disp)

We can use figure options to customise the output of the plot, e.g:

  • fig.align='center to set the alignment to the middle of the document
  • fig.height=8 to set the height of the figure
  • fig.width=8 to set the width of the figure
  • fig.cap="Fig 1. Miles per gallon vs displacement" to add a caption describing the plot
plot(mtcars$mpg, mtcars$disp)
Fig 1. Miles per gallon vs displacement

Fig 1. Miles per gallon vs displacement

More on YAML

Themes

You’ve written your R Markdown file. But how do you want to present it? Maybe you don’t like the default appearance of the web page. R Markdown has a number of in-built themes you could try out to quickly change the look of a web page.

Our YAML header currently looks something like:

---
title: "Your title here"
date: "Todays date"
output: 
  html_document
---

We can add a theme like this:

---
title: "Your title here"
date: "Todays date"
output: 
  html_document:
    theme: journal
---

Other theme options: default, cerulean, journal, flatly, readable, spacelab, united, cosmo, lumen, paper, sandstone, simplex, yeti.

We can also alter the appearance of code chunks with the highlight option:

---
title: "Your title here"
date: "Todays date"
output: 
  html_document:
    theme: journal
    highlight: tango
---

Other highlight options: default, tango, pygments, kate, monochrome, espresso, zenburn, haddock, textmate.

If you want to create your own styling, you can add your own CSS to the document. We’ll come back to that a little later.

Table of contents

Your document has gotten very large and it’s hard to navigate. You can add a table of contents to aid in navigating with toc:

---
title: "Your title here"
date: "Todays date"
output: 
  html_document:
    theme: journal
    highlight: expresso
    toc: true
---

This will produce a web page with a table of contents using the headers inside our document. There’s a few more things we can do with the table of contents:

---
title: "Your title here"
date: "Todays date"
output: 
  html_document:
    theme: journal
    highlight: espresso
    toc: true
    toc_depth: 4
    toc_float: true
---

This will now ensure the table of contents floats on the side of the web page and is always accesible, allowing for easy navigation. By default, the table of contents depth is set to 3 so it includes any H1, H2, H3 headers. Anything lower than 3 is not included (e.g if we have a H4 or H5 header) but by setting the depth to 4, now H4 headers are included in the table of contents. Or maybe we can go the opposite way and insist that anything lower than a H2 header shouldn’t be included in the table of contents (toc_depth: 2).

However, our floating toc defaults to collapsing down smaller headers. To prevent this behaviour, we can add collapsed: true:

---
title: "Your title here"
date: "Todays date"
output: 
  html_document:
    theme: journal
    highlight: espresso
    toc: true
    toc_depth: 4
    toc_float:
      collapsed: false
---

Document output format

Something has come up and now you need to deliver your material in a different form. We’ve been creating web pages from our R Markdown file so far but now you need a PDF instead.

To change the file format, we change from html_document to pdf_document under the output option:

---
title: "Your title here"
date: "Todays date"
output: 
  pdf_document:
    theme: journal
    highlight: espresso
    toc: true
    toc_depth: 4
    toc_float: true
---

And when we click the Knit button, we use the dropdown menu, Knit to PDF. Our R Markdown file will now produce a PDF file instead.

Wait - an error has occured. The message we have is unused argument (theme = "journal"). PDF files can’t take the theme option, this is what’s causing our error. Not all YAML options are universally applicable - each document type has options that are specific to them, so it’s worth looking at the R Mardown website to see what you can do with them.

For the YAML options for a PDF document, look here.

PDF files can’t take the toc_float: true as well. Therefore we should try:

---
title: "Your title here"
date: "Todays date"
output: 
  pdf_document:
    highlight: espresso
    toc: true
    toc_depth: 4
---

We also don’t need to solely produce just a web page or just a pdf file. We can do both at the same time and specify options to them:

---
title: "Your title here"
date: "Todays date"
output: 
  pdf_document:
    highlight: espresso
    toc: true
    toc_depth: 4
  html_document:
    toc: false
    highlight: haddock
    theme: journal
---

This will produce a pdf document with a table of contents and a themed web page without one. The highlights will be different for each document.

We still have to seperately knit to the type of file we want if we use the Knit button. Or we could use the render function: rmarkdown::render("demo.Rmd", output_format = "all") and this will render all the specified output types.

Styling and Extending YAML for multiple files

Styling

While R Markdown has a few in-built themes to customise the appearance of a document that can be passed to the YAML header, a stylesheet can be used to further customise the appearance of the document. We aren’t going to cover how CSS works, but if you are familiar with it, you either write it within the body of the document or you can generate a CSS file to provide the styles you want to the document.

e.g, if you add this in the body of your document, under the YAML block:

<style>
p {
  background: #e6f2ff
}
</style>

That applies a light blue background to all the paragraphs in the document. You could go further and add more styling but rather than sticking it in the body of the document, it’s better to save the styling to a seperate CSS file.

We can write the example styling to a simple CSS file as so:

p {
  background: #e6f2ff
}

Save this as style_demo.css and save it in the same directory as the R Markdown you are working on. Edit your YAML header to:

---
title: "Demo Markdown Document"
output:
  html_document:
    css: style_demo.css
---

This file is then passed to the YAML header - depending on how much the CSS file affects, you may need to set the theme and highlight to null to ensure no conflicts arise e.g:

output:
  html_document:
    theme: null
    highlight: null
    css: style_demo.css

This should change all paragraphs to have a blue background. (Note that if you want to create a table of contents, you cannot set theme:null)

It’s also possible to target specific parts of the document from the CSS file by using ids and classes to section headers in the document.

For example, we want to style a specific part of our document.

.particular_topic_block {
    margin: 2em;
    padding: 2em;
    border: 1px solid red;
    border-radius: 5px;
    background: #ffffe6;
}

Then, next to one of your headers, add:


## Markdown Basics {.particular_topic_block}

This will apply the styling to this part of the document until the next header.

It isn’t pretty if you’ve left in the p styling but we hope this demonstrates how you can write your material in R Markdown and then apply styling with an external file. With just one line in the YAML header for each document, this styling can be applied to multiple files.

YAML for multiple documents

So you’ve got a bunch of R Markdown document full of material. You could set the YAML header individually for each one of them or you could create a _output.yaml file within the directory of multiple R Markdown files. All documents located in the same directory will inherit the YAML options defined in this file as default. However, specifically setting the YAML header of a file will overide the options from the _output.yaml file.

In the _output.yaml file, no delimiters are required:

html_document:
  toc: true
  toc_float: true
  toc_depth: 4
  theme: paper
  highlight: expresso
  css: style_demo.css

Now we don’t even need to set the YAML headers for each file.

In summary with R Markdown we can:

  • Write plain text that converts to HTML
  • Easily embed R code into our document
  • Describe how we want our document built with YAML
  • Choose how to style our documents with a CSS file
  • Use a YAML file to control the output of multiple R Markdown files

Extended Topics

Things we haven’t touched on but think are neat:


Home