The basic syntax of YAML is to use key-value pairs in the format key: value
. A YAML code block should be fenced in with ---
before and after (you can also use ...
to end the YAML block, but this is not very common in R Markdown).
In R, the equivalent structure is a list with named character vector: list(author = "Malcolm Barrett")
. In fact, you can call this list in R Markdown using the metadata
object; in this case, metadata$author
will return "Malcolm Barrett"
In YAML, spaces are used to indicate nesting. When we want to specify the output function pdf_document(toc = TRUE)
, we need to nest it under the output
field. We also need to nest toc
under pdf_document
so that it gets passed to that function correctly.
In R, the equivalent structure is a nested list, each with a name: list(output = list(pdf_document = list(toc = TRUE)))
. Similarly, you can call this in R Markdown using the metadata
object, e.g. metadata$output$pdf_document$toc
. The hierarchical structure (which you can see with draw_yml_tree()
) looks like this:
└── output:
└── pdf_document:
└── toc: true
Without the extra indents, YAML doesn’t know toc
is connected to pdf_document
and thinks the value of pdf_document
is NULL
. YAML that looks like this:
has a hierarchy that looks like this:
├── output:
│ └── pdf_document: null
└── toc: true
If you use output functions without additional arguments, the value of output
can simply be the name of the function.
However, if you’re specifying more than one output type, you must use the nesting syntax. If you don’t want to include additional arguments, use "default"
as the function’s value.
Some YAML fields take unnamed vectors as their value. You can specify an element of the vector by adding a new line and -
(note that the values are not indented below category
here).
In R, the equivalent structure is a list with a named vector: list(categories = c("R", "Reprodicible Research"))
. metadata$category
will return c("R", "Reprodicible Research")
. Another way to specify vectors is to use []
with each object separated by a column, as in the syntax for c()
. This YAML is equivalent to the YAML above:
By default, ymlthis uses the -
syntax for vectors.-
is also used to group elements together. For instance, in the params
field for parameterized reports, we group parameter information together by using -
. The first line is the name and value of the parameter, while all the lines until the next -
are extra information about the parameter. While you can use metadata
to call objects in params
, params
has it’s own object you can call directly: params$a
and params$data
will return the values of a
and data
.
In R, the equivalent structure is a nested list that contains a list of unnamed lists: list(param = list(list(a = 1, input = numeric), list(data = "data.csv", input = "file")))
. The inner-most lists group items together, e.g. list(a = 1, input = numeric)
groups a
and input
.
└── params:
├── a: 1.0
└── input: numeric
├── data: data.csv
└── input: text
You may have noticed that strings in YAML don’t always need to be quoted. However, it can be useful to explicitly wrap strings in quotes when they contain special characters like :
and @
.
R code can be written as inline expressions `r expr`
. yml_code()
will capture R code for you and put it in a valid format. R code in params
needs to be slightly different: use !r
(e.g. !r expr
) to call an R object.
Logical values in YAML are unusual: true/false
, yes/no
, and on/off
are all equivalent to TRUE/FALSE
in R. Any of these turn on the table of contents:
By default, ymlthis uses true/false
. If you want to use any of these values literally (e.g. you want a string equal to "yes"
), you need to wrap them in quotation marks:
NULL
can be specified using null
or ~
. By default, ymlthis uses null
. If you want to specify an empty vector, use []
, e.g. category: []
. For an empty string, just use empty quotation marks (""
).
Where do the YAML fields you use in R Markdown come from? Many YAML fields that we use come from Pandoc from the rmarkdown package. These both use YAML to specify the build of the document and to pass information to be printed in a template. Pandoc templates can also be customized to add new YAML. The most common sources of YAML are:
rmarkdown::pdf_document()
)Because YAML is an extensible approach to metadata, and there is often no way to validate that your YAML is correct. YAML will often fail silently if you, for instance, make a typo in the field name or misspecify the nesting between fields. For more information on the fields available in R Markdown and friends, see the YAML Fieldguide.