One or multiple level 1 headings in PDF?

30th October 2019 | by Ted Page

There are two schools of thought regarding whether you should use one or multiple level 1 headings (<H1>s) in a PDF. This article examines both sides of the argument and concludes that sometimes it makes sense to use multiple <H1>s, but that most of the time using only one is better for accessibility.

The argument in a nutshell

This whole discussion boils down to how you tag a document’s title. This article advocates (usually) tagging a document title as an <H1>, with each subsequent main section heading being an <H2>.

On the other side of this debate are those who argue that a document’s title should always be contained in a <Title> tag, with each main section of the document being headed by an <H1>. In such a case, because the <Title> tag in a PDF has no semantic value, it should be mapped to <P> and hence be effectively just a block of ordinary text. So, what exactly is the reasoning behind this latter approach?

The table of contents argument

One argument against using an <H1> for the document title is that tables of contents (TOCs) are typically built using headings. And, the argument goes, as titles don’t usually appear in TOCs, you shouldn’t use <H1>s to tag them. However, this really is a very weak argument indeed.

Microsoft Word TOCs

In Microsoft Word it is easy to exclude the <H1> when creating a TOC. To do so, in the table of contents dialogue box, click the Options button. Scroll down the list of available styles until you find the headings section. Delete the “1” from the Heading 1 field. When you generate the TOC, it will not now include the <H1>.

Screenshot of the table of contents options dialogue box with the Heading 1 deselected
Figure 1: Excluding Heading 1 from an MS Word-generated table of contents

InDesign TOCs

In InDesign you have to specify which headings to include in a TOC. If you don’t want the <H1> in the TOC, just don’t include it. It really is that simple.

Not enough heading levels

Another argument is that in large PDFs you can run out of heading levels. This is true. It is arguable that you could get more benefit in large and complex documents from the additional heading level you would gain from not using an <H1> for the title.

But, this is an editorial call that is entirely dependent on the particular document in question. A relatively small number of PDFs are so complex that six heading levels will not suffice, and there’s no logical reason why authors of the majority of shorter, simpler documents should be forced to adopt this approach. In most cases you will gain nothing, but you will lose the benefits of the single <H1> approach (see below).

Why the title should be tagged as an H1

A document title is, editorially and semantically, indistinguishable from a heading: it labels or describes the block of content that it heads. Tagging the title as an <H1> tells a screen reader user that this is the top level label for the block of content that follows, which, in the case of a title, typically means the whole document. It also enables a screen reader user to press “1” from anywhere within the document to navigate to and read the title.

Those favouring multiple <H1>s as standard argue that the reader can simply navigate to the top of the document (Ctrl + Home) to find the title content, and from its position, deduce that it is the title. It really is hard to see how this approach is better than explicitly tagging it as the document’s overall label.

Alternatively, those favouring multiple <H1>s argue that the reader can interrogate the metadata title to get the same information. Even assuming a screen reader user knows how to do so (by no means as safe an assumption as that a screen reader user will know how to navigate via headings) this is not always true…

More than one document title?

Take, for example, a GCSE languages exam paper. A Chinese paper will typically come with the first half of the document in Simplified Chinese, and with the second half being the same content, but in Traditional Chinese (usually starting on page 17). In such a document it makes perfect sense to have two titles, one on page 1 and one on page 17, each of them an <H1>.

The existence of such structures (the above is but one example) demolishes the arguments that in order to read the title, you can just interrogate the metadata, or that you can just navigate to the top of the document to find it. In this example, the document has (and can only have) one metadata title, but 2 completely different “on-the-page” titles, one of which is on page 17.

Conclusion

As can be seen, the arguments for requiring all PDFs to use a <Title> tag mapped to <P> for their titles don’t stand up to a great deal of scrutiny. However, it should be noted that when the next version of PDF (version 2.0) eventually finds its way into everyday usage, this argument will change. This is because PDF 2.0 does actually have a <Title> tag that does what it should. Until then, most of the time, PDFs with a single <H1> will serve the end user better.