pdf file illustration on top of clouds in blue sky

Understanding auto-tagging of PDF documents

Posted by: GrackleDocs on February 22, 2023

When it comes to accessible PDFs, you hear a lot about document being tagged. With PDF remediation being time-consuming, it’s easy for someone to be tempted to auto-tag a PDF instead. However, this won’t make your PDF accessible the way you think.

Let’s dive in and take a closer look at this.

What’s a tagged PDF?

Let’s start with the basics – what is a tagged PDF? Essentially, a tagged PDF refers to a type a PDF that has something called a tags tree, similar to HTML, that provides and defines the structure of the document.

Why is tagging a PDF important?

When you properly tag a PDF, you’re providing the document with structure. This structure, in turn, become accessibility markups that, when they’re properly applied, can help optimize the reading and usability experience for people using assistive technology, like screen readers.

What does it mean to auto-tag PDF documents?

In Adobe Acrobat, there’s a feature that allows you to auto-tag a document that doesn’t have a tags tree already associated with it. However, Acrobat wasn’t designed to be an accessibility tool; it’s a publishing tool. That means that the auto-tagging tools aren’t advanced enough to fully tag a document to make it fully accessible.

Can PDF documents be automatically tagged for accessibility?

The short answer is no, you can’t automatically tag PDF documents for accessibility.

While auto-tagging can be used as a starting point, it doesn’t fully remediate your document. You still need to manually fix it manually to ensure the accuracy of the tags along with resolving any errors within it.

For example, Acrobat can’t interpret variations in styling throughout the document, so it may tag everything with the same structure, when in fact, they are different.

The Adobe AutoTag feature is just the start of a longer process to meet any of the accessible standards, including Section 508, WCAG 2.1 and PDF/UA.

Additionally, Acrobat’s Accessibility Checker only looks for the basic accessibility checkpoints within a document, like:

  • Images having alternative text
  • Heading levels for navigation
  • Document properties
  • And other basic settings

It doesn’t actually certify that files are fully accessible and meet the official standards.

Understanding Adobe Acrobat’s tag tree after auto-tagging

We’ve already flagged that using Acrobat’s auto-tagging feature won’t provide you with an accessible PDF. But why? In theory, it’s providing the structure required for a document to have some navigation. So, why isn’t that enough?

Well, because Acrobat isn’t inherently an accessibility tool, it doesn’t tag properly. It makes assumptions within its auto-tagging, which can lead to the document still not being accessible.

When it comes to text, it can be lumped in with an image if there is one close to it. You see this a lot with letterhead, where the text within the letterhead is combined with the logo, making the overall tag a <Figure> tag instead of a <P> tag.

Additionally, some text boxes get tagged as images instead of the text itself being tagged properly. Since Adobe’s AutoTag function doesn’t have optical character recognition (OCR) capabilities, there’s no way for the auto-tag to fix this issue. Someone would have to go in manually and fix the issue.

Another issue relates to headings within the document. Since Acrobat’s AutoTag feature doesn’t have a discerning eye for what level headings should it, it just assumes, and is typically incorrect with its assumptions. For example, based on size, it may make something an <H3> tag instead of an <H1> tag.

Lists provide another issue with auto-tagging. For example, despite lists having bullets, the bullets themselves are typically not tagged properly, and thus not used. For someone using assistive technology, they will be given a string of works without understanding the relationship, because the list tagging is incorrect.

These are just a few of the most common issues that occur with auto-tagging in Adobe Acrobat. But as mentioned earlier, it can be a starting point for remediators when they’re working on a document. It just can’t be the only step.

PDF Tagging Software

While Adobe Acrobat’s auto-tagging feature won’t give you a fully accessible and compliant PDF, the program itself can help you achieve that outcome with manual remediation.

Using Adobe Acrobat Pro, users can follow various guides to create accessible documents and remediate any issues that pop up. They can test for issues using ADScan which will give users an Accessibility Detail Report, and point out any accessibility issues within the document.

Not sure where to begin with tagging? AbleDocs’ ADTraining can provide you with one-on-one or group training opportunities to walk you and your team through making a PDF accessible, including testing and fixing issues that are flagged.

If you’re using Adobe InDesign to create your documents, you can use the MadeToTag plug-in to prepare your InDesign documents for export as accessible and tagged PDF files.

Back to Top

You may also be interested in:

  • Fostering Inclusivity with Google Workspace Accessibility Tools

    Posted in News on June 20, 2023

    In our increasingly digital world, fostering inclusivity means making sure that everyone has access to and can effectively use digital tools and resources. Google Workspace has made strides in providing…

    Read PostBlue-headed Grackle bird standing next to water
  • Tech4Good Awards

    Posted in News on June 24, 2021

    We are thrilled to announce that Grackle is a finalist in the Tech4Good AwardsYou can learn more here: Tech4Good Awards Announcement

    Read PostGrackle are an accessibility award finalist
  • Understanding Your GrackleScan Grade

    Posted in Digital Accessibility, Document Accessibility on May 10, 2024

    As you receive your monthly GrackleScan reports, we wanted to help provide insights into your grades. What does your grade mean? Grade Meaning Explanation A+ Pass All accessibility and compliance…

    Read PostGrackle Scan