Qoyod
Pricing

Knowledge Base

PDF/A-3 format in the e-invoice

When discussing electronic invoice formats in Saudi Arabia, the term PDF/A-3 keeps coming up as one of the two officially approved formats for issuing and storing invoices. Yet many developers and accountants confuse it with the ordinary PDF file they are used to creating from word-processing software. The difference is fundamental, and it determines whether the invoice is accepted or rejected by the Zakat, Tax and Customs Authority (ZATCA).

This technical guide explains the PDF/A-3 format on its own: what it is, why the Authority chose it for human-readable invoices in business-to-business (B2B) transactions, how it stores the XML copy inside itself, what its long-term archiving properties are, and where it differs from a traditional PDF file. As for the details of the embedding mechanism itself (how the XML file is injected into the file structure at the byte level), we cover those in a separate guide titled “Embedding XML inside PDF/A-3” so that this page stays focused on the format itself.

What exactly is the PDF/A-3 format?

PDF/A is a family of PDF file standards designed specifically for long-term archiving. The letter “A” stands for Archival. This family is part of an international standard issued by the International Organization for Standardization (ISO), and it has three main versions:

  • PDF/A-1 Released in 2005 under the ISO 19005-1 standard. It is the most restrictive, and it does not allow attaching any files inside the document.
  • PDF/A-2 Released in 2011 under the ISO 19005-2 standard. It added support for JPEG 2000 compression and transparency, and permitted attaching files on the condition that they themselves be PDF/A-compliant.
  • PDF/A-3 Released in 2012 under the ISO 19005-3 standard. This is the version the Authority adopted, because it allows embedding any type of file inside the document, including an XML file that is not PDF/A-compliant.

The decisive difference between PDF/A-3 and its predecessors is this last point. The first and second versions either prohibit attachment or restrict it to archival files only. PDF/A-3, on the other hand, opens the door to embedding a structured data file such as XML alongside the visual page. This capability specifically is what made it the logical choice for the hybrid electronic invoice, where you need a human-readable copy and a machine-readable copy in a single file.

Why did the Authority choose PDF/A-3 for human-readable invoices?

The Saudi e-invoicing regulation requires every invoice to be issued in a structured digital format. The Authority specified two acceptable formats: either a pure XML file, or a PDF/A-3 file that carries the XML copy inside it. The second option is aimed primarily at business-to-business (B2B) transaction invoices, where the receiving party often needs to see the invoice in a comprehensible visual form, not just lines of code.

Imagine a supplier sending its client a pure XML invoice. XML is structured text that systems understand, but it is not readable to the procurement employee who opens the file. When the invoice is sent in PDF/A-3 format, however, the employee sees a clear invoice page showing the supplier’s name, VAT number, line items, amounts, and QR code. At the same time, the client’s accounting system can extract the embedded XML file and process it automatically without human intervention. This is how a single format bridges two seemingly conflicting needs.

The Authority chose this format because it is a well-established international standard recognized in many tax systems around the world. Germany’s ZUGFeRD standard and France’s Factur-X standard adopt the same principle: a PDF/A-3 file carrying embedded XML. This means the software infrastructure for generating and reading these files already exists and is proven, and companies do not need to invent a new local format from scratch.

There is another practical reason behind this choice. An invoice is a legal document that remains binding for years, and it may be requested during a tax audit or a commercial dispute. Had the Authority adopted a free visual format without constraints, it would have been impossible to guarantee that the invoice would be readable a decade later in the same form it was issued. The PDF/A-3 format settles this concern, because it is designed from the ground up to freeze appearance and content together. What you see today is what the auditor will see tomorrow, without depending on a particular software version or a font installed on a device.

This format also balances two interests that may appear opposed. Accounting systems want structured data they can process automatically with speed and accuracy. Employees and auditors want a document they can read with their own eyes and print when needed. Had the Authority imposed pure XML on everyone, it would have overburdened small businesses that lack systems to convert XML into a visual display. And had it imposed a visual PDF only, it would have deprived systems of structured data. The solution is for a single format to carry both sides, so each party benefits from the layer that suits it.

When to use PDF/A-3 and when pure XML is enough

The practical distinction is simple at its core:

  • The standard tax invoice (B2B) is usually sent to the buyer after its clearance by the Authority, which makes the PDF/A-3 format suitable because it carries the visual representation alongside the structured data.
  • The simplified tax invoice (B2C) is handed directly to the consumer and reported to the Authority within 24 hours. Here a receipt carrying a QR code is often enough, without requiring a full PDF/A-3 file.

The mental rule: the more the counterparty needs to read the invoice visually and process it automatically at the same time, the more PDF/A-3 becomes the most appropriate choice.

The PDF/A family versions in one line

The PDF/A family of standards
The archiving standard evolved until it reached support for attachment.
PDF/A versions

PDF/A-1 (2005): basic archiving, no attachment

PDF/A-2 (2011): newer features

PDF/A-3 (2012): allows attaching files such as XML

The Authority chose PDF/A-3 for its ability to embed an XML file inside the document.

The difference between PDF/A-3 and an ordinary PDF file

The most common mistake among those starting to deal with e-invoicing is assuming that any PDF file serves the purpose. The ordinary PDF file you produce from a word processor or a virtual printer is not accepted by the Authority, even if it looks identical to the invoice in appearance. The technical reasons are clear:

1. The file’s self-containment

An ordinary PDF file may rely on fonts installed on the issuer’s device, on external links, or on elements not saved inside the file itself. Years later the file may be opened on another device, and the fonts disappear or the alignment shifts. PDF/A-3, by contrast, requires the file to be self-contained: all fonts embedded inside it, and all colors defined with device-independent color spaces. The goal is for the file to display after ten years exactly as it displayed the day it was issued.

2. Preventing dynamic content

An ordinary PDF file allows dynamic elements such as embedded JavaScript, multimedia, interactive forms, and encrypted content. These elements pose a risk to archiving, because their behavior may change or stop as software evolves. The PDF/A-3 format explicitly prohibits these elements. The archival invoice must be static, changing nothing and executing no code when opened.

3. The embedded file (XML)

This is the most important difference functionally. An ordinary PDF file does not necessarily carry a structured data copy suitable for automated processing. PDF/A-3 in the invoicing context, on the other hand, carries inside it an XML file attached and identified by a specific relationship, so that any system reading the file knows this attachment is the structured representation of the visual invoice. The details of how this attachment is injected and its relationship defined we explain in the dedicated guide “Embedding XML inside PDF/A-3.”

4. Compliance with a strict ISO standard

Any file claiming to be PDF/A-3 must pass validation against the clauses of the ISO 19005-3 standard. There are tools such as veraPDF that inspect the file and confirm its compliance. An ordinary PDF file is not required to pass this validation, and so it may contain violations that do not show visually but strip it of its archival status.

The practical takeaway: PDF/A-3 is not a “prettier kind” of PDF; it is a file governed by constraints that make it valid for legal archiving and a carrier of structured data at the same time. An ordinary PDF file guarantees neither of the two conditions.

Ordinary PDF versus PDF/A-3 across four conditions

Ordinary PDF versus PDF/A-3
What distinguishes the PDF/A-3 archival version from an ordinary PDF.
The criterion Ordinary PDF PDF/A-3
Self-containment May rely on external resources Self-contained
XML embedding Not guaranteed Officially supported
The criterion Not defined for archiving ISO 19005-3
Long-term archiving Not guaranteed Guaranteed
PDF/A-3 is designed to remain readable and archival for years.

Long-term archiving properties in PDF/A-3

The original purpose of the entire PDF/A family is archiving, that is, ensuring the document remains readable and valid for decades to come. This serves a core requirement in e-invoicing: retaining invoices for long periods for tax-audit purposes. The following are the properties that achieve this.

Full font embedding

The standard requires embedding all fonts used inside the file itself. Relying on a font installed on the operating system is not permitted. If the file is opened on a device that does not have that font, the text must still appear in its correct form. This protects the invoice from “appearance collapse” over time.

Device-independent color spaces

Colors in PDF/A-3 are defined in a way that does not depend on a particular screen or a particular printer, by embedding a color profile (ICC Profile). The goal is for the same color to look the same on any display medium, so the red distinguishing a certain amount does not turn into a different color that confuses the reader years later.

Metadata in XMP format

The file carries its metadata in a standard format known as XMP, embedded within the file structure. This data includes information about the document, its creation date, and the PDF/A version it follows. The presence of this data in a unified format makes it easier for archiving systems to index and search files years later.

Preventing encryption and external restrictions

A PDF/A-3 file may not be encrypted with a password or protected by restrictions that prevent it from being opened in the future. The reasoning is logical: if the encryption key is lost after years, the file becomes useless. Archiving means permanent availability, not withholding.

Stability of the file structure itself

Alongside the above, the standard prohibits any reliance on features that may disappear as the PDF specification evolves. Using properties specific to a particular reader version is not allowed, nor are experimental, undocumented features. Everything in the file must be built on stable elements clearly defined in the specification. The result is a file that any compliant reader opens, today and years from now, without error messages or missing content.

These properties combined make a PDF/A-3 file a stable document unaffected by changes in software environments. This is exactly what an invoice needs when it may be required to be presented before the Authority years after its issuance. And because record-retention requirements in Saudi Arabia extend for years, a format designed for long-term archiving is not a luxury but a necessity that intersects directly with the business’s tax obligation.

Archiving properties within a single file

Archiving properties in PDF/A-3
What makes the file valid for long-term retention.
Archiving properties

Fonts embedded inside the file

ICC color profiles embedded

No reliance on external resources or software

The invoice XML file embedded inside it

These properties ensure the invoice opens in the same form years later.

The conceptual structure of a PDF/A-3 invoice

To bring the picture closer for the developer, a PDF/A-3 invoice can be viewed as two intertwined layers inside a single file:

  • The visual layer: the PDF pages a human sees, containing the invoice header, seller and buyer data, the line-item table, the totals, and the QR code.
  • The structured layer: an embedded XML file carrying the same data in a structured format that systems understand, according to the invoice data model approved by the Authority.

The two layers must match. Any discrepancy between what the eye sees in the visual layer and what the machine reads in the structured layer is considered a violation. This is why generating the file is not done manually, but through an accounting system that ensures the visual data is derived from the same source that generates the XML.

To illustrate the danger of a mismatch, imagine an invoice showing a total of SAR 1,150 in its visual layer, while the embedded XML file carries a value of SAR 1,510 due to a programming error in the template. The employee reading the page sees one number, and the other party’s system records a different number, and a gap arises that may turn into a dispute or a violation on audit. This is exactly what the matching rule prevents: a single data source feeding both layers together, so there is no room for them to diverge.

This principle explains why it is always recommended that the accounting system generate both layers in a single synchronized operation. When the display page is generated from one source, and the XML file is generated from another source or in a separate later step, the likelihood of divergence between them increases. Simultaneous generation from a unified data model, by contrast, ensures that the number printed is the number sent, letter for letter.

Below is a highly simplified illustration of how the structure of a PDF/A-3 file appears from the inside at the object level (Objects), for educational purposes and not representing the actual file byte for byte:

%PDF-1.7
% Simplified structure of a PDF/A-3 file carrying an invoice

1 0 obj
  << /Type /Catalog
     /Pages 2 0 R
     /Metadata 8 0 R        % XMP metadata
     /Names 9 0 R           % name tree of embedded files
  >>
endobj

2 0 obj
  << /Type /Pages /Kids [3 0 R] /Count 1 >>   % the visual layer
endobj

7 0 obj
  << /Type /EmbeddedFile
     /Subtype /text#2Fxml          % the embedded XML file (the structured layer)
     /Params << /Size 4096 >>
  >>
stream
  ... invoice content in XML format ...
endstream
endobj

Object number 7 in the example above is the embedded XML file, and it is the heart of the difference between an ordinary PDF and a PDF/A-3 in the invoicing context. Note its type EmbeddedFile and its subtype text/xml. As for the way this object is linked to the document and declared as an alternative representation of the invoice, that is the subject of the separate embedding guide.

Verifying the file’s compliance with the standard

Before you can be confident that your file is genuinely a valid PDF/A-3, it must be run through a validation tool. The best-known open tool is veraPDF, the reference adopted by the PDF Association for verifying PDF/A file conformance. The tool inspects the file and produces a report of every violation.

Here is an example of invoking the tool from the command line to verify a file’s compliance with PDF/A-3B level:

# Verify an invoice file against PDF/A-3 level B
verapdf --flavour 3b invoice-2026-001.pdf

# Output the report in a machine-processable format
verapdf --flavour 3b --format mrr invoice-2026-001.pdf > report.xml

Level “B” (short for Basic) ensures the file is reliably renderable visually. There is a higher level known as “A” (short for Accessible) that adds requirements related to accessibility and logical text structure. In the invoicing context, level B is usually enough, because the goal is display stability and carrying structured data, not screen reading for the visually impaired.

If the report shows violations, the common causes are: a font not fully embedded, a device-dependent color space, or the presence of prohibited content such as JavaScript. Addressing these violations is a condition for the invoice’s acceptance, and so the accounting system handles them automatically instead of leaving them to the developer manually.

It is useful for the developer to understand why these violations recur specifically. An unembedded font usually arises when the invoice is generated from a template that assumes a certain font exists on the operating system without actually embedding it in the file. A device-dependent color space appears when colors are specified directly in RGB values without linking them to an independent color profile. Prohibited content, meanwhile, sometimes creeps in from PDF-generation libraries that add interactive layers by default without the developer noticing.

These fine details are what make manual generation fraught with risk in a production environment. The file may look perfectly sound when opened, and print flawlessly, yet fail validation over a violation invisible to the eye. And because the Authority accepts only a compliant file, any silent flaw of this kind may disrupt the entire invoicing cycle. That is why it is preferable to rely on a tested system that guarantees compliance on every invoice, rather than on local generation code reviewed manually.

PDF/A-3 in relation to international standards for the hybrid invoice

The Authority’s choice of the PDF/A-3 format did not come from nowhere; it rests on a well-established international practice known as the “hybrid invoice.” The core idea is that a single file carries two layers: a visual representation read by a human, and a structured representation processed by a machine. European tax systems preceded Saudi Arabia in adopting this approach, which makes the software infrastructure for generating these files mature and proven.

The two most prominent international standards that adopt PDF/A-3 as a container for the hybrid invoice:

  • ZUGFeRD The German standard that embeds an XML file inside a PDF/A-3 file, used widely in business transactions within Germany.
  • Factur-X The French standard compatible with ZUGFeRD, adopting the same principle: a PDF/A-3 file carrying embedded XML with a unified data model.

The common denominator among these standards is the separation of the “container” from the “content.” The container is always a PDF/A-3 file because it guarantees long-term archiving and display stability. The structured content, meanwhile, differs in its data model from one tax system to another. Saudi Arabia adopts its own data model built on the UBL 2.1 standard, while the European standards adopt other models. But the container is the same in both cases, which is what makes a developer coming from a European background find themselves on familiar ground when dealing with the Authority’s requirements.

The practical lesson for the developer here is twofold. First, that open ecosystem tools (PDF/A-3 generation libraries and validation tools such as veraPDF) are usable in the Saudi context, because the container is globally standard. Second, that what actually changes is the structure of the embedded XML file and its data model, not the container file format. This is why the embedding mechanism itself (explained in a separate guide) remains similar across these standards, while what is placed inside the attachment differs.

Performance and file-size considerations when generating PDF/A-3

Compliance with PDF/A-3 requirements has a direct impact on file size and generation performance, a point the developer overlooks until they collide with it in the production environment. The first reason is full font embedding. While an ordinary PDF file suffices with a reference to a font installed on the system, PDF/A-3 mandates carrying the font inside the file itself, which adds a fixed size to every invoice.

To reduce this impact without violating the standard, mature systems resort to a technique called “font subsetting.” Instead of embedding the entire font with thousands of glyphs, only the set of glyphs actually used in the invoice is embedded: the digits and the Arabic and Latin letters appearing in the line items. This preserves the self-containment condition and keeps the file size reasonable.

The second factor is the embedded XML file. Its size is directly proportional to the number of invoice line items. An invoice with three line items produces a small XML, while an invoice with hundreds of line items produces a much larger file that reflects on the total size. This matters for systems that generate thousands of invoices daily, as the impact accumulates on storage and on processing time.

On the performance level, generating PDF/A-3 is slightly slower than generating an ordinary PDF, because the system performs additional steps: embedding fonts, writing the metadata in XMP format, attaching and linking the XML file, and sometimes internal conformance validation before outputting the file. In a single operation the difference is imperceptible, but it becomes significant in bulk generation. This is why well-designed systems separate invoice generation from sending, and process generation in background queues instead of blocking the user’s request.

The practical conclusion is that these considerations are not obstacles but well-understood engineering details with known solutions. A tested accounting system handles them by default: it subsets fonts, compresses what can be compressed without violating the standard, and processes generation efficiently. The developer building the solution themselves needs to pay attention to them early, before they turn into a performance problem in production.

How does Qoyod handle the PDF/A-3 format?

generates Qoyod’s e-invoicing software invoices in the format required by the Authority automatically, so you need not build the file structure manually nor configure fonts and color spaces yourself. When issuing a business-transaction invoice, the system produces a file that carries the visual layer and the structured layer together, signs it with the cryptographic stamp, and adds the unique identifier (UUID) and the QR code in line with the requirements of the second phase of e-invoicing.

Qoyod also manages the cryptographic stamp certificate (the CSID) automatically, and stores the hash chain for each invoice to guarantee the integrity of the sequence. All of this happens in the background, so the user sees nothing but a ready invoice compliant with the Authority’s requirements.

Certain responsibilities remain on the business itself, foremost registering the CSID certificate with the Authority, since Qoyod guides you through the steps but does not carry out the registration on your behalf. Likewise, filing the tax return and paying the tax are done directly through the Authority’s portal, not from within the software.

The advantage of relying on an integrated accounting system here is that you avoid the fine technical errors that strip the PDF/A-3 status from the file. The software ensures fonts are embedded, prevents dynamic content, generates the metadata in the correct format, and links the embedded XML file to the visual layer properly. In short, it turns a complex technical requirement into an issue-invoice button.

Start today

Issue compliant PDF/A-3 invoices without technical hassle

Qoyod generates your invoices in the Authority’s approved format with the embedded XML, the cryptographic stamp, and the QR code automatically, so you issue a sound invoice with a single click.

Start your free trial and issue compliant invoices

Frequently asked questions about the PDF/A-3 format

Is a PDF/A-3 file the same as a PDF file but with a different name?

No. The PDF/A-3 format is a version governed by the ISO 19005-3 standard that requires embedding fonts, prohibits dynamic content and encryption, and permits carrying an embedded XML file. An ordinary PDF file does not necessarily comply with any of these conditions, and therefore is not accepted as a substitute for it by the Authority.

Why does a business invoice need PDF/A-3 while pure XML is sometimes enough?

Because the receiving party in business transactions usually needs to see the invoice visually and process it automatically at once. The PDF/A-3 format combines both layers in a single file: a human-readable page and an XML file for systems. Pure XML, on the other hand, is enough when no visual representation is required.

What is the difference between the PDF/A-3B level and the PDF/A-3A level?

Level B guarantees the stability of the file’s visual display. Level A adds accessibility requirements and logical text structure for screen readers. In invoices, level B is usually enough, because the aim is appearance stability and carrying structured data.

How do I make sure the invoice my system generated genuinely conforms to the standard?

Run the file through a validation tool such as veraPDF at the required level. The tool produces a report of every violation. Common errors are an unembedded font, a device-dependent color space, or prohibited content. With an approved accounting system, these points are handled automatically.

Where is the structured invoice data stored inside a PDF/A-3 file?

It is stored in an XML file embedded inside the document structure as an object of type EmbeddedFile, linked to the document by a specific relationship that identifies it as an alternative representation of the visual invoice. The details of this embedding mechanism we lay out in the guide “Embedding XML inside PDF/A-3.”

Do I need to generate a PDF/A-3 file manually as a developer?

It is not advisable in a production environment. Generating the file manually is prone to fine violations that strip the PDF/A-3 status. It is better to rely on an accounting system that generates the file compliant with the Authority, so it handles font embedding, the cryptographic signature, and linking the embedded XML file automatically.

A practical takeaway

The PDF/A-3 format is not a cosmetic luxury, but a calculated technical choice by the Authority to combine human reading and machine processing in a single file archivable for decades. It is a version of the PDF/A family issued by ISO, distinguished by its ability to embed an XML file inside itself, and by strict archiving properties that make it valid for tax review years later.

The difference between it and an ordinary PDF file is not in appearance, but in self-containment, the prevention of dynamic content, the carrying of structured data, and passing validation against the standard. And if you want to go deeper into the other technical representation formats of the invoice, see the guide on XML invoice and the guide on e-invoice structure. As for the mechanism of injecting the XML file inside a PDF/A-3 file at the structural level, we dedicate a guide to it, “Embedding XML inside PDF/A-3,” so this guide stays focused on the format alone.

Guides

Continue your learning journey

Explore the rest of Qoyod’s guides, or start applying what you’ve learned.

Live webinars hosted by the Qoyod team to help you use the software easily and answer your questions.

Discover Qoyod’s latest updates, ongoing improvements, and new features in one place.

Our team is ready to help you and provide instant support for any issue you face, around the clock.