Qoyod
Pricing

Knowledge Base

Hashing in the E-Invoice: How ZATCA Guarantees Invoice Integrity

When your business moves to Phase Two of e-invoicing, the invoice changes from a visual document into a signed, sequenced digital document. At the heart of this sequence lies a single technology that guarantees no invoice has been touched after it was issued: Hashing. This technical guide explains the concept of hashing as used by the Zakat, Tax and Customs Authority (ZATCA), its mathematical properties, what exactly gets hashed, and how it differs from encryption and the digital signature, with practical code examples.

This content is aimed at developers and technical-integration specialists who build or integrate invoicing systems with the Fatoora platform. If you are looking for the full picture of e-invoicing requirements, start from the Qoyod e-invoicing guide, then come back here for the technical detail. Many developers reach the integration stage understanding the functional invoice requirements, yet stumble at the cryptographic layer. Hashing is the cornerstone of this layer, and without understanding it, it is hard to diagnose why the platform rejects invoices.

What is a hash function?

A hash function is a mathematical algorithm that takes an input of any size (text, a file, a message) and produces a fixed-length output called the hash value or digital fingerprint (digest). Whatever the size of the input, the length of the output stays fixed. For example, the SHA-256 function always produces a 256-bit fingerprint (64 characters in hexadecimal), whether the input is a single word or a hundred-line invoice.

The idea is simple at its core. You pass the invoice data through the function and get a short string that represents that data uniquely. Any change in the original data, even by a single character, produces an entirely different fingerprint. It is precisely this property that makes hashing an ideal tool for verifying document integrity.

The fingerprint can be likened to a human fingerprint. A fingerprint does not carry a picture of the person and does not tell you their height or colour, yet it identifies them uniquely, and the slightest difference changes its shape. Likewise, the invoice fingerprint does not reveal its content, but it represents it uniquely: if two fingerprints match, the two documents are identical, and if they differ, the two documents differ, even by a single character. This analogy shows why hashing is suited to verifying both identity and integrity together.

It is useful to distinguish between general hash functions and cryptographic hash functions. The former are used in data structures such as hash tables to speed up searching, and are not required to resist attacks. The latter, which are what concern us in e-invoicing, are designed specifically to resist forgery and collision attempts, and they alone are approved for security and compliance applications.

Let us illustrate with a concrete example. If we hash the text "invoice" and then hash the text "invoicе" (with a slightly different final letter), we will get two entirely different fingerprints even though the difference is a single character:

SHA-256("invoice")  = 7d3a9f2c8b1e4f6a0d5c2b8e9a1f4d7c...
SHA-256("invoicе") = e2b8c1a4f9d6037e5a8b2c1d4e7f0a93...

Notice that the two fingerprints share no pattern at all, even though the two inputs are almost completely alike. This behaviour is not a coincidence; it is a direct result of the mathematical properties that hash functions were designed around. To understand why the Authority relies on this technology specifically, we must examine its three properties.

The three properties of hash functions

The three properties of a hash function
What makes hashing a reliable tool for invoice integrity.
Hashing properties

One-way: it cannot be reversed to recover the original

Deterministic: the same input gives the same fingerprint

Avalanche effect: a tiny change alters the fingerprint entirely

These properties make any tampering with the invoice instantly detectable.

Hashing derives its value in e-invoicing from three fundamental mathematical properties. Understanding these properties is a condition for understanding why the Authority relies on it, and each property serves a specific function within the Phase Two system.

1. One-way (Pre-image Resistance)

A hash function works in one direction only. You can turn invoice data into a fingerprint easily, but it is practically impossible to reverse the operation and recover the original data from the fingerprint alone. There is no inverse function that takes the fingerprint and returns the invoice to you.

The reason is that hashing is a lossy operation. The function reduces an input that may run to thousands of characters into a fixed 256 bits, so enough information is lost to make reconstruction impossible. The only theoretical way to find an input matching a given fingerprint is blind trial of an enormous number of possibilities, which exceeds available computing power. This property means that publishing an invoice fingerprint reveals nothing about its actual content.

2. Deterministic

When the same input is passed through the same function, you always get the same fingerprint, at any time, on any machine, and in any programming language. This property is essential for verification. When your business sends an invoice and its fingerprint to the Fatoora platform, the platform can recompute the fingerprint from the received data and compare it with the sent fingerprint. A match of the two values means the data arrived intact.

If the function produced a different value each time, verification would be entirely impossible. Determinism is what makes the fingerprint a stable reference relied upon across different parties and systems, and it is an indispensable condition in any distributed verification protocol.

3. Avalanche Effect

Any small change in the input, even by a single bit, causes a drastic and unpredictable change in the resulting fingerprint. On average, roughly half the output bits change. This means it is impossible to predict how a small edit will affect the fingerprint, and impossible to craft a calculated edit that keeps the fingerprint unchanged.

In the invoice context, if a party tried to change the amount from SAR 1,000 to SAR 10,000, the invoice fingerprint would change entirely. And because the fingerprint is recorded and bound to the invoice chain, the tampering is exposed immediately at any verification. The avalanche effect is what prevents a tamperer from making cosmetic changes to an invoice without leaving a clear trace.

A fourth supporting property: Collision Resistance

Alongside the three properties, cryptographic hash functions are characterised by collision resistance. A collision means an attacker finds two different inputs that produce the same fingerprint. If this were easily possible, someone could replace an invoice with a forged one carrying the same fingerprint, and the forgery would pass undetected.

Functions such as SHA-256 are designed so that finding a deliberate collision exceeds realistic computing power. For this reason, modern systems avoid old, weak functions such as MD5 and SHA-1, which have proven vulnerable to collision attacks, and rely on a strong modern function. This resistance is what makes the fingerprint reliable evidence of a document’s identity.

Why does the Authority rely on hashing in e-invoicing?

The Authority chose hashing because it achieves two fundamental goals in Phase Two: proving document integrity (Integrity) andtamper-evidence (Tamper-evidence). Let us detail each of them, as they explain why the fingerprint became a mandatory field in every invoice.

Proving document integrity

When your business issues a Phase Two invoice, a hash fingerprint is computed for it from its content in its legal form. This fingerprint is attached to the invoice and sent within the data to the Fatoora platform. Any party can later, whether the Authority, the buyer, or an auditor, recompute the fingerprint from the copy of the invoice in their possession and compare it with the original fingerprint. A match proves the invoice has not changed since the moment it was issued.

This model shifts trust from reliance on good faith to reliance on mathematical proof. No one has to take it on faith that the invoice has not been altered; they can verify it themselves by recomputing the fingerprint. This is a fundamental shift in the nature of the accounting document.

The added advantage is that hashing is fast and computationally light. Computing the fingerprint of a full invoice takes fractions of a second, which allows millions of invoices to be processed daily without burdening the systems. And because the fingerprint is short and fixed-length, it is easy to store, send, and compare against the full invoice. This efficiency is a practical condition for applying verification at a broad national scale.

Detecting tampering through the invoice chain

Hashing does not stop at a single invoice. The Authority requires each invoice to carry the fingerprint of the invoice preceding it in a field known as the Previous Invoice Hash PIH (Previous Invoice Hash). This links invoices into a connected chain, each link depending on the one before it. This linkage is what makes the business’s invoice ledger a record that admits neither deletion nor hidden retroactive insertion.

The hash chain across invoices
How hashing builds a linked chain of invoices.
1

Invoice 1 + its hash

2

Invoice 2 carries hash 1

3

Invoice 3 carries hash 2

Editing any invoice breaks the entire chain.

If someone tried to edit an old invoice in the middle of the chain, its fingerprint would change and would no longer match the Previous Invoice Hash PIH stored in the following invoice. The match breaks and the whole chain is exposed. To hide the tampering, the tamperer would have to recompute the fingerprints of all subsequent invoices and re-sign them, which is practically impossible because the invoices have already been sent and recorded with the Authority. This principle is borrowed from blockchain technology and gives the invoice ledger the property of immutability.

For a deeper treatment of the Previous Invoice Hash field and how it is computed for the first invoice in the chain, we have dedicated a separate technical guide to the PIH field PIH within this section. As for the specific algorithm the Authority adopts for computing the fingerprints (SHA-256), it too has a detailed guide explaining its structure, output length, and the reasons for choosing it. The central idea here is that the first invoice in the chain has no previous invoice, so it is given a specific default value in the PIH field, and the chain then starts from it.

What exactly gets hashed?

Here lies a subtle point developers often get wrong. The Authority does not hash the invoice as you see it visually (PDF or image); it hashes the legal form of the invoice in XML according to the UBL 2.1standard. This form is the official document, and the visual representation is merely a display of it. Any attempt to compute the fingerprint from a PDF file or from the visible text will fail, because the document recognised by the Authority is the structured XML file.

But before hashing, the invoice passes through a crucial step called canonicalization (Canonicalization). An XML file can be written in multiple ways that convey the same meaning: different indentation, different attribute order, blank lines, differences in encoding. All these formatting differences do not change the meaning of the invoice, but they change the bytes, and therefore change the resulting fingerprint.

To solve this problem, a standard canonicalization algorithm (C14N) is applied that converts the XML file into a unified, unambiguous form before computing the fingerprint. This way, the issuer and the recipient get the same fingerprint from the same invoice, regardless of formatting differences in how the file is written. Canonicalization is not an optional detail; it is a fundamental condition for fingerprints to match across systems.

The invoice’s journey to the fingerprint
The steps from the XML file to the fixed fingerprint.
1

UBL 2.1 file

2

C14N canonicalization

3

Apply SHA-256

4

Fixed 256-bit fingerprint

Canonicalization guarantees a fixed fingerprint for the same invoice.

The practical steps for computing an invoice fingerprint according to the Authority’s requirements proceed as follows:

1. Generate the invoice in XML according to the UBL 2.1 standard
2. Apply canonicalization (C14N) to unify the file format
3. Compute the fingerprint with the SHA-256 algorithm on the canonicalized output
4. Encode the fingerprint in Base64 to insert it into the fields
5. Bind the fingerprint to the invoice chain via the PIH field

Here is an illustrative example in Python that computes the hash of text content with the SHA-256 algorithm and then encodes it in Base64, which is the essence of what happens to the canonicalized XML file:

import hashlib
import base64

# Invoice content in XML after canonicalization
canonical_xml = "<Invoice>...</Invoice>".encode("utf-8")

# Compute the hash with the SHA-256 algorithm
digest = hashlib.sha256(canonical_xml).digest()

# Encode in Base64 to insert into the invoice
invoice_hash = base64.b64encode(digest).decode("utf-8")

print(invoice_hash)
# Example output: NWZlY2ViNjZmZmM4NmYzOGQ5NTI3ODZjNmQ2OTZjNzk=

And because many invoicing systems run in JavaScript on the back end, here is the same equivalent:

const crypto = require("crypto");

// Invoice content in XML after canonicalization
const canonicalXml = "<Invoice>...</Invoice>";

// Compute the hash with the SHA-256 algorithm, then encode it in Base64
const invoiceHash = crypto
  .createHash("sha256")
  .update(canonicalXml, "utf8")
  .digest("base64");

console.log(invoiceHash);

And for anyone verifying manually during development, the hexadecimal fingerprint of a canonicalized file can be computed directly from the command line:

openssl dgst -sha256 invoice_canonical.xml
# SHA256(invoice_canonical.xml)= 5fececb6ffc86f38d952786c6d696c79...

An important note for developers: do not compute the fingerprint on the raw XML file directly. Canonicalization is a mandatory step, and omitting it is the most common cause of an invoice being rejected by the Fatoora platform due to a fingerprint mismatch. The very XML file from which the fingerprint is computed is the subject of the e-invoice XMLguide, while the mandatory fields and their order are explained in the e-invoice structureguide, and it is advisable to review them side by side with this guide.

Hashing vs encryption vs the digital signature

Many people confuse these three concepts because they all belong to cryptography, yet they serve entirely different purposes. Distinguishing between them is essential for understanding the Phase Two system and for diagnosing errors during integration.

The fundamental differences can be summarised as follows:

Hashing: A one-way operation that needs no key. Its purpose is to prove data integrity and detect any change to it. It cannot be reversed, and the input cannot be recovered from it. It is what guarantees the invoice has not been altered. And the output is fixed-length regardless of the input size.

Encryption: A reversible operation that needs a key. Its purpose is to hide data from unauthorised parties (confidentiality). Encryption can be undone and the original data recovered with the correct key. Encryption hides content, whereas hashing does not hide it but stamps it. It is important to note that the invoice in Phase Two is not encrypted, but signed and stamped, because the goal is proof, not concealment. The invoice is a tax document that must be readable to the concerned parties, not hidden from them.

Digital Signature: Builds on hashing and adds to it the element of authenticity. First the document fingerprint is computed, then this fingerprint is encrypted with the issuer’s private key. The result is a signature that proves two things together: that the document has not changed (integrity, from hashing), and that it was actually issued by the holder of the private key (authenticity). The digital signature in Phase Two is known as the Cryptographic Stamp and relies on a CSID certificate issued by the Authority.

The bottom line is that hashing is the foundational building block. The digital signature uses it internally, and the whole system is built on it. Every digital signature begins with computing a hash fingerprint, and no valid signature can be created without a valid hash before it. The details of the digital signature, the cryptographic stamp, and the CSID certificate are set out in two separate guides within this section, one for the digital-signature specification and the other for the cryptographic-stamp specification.

Where does the fingerprint appear in the invoice?

After computing the fingerprint in Base64, it appears in two main places within the Phase Two system. The first is the QR Code on the simplified invoice (B2C), where the fingerprint is embedded within the code’s data so that any party can verify by scanning the code. The second is the internal fields of the XML file, where the current invoice’s fingerprint is recorded along with the field PIH for the previous invoice’s fingerprint.

This threefold binding between the fingerprint, the signature, and the code is what turns the invoice from a visual document into a digital document that can be verified automatically by any party at any time. Thanks to it, the Authority can inspect millions of invoices automatically and confirm their integrity without human intervention.

Alongside the fingerprint, the invoice carries other identifiers that complete the tracking system, most notably the invoice’s Unique Identifier (UUID) and the sequential Invoice Counter (ICV). The relationship between these identifiers and the hash fingerprint is close, as they all work together to prove the sequence of invoices and the absence of any gaps or deletions. For anyone wanting to explore the sequential invoice counter and how it is numbered, we have dedicated the Invoice Counter ICV guide within this section.

How does verification work at the receiving party?

Verifying invoice integrity is a straightforward operation that leverages both determinism and one-wayness together. When a party receives the invoice (the platform, the buyer, or the auditor), it carries out the following steps:

  • It takes the received XML file and applies to it the same canonicalization step the issuer applied.
  • It computes the SHA-256 fingerprint on the canonicalized output.
  • It compares the fingerprint it computed with the fingerprint attached in the invoice.
  • If they match, the invoice is intact and untouched. If they differ, a change has occurred to the invoice and the document is rejected.

The beauty of this process is that the recipient does not need to trust the sender, nor a secure channel, nor a secret key. All it needs is the invoice itself and the declared hash algorithm. This is what makes verification available to any party at any time, and it is the essence of the idea of non-repudiation in digital documents. And when the digital signature is added on top of the fingerprint, verification becomes comprehensive of authenticity as well, proving who issued the invoice alongside proving its integrity.

The default value for the first invoice in the chain

The first invoice your business issues has no previous invoice to rely on. In this case, the field PIH is filled with a specific default value representing the start of the chain, which is the hash of a standard value defined by the Authority. From that point, each subsequent invoice begins to carry the fingerprint of its predecessor, and the chain grows invoice after invoice. An error in this default value breaks the chain from its very start, so it is considered a key checkpoint on first integration.

How does Qoyod help you with hashing and e-invoicing?

The technical complexity we explained above, from canonicalization to computing the fingerprint to linking the chain, happens entirely behind the scenes in Qoyod without any intervention from you. You do not need to write a single line of code nor to understand the SHA-256 algorithm to issue a compliant invoice.

  • Automatic XML generation and canonicalization: Qoyod generates the invoice in UBL 2.1 format and applies standard canonicalization before computing the fingerprint, so there is no room for the whitespace or encoding errors that get an invoice rejected.
  • Computing the fingerprint and linking the chain: Qoyod computes each invoice’s fingerprint and automatically links it to the previous invoice’s fingerprint via the PIH field, and keeps the complete invoice chain for verification and auditing.
  • Managing the CSID certificate and the cryptographic stamp: Qoyod handles the cryptographic-stamp certificate issued by the Authority and signs every invoice automatically with no technical burden on you.
  • Integration with the Fatoora platform: Qoyod sends standard invoices (B2B) for real-time clearance and simplified invoices (B2C) for reporting within 24 hours, generating the QR code that embeds the fingerprint.

Common technical errors when dealing with the fingerprint

From real integration with the Fatoora platform, a set of errors recur that lead to invoice rejection. Watching out for them saves hours of debugging when building an invoicing system:

  • Neglecting canonicalization: Computing the fingerprint on raw XML without applying C14N. The most common cause of a fingerprint mismatch.
  • Encoding differences: Using a non-UTF-8 encoding when converting text to bytes before hashing, which changes the resulting fingerprint.
  • Mixing up output formats: Sending the fingerprint in hexadecimal (hex) while the platform expects it in Base64, or vice versa.
  • Breaking the PIH chain: Inserting the wrong Previous Invoice Hash or neglecting the default value for the first invoice in the chain.
  • Editing the XML after computing: Any edit to the file after computing the fingerprint, even adding a blank line, invalidates the match and the computation must be redone.
Start today

Leave the technical complexity to Qoyod and focus on your business

Qoyod handles canonicalization, computing the fingerprint, linking the chain, signing, and integration with the Fatoora platform automatically. Issue Phase Two invoices with confidence without writing a single line of code.

Try Qoyod free for 14 days

Frequently Asked Questions

Can invoice data be recovered from the hash fingerprint?

No. Hashing is a one-way operation by nature. The fingerprint represents the data uniquely but does not carry it, so there is no practical way to reverse the operation and recover the invoice content from the fingerprint alone.

What is the difference between hashing and encryption in the e-invoice?

Hashing is a one-way operation that proves invoice integrity and detects tampering without hiding its content. Encryption is a reversible operation that hides content from unauthorised parties. The invoice in Phase Two is stamped and signed, not encrypted, because the goal is proof, not concealment.

Which algorithm does the Authority adopt for computing the fingerprint?

The Authority adopts the SHA-256 algorithm, which produces a fixed-length 256-bit fingerprint from any input. It has a separate technical guide within this section explaining its structure, output length, and the reasons for choosing it.

Why must canonicalization be applied before computing the fingerprint?

Because an XML file can be written in multiple formatting ways (whitespace, attribute order, blank lines) that convey the same meaning but change the bytes and therefore the fingerprint. Canonicalization unifies the format before hashing so that the issuer and the recipient get the same fingerprint.

What is the relationship between hashing and the invoice chain?

Each invoice carries the fingerprint of the previous invoice in the PIH field, so invoices are linked in a chain where each link depends on the one before it. Editing any invoice breaks this linkage and exposes the tampering immediately.

Do I need to program these operations myself when using Qoyod?

No. Qoyod handles generating the XML and canonicalizing it, computing the fingerprint, linking the chain, signing, and integrating with the Fatoora platform automatically. You do not need to write any code nor to understand the details of the algorithm.

Guides

Continue your learning journey

Explore the rest of Qoyod’s guides, or start applying what you’ve learned.

Live webinars hosted by the Qoyod team to help you use the software easily and answer your questions.

Discover Qoyod’s latest updates, ongoing improvements, and new features in one place.

Our team is ready to help you and provide instant support for any issue you face, around the clock.