3 min read

How Pulling the Docker Image Digest Out of Hiding Improves Source Code Auditability

How Pulling the Docker Image Digest Out of Hiding Improves Source Code Auditability
Photo by Immo Wegmann / Unsplash

A Docker digest is a cryptographic hash, most commonly a SHA-256 hash. You can consider this a unique fingerprint for each Docker image version. But why should you care, and how does it help with security? Let’s break it down in simple terms.

What is a Cryptographic Hash?

A cryptographic hash function, such as SHA-256 (Secure Hash Algorithm 256-bit), processes input data (in this case, the contents of your Docker image) and generates a fixed-size string of characters, often referred to as a hash or digest. This output string appears random but is uniquely determined by the input data.

To visualize this, consider the following key points:

  • Fixed Size: Regardless of the size of the input data, the output hash will always be of a fixed length. For SHA-256, this length is 256 bits.
  • Uniqueness: Even a tiny change in the input (e.g., altering a single byte in one of the layers) results in a new hash/digest. This property ensures that each version of the Docker image has a unique hash/digest.
  • Deterministic: The same input will always produce the same hash/digest, allowing for consistent verification.

Think of it like a digital fingerprint: just as your fingerprint uniquely identifies you, the SHA-256 hash uniquely identifies the contents of your Docker image. If even a single byte in the image changes, the SHA-256 hash will change, ensuring that you’re always working with the immutable version of the Docker image you expect. This ensures the integrity and security of the image.

Where Does the Digest Come From?

When you build a Docker image, Docker takes all the files and layers that make up your image and runs them through a cryptographic hash function, typically SHA-256. This process generates a unique digest for the image.

Why is This Important for Security?

Using the digest to reference your Docker images instead of tags alone (like latest) ensures that you always pull the exact same image. This prevents issues where an updated image might introduce bugs or vulnerabilities without your knowledge.

Technically, you can deploy your source code at different times with no changes, but end up with a different parent image running. This is a potential problem if bugs or vulnerabilities were introduced into the parent Docker image since your software supply chain last pulled the latest image. Your team would have no way to audit these changes without specifically opting to audit the parent Docker image changes (e.g., using the digest).

Here’s how to validate that the digest is correct and hasn't been tampered with:

  • Pull the Image by Digest - You can pull an image directly by its digest using the following command.
# minimum requirements
docker pull <image>@sha256:<digest>

# not required, but for improved readability
docker pull <image>:<tag>@sha256:<digest>

# AWARENESS
# digest includes the hash algorithm as part of the value
# e.g.sha256:84006fe377327d69aa1b7576ad5f9c2081bca2ce5d67fa17baf25cdf152d37ad
  • Inspect the Image Digest - After pulling the image, you can verify its digest
# minimum requirements
docker inspect <image>@sha256:<digest>

# not required, but for improved readability
docker inspect <image>:<tag>@sha256:<digest>

# example
docker inspect python:3.9-alpine@sha256:84006fe377327d69aa1b7576ad5f9c2081bca2ce5d67fa17baf25cdf152d37ad
  • Compare Digests - Compare the digest you pulled with the one you expect. If they match, you know the image is exactly what you intended to use.
    • Look for RepoDigests in the output to validate:
[
    {
        "Id": "sha256:ae876e42c70993be1619770990412615df3752b0a89a8ef6889fb798cbc4bdf4",
        "RepoTags": [
            "python:3.9-alpine"
        ],
        "RepoDigests": [
            "python@sha256:84006fe377327d69aa1b7576ad5f9c2081bca2ce5d67fa17baf25cdf152d37ad"
        ],
...

Great! My Team Has No Time To Maintain This Though

Agreed! This is where signal.fyi comes in. signal.fyi is a GitHub Marketplace App that focuses on maintaining the parent Docker image digest for you on a daily basis. Born from the fallout of the Log4Shell event, their flagship product ensures that your Docker images are up-to-date (on a daily basis) without the manual overhead.

Further Reading

For those interested in a deeper dive into Docker digests and their significance, refer to these resources:

These links should help readers who want to delve deeper into the technical aspects of Docker digests.