Kosli devops change management Artifact Binary Provenance

How to secure your software supply chain with Artifact Binary Provenance

Mike Long
Mike Long
Published January 13, 2022 in technology

In Kosli, we use Artifact Binary Provenance as the foundation for our audit trails. Artifact Binary Provenance is a fancy term, but the idea behind it is really quite simple. All it means is that we can identify the software we have running in production. Let’s take a closer look 👀

How should we identify software?

There’s lots of ways to identify software. In our industry we’ve tried different approaches to version-numbers like semantic versioning and release names. These are human-centered approaches that involve applying a name to a specific piece of software.

This approach is called version labeling

The downside of this approach is that it is fallible. Any label can be applied to any software package, so it’s easy to see how mistakes can be made. For example, the version number could be incorrectly bumped, or errors in copying and distributing software could cause a misapplication of identity.

Version labeling also creates a security threat. A malicious actor could label their software in a way that makes a system believe it is running qualified software, but is instead running compromised software.

For compliance and security reasons we need a more reliable approach.

Content Addressable Storage

In high security environments we need a tamper-proof identity scheme. In plain talk, if the software changes we want it to have a different identity.

Luckily, this is a solved problem in computer science. The solution is Content Addressable Storage.

How this works is really simple. Instead of using a label to define software identity, you use the cryptographic hash of the software itself.

This means that if a single byte in the software changes it will have a different identity.

Can’t I just use the git commit SHA to identify the software?

Git commits define a content addressable snapshot of the source code (and its history). If you are distributing the source repo as your artifact this could be a valid method of identity.

However, in most cases software is not distributed as source but rather as binaries (typically through compilation, packaging, or Docker images). This translation process is often non-reproducible or nondeterministic, removing a hard trace from source to binary. In other words, the binary package could be labelled with a source commit that is invalid.

For this reason we use a Secure Hash Algorithm (SHA) to identify the binary.

Storing the provenance

Now that we have a method for identifying software, wouldn’t it be great if we could look this up on demand from our DevOps tools?

A compliance System of Record provides a secure database to store claims to the identity (we have a strong opinion on what that should be 😇). When we create a binary in our secure CI build process we store the identity information in a journal.

Artifact Binary Provenance process Kosli

As each binary progresses through the value stream you can record evidence against it such as:

  • Source commit
  • Build url
  • Test results
  • Security analysis
  • Deployments
  • Approvals

And the information is as easy to look up as it is to store. Our deployment processes can perform risk controls to ensure deployments are based on known approved binaries and verified processes. This is why we believe Artifact Binary Provenance is the basis for any compliance-based DevOps approach. It makes it impossible to qualify one piece of software and deploy another.

What about the humans?

Does this mean SemVer is dead? That you shouldn’t use git SHAs to identify your software? Not at all!

These are very useful ways for humans to navigate identity through version control and CI systems. However, since they are fallible, we still need the primary key of identity to be the content-addressable storage, linked to the labels. Labels are for humans and SHAs are for machines.

About this article

Published January 13, 2022 in technology

About the author

Mike Long

Mike Long


Published January 13, 2022, in technology

Mike Long
Mike Long
Live in Git Blame? Don’t spend hours searching for the change that broke your application! Query, search and discover all the changes in one place

Latest articles

Kosli announces Innovation Partnership with DNB and Firi

We are pleased to announce that Innovasjon Norge has awarded Kosli an innovation grant of 3.4 million NOK to pursue a R&D project with DNB and Firi. In this blog we’ll give you an overview of the …

The Ultimate Guide to git blame: A How To with Examples

Source control tools give users many powers and one of the big ones is traceability. With traceability tools you can know exactly who made each change and when they made it. In Git, you use the git …

Git Blame in VS Code: The 4 Best Options

Most production projects have a team collaborating on them, so even in a single file there can be multiple contributors. When things go wrong, it’s useful to understand how and why certain changes …

Sign up to our newsletter

We'll let you know about the Kosli launch, product news, features and updates
Kosli is committed to protecting and respecting your privacy. By submitting this newsletter request, I consent to Kosli sending me marketing communications via email. I may opt out at any time. For information about our privacy practices, please visit Kosli's privacy policy.
Kosli team reading the newsletter

Let’s chat!

Got a question about Kosli? An idea for a new feature? Join Kosli Slack and talk to us.

Developers using Kosli