All artifacts in Fossil are identified by a unique hash, currently using the SHA3 algorithm by default, but historically using the SHA1 algorithm. Therefore, there are two full-length hash formats used by Fossil:
Algorithm | Raw Bits | Hex ASCII Bytes |
---|---|---|
SHA3-256 | 256 | 64 |
SHA1 | 160 | 40 |
There are many types of artifacts in Fossil: commits (a.k.a. check-ins), tickets, ticket comments, wiki articles, forum postings, file data belonging to check-ins, etc. (More info...).
There is a loose hierarchy of terms used instead of “hash” in various parts of the Fossil UI, terms we try to use consistently, though we have not always succeeded. We cover each of those terms in the sections below.
Names
Several Fossil interfaces accept a wide variety of check-in names: commit artifact hashes, ISO8601 date strings, branch names, etc.
Artifact hashes are names, but not all names are artifact hashes. We use the broader term to refer to the whole class of options, and we use the specific terms when we mean one particular type of name.
Versions
When an artifact hash refers to a specific commit, Fossil sometimes calls it a “VERSION,” a “commit ID,” or a “check-in ID.” This is a specific type of artifact hash, distinct from, let us say, a wiki article artifact hash.
We may eventually settle on one of these terms, but all three are currently in common use within Fossil’s docs, UI, and programming interfaces.
A unique prefix of a VERSION hash is itself a VERSION. That is, if your repository has exactly one commit artifact with a hash prefix of “abc123”, then that is a valid version string as long as it remains unambiguous.
UUIDs: An Unfortunate Historical Artifact
Historically, Fossil incorrectly used the term “UUID” where it should use the term “artifact hash” instead. There are two primary problems with miscalling Fossil artifact hashes UUIDs:
UUIDs are always 128 bits in length — 32 hex ASCII bytes — making them shorter than any actual Fossil artifact hash.
Artifact hashes are necessarily highly pseudorandom blobs, but only version 4 UUIDs are pseudorandom in the same way. Other UUID types have non-random meanings for certain subgroups of the bits, restrictions that Fossil artifact hashes do not meet.
Therefore, no Fossil hash can ever be a proper UUID.
Nevertheless, there are several places in Fossil where we still use the term UUID, primarily for backwards compatibility:
Repository DB Schema
Almost all of these uses flow from the blob.uuid
table column. This is
a key lookup column in the most important persistent Fossil DB table, so
it influences broad swaths of the Fossil internals.
Someday we may rename this column and those it has influenced (e.g.
purgeitem.uuid
, shun.uuid
, and ticket.tkt_uuid
) by making Fossil
detect the outdated schema and silently upgrade it, coincident with
updating all of the SQL in Fossil that refers to these columns. Until
then, Fossil will continue to have “UUID” all through its internals.
In order to avoid needless terminology conflicts, Fossil code that
refers to these misnamed columns also uses some variant of “UUID.” For
example, C code that refers to SQL result data on blob.uuid
usually
calls the variable zUuid
. Another example is the internal function
uuid_to_rid()
. Until and unless we decide to rename these DB columns,
we will keep these associated internal identifiers unchanged.
You may have local SQL code that digs into the repository DB using these
column names. If so, be warned: we are not inclined to consider
existence of such code sufficient reason to avoid renaming the columns.
The Fossil repository DB schema is not considered an external user
interface, and internal interfaces are subject to change at any time. We
suggest switching to a more stable API: the JSON API, /timeline.rss
,
TH1, etc.
There are also some temporary tables that misuse “UUID” in this way.
(description.uuid
, timeline.uuid
, xmark.uuid
, etc.) There’s a good
chance we’ll fix these before we fix the on-disk DB schema since no
other code can depend on them.
TH1 Scripting Interfaces
Some TH1 interfaces use “UUID” where they actually mean some
kind of hash. For example, the $tkt_uuid
variable, available via TH1
when customizing Fossil’s ticket system.
Because this is considered a public programming interface, we are
unwilling to unilaterally rename such TH1 variables, even though they
are “wrong.” For now, we are simply documenting the misuse. Later, we
may provide a parallel interface — e.g. $tkt_hash
in this case — and
drop mention of the old interface from the documentation, but still
support it.
JSON API Parameters and Outputs
The JSON API frequently misuses the term “UUID” in the same sort of way, most commonly in artifact and timeline APIs. As with the prior case, we can’t fix these without breaking code that uses the JSON API as originally designed, so our solutions are the same: document the misuse here for now, then possibly provide a backwards-compatible fix later.
manifest.uuid
If you have the manifest
setting enabled, Fossil writes a file
called manifest.uuid
at the root of the check-out tree containing the
commit hash for the current checked-out version. Because this is a
public interface, we are unwilling to rename the file for correctness.