Essays and Analysis about All Things Data

A datom is a finite, discrete sequence of bits (i.e., a bit pattern) that is:

  1. stored on and retrieved from digital media as a single holistic object by a software application, and
  2. governed by a specification(s) that
    • specifies what constitutes valid bit patterns, and
    • explains how the bit patterns are to be interpreted.
  3. The term “datom” corresponds to the term “token” as in “An atomic object in parsing”.

Specifications such as UTF-8, UTF-16, MP3, or JPG are examples of “governing specifications” that specify the interpretation of the bit-sequence or bit patterns of a datom.

The bit pattern may represent, for example, a string of lexical characters, a number, or a binary object.  The value of a cell in a relational table is a datom; the value of an XML attribute is a datom; and the content of an XML element that only contains character data is a datom.

Datoms are typically associated with a data type, such as STRING, CHAR, INTEGER, and LONG, in the same sense that programming language variables are assigned a data type and fields in databases are assigned a data type.

Datoms are typically (almost always) associated with lexical, natural-language names or labels.

Datoms are the fundamental, atomic component of the Datomic Data Model.  “Datom” and “datomic” are neologisms of “data” and “atom/atomic”

The Datomic Data Model is a graph-based data model that generalizes all data structures and serves as the foundation for the Data Mapping Specification Language (DAMSL).