← The Framework

Component 1: The Schema

The shape of the domain as a small category.

1. The Primitive Type Category

We start at the bottom. Every column in every table has a type, drawn from a fixed collection of primitives. These types come with a natural subtyping structure (integers embed into longs, dates refine to timestamps), and the lattice structure propagates upward into the rest of the framework.

Definition 1.1 : Type

Let Type be the category whose objects are primitive types:

Ob(Type) = {String, Integer, Long, Double, Boolean, Date, Timestamp, Binary, GeoPoint, GeoShape, TimeSeries, Array(τ), Struct({li : τi}), Attachment, Marking}

Morphisms are subtype inclusions and coercions: if τ1 is a subtype of τ2, there is a unique morphism τ1τ2. The familiar chains IntegerLongDouble and DateTimestamp are examples. Because there is at most one morphism between any two objects, Type is a thin category.

The subtype ordering makes Type a bounded lattice: ⊤ = String (every type coerces to string) and ⊥ = Void (the empty type, initial object).

Proposition 1.2 : Type is a finite lattice

Type has all finite meets ∧ and joins ∨. The meet τ1τ2 is the greatest common subtype; the join τ1τ2 is the least common supertype. Since the type system is finite, these always exist. In practice, the join is what matters: when you merge two columns, the result type is τ1τ2.

Corollary 1.3 : Unique coercion

For any chain τ1τ2 → … → τk, the composite is unique. Thinness kills ambiguity: there is never more than one way to coerce along a chain.

Remark 1.4

Type compatibility is a preorder, coercion chains compose by transitivity, and you can use a value of type τ1 wherever τ2 is expected as long as τ1τ2 exists. Nothing deep : but getting it right once means we never have to think about it again.

2. Schemas as Functors

A schema is just a list of columns with their types, and the categorical way to say this is surprisingly clean.

Definition 2.1 : Schema

A schema is a functor S : nType, where n = {1, 2, …, n} is the discrete category on n objects. For each kn, S(k) = τk gives the type of column k. Equivalently:

Sτ1 × τ2 × … × τn

A named schema is a pair (S, name) where name : nLabel gives each column a human-readable label.

Definition 2.2 : Schema Category Sch

The category Sch has:

Schema morphisms capture every structural transformation of tabular data: drop columns, rename them, cast types. Sch sits inside [FinSetop, Type] as a full subcategory.

Example 2.3

A "Person" schema: S : 3Type with S(1) = String, S(2) = Integer, S(3) = Date, named (name, age, dob). In product form: String × Integer × Date.

3. The Ontology Schema Category

Now we assemble the central object. The idea: entity types become objects, relationship types become morphisms, and the whole thing lives in a single small category.

Definition 3.1 : OntSch

The ontology schema category OntSch is a small category where:

Multiple morphisms between the same pair of objects are allowed : a person can be both an employee of and a shareholder in the same company : so OntSch is really a finite directed multigraph, viewed as a category.

Example 3.2: Multigraph structure
Person employs Company
Person shareholder Company
Person orders Order contains Product

Three distinct link types, some sharing codomains. Note that a path like Person to Company to Product is not composable when arrows point in different directions.

Remark 3.3

The triple (SO, pkO, titleO) is doing all the work: SO gives the table shape, pkO identifies the uniqueness constraint, and titleO picks the column used for display in UIs. That is everything you need to fully specify an entity type.

4. The Property Functor and Foreign Keys

Each entity type carries a schema; that much is clear. But link types do not induce maps between schemas in any natural way. The right structure is a span.

Definition 4.1 : Property Functor

The property functor S : Ob(OntSch) → Ob(Sch) sends each entity type O to its property schema SO. Note that S is only defined on objects : it does not extend to morphisms, because a link L : OiOj tells you nothing about how the columns of Oi and Oj relate. What the link does specify is which columns participate in the join.

Definition 4.2 : Foreign Key Specification

For a link type L : OiOj, the foreign key is a span in Sch:

S(Oi) FKL1, π2) S(Oj)

FKL is the key schema; π1 picks out the foreign key column(s) from the source; π2 embeds the primary key column(s) of the target.

For n:1 links, FKL is just the primary key type of Oj, living as a column in S(Oi). For m:n links, FKL is a separate join schema : potentially backed by its own dataset.

5. OntSch as a Quiver

If we strip OntSch of composition and identities, keeping only the raw graph of entity types and link types, we get a quiver. The free category on this quiver recovers composition and gives us something extra: a universal property that pins down multi-hop traversal.

Definition 5.1 : The Ontology Quiver

Let Q = (V, E, s, t) where V = Ob(OntSch), E = link types, and s, t assign source and target. The free category Path(Q) has objects V and morphisms = composable paths of link types, with composition by concatenation and the empty path as identity.

Theorem 5.2 : Universal property of Path(Q)

For any category C and graph morphism F : QU(C) (where U forgets composition), there exists a unique functor : Path(Q) → C extending F.

In plain terms: once you decide what each entity type and each link type mean in some target category, all multi-hop traversals are forced. You don't get to choose how two-hop paths behave; the universal property decides for you. Link traversal "just works."

Corollary 5.3

Morphisms in Path(Q) are sequences L1L2 ∘ … ∘ Lk with matching endpoints. Identity is the empty path. Composition is concatenation : nothing more.

6. Cardinality as Enrichment

The 1:1, 1:n, n:1, m:n annotations on links are more than database decoration : they constrain the shape of any presheaf inhabiting the schema, and the right language for this is enrichment.

Definition 6.1 : Cardinality constraints

For L : OiOj with card(L):

Proposition 6.2 : Enrichment over cardinality

We can view OntSch as enriched over the lattice 1:1 ≤ n:1 ≤ m:n (and 1:1 ≤ 1:nm:n). The cardinality determines the implementation: n:1 as a foreign key column, m:n as a join table, 1:1 as a subset isomorphism. These are structural consequences of the enrichment, not choices an engineer makes.

Example 6.3

An n:1 link Order → Customer stores a customer_id column in the Order table. The reverse 1:n is computed (the set of orders per customer). An m:n link Student ↔ Course requires a join table with student_id and course_id. None of this is surprising, but notice that the category theory tells you which implementation is forced.

7. The Schema as a Sketch

Ehresmann's notion of a sketch gives one last way to think about the schema: a sketch is a category with distinguished cones and cocones, and a model is a functor preserving them. The punchline is that models of OntSch-as-sketch are exactly the valid database states.

Definition 7.1 : OntSch as a sketch

The sketch T = (OntSch, L, C) consists of:

A model of this sketch is a functor M : OntSchSet that preserves the distinguished cones and cocones. The category of models is exactly the full subcategory of presheaves satisfying all referential integrity and cardinality constraints.

Theorem 7.2 : Models = valid presheaves

Mod(T) is equivalent to the full subcategory of [OntSchop, Set] consisting of presheaves that send distinguished limit cones to limits and distinguished colimit cocones to colimits. Referential integrity and cardinality constraints are not ad hoc checks : they are the limit/colimit preservation conditions.

Remark 7.3

This answers a question that sounds simple but is surprisingly hard to make precise: "What is a valid database state?" Answer: a model of the sketch. The sketch formulation is strictly more expressive than ER diagrams, since it can encode arbitrary limit/colimit conditions, not just keys and cardinalities.


Summary

The schema OntSch is a small category carrying: a property functor S on objects, cardinality annotations on morphisms, and foreign key specifications as spans. Its free path category generates multi-hop traversals by universal property. As a sketch, it determines exactly which presheaves are valid database states. There is no data here, only structure, and from this structure the entire relational backbone follows.

Next: Component 2: The Presheaf, pouring data into this skeleton.