Component 1: The Schema

The shape of the domain as a small category.

1. The Primitive Type Category

We start at the bottom. Every column in every table has a type, drawn from a fixed collection of primitives. These types come with a natural subtyping structure (integers embed into longs, dates refine to timestamps), and the lattice structure propagates upward into the rest of the framework.

Definition 1.1 : Type

Let Type be the category whose objects are primitive types:

Ob(Type) = {String, Integer, Long, Double, Boolean, Date, Timestamp, Binary, GeoPoint, GeoShape, TimeSeries, Array(τ), Struct({l_i : τ_i}), Attachment, Marking}

Morphisms are subtype inclusions and coercions: if τ₁ is a subtype of τ₂, there is a unique morphism τ₁ → τ₂. The familiar chains Integer → Long → Double and Date → Timestamp are examples. Because there is at most one morphism between any two objects, Type is a thin category.

The subtype ordering makes Type a bounded lattice: ⊤ = String (every type coerces to string) and ⊥ = Void (the empty type, initial object).

Proposition 1.2 : Type is a finite lattice

Type has all finite meets ∧ and joins ∨. The meet τ₁ ∧ τ₂ is the greatest common subtype; the join τ₁ ∨ τ₂ is the least common supertype. Since the type system is finite, these always exist. In practice, the join is what matters: when you merge two columns, the result type is τ₁ ∨ τ₂.

Corollary 1.3 : Unique coercion

For any chain τ₁ → τ₂ → … → τ_k, the composite is unique. Thinness kills ambiguity: there is never more than one way to coerce along a chain.

Remark 1.4

Type compatibility is a preorder, coercion chains compose by transitivity, and you can use a value of type τ₁ wherever τ₂ is expected as long as τ₁ → τ₂ exists. Nothing deep : but getting it right once means we never have to think about it again.

2. Schemas as Functors

A schema is just a list of columns with their types, and the categorical way to say this is surprisingly clean.

Definition 2.1 : Schema

A schema is a functor S : n → Type, where n = {1, 2, …, n} is the discrete category on n objects. For each k ∈ n, S(k) = τ_k gives the type of column k. Equivalently:

S ≅ τ₁ × τ₂ × … × τ_n

A named schema is a pair (S, name) where name : n → Label gives each column a human-readable label.

Definition 2.2 : Schema Category Sch

The category Sch has:

Objects: Named schemas (S, name).
Morphisms σ : S₁ → S₂: a function f : m → n on column indices together with a family of type coercions S₁(f(k)) → S₂(k) in Type.

Schema morphisms capture every structural transformation of tabular data: drop columns, rename them, cast types. Sch sits inside [FinSet^op, Type] as a full subcategory.

Example 2.3

A "Person" schema: S : 3 → Type with S(1) = String, S(2) = Integer, S(3) = Date, named (name, age, dob). In product form: String × Integer × Date.

3. The Ontology Schema Category

Now we assemble the central object. The idea: entity types become objects, relationship types become morphisms, and the whole thing lives in a single small category.

Definition 3.1 : OntSch

The ontology schema category OntSch is a small category where:

Objects: Entity types O₁, …, O_n. Each O is a triple (S_O, pk_O, title_O) : its property schema, primary key column, and display property.
Morphisms: Link types L : O_i → O_j, each carrying a cardinality card(L) ∈ {1:1, 1:n, n:1, m:n} and a foreign key specification.

Multiple morphisms between the same pair of objects are allowed : a person can be both an employee of and a shareholder in the same company : so OntSch is really a finite directed multigraph, viewed as a category.

Example 3.2: Multigraph structure

Person employs Company

Person shareholder Company

Person orders Order contains Product

Three distinct link types, some sharing codomains. Note that a path like Person to Company to Product is not composable when arrows point in different directions.

Remark 3.3

The triple (S_O, pk_O, title_O) is doing all the work: S_O gives the table shape, pk_O identifies the uniqueness constraint, and title_O picks the column used for display in UIs. That is everything you need to fully specify an entity type.

4. The Property Functor and Foreign Keys

Each entity type carries a schema; that much is clear. But link types do not induce maps between schemas in any natural way. The right structure is a span.

Definition 4.1 : Property Functor

The property functor S : Ob(OntSch) → Ob(Sch) sends each entity type O to its property schema S_O. Note that S is only defined on objects : it does not extend to morphisms, because a link L : O_i → O_j tells you nothing about how the columns of O_i and O_j relate. What the link does specify is which columns participate in the join.

Definition 4.2 : Foreign Key Specification

For a link type L : O_i → O_j, the foreign key is a span in Sch:

S(O_i) FK_L (π₁, π₂) S(O_j)

FK_L is the key schema; π₁ picks out the foreign key column(s) from the source; π₂ embeds the primary key column(s) of the target.

For n:1 links, FK_L is just the primary key type of O_j, living as a column in S(O_i). For m:n links, FK_L is a separate join schema : potentially backed by its own dataset.

5. OntSch as a Quiver

If we strip OntSch of composition and identities, keeping only the raw graph of entity types and link types, we get a quiver. The free category on this quiver recovers composition and gives us something extra: a universal property that pins down multi-hop traversal.

Definition 5.1 : The Ontology Quiver

Let Q = (V, E, s, t) where V = Ob(OntSch), E = link types, and s, t assign source and target. The free category Path(Q) has objects V and morphisms = composable paths of link types, with composition by concatenation and the empty path as identity.

Theorem 5.2 : Universal property of Path(Q)

For any category C and graph morphism F : Q → U(C) (where U forgets composition), there exists a unique functor F̃ : Path(Q) → C extending F.

In plain terms: once you decide what each entity type and each link type mean in some target category, all multi-hop traversals are forced. You don't get to choose how two-hop paths behave; the universal property decides for you. Link traversal "just works."

Corollary 5.3

Morphisms in Path(Q) are sequences L₁ ∘ L₂ ∘ … ∘ L_k with matching endpoints. Identity is the empty path. Composition is concatenation : nothing more.

6. Cardinality as Enrichment

The 1:1, 1:n, n:1, m:n annotations on links are more than database decoration : they constrain the shape of any presheaf inhabiting the schema, and the right language for this is enrichment.

Definition 6.1 : Cardinality constraints

For L : O_i → O_j with card(L):

1:1 : Φ(L) is a bijection between subsets.
n:1 : Φ(L) is a (total or partial) function: each source has at most one target.
1:n : Dual: each target has at most one source.
m:n : A span Φ(O_i) ← R → Φ(O_j), with R a relation.

Proposition 6.2 : Enrichment over cardinality

We can view OntSch as enriched over the lattice 1:1 ≤ n:1 ≤ m:n (and 1:1 ≤ 1:n ≤ m:n). The cardinality determines the implementation: n:1 as a foreign key column, m:n as a join table, 1:1 as a subset isomorphism. These are structural consequences of the enrichment, not choices an engineer makes.

Example 6.3

An n:1 link Order → Customer stores a customer_id column in the Order table. The reverse 1:n is computed (the set of orders per customer). An m:n link Student ↔ Course requires a join table with student_id and course_id. None of this is surprising, but notice that the category theory tells you which implementation is forced.

7. The Schema as a Sketch

Ehresmann's notion of a sketch gives one last way to think about the schema: a sketch is a category with distinguished cones and cocones, and a model is a functor preserving them. The punchline is that models of OntSch-as-sketch are exactly the valid database states.

Definition 7.1 : OntSch as a sketch

The sketch T = (OntSch, L, C) consists of:

The underlying category OntSch.
L: distinguished limit cones : encoding uniqueness from 1:1 and n:1 links (pullbacks ensuring each foreign key points to exactly one target).
C: distinguished colimit cocones : encoding existence of join tables for m:n links.

A model of this sketch is a functor M : OntSch → Set that preserves the distinguished cones and cocones. The category of models is exactly the full subcategory of presheaves satisfying all referential integrity and cardinality constraints.

Theorem 7.2 : Models = valid presheaves

Mod(T) is equivalent to the full subcategory of [OntSch^op, Set] consisting of presheaves that send distinguished limit cones to limits and distinguished colimit cocones to colimits. Referential integrity and cardinality constraints are not ad hoc checks : they are the limit/colimit preservation conditions.

Remark 7.3

This answers a question that sounds simple but is surprisingly hard to make precise: "What is a valid database state?" Answer: a model of the sketch. The sketch formulation is strictly more expressive than ER diagrams, since it can encode arbitrary limit/colimit conditions, not just keys and cardinalities.

Summary

The schema OntSch is a small category carrying: a property functor S on objects, cardinality annotations on morphisms, and foreign key specifications as spans. Its free path category generates multi-hop traversals by universal property. As a sketch, it determines exactly which presheaves are valid database states. There is no data here, only structure, and from this structure the entire relational backbone follows.

Next: Component 2: The Presheaf, pouring data into this skeleton.