Skip to content

Implementation is a relation C implements A — concrete structure C implements abstract specification A — asserting that there exists a retrieve function R: States(C) → States(A) such that R maps every valid concrete state to a valid abstract state and every concrete operation of C is a refinement of the corresponding abstract operation of A under R. The defining structure: implementation crosses the abstraction boundary from specification to realization; it is not just refinement (which relates two specifications at the same level) and not just satisfaction (which is a logical truth relation). Implementation provides a structural witness — the retrieve function — that shows how the concrete representation realizes the abstract one. The retrieve function is the mathematical certificate of implementation.
Table of contents

Implementing

Formal definition

Implementation is a relation C implements AC \text{ implements } A between a concrete structure CC and an abstract specification AA:

C implements Aiff  R:States(C)States(A) (retrieve function) s.t. Conds 1–3C \text{ implements } A \quad\text{iff}\quad \exists\; R : \mathrm{States}(C) \to \mathrm{States}(A) \text{ (retrieve function) s.t. Conds 1–3}

where:

  • AA is the abstract specification — a description of a data type or system in terms of abstract values, abstract state, and abstract operations; AA specifies what the system does, not how it does it; AA’s state space may be mathematically idealized (e.g., abstract stacks as mathematical sequences) and its operations may be specified axiomatically or as abstract functions
  • CC is the concrete realization — a data structure, program, or system that provides actual computational representations of AA’s abstract values and actual implementations of AA’s operations; CC’s state space uses concrete representations (arrays, linked lists, hash tables) and its operations are executable procedures
  • R:States(C)States(A)R : \mathrm{States}(C) \to \mathrm{States}(A) is the retrieve function (or abstraction function) — a function mapping each concrete state cStates(C)c \in \mathrm{States}(C) to the abstract state R(c)States(A)R(c) \in \mathrm{States}(A) that cc represents; RR is the mathematical certificate of implementation: it shows exactly how the concrete representation relates to the abstract specification

Three conditions for RR to witness a valid implementation:

  1. Representation invariant coverage: every state cStates(C)c \in \mathrm{States}(C) that satisfies the concrete representation invariant invC\mathrm{inv}_C has R(c)States(A)R(c) \in \mathrm{States}(A) satisfying the abstract invariant invA\mathrm{inv}_A. The concrete representation must encode only valid abstract states — no concrete state outside the representation invariant’s scope is required.

  2. Operation commutativity: for each abstract operation fA:States(A)States(A)f_A : \mathrm{States}(A) \to \mathrm{States}(A) and its corresponding concrete operation fC:States(C)States(C)f_C : \mathrm{States}(C) \to \mathrm{States}(C):

R(fC(c))=fA(R(c))R(f_C(c)) = f_A(R(c))

Executing the concrete operation and then abstracting (via RR) gives the same result as abstracting first and then executing the abstract operation. The concrete operation faithfully implements the abstract one through RR.

  1. Initialization: the concrete initial state c0c_0 corresponds to the abstract initial state: R(c0)=a0R(c_0) = a_0 where a0a_0 is the abstract specification’s initial state. The implementation starts in the right abstract state.

The retrieve function is not required to be injective: multiple concrete states may represent the same abstract state (e.g., two different memory layouts representing the same list). This is the essential flexibility that makes implementation more than isomorphism — the concrete representation may be more redundant, more detailed, or differently organized than the abstract one.

Hoare: proof of correctness of data representations

C.A.R. Hoare (Proof of Correctness of Data Representations, 1972, Acta Informatica) gave the first formal treatment of data abstraction and implementation:

Hoare’s paper establishes the following framework for proving that a concrete data representation correctly implements an abstract specification:

An abstract type is defined by:

  • An abstract state space AA (e.g., mathematical sequences for a stack)
  • Abstract operations with pre/postconditions in terms of AA (e.g., push(a,s)=a::s\mathrm{push}(a, s) = a :: s; pop(a::s)=(a,s)\mathrm{pop}(a :: s) = (a, s))
  • An abstract invariant IAAI_A \subseteq A (conditions that always hold)

A concrete representation is defined by:

  • A concrete state space CC (e.g., arrays with a top pointer for a stack)
  • Concrete operations (executable procedures)
  • A representation invariant ICCI_C \subseteq C (conditions the array representation always satisfies)
  • A retrieve function R:ICIAR : I_C \to I_A mapping valid concrete states to abstract states

Proof obligations for the implementation:

  1. ICI_C is established by the concrete initialization: the initial concrete state satisfies ICI_C
  2. ICI_C is maintained by each concrete operation: if cICc \in I_C before, then fC(c)ICf_C(c) \in I_C after
  3. Each concrete operation fCf_C correctly implements fAf_A via RR: the Hoare triple {IC(c)R(c)=a}  fC  {R(c)=fA(a)}\{I_C(c) \wedge R(c) = a\}\; f_C \;\{R(c') = f_A(a)\} holds

Hoare’s framework established that correctness of data representations is provable by finding a retrieve function — a mathematical object that witnesses the correspondence between abstract and concrete. This was the foundation of all subsequent work on data abstraction verification.

Liskov and Guttag: abstract data types and abstraction functions

Barbara Liskov and John Guttag (Abstraction and Specification in Program Development, 1986) developed abstract data types (ADTs) as the programming language realization of Hoare’s abstract/concrete distinction:

An ADT specification consists of:

  • A type name TT
  • An overview describing the abstract value the type represents
  • Operations with pre/postconditions specified in terms of the abstract value
  • An abstraction function AF:repTAF : \mathrm{rep} \to T mapping concrete representations to abstract values

The representation invariant RI:repBool\mathrm{RI} : \mathrm{rep} \to \mathrm{Bool} specifies which concrete states are valid representations of abstract values. Only states satisfying RI\mathrm{RI} are subject to the abstraction function.

The abstraction function is partial: it is defined only on states satisfying RI\mathrm{RI}. Outside RI\mathrm{RI}, the concrete state has no abstract meaning — it is an invalid intermediate state. Operations must maintain RI\mathrm{RI} (implementation invariant) and must commute with AFAF (implementation correctness).

Liskov and Guttag introduce benevolent side effects — operations that appear to modify the concrete representation but leave AF(rep)AF(\mathrm{rep}) unchanged (e.g., a caching operation). Benevolent side effects are permitted: they change the concrete state while preserving the abstract value. This formalizes the implementation principle that multiple concrete states may represent the same abstract value.

Reynolds and abstraction in type systems

John Reynolds (Types, Abstraction and Parametric Polymorphism, 1983) showed that the retrieve function / abstraction function has a deep connection to the semantics of polymorphic types:

A representation independence theorem: if a module MM with abstract type TT uses representation CC internally and exports TT via operations with the abstraction function AF:CTAF : C \to T, then any two implementations C1C_1 and C2C_2 with abstraction functions AF1AF_1 and AF2AF_2 that satisfy the same abstract specification are related by a logical relation — the relation induced by the abstraction function. No client code can distinguish C1C_1 from C2C_2, because the type system ensures all client accesses go through the abstract operations.

This is the abstraction theorem for modules: information hiding (encapsulation of the concrete representation) is enforced by the type system, and the retrieve function is the mathematical content of what is being hidden. The module’s abstract interface exposes only TT; the concrete representation CC is inaccessible to clients.

Reynolds’ paper connects data abstraction to relational parametricity: the abstraction function AFAF defines a binary relation on values of type TT, and well-typed programs respect this relation — they cannot distinguish elements of TT that are AFAF-related.

Open questions

  • Whether the retrieve function R:States(C)States(A)R : \mathrm{States}(C) \to \mathrm{States}(A) is required to be a surjection (every abstract state has a concrete representation) or only a partial surjection (some abstract states may be unrepresentable), and whether this choice affects the theory of data refinement for non-terminating systems.
  • Whether implementation in the sense of the retrieve function corresponds to a morphism of sheaves in Sh(T,J)\mathbf{Sh}(T, J) — whether the retrieve function RR is a natural transformation between the presheaves representing CC and AA, making implementation a morphism in the category of sheaves.
  • Whether Liskov’s behavioral subtyping (the Liskov Substitution Principle) and Hoare’s data refinement are the same concept at different levels — whether behavioral subtyping is implementation relative to the behavioral specification, and whether the retrieve function in data refinement corresponds to the simulation relation in behavioral subtyping.
  • Whether benevolent side effects (changing the concrete representation while preserving the abstract value) are the implementation-theoretic analogue of the saturation nucleus’s closure — whether applying the nucleus to a concrete state produces a canonically normal form with the same abstract value, making nucleus application the “canonical retrieve” operation.

Relations

Abstract specification
Relational universe
Ast
Concrete realization
Relational universe
Date created
Date modified
Defines
Implementing
Output
Relational universe morphism
Related
Refines, satisfies, extends, data abstraction, abstract data type, retrieve function