Jonathan. Frech’s WebBlog

The humble sum should neither necessitate indirection nor be unbounded across package bound­aries (#250)

Jonathan Frech

Routinely, I am irked by the effort required in most languages to express al­ge­bra­ic data types — the epitome of information itself. An inert datum is too often occluded by hardware-inspired lingo forcing one to adapt to rigid register-fitted types. Leastways the C lineage supports bytestrings together with their concatenation which, under the previous constraints, are equatable to product types. Though whilst control flow neatly factors through such a type, the categorical dual consistently requires an iffy cast or virtual semantics alltogether: the humble sum is quietly ignored in favor of its dual in whose hulking shadow it sometimes desintegrates out of existence. A far cry from Haskell’s ef­fort­less type alternative “|”, C at least supports byte-based type superpositioning. Yet although Go inherits “struct”, the fameless yet in no way inferior “union” is — to me unreasonably — absent.

Consequently, when I im­ple­ment­ed miniKanren in Go back in March of 2021⁠¹, I was flabber­gasted to find a gaping lan­guage definition hole where I had expected a sum type. Toying with the idea to one-hot encode S-expressions via

$$\sum_i\alpha_i\hookrightarrow\prod_i\alpha_i,\quad x_k\mapsto(0_1,\dots,0_{k-1},x_k,0_{k+1},\dots,0_n),$$$$\sum_i\alpha_i\hookrightarrow\prod_i\alpha_i,\quad x_k\mapsto(0_1,\dots,0_{k-1},x_k,0_{k+1},\dots,0_n),$$

I quickly dismissed it in hopes of a more elegant idiomatic Go approach. What I found was an in­ter­pre­ta­tion of a sum type which conflates object action, dynamic dispatch, undefined control flow and unbounded sums $\sum_{i\in\Omega}\alpha_i$$\sum_{i\in\Omega}\alpha_i$ into a briddle mess of Go lines which can be viewed as indeed im­ple­ment­ing a type coproduct. Needless to say, I am dissatisfied with Go’s negligence in this regard.

-=-

Whereas Haskell’s concise syntax for al­ge­bra­ic data types allows for an S-Expression type to be cleanly defined in two lines, Go’s convoluted interface repurposing requires five lines when shrunk:

-- Haskell
data Term = Symbol String
          | Pair Term Term
// Go
type Term interface { isTerm() }

type Symbol struct { Data string }
func (_ Symbol) isTerm() {}

type Pair struct { Car, Cdr Term }
func (_ Pair) isTerm() {}

Go’s support for discriminating the above is likewise a lengthy endeavour:

-- Haskell
flatten :: Term -> [String]
flatten (Symbol sym) = pure sym
flatten (Pair t s)   = flatten t ++ flatten s
// Go
func flatten(t Term) []string {
    switch x := t.(type) {
        case Symbol:
            return []string{x.Data}
        case Pair:
            return append(flatten(x.Car), flatten(x.Cdr)...)
        default:
            panic("unexpected summand")
    }
}

Ignoring the raw line or byte count, Go seems to actively revolt when working a tad more abstractly than with neatly byte-packed data; writ­ing these fundamental type manipulations by hand is not a pleasant ex­pe­ri­ence. Auspiciously, the above is implementable — if at a high cost: Go’s interfaces were built to allow packages func­tion­al­i­ty exports. Hence appending to or redefining a struct’s receiver methods across package bound­aries is strictly forbidden, yet im­ple­ment­ing an interface actively encouraged. And therein lies the problem: any struct from any source is allowed to im­ple­ment isTerm() as a receiver method and will be seen as supporting the func­tion­al­i­ty required by Term — when viewed as an interface. Coming back to the sum type view, this means func flatten(Term) []string can be made to panic by nothing more than injecting another summand, from any package:

// Go
type ExtraPackatorial struct {}
func (_ ExtraPackatorial) isTerm() {}
func panics() { flatten(&ExtraPackatorial{}) }

All this leads me to propose an extension to the Go lan­guage: interface admission clauses. In action they may look like the following:

// wishful Go
type Term interface admits (Symbol; Pair) { isTerm() }

Taylored to allow bound­ed sum type im­ple­men­ta­tion, admits restricts im­ple­men­ta­tion of an interface such that on­ly the types admitted by said interface may do so. Consequently, ExtraPackatorial’s attempt to fake itself into the sum type would be foiled by the compiler which respects Term’s request for an exclusive implementability.

Even though admits should be used rarely to not hinder package’s effectiveness and further support duck typing, it would close the gaping hole described in this post. Clearly, not every interface is meant to be freely extended and I think specifying structs by name which are granted permission to im­ple­ment an interface would be a valuable feature, adding currently lacking control to define a universally useful type — the sum type.


[1]Jonathan Frech: Going for a miniKanren im­ple­men­ta­tion.. 2021-03-31, self-published.