Metaphors We Learn to Program By

2021-11-29

I’m working through Metaphors We Live By, an exploration of the role of metaphor in human cognition and language. The central theme of the book is:

[..] human thought processes are largely metaphorical. [..] Metaphors as linguistic expressions are possible precisely because there are metaphors in a person’s conceptual system.

They provide examples of conceptual metaphors which ground our experience– ARGUMENT IS WAR, HAPPY IS UP, THEORIES ARE BUILDINGS– and examples from everyday language to support the theory. They note:

Most of our fundamental concepts are organized in terms of one or more spatialization metaphors.

And in particular, CONTAINERS and PERSONIFICATION fundamentally ground our experience. Containers because:

We are physical beings, bounded and set off from the rest of the world by the surface of our skins, and we experience the rest of the world as outside us. Each of us is a container, with a bounding surface and an in-out orientation. We project our own in-out orientation onto other physical objects that are bounded by surfaces. Thus we also view them as containers with an inside and an outside. [..] But even where there is no natural physical boundary that can be viewed as defining a container, we impose boundaries—marking off territory so that it has an inside and a bounding surface—whether a wall, a fence, or an abstract line or plane. There are few human instincts more basic than territoriality.

This primacy of CONTAINERS and PERSONIFICATION reminded me of my earlier post’s “little man” analogy– telling a little man how to run around your computer moving variously shaped objects in & out of containers for me captures a curiously large amount of what this activity we call ‘programming’ is about.

The basic physical metaphors 📦🏃‍♂️

Most common data structures obviously fit into the CONTAINER metaphor because they’re, well, containers. I find it useful to imagine them spatially:

A map or dictionary is an old school wall of post-office boxes with labels on the drawers. The labels on the boxes are the keys, and whatever is contained within the box is the value.¹
A linked list is a row of boxes on a conveyer belt which can only go forward. (Doubly linked lists let me convey the boxes forward & backwards.)
A binary tree is a box with two drawers, which opens into a box with two drawers, which opens into…

Of course I’m not saying I visualize a conveyer belt when walking a linked list. What I want to emphasize is the sense in which, reading code can translate into a physical, kinetic feeling. I posit the degree to which I can effectively use a data structure is related to how strong this kinetic sense of it is.

Another metaphor I typically employ to help with that kinetic sense of things is one in which there is a “person moving about space”. For example:

Database cursors represent a person gathering a stack of records
The instruction pointer represents a person reading the code line-by-line & interpreting it
A loop counter or iterable represents a person holding a place in a rolodex, list, or other structure, perhaps by pointing with their index finger.

All of these are people systematically scanning over some contained information, keeping track of their place, and perhaps manipulating whatever has been put within their grasp. There’s no obvious reason to conceive of these as a person– it’s just an abstract arrow moving thru space and/or time– but it feels natural to personify it, especially when debugging, so I can step inside their shoes and tell the little person where s/he made a wrong turn.

… think about how much of your daily coding can be encompassed by those two metaphors!

Non-physical metaphors 📝

There are programming constructs which don’t to me have obvious physical analogs– interfaces, types, protocols, recursion– but I want to highlight language syntax. Unlike the others, learning most programming environment’s syntax is a pre-requisite to using it at all, but can also be a significant stumbling block. Scratch and other block-based programming environments are explicitly designed to overcome this difficulty with syntax– and they succeed by replacing it with a CONTAINER metaphor! Replacing formal language– acquired only with great effort well after our physical senses– in this way seems a stroke of pedagogical genius.

Metaphors of delegation 🗣

Beyond the scale of toy programs, it becomes necessary to employ abstractions that reduce the immediately apparent complexity of the problem at-hand. There is an interesting thing which happens when one proceeds thru the different ways of doing this, from imperative to object-oriented, functional, & relational:

A procedure is a command to a wily subordinate: “go do this”. Despite their subordinate status, you don’t get a lot of control over what else they do while they’re gone.
We send messages (commands or requests)² to an object, which may contain multitudes. Even tho we have no sense of how many multitudes are contained, “best practices” assure us the object will only transform that which it contains.
A pure function is an exchange with an equal. I give the function some information, it gives me back other information which relates to what I gave it in a predictable way. They never leave the room, so there’s no possibility of a sorcerer’s apprentice situation like the others.
A database query is a request – perhaps even a supplication?– to an all-knowing oracle. They will respond with what they can at a time of their choosing. I’m at their mercy.

Notice how the balance of power shifts from the programmer to the abstraction? We also seem to move away from a tangible physical paradigm here– the delegatee’s person-hood transforms. In procedures there is a “little person” who will finish the work & return to us. If we have to fix something we can always follow the little person thru their steps. By the time we get to relational the problem becomes one of forming unambiguous questions to a literal genie. Legends warn of the perils of this situation!

Nevertheless, fighting the instinct to control is necessary, or you could end up with an ad-hoc, unqueryable, bug-ridden³ ~~database~~ stack of containers. Unless your application’s scale or complexity requires turning the database inside out, it should probably be just a thin shell around a battle-hardened embedded database. “You are either building a database or a compiler”⁴, and you should be very certain you can’t re-use parts of ones that already work.

Questions of pedagogy 🎓

In an earlier post I claimed: “We employ the industrial [metaphors] too often— builder, adaptor, factory. We need more pre/post-industrial ones— they’re more powerful.” Despite having spent most of my time developing in the object-oriented and imperative paradigms, Out of the Tar Pit, Can programming be liberated… and others have convinced me that, as an industry, we ought to be employing the functional & relational paradigms to a greater degree than we currently do. This belief sits uncomfortably with my current inability to see how they’re rooted in the CONTAINER or PERSONIFICATION paradigms. I would love to hear your opinions on this– in closing, I’m led to two major question, which seem to be in tension with one another:

Could we do more to make programming an accessible activity to the 99% of people who don’t program, by putting greater emphasis on intuitive physical analogs?
- Were physical metaphors like the ones described useful to you when learning to program? Are they still useful to you?
- How do your physical conceptions of common programming constructs differ from what I’ve described?
Are the embodied, imperative metaphors we’ve inherited the best ones to be teaching?
- Can the functional and relational paradigms be well-represented in physical terms, giving the reader a kinetic sense of the code? Or do they imply an internalized-but-forgotten “little person”?
- Could a richer development of PERSONIFICATION be part of this?

Would love to hear from you on Twitter.

Obviously “maps” and “dictionaries” are also real physical things which analogize well to this data structure, but what they contain is significantly more abstract than a postbox. In particular, it bothers me that the contained value exists in a lower spatial dimension. ↩︎
There are a lot of ways in which objects straddle paradigms here. Alan Kay says he was thinking of “cells” and a biological paradigm, which has a kind of PERSONIFICATION to it. Yet of course cells and the more static “objects” are also CONTAINERS. Then there is the question of tense– with late-binding, the locus of control is inside the object so the tense is somewhat tentative. Without that, a method call differs little from an order, in which the locus of control stays outside the object. ↩︎
Nod to Greenspun’s tenth rule ↩︎
I love this maxim but I’ve lost the citation for it! Please tell me if you know. ↩︎