Hofstadter's MIU in Thue

[2025-08-11 Mon]

← Back to root

This post is about programming. In particular, I use a laborious, esoteric language (esolang) to generate random strings of a symbolic system given in Douglas Hofstadter's excellent book Gödel, Escher, Bach.

Introduction to the MIU system

My friends and I are reading Gödel, Escher, Bach. In chapter 1, "The MU-puzzle", Hofstadter presents a formal system of symbols; effectively, a set of all valid strings created by the rules below. x and y are placeholders for strings.

  1. MI is a valid string.
  2. If xI is valid, xIU is valid.
  3. If Mx is valid, Mxx is valid.
  4. If xIIIy is valid, xUy is valid.
  5. If xUUy is valid, xy is valid.

The following are all valid strings:

Hofstadter invites the reader to ponder whether MU is valid. This post is not about that question.

Introduction to Thue

Thue is an esoteric programming language named after logician Axel Thue.

Links:

For now, we'll ignore Thue's ability to do input and output. That omission leaves us with a programming system that simply takes in a string and outputs a string.

Thue is nondeterministic; that is, it acts randomly. In an instance of goofing off, I wondered whether I could write a program in Thue that generates a random valid string of the MIU system. (Thue is Turing complete, so it's necessarily possible to write such a program. I wondered whether I could do it.)

Let's look at some simple programs.

a::=Hello!
::=
a

The rule

a::=Hello!

specifies that any occurrence of "a" in the current string can be replaced by "Hello!".

The line containing ::= alone is not a rule and ends the list of rules.

Text that comes after the end of the rules is the initial string. Thue takes that string and applies a random applicable substitution. When no more substitutions apply, computation ends and the current string is returned.

Here's another program:

a::=Hello!
a::=Goodbye?
::=
a

This program will either return "Hello!" or "Goodbye?". It can't do both, since once a substitution is made, there are no a's left in the current string.

Another one:

?::=intergalactic
?::=planetary
::=
? ?

This will return one of four possible values:

(Keep it goin, keep it goin, keep it goin, full steam . . .)

Another dimension; another dimension; another dimension; another dimension; another dimension! Another dimension? Another dimension! Another dimension. Another dimension; another dimension; another dimension; another dimension.

First forays

Rules (1), (3), (4) translate pretty easily into Thue since they involve replacements of predetermined, finite strings.

So I wanted to build up to rule (2) first. Doubling an arbitrary string sounded hard to me, so I started by doubling a string I know. Our goal will be to turn MIII, which looks like a string of MIU, into MIIIIII. (That's six I's. We're doubling the part after M.)

_M::=@M+
+II::=.I+I
+I_::=.I@
.I::=II
::=
_MIII_

Note that _MIII_ has its beginning and end marked with underscores. This is common in Thue, since substitution rules don't allow matching against the beginning or end of the string.

This program runs like this:

Remember: Thue runs nondeterministically, so those aren't fixed phases. It's up to us, the programmers, to restrict the possible actions using the form of the string.

There are definitely simpler ways to do this, but this works! Nice.

Let's try it with two symbols:

_M::=@M+
+II::=.I+I
+IU::=.I+U
+UI::=.U+I
+UU::=.U+U
+I_::=.I@
+U_::=.U@
.I::=II
.U::=UU
::=
_MIUIIUU_

That's a lot of rules. Again, there are certainly more parsimonious ways to do this, but it works.

This is far from our eventual goal: Mx → Mxx. When we feed this program MIUIIUU, the actual rule (2) of MIU would yield MIUIIUUIUIIUU. Instead, this program yields MIIUUIIIIUUUU. We're doubling letters in place rather than appending a new copy of the string.

I decided to handle this problem by duplicating letters in place, since we know how to do that, then moving them to the end of the string. We'll have to get the order right. Let's only worry about movement for now, not copying:

>I_::=_I_
>II::=I>I
>IU::=U>I
>U_::=_U_
>UI::=I>U
>UU::=U>U
::=
_M>I>UUI_

This program runs like this:

This program yields _MUI_I_U_. Some unsightly underscores are left over, but the order of characters is correct.

Now copy letters at the same time:

II_::=I%I#I_
IU_::=I%U#U_
UI_::=U%I#I_
UU_::=U%U#U_
II%I::=%I>I%I
II%U::=%I>I%U
IU%I::=%U>U%I
IU%U::=%U>U%U
UI%I::=%I>I%I
UI%U::=%I>I%U
UU%I::=%U>U%I
UU%U::=%U>U%U
_I%I::=%I>I%I
_I%U::=%I>I%U
_U%I::=%U>U%I
_U%U::=%U>U%U
>I_::=#I_
>I%I::=%I>I
>I#I::=#I#I
>I%U::=%U>I
>I#U::=#I#U
>U_::=#U_
>U%I::=%I>U
>U#I::=#U#I
>U%U::=%U>U
>U#U::=#U#U
::=
_IUUU_
_UIII_

Ouch. The number of rules exploded. But it's not so bad, really! Let's imbue those arcane glyphs with meaning.

The rules are still a lot to take in, but know: this program works kind of like the previous two combined. Action moves from the end of the string to the beginning. Along the way, characters are copied; one is marked ready to move toward the end of the string, and the other is marked as copied. Then all the characters move and take stalwart seats at the end of the string.

And it works! This program outputs these two lines, which developed independently (since no substitution works across a line break):

%I%U%U#I#U#U_
%U%I%I#U#I#I_

Ignoring the % and # symbols, this is exactly what we want. With that, it's time.

Generating random, valid strings of MIU in Thue

_::=

I_::=IU_

I_::=I2#<2
U_::=U2#<2
I2::=2%I>I
U2::=2%U>U

M2::=M>2
>2%I::=I>2
>2%U::=U>2
>2#I::=I>2
>2#U::=U>2
>2#<2::=_

>I%I::=%I>I
>I%U::=%U>I
>U%I::=%I>U
>U%U::=%U>U
>I#::=#I#
>U#::=#U#

I_::=(I*
U_::=(U*
(III::=(U
(UU::=(
I(::=(I
U(::=(U
M(::=M)
)I::=I)
)U::=U)
)*::=_

::=

MI_

There are 28 substitution rules in this program. I find it rather pretty. I will give only a light explanation of its functioning.

The initial string is MI_. An underscore marks its end.

Because the underscore is restored by each rule, any conceivable sequence of rules can run (with nebulous probabilities). It's possible, between rules, for the final underscore to disappear (the rule _::= reads "underscore becomes nothing"), in which case nothing can be done and the computation ends. The string we get results from applying some sequence of MIU rules to MI. Therefore, the result is a random (according to some unknown distribution) valid string of MIU.

Thank you (and book club!) for coming with me on this adventure. Here are 15 unique strings of MIU:

MIIIIIIIIU
MIIII
MUIUIU
MIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIU
MIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIUIU
MIIIIU
MIUIUIUIUIUIUIUIU
MIIUIIUIIUIIUIIUIIUIIUIIUIIUIIUIIUIIUIIUIIUIIUIIU
MIUIU
MIUIUIUIU
MI
MIU
MIIU
MIIUIIU
MII

Code

I ran these programs using a Thue interpreter I wrote in Common Lisp. It's available on Codeberg. Using it will be a bit annoying because it depends on other Common Lisp packages of mine that aren't on Quicklisp.

The programs are available on Codeberg.