Fun with Symbolic Computation: Solving Einstein’s Zebra Riddle
Skip the algorithms: just define the problem and let Solvers do the rest!
Algorithms
"Algorithm": the word has its origins from the 9th century writings of Muhammad ibn Musa al-Khwarizmi a Persian mathematician at the Baghdad House of Wisdom, The original text in Arabic is lost, but a 12th century translation exists: "Algoritmi de numero Indorum" (Al-Khwarizmi on the Hindu Art of Reckoning). The mathematics within it were so influential that his latinized name "Algoritmi" and all its variants were entered in the lexicon.
Al-Khwarizmi described systematic/step-by-step procedural and numerical methods for solving mathematical problems. These methods were foundational for centuries and in the 17th century a new field of symbolic computation emerged.
Symbolic Computation
Symbolic computation manipulates mathematical symbols and expressions directly to obtain exact solutions. Procedural and numerical methods are based on some mathematical representation of the problem space and thus may lose precision. Symbolic computation methods are based on the abstract concepts allowing derivation, simplification, and solutions to problems with underlying abstract mathematical structures.
Symbolic computation took off in the '60s with MIT's Macsyma and in the '80s with UofWaterloo's Maple, and Wolfram's Mathematica.
This casual chronicle will discuss how symbolic computation can be used for solving logic problems. Specifically I will use the Python SymPy package to solve a logic puzzle known as The Zebra Puzzle.
Constraint Satisfaction Problems
The Zebra Puzzle is also known as Einstein's Riddle, and is a Constraint Satisfaction Problem (CSP) SymPy can solve these problems with its implementaton of constraints, boolean logic, and methods to determine if values and constraints can be satisfied.
Formal Definition:
Let P
be a Logic Constraint Problem with variables V = {x1, x2, ..., xn}
and constraints C = {c1, c2, ..., cm}
.
P
is said to be satisfiable if there exists an assignment σ: V → D
(where D
is the domain of the variables) such that:
∀c ∈ C, c(σ(x1), σ(x2), ..., σ(xn)) = TRUE
In other words, the assignment σ
satisfies all the constraints c
in C
.
Einstein’s Zebra Riddle Logic Constraint Problem:
The Zebra LCP has variables and constraints. Variables are Houses, Owners, Drinks, Pets, and Colours. The constraints are two-fold: 1) overall uniqueness & completeness and 2) individual constraints. The overall constraints require that all houses are identified with unique attributes, and all attributes are used. The individual constraints are specific to entities and their attributes (e.g. coffee is served in the green house)
The problem:
The general steps to solve logic constraint problems are:
Define values and variables (attributes and entities)
Create overall constraints
Create individual constraints
Determine satisfiability
If solutions exist, output formatted results; otherwise, indicate no solution
Symbolic Logic and SymPy
Let's go through those steps with some symbolic notation and the corresponding SymPy code.
1) Define Values and Variables:
from sympy import symbols
# Create symbols for each value attribute
colors_s = blue, green, ivory, red, yellow = symbols('blue green ivory red yellow')
nationalities_s = Englishman, Japanese, Norwegian, Spaniard, Ukrainian = symbols('Englishman Japanese Norwegian Spaniard Ukrainian')
pets_s = dog, fox, horse, snails, zebra = symbols('dog fox horse snails zebra')
drinks_s = coffee, juice, milk, tea, water = symbols('coffee juice milk tea water')
cigarettes_s = Chesterfields, Kools, LuckyStrike, OldGold, Parliaments = symbols('Chesterfields Kools LuckyStrike OldGold Parliaments')
# Create variables for each entity
n_houses = 5
house_colors = symbols(f'house_color:{n_houses}')
house_nationalities = symbols(f'house_nationality:{n_houses}')
house_pets = symbols(f'house_pet:{n_houses}')
house_drinks = symbols(f'house_drink:{n_houses}')
house_cigarettes = symbols(f'house_cigarette:{n_houses}')
this will create sympy symbols for the values, and symbols for the attributes for each house:
2) Create overall constraints
The overall constraints ensure that all of the attributes have values and all the values are in an attribute: coverage. Another overall constraint ensures that each of the five houses are painted a different color, and their inhabitants are of different nationalities, own different pets, drink different beverages, and smoke different brands of cigarettes: uniqueness
For coverage, the first house colour can be blue or green or any other colour.
The constraint can be expressed in SymPy as a boolean expression:
Xor(*[Eq(house_colours[0], blue),
Eq(house_colours[0], green),
Eq(house_colours[0], ivory),
Eq(house_colours[0], red),
Eq(house_colours[0], yellow)])
So to define coverage constraints for all colours on all houses:
colours_s = blue, green, ivory, red, yellow = symbols('blue green ivory red yellow')
n_houses = 5
house_colours = symbols(f'house_colour:{n_houses}')
constraints = []
for i in range(n_houses):
constraints.append(Xor(*[Eq(house_colours[i], colour)
for colour in colours_s]))
Likewise for the other attributes the same coverage constraints can be defined.
For the uniqueness constraints, it would be nice if there was some sort of "distinct" method in SymPy, but it doesn't appear to be implemented. So here is a custom Distinct creator using boolean logic:
def Distinct(variables, values):
constraints = []
for i in range(len(variables)):
for j in range(i+1, len(variables)):
for value in values:
constraints.append(Not(And(Eq(variables[i], value),
Eq(variables[j], value))))
return And(*constraints)
You might ask: why not use Exclusive Or? Well, the problem is that Xor by definition across N arguments does not ensure some uniqueness, rather, Xor returns True if an odd number of the arguments are True, and it returns False if an even number of the arguments are True.
3) Create individual constraints
Individual constraints are a little more succint. One of the Zebra Puzzle constraints is "Coffee is served in the green house". In this case Xor can be used, pairing coffee Eq() to green with And() combined with Xor(). Here is the symbolic representation and the SymPy code.
Xor(*[And(Eq(house_drinks[i], coffee), Eq(house_colours[i], green))
for i in range(n_houses)]))
The houses have an order to them, left to right numbered 0 -> 4.
One of the Zebra puzzle's constraints is "Milk is the drink in the middle house". With the middle house being house #2 we have a simple symbolic equation and SymPy code:
Eq(house_drinks[2], milk)
Some of the constraints have positional definitions. "The green house is immediately to the right of the ivory house".
Or(*[And(Eq(house_colours[i], ivory),
Eq(house_colours[i+1], green))
for i in range(n_houses-1)]))
And another kind of constraint in the Zebra puzzle combines positional with other attributes: "Kools are smoked in the house next to the house where the horse is kept"
Or(*[And(Eq(house_cigarettes[i], Kools), Eq(house_pets[i+1], horse))
| And(Eq(house_cigarettes[i+1], Kools), Eq(house_pets[i], horse))
for i in range(n_houses-1)])
4) Determine satisfiability
Once all the overall and individual constraints are defined, combine them all into a collective constraint and pass the collective into the satisfiable
method.
sympy.logic.inference.satisfiable(expr,
algorithm=None,
all_models=False,
minimal=False,
use_lra_theory=False)[source]
Checks the satisfiability of a propositional sentence, return model(s) when it succeeds.
By setting all_models to True, when expr is satisfiable it returns a generator of models. However, if expr is not satisfiable then it returns a generator containing the single element False.
all_constraints = constraints + distinct_constraints
# Solve the puzzle
solutions = satisfiable(And(*all_constraints), all_models=True)
5) If solutions exist, output formatted results; otherwise, indicate no solution
The final step is processing the solution(s): check the generator object and for each solution found, display the results. We can also confirm that there is only one solution:
if solutions:
print("Solutions:")
while True:
try:
print_result(next(solutions))
except StopIteration:
print("No more solutions found.")
break
else:
print("No solution found.")
Gathering up the solutions and using Pandas to display the result:
The puzzle is called The Zebra puzzle because of the original question "Who has the pet Zebra"? (The Japanese homeowner). Another original question: "who drinks water"? (The Norwegian homeowner)
Thanks for reading! Hope you enjoyed the walkthrough. You can check out the code as a Jupyter notebook in my repo: https://github.com/carlek/sympy-fun
References
https://en.wikipedia.org/wiki/Zebra_Puzzle
https://docs.sympy.org/latest/index.html
https://www.proquest.com/docview/1953695580
https://www.researchgate.net/publication/341189675_Is_Einstein%27s_Puzzle_Over-Specified
Subscribe to my newsletter
Read articles from carl ek directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
carl ek
carl ek
Software developer. Sports Fan. Motorcyclist. Pianist and French Hornist. Cat Dad. Let's try out hashnode free version, and see where it goes....