Solving the Zebra Puzzle with AC-3

KahKah
May 15, 2024·
12 min read

The first time I heard of the arc consistency #3 (AC-3) algorithm during a mentoring session on the Zebra Puzzle exercise in Exercism's Elixir track (thanks afraik!). I was avoiding permutating through the combinations and had came up with a solution that used mutation to get to the solution. At first, it seemed hard and I had difficulty relating the algorithm to the puzzle, especially as the examples often seemed abstract. So, I chose to leave it, intending to revisit at a later time. That was a year ago. Now in 2024, I've been doing Exercism's 48in24 and the exercise for week 17 happens to be Zebra Puzzle! So I took the opportunity to revisit AC-3 in the puzzle. This is how I've applied it.

What's the zebra puzzle?

In the puzzle, you are given a set of clues from which you need to answer a couple of questions. Here's the puzzle from Wikipedia:

  1. There are five houses.
  2. The Englishman lives in the red house.
  3. The Spaniard owns the dog.
  4. Coffee is drunk in the green house.
  5. The Ukrainian drinks tea.
  6. The green house is immediately to the right of the ivory house.
  7. The Old Gold smoker owns snails.
  8. Kools are smoked in the yellow house.
  9. Milk is drunk in the middle house.
  10. The Norwegian lives in the first house.
  11. The man who smokes Chesterfields lives in the house next to the man with the fox.
  12. Kools are smoked in the house next to the house where the horse is kept.
  13. The Lucky Strike smoker drinks orange juice.
  14. The Japanese smokes Parliaments.
  15. The Norwegian lives next to the blue house.

Now, who drinks water? Who owns the zebra?

In the interest of clarity, it must be added that each of the five houses is painted a different color, and their inhabitants are of different national extractions, own different pets, drink different beverages and smoke different brands of American cigarets [sic]. One other thing: in statement 6, right means your right.

— Life International, December 17, 1962

Applying AC-3

In AC-3, a set of variables has a range of possible values. The range of possible values is the variable's domain. The aim is to find the values for the variables that satisfy a set of constraints. The clues can be easily seen as the constraints, but what are the variables and domains in the Zebra Puzzle?

Since there are 5 houses, we can number them 1-5. If we treat each trait (e.g. "red", "green", "orange juice", "Englishman", etc.) as a variable, then the house numbers are their domain. All traits will start with the same domain:

TypeVariable (House trait)Domain (House #)
ColorRed1, 2, 3, 4, 5
Green1, 2, 3, 4, 5
Ivory1, 2, 3, 4, 5
Yellow1, 2, 3, 4, 5
Blue1, 2, 3, 4, 5
SmokesKools1, 2, 3, 4, 5
Old gold1, 2, 3, 4, 5
Chesterfields1, 2, 3, 4, 5
Lucky strike1, 2, 3, 4, 5
Parliament1, 2, 3, 4, 5
DrinkCoffee1, 2, 3, 4, 5
Tea1, 2, 3, 4, 5
Milk1, 2, 3, 4, 5
Orange Juice1, 2, 3, 4, 5
Water1, 2, 3, 4, 5
NationalityEnglishman1, 2, 3, 4, 5
Spaniard1, 2, 3, 4, 5
Ukrainian1, 2, 3, 4, 5
Norwegian1, 2, 3, 4, 5
Japanese1, 2, 3, 4, 5
PetDog1, 2, 3, 4, 5
Fox1, 2, 3, 4, 5
Horse1, 2, 3, 4, 5
Zebra1, 2, 3, 4, 5
Snails1, 2, 3, 4, 5

The puzzle is viewed as a directed graph, with the traits as the nodes. The relationship between the nodes are the arcs. For example, consider the clue coffee is drunk in the green house. This can be represented in a graph as:

There are two arcs (coffee = green and green = coffee) going in each direction as the graph is directed. This is important as the algorithm will later go through each arc and remove inconsistent values from the "left hand's" domain (the variable of the left hand side of the = in the relationship).

For the green house is immediately to the right of the ivory house, we can use the convention adding one to the house number to get the house to the right:

So far, the arcs have been going from one node to another. It is also possible for an arc to start and end at the same node. For example, consider the clue milk is drunk in the middle house. The middle house is 3, which has the relationship milk = 3. Milk is on the left hand side but no other variable is on the right. Therefore, the arc works solely on the milk domain:

Milk = 3 arc starts and ends on the milk node

Arcs beyond the 15 clues

There are a couple of additional constraints to consider beyond the 15 clues. As stated in the paragraph at the bottom, no two houses have the same color and their inhabitants have different nationalities, pets, drink different drinks and smoke different cigarettes. If we consider the colors, then there must be arcs that say two colors are not equal. For example, red ≠ green, red ≠ ivory, red ≠ yellow, red ≠ blue, etc.

No two colors can be the same

The other arc comes from the fact that each house must also have one of the colors. In other words, if we consider the colors alone, one of values must be 1, another must be 2, etc. For example, consider the following state:

ColorDomain
Red3, 4, 5
Green4, 5
Ivory3, 4
Yellow1, 3, 4, 5
Blue2

The value 1 appears in only yellow's domain. Since house 1 must have a color and each house has a different color, house 1 must be yellow.

Based on this, there must be an arc that:

  1. Checks the other (red, green, ivory and blue) domains if there is a value that only yellow has.
  2. If there is, remove the other values from yellow's domain.

Running the algorithm

The algorithm starts by adding all the arcs to the queue. It will then go through each arc in the queue and:

  1. Check the values from the "left hand" domain for consistency. A value is consistent if it can satisfy the relationship. Values that can't are removed from the domain.
  2. If any values were removed, any arcs that depend on the left hand variable are added back to the queue (if they aren't already in the queue). An arc depends on a variable if the variable appears on the "right hand side" of the relationship. For example, for the arc green = ivory + 1, green depends on ivory.

The process repeats until the queue is empty.

For example, consider the arcs for the rule coffee is drunk in the green house. The arcs are coffee = green and green = coffee. If both their domains were [1, 2, 3, 4, 5] like at the very beginning, for coffee = green, it would go through coffee's domain and check if it is also in green's domain. Similarly, it will check if each value in green's domain is in coffee's domain when it processes green = coffee. Since both have the exact same values, neither arc removes any values.

The process continues processing arcs from the queue. Eventually, it encounters green = ivory + 1 from the green house is immediately to the right of the ivory house. If both of their domains are still [1, 2, 3, 4, 5], it will remove 1 from green's domain because the only way for the relationship to hold is if ivory = 0 but ivory's domain doesn't have 0. Since a value has been removed from green's domain, it ensure that any arcs (for example, coffee = green) where green appear is on the right side is added to the queue.

Eventually, it will re-run the coffee = green arc. This time green's domain is [2, 3, 4, 5], so it will remove 1 from coffee's domain and enqueue the arcs where coffee is on the right.

This example is also illustrated below:

Process for removing inconsistent values

The process keeps running until there is nothing left in the queue.

Implementing this in Java

The following snippet shows how I implement this in Java for 48in24:

enum Variable {
  RED,
  GREEN,
  IVORY,
  ...
}

static class Arc {
  Arc(Variable leftVariable, Collection<Variable> rightVariables, Checker checker) {
        ...
  }


  boolean check(Map<Variable, Collection<Integer>> possibilities) {
    Collection<Integer> leftDomain = possibilities.get(leftVariable);
    Collection<Integer> combinedRight =
      rightVariables.stream()
        .flatMap(v -> possibilities.get(v).stream())
        .collect(Collectors.toSet());
    return checker.check(leftDomain, combinedRight);
  }
}

@FunctionalInterface
interface Checker {
  boolean check(Collection<Integer> leftDomain, Collection<Integer> rightDomains);
}
...

List<Arc> arcs =
  new ArrayList<>(
    List.of(
      // 2. The Englishman lives in the red house.
      sameHouse(Variable.ENGLISHMAN, Variable.RED),
      sameHouse(Variable.RED, Variable.ENGLISHMAN),

      // 3. The Spaniard owns the dog.
      sameHouse(Variable.SPANIARD, Variable.DOG),
      sameHouse(Variable.DOG, Variable.SPANIARD),

      ...
    )
  );
...
static Arc sameHouse(Variable leftVariable, Variable rightVariable) {
  return new Arc(
      leftVariable,
      rightVariable,
      (leftDomain, rightDomain) -> leftDomain.retainAll(rightDomain));
}
...
Map<Variable, Collection<Integer>> possibleValues = new HashMap<>();
for (Variable variable : Variable.values()) {
  possibleValues.put(variable, new HashSet<>(List.of(1, 2, 3, 4, 5)));
}

ArrayDeque<Arc> workQueue = new ArrayDeque<>(arcs);
while (!workQueue.isEmpty()) {
  Arc arc = workQueue.poll();
  if (arc.check(possibleValues)) {
    if (domainByVariable.get(arc.leftVariable).isEmpty()) {
      throw new IllegalStateException("A domain is empty!");
    }

    // If a change was made, ensure that any arcs that are interested in the changed domain are
    // enqueued. Keep in mind, the arc may also still be coming up in the queue.
    List<Arc> toEnqueue = arcInterests.getOrDefault(arc.leftVariable, Collections.emptyList());
      for (Arc requiredArc : toEnqueue) {
        if (!workQueue.contains(requiredArc)) {
          workQueue.add(requiredArc);
        }
     }
  }
}
...

The variables are represented as the Variable enum and are mapped to their domain (possibleValues). The arcs are represented by the Arc class. Its check method finds the values currently in the domain and asks the Checker to remove values that can be eliminated and returns whether anything was changed. The arcs are actually made in static factory methods, like sameHouse, which made it much easier to create to create similar arcs.

The while loop at the bottom is where the algorithm runs through the queue of arcs. An exception is thrown if a domain becomes empty as there must always be at least one value. If any values were removed from the left hand domain, any other arcs that depend on the left hand variable are re-queued to be check again later.

Eliminating more values

The first time I ran this algorithm, it eliminated a lot of values and was able to reach an answer for who drinks water? However, it didn't reach the solution for who owns the zebra? Printing out the state revealed it hadn't fully solved the puzzle:

TypeVariableDomain
ColorRed3, 5
Green4, 5
Ivory3, 4
Yellow1
Blue2
SmokeKools1
Old gold3, 4, 5
Chesterfields2, 3, 4, 5
Lucky Strike2, 4, 5
Parliament2, 3, 4, 5
DrinkCoffee4, 5
Tea2, 4, 5
Milk3
Orange juice2, 4, 5
Water1
NationalityEnglishman3, 5
Spaniard3, 4, 5
Ukrainian2, 4, 5
Norwegian1
Japanese2, 3, 4, 5
PetDog3, 4, 5
Fox1, 3, 4, 5
Horse2
Zebra1, 3, 4, 5
Snails3, 4, 5

Clearly, it hasn't fully solved the puzzle and there were some further values to remove. Running through the arcs again wouldn't remove further values. The next step is to go through the variables still with multiple values in the domain. For each value, test to see what happens when the variable takes on that value. If it results in a conflict, then it can't be that value and the value can be removed from the domain. The follow snippet does this:

static Collection<Variable> getUnsolvedVariables(
  Map<Variable, Collection<Integer>> possibleValues) {

  return Stream.of(Variable.values())
        .filter(key -> possibleValues.get(key).size() > 1)
        .toList();
}

static boolean isSolved(Map<Variable, Collection<Integer>> possibleValues) {

  return possibleValues.values().stream().allMatch(values -> values.size() == 1);
}

Iterator<Variable> variableIterator = getUnsolvedVariables(possibleValues).iterator();
while (!solved && variableIterator.hasNext()) {
  Variable testVariable = variableIterator.next();
  List<Integer> tryValues = List.copyOf(possibleValues.get(testVariable));

  if (tryValues.size() > 1) {
    Iterator<Integer> valueIterator = tryValues.iterator();
    while (!solved && valueIterator.hasNext()) {
      Integer testValue = valueIterator.next();

      Map<Variable, Collection<Integer>> testCopy = makeCopy(possibleValues);
      Collection<Arc> testArcs = new ArrayList<>(arcs);
      testArcs.add(knownHouse(testVariable, testValue));

      try {
        reduce(testCopy, testArcs);

        if (isSolved(testCopy)) {
          solved = true;
          possibleValues = testCopy;
        }
      } catch (IllegalStateException e) {
        possibleValues.get(testVariable).remove(testValue);
        reduce(possibleValues, arcs);
        solved = isSolved(possibleValues);
      }
    }
  }
}

This starts by getting the list of variables with multiple values in their domain (getUnsolvedVariables). The values are tested in the while loops. The outer loop iterates over the unsolved variables and the inner loop iterates over the domain. The outer loop needs to recheck if there are multiple values in the domain because the loops will be removing values from the domain and it is possible that they become solved in the process.

The inner loop makes a copy of the possible values and list of arcs so that the test values can be tested without effecting the current state. An arc is added (via knownHouse) to set the variable to the test value. The reduce method runs the earlier loop from the earlier listing, to go through the arcs and remove values from the domains.

The reduce method throws an exception when it detects a conflict. If this happens, then the variable can't be that value and is removed from the domain. If a value is removed, it then runs through the arcs again in case the removing the value allows even more values to be removed.

Conclusion

The cool thing about this approach is that it solve the puzzle by eliminating values that don't meet the criteria. Kind of like trying to apply logic to solve the puzzle instead of trying out every single permutation until the solution is found.

If you'd like to see the full implementation, it is avalaible here, or in Javascript, V or Scala 2.

(Cover image created using Image Creator)

Update 29 July 2024

Exercism has recently uploaded a video about solving this puzzle on Youtube. Be sure to check it out!

23
Subscribe to my newsletter

Read articles from Kah directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Kah
Kah

Thanks to some friends, I started programming in high school with QBASIC. Nowadays, I'm a software engineer working mostly in Java and active on Exercism. On weekends, you can find me outside, at a parkrun, tending to the veggie patch or the six backyard chooks.