Close

Solving the Water Jug puzzle with SQL

The Water Jugs are a classic puzzle going back centuries with many variations. While the specific numbers involved may change from one telling to the next the general form is one small jug (cup, barrel, bucket, etc.) and one large one. Each with an integer capacity of units (liters, gallons, ounces, etc.) The goal of the puzzle is to use only those two containers and a supply of water (or other liquid) to measure out some specific integer amount.

Although not specified the two values are usually coprime, also known as relatively prime, that is the two numbers have only 1 as a common divisor. Furthermore, the target number is some value between 1 and one less than the larger of the two jugs. In some variations the solution may be to solve for all of those values instead of just one of them.

An example may help illustrate. I’ll use what is probably the most famous example of the puzzle – that found in the movie Die Hard 3. In the movie Samuel L. Jackson and Bruce Willis are tasked with using a 3 gallon jug and a 5 gallon jug to measure out 4 gallons of water from a fountain.

To solve the puzzle, they filled the large jug and poured it into the small jug which is then emptied and refilled with the remaining 2 gallons, thus still having 1 gallon of left over capacity and the large jug is empty. The large jug is then refilled and one gallon is poured into the smaller jug. At this point the 3-gallon jug is full, and the 5-gallon jug has 4 gallons left in it. Getting 4 gallons was the goal, thus they solved puzzle.

If the numerical properties stated above are held, (i.e. coprime integer values with integer target less than largest jug) then the method above can be used to solve for any jugs and any target with those properties. That is, using one jug as the filler, and the other as the receiver, you can always use that same pouring jug and then emptying the receiving jug iteratively until you reach the target value you’re looking for. As an added bonus, it doesn’t matter which jug you use as the pouring jug and which is the receiving jug as long as you keep the same roles for the duration of the loops.

Those are strong claims but we know they are true due to Bézout’s identity.

Let a and b be integers with greatest common divisor d. Then there exist integers x and y such that ax + by = d. More generally, the integers of the form ax + by are exactly the multiples of d.

https://en.wikipedia.org/wiki/Bézout’s_identity

Since the water jug puzzle uses coprime values for a and b, the common divisor is 1. And, since all integers are multiples of 1, then all integers of the form ax + by will be solutions. I’m sure there are examples of the puzzle that do not use relatively prime values for the jug volumes; but in those cases, the number of possible target values will be limited. Using the relatively prime values allows for solutions for any value 1-N where n is the size of the larger jug. N itself is not used though because it’s a trivial solution. You simply fill the larger jug and you’re done in one step. So 1-N where N is less than the larger jug’s capacity is the typical range of interesting solutions, skipping the case where N equals the volume of the smaller jug for the same reason of being a trivial solution in one step.

So, the math checks out, but how do we actually use this in SQL? Below was my first attempt. The values “small” and “big” referred to the jug capacities while “s” and “b” referred to the actual amount held at the end of any one step.

WITH
    params AS(SELECT 3 small, 5 big, 4 target FROM DUAL),
    steps (step, s, b)
    AS
        (SELECT 0, 0, 0 FROM DUAL
         UNION ALL
         SELECT step + 1,
                CASE
                    WHEN s = 0 AND small + b >= big THEN small - (big - b)
                    WHEN s = 0 AND small + b < big THEN 0
                    WHEN s != 0 AND s + b >= big THEN s - (big - b)
                    WHEN s != 0 AND s + b < big THEN 0
                END s,
                CASE
                    WHEN s = 0 AND small + b >= big THEN 0
                    WHEN s = 0 AND small + b < big THEN b + small
                    WHEN s != 0 AND s + b >= big THEN 0
                    WHEN s != 0 AND s + b < big THEN b + s
                END b
           FROM steps, params
          WHERE target NOT IN (s, b))
SELECT *
  FROM steps;

      STEP          S          B
---------- ---------- ----------
         0          0          0
         1          0          3
         2          1          0
         3          0          1
         4          0          4

The flow of the query is, start at Step-0 with both jugs empty. In my query I use the opposite method to that found in Die Hard 3. I use the small jug as the pouring jug and the big jug as the receiving jug. When the small jug is empty, I didn’t include a distinct step to refill it. That’s why, when the s=0, the “small” is used instead of “s” for pouring into the big jug. Because “small” is the capacity value, when s=0, the jug is empty, but using the capacity value is equivalent to filling it up again. When s != 0, the math is all the same, but I use whatever amount is left over in the small jug instead of refilling it to capacity.

Since the pouring only goes one way (i.e. we don’t alternate pouring from small to big and then from big to small,) and we always fill as much as we can from the pouring jug – each step yields only one possible new value for each jug.

Reading the results above we see it worked but took a few extra steps than our movie heroes needed. First the small jug was filled and then poured into the big jug, leaving (0,3) as our (small,big) state. Then, since the small jug was empty again, we refilled and poured 2 gallons into the big jug, which we then emptied. Thus leaving 1 gallon in the small jug and nothing in the big jug. Next, we pour the 1 gallon into the big jug yielding (0,1). With the small jug empty again, refill it and pour all of it into the big jug which brings its total to 4 gallons, our target, so we are done.

This solution worked, but was obviously inferior to the movie solution because it took more steps. So, I changed the steps subquery to use the big jug as the pouring jug and the small one as the receiver. I distinguish this from the original by renaming it big_steps indicating these are steps taken when the big jug is the pouring source.

WITH
    params AS(SELECT 3 small, 5 big, 4 target FROM DUAL),
    big_steps (step, s, b)
    AS
        (SELECT 0, 0, 0 FROM DUAL
         UNION ALL
         SELECT step + 1,
                CASE
                    WHEN b = 0 THEN 0
                    WHEN b != 0 AND s + b >= small THEN 0
                    WHEN b != 0 AND s + b < small THEN b + s
                END s,
                CASE
                    WHEN b = 0 THEN big - (small - s)                   
                    WHEN b != 0 AND s + b >= small THEN b - (small - s)
                    WHEN b != 0 AND s + b < small THEN 0
                END b
           FROM big_steps, params
          WHERE target NOT IN (s, b))
SELECT *
  FROM big_steps;

      STEP          S          B
---------- ---------- ----------
         0          0          0
         1          0          2
         2          2          0
         3          0          4

We we have our solution with 4 gallons again. The results follow the same logic found in the movie. Checking for other target values, 1 is best achieved using the small jug, 2 and 4 are better using the larger jug. 3 is also better using the small jug but it’s the trivial case of one fill and done.

You might note there are fewer WHEN clauses in the CASE statements. This is because it wasn’t necessary to check for over filling when the big jug is full. Since the big jug is, by definition, bigger than the small jug, we already know that anytime we refill the big jug it will always exceed the capacity of the small jug regardless of how much it already holds.

With two competing solutions the next step was combining them and only returning the solution that worked best. One method to do this is: I can query both small and big steps and count the number of steps in each, then only report those that have the lowest maximum step count. In the event of a tie, I’ll print both sets of results.

WITH
    params AS(SELECT 3 small, 5 big, 4 target FROM DUAL),
    small_steps (step, s, b)
    AS
        (SELECT 0, 0, 0 FROM DUAL
         UNION ALL
         SELECT step + 1,
                CASE
                    WHEN s = 0 AND small + b >= big THEN small - (big - b)
                    WHEN s = 0 AND small + b < big THEN 0
                    WHEN s != 0 AND s + b >= big THEN s - (big - b)
                    WHEN s != 0 AND s + b < big THEN 0
                END s,
                CASE
                    WHEN s = 0 AND small + b >= big THEN 0
                    WHEN s = 0 AND small + b < big THEN b + small
                    WHEN s != 0 AND s + b >= big THEN 0
                    WHEN s != 0 AND s + b < big THEN b + s
                END b
           FROM small_steps, params
          WHERE target NOT IN (s, b)),
    big_steps (step, s, b)
    AS
        (SELECT 0, 0, 0 FROM DUAL
         UNION ALL
         SELECT step + 1,
                CASE
                    WHEN b = 0 THEN 0
                    WHEN b != 0 AND s + b >= small THEN 0
                    WHEN b != 0 AND s + b < small THEN b + s
                END s,
                CASE
                    WHEN b = 0 THEN big - (small - s)
                    WHEN b != 0 AND s + b >= small THEN b - (small - s)
                    WHEN b != 0 AND s + b < small THEN 0
                END b
           FROM big_steps, params
          WHERE target NOT IN (s, b))
  SELECT pour, step, s, b
    FROM (SELECT MIN(maxpour) OVER() minall, maxpour, pour, step, s, b
            FROM (SELECT MAX(step) OVER (PARTITION BY pour) maxpour,
                         pour,
                         step,
                         s,
                         b
                    FROM (SELECT 'big' pour, b.*
                            FROM big_steps b
                          UNION ALL
                          SELECT 'small' pour, s.*
                            FROM small_steps s)))
   WHERE minall = maxpour
ORDER BY pour, step;

POUR        STEP          S          B
----- ---------- ---------- ----------
big            0          0          0
big            1          0          2
big            2          2          0
big            3          0          4

At this point I could have declared my little project done; but it was kind of nagging at me that I was hiding the refill action inside the math. I thought it made the query harder to understand and awkward to read the output. So I added a condition for that and it actually made the case statements shrink because I no longer needed to have separate checks for empty or not where I substituted the capacity for the actual value. By including the refill action as distinct step the other conditions became simpler.

WITH
    params AS(SELECT 3 small, 5 big, 4 target FROM DUAL),
    small_steps (step, s, b)
    AS
        (SELECT 0, 0, 0 FROM DUAL
         UNION ALL
         SELECT step + 1,
                CASE
                    WHEN s = 0 THEN small
                    WHEN s + b >= big THEN s - (big - b)
                    WHEN s + b < big THEN 0
                END s,
                CASE
                    WHEN s = 0 THEN b
                    WHEN s + b >= big THEN 0
                    WHEN s + b < big THEN b + s
                END b
           FROM small_steps, params
          WHERE target NOT IN (s, b)),
    big_steps (step, s, b)
    AS
        (SELECT 0, 0, 0 FROM DUAL
         UNION ALL
         SELECT step + 1,
                CASE
                    WHEN b = 0 THEN s
                    WHEN s + b >= small THEN 0
                    WHEN s + b < small THEN b + s
                END s,
                CASE
                    WHEN b = 0 THEN big
                    WHEN s + b >= small THEN b - (small - s)
                    WHEN s + b < small THEN 0
                END b
           FROM big_steps, params
          WHERE target NOT IN (s, b))
  SELECT pour, step, s, b
    FROM (SELECT MIN(maxpour) OVER() minall, maxpour, pour, step, s, b
            FROM (SELECT MAX(step) OVER (PARTITION BY pour) maxpour,
                         pour,
                         step,
                         s,
                         b
                    FROM (SELECT 'big' pour, b.*
                            FROM big_steps b
                          UNION ALL
                          SELECT 'small' pour, s.*
                            FROM small_steps s)))
   WHERE minall = maxpour
ORDER BY pour, step;

POUR        STEP          S          B
----- ---------- ---------- ----------
big            0          0          0
big            1          0          5
big            2          0          2
big            3          2          0
big            4          2          5
big            5          0          4

Now we get the same results but we can see the big jug getting filled up twice in the process.

And finally, I wanted the SQL to not just spit out a bunch of numbers. I thought it would be more helpful if it returned instructions, in words, that indicated what was happening during each step.

WITH
    params AS(SELECT 3 small, 5 big, 4 target FROM DUAL),
    small_steps (step, s, b, solution)
    AS
        (SELECT 0, 0, 0, 'Initial state, both jugs are empty' FROM DUAL
         UNION ALL
         SELECT step + 1,
                CASE
                    WHEN s = 0 THEN small
                    WHEN s + b >= big THEN s - (big - b)
                    WHEN s + b < big THEN 0
                END s,
                CASE
                    WHEN s = 0 THEN b
                    WHEN s + b >= big THEN 0
                    WHEN s + b < big THEN b + s
                END b,
                CASE
                    WHEN s = 0
                    THEN
                        'The Small jug is empty, fill it.'
                    WHEN s + b >= big
                    THEN
                        'Pour the Small jug into the Big jug until the Big jug is full, then empty the Big jug'
                    WHEN s + b < big
                    THEN
                        'Pour all of the Small jug into the Big jug'
                END solution
           FROM small_steps, params
          WHERE target NOT IN (s, b)),
    big_steps (step, s, b, solution)
    AS
        (SELECT 0, 0, 0, 'Initial state, both jugs are empty' FROM DUAL
         UNION ALL
         SELECT step + 1,
                CASE
                    WHEN b = 0 THEN s
                    WHEN s + b >= small THEN 0
                    WHEN s + b < small THEN b + s
                END s,
                CASE
                    WHEN b = 0 THEN big
                    WHEN s + b >= small THEN b - (small - s)
                    WHEN s + b < small THEN 0
                END b,
                CASE
                    WHEN b = 0
                    THEN
                        'The Big jug is empty, fill it.'
                    WHEN s + b >= small
                    THEN
                        'Pour the Big jug into the Small jug until the Small jug is full, then empty the Small jug'
                    WHEN s + b < small
                    THEN
                        'Pour all of the Big jug into the Small jug'
                END solution
           FROM big_steps, params
          WHERE target NOT IN (s, b))
  SELECT pour, step, s, b, solution
    FROM (SELECT MIN(maxpour) OVER() minall, maxpour, pour, step, s, b, solution
            FROM (SELECT MAX(step) OVER (PARTITION BY pour) maxpour,
                         pour,
                         step,
                         s,
                         b,
                         solution
                    FROM (SELECT 'big' pour, b.*
                            FROM big_steps b
                          UNION ALL
                          SELECT 'small' pour, s.*
                            FROM small_steps s)))
   WHERE minall = maxpour
ORDER BY pour, step;

POUR  STEP S B SOLUTION                                                                                 
----- ---- - - -----------------------------------------------------------------------------------------
big     0  0 0 Initial state, both jugs are empty                                                       
big     1  0 5 The Big jug is empty, fill it.                                                           
big     2  0 2 Pour the Big jug into the Small jug until the Small jug is full, then empty the Small jug
big     3  2 0 Pour all of the Big jug into the Small jug                                               
big     4  2 5 The Big jug is empty, fill it.                                                           
big     5  0 4 Pour the Big jug into the Small jug until the Small jug is full, then empty the Small jug

That was fun. I hope you enjoyed reading it as much as I did writing it. Questions and comments, as always, are welcome.