Close

Large Number Combinatoric calculations in PL/SQL

Calculating factorials quickly explode into large results. The SQL and PL/SQL numeric types can’t hold the extremely large values generated by factorial computation of even relatively small numbers.

Since several numeric types are subtypes, they inherit the limits of their parent type. All of the types will eventually overflow when the results get too big. Some of them will lose precision before reaching their overflow limits.

First let’s look at a simple implementation of the factorial function

CREATE OR REPLACE FUNCTION factorial(p_n IN INTEGER)
    RETURN NUMBER
    DETERMINISTIC
IS
    x   NUMBER;
BEGIN
    IF p_n < 0
    THEN
        RAISE VALUE_ERROR;
    END IF;

    x := 1;

    FOR i IN 2 .. p_n
    LOOP
        x := x * i;
    END LOOP;

    RETURN x;
END factorial;

Changing the data type of v_temp you can test each of them with increasingly larger values until they no longer return numbers. I’ve already found the limits for each type and subtype.

PLS_INTEGER, BINARY_INTEGER and SIMPLE_INTEGER are limited to 12!
12! = 479001600

INTEGER and DECIMAL are limited to 33!
33!= 8683317618811886495518194401280000000

BINARY_FLOAT will allow you to hold up to 34! but it loses precision beginning with 14! So 13! (6227020800) is the limit for accurate results.

BINARY_DOUBLE similarly will hold larger results, up to 83! but loses precision after 21! (51090942171709440000)

NUMBER will hold up to 83! but loses precision after 40! (815915283247897734345611269596115894272000000000)

Since NUMBER has the greatest capacity, it is what I use for these functions, choosing accuracy over possible performance gains from using other types.

Some of the numbers above may seem astronomically large and unreasonable for day-to-day calculations. But, combinations with something as mundane as a standard deck of playing cards can yield such large numbers with intermediate values when performing the calculations.

For example, the number of possible 5-card hands from a standard deck is calculated with combination formula \displaystyle nCk = {n \choose k} =\frac{n!}{k!(n-k)!} = \frac{52!}{5! 47!}

Both 52! and 47! are too big to be calculated with complete accuracy if the factorials are fully resolved.

However, 52! = 52*51*50*49*48*47!, so we can cancel the 47! from the numerator and denominator and thus simplify our calculation to \displaystyle\frac{52*51*50*49*48}{5!}. This calculation is well within the capabilities of the NUMBER type, yielding 2598960 possible distinct hands of 5 cards.

SQL> select 52*51*50*49*48/factorial(5) from dual;

   52*51*50*49*48/FACTORIAL(5)
______________________________
                       2598960

Implementing factorial division involves a number of possibilities. The numerator and denominator could be equal, the numerator could be greater than the denominator or vice versa and certain values (0 and 1) are already known to be 1 thus simplifying the calculation.

CREATE OR REPLACE FUNCTION factorial_division(p_numerator     IN INTEGER,
                                              p_denominator   IN INTEGER)
    RETURN NUMBER
    DETERMINISTIC
IS
    x   NUMBER;
BEGIN
    -- These don't make sense, don't even try
    IF p_numerator < 0 OR p_denominator < 0
    THEN
        RAISE VALUE_ERROR;
    END IF;

    CASE
        WHEN p_numerator = p_denominator
        THEN
            -- If n = d then n!=d! so the quotient is 1
            x := 1;
        WHEN p_denominator < 2
        THEN
            -- If the denominator is 0 or 1 then d! = 1 so the quotient is n!
            x := factorial(p_numerator);
        WHEN p_numerator < 2
        THEN
            -- If the numerator is 0 or 1 then n! = 1 so the quotient is 1/d!
            x := 1 / factorial(p_denominator);
        WHEN p_numerator > p_denominator
        THEN
            -- If the numerator is greater than the denominator
            -- then the denominator will be cancelled out
            -- so, only multiply the elements of the factorial greater than the denominator
            x := p_denominator + 1;

            FOR i IN p_denominator + 2 .. p_numerator
            LOOP
                x := x * i;
            END LOOP;
        ELSE
            -- p_numerator < p_denominator
            --
            -- If the numerator is less than the denominator
            -- then the numerator will be cancelled out
            -- so, only multiply the elements of the factorial greater than the numerator
            -- then divide 1 by that product
            x := p_numerator + 1;

            FOR i IN p_numerator + 2 .. p_denominator
            LOOP
                x := x * i;
            END LOOP;

            x := 1 / x;
    END CASE;

    RETURN x;
END factorial_division;

So calculating the result using the functions above:

SQL> select factorial_division(52,47)/factorial(5) from dual;

   FACTORIAL_DIVISION(52,47)/FACTORIAL(5)
_________________________________________
                                  2598960

This however does presuppose you always know whether k! is larger than (n-k)! so you can maximize the negation. Rather than guessing which might be better, we can examine all of the options and execute the division with the greatest amount of cancellation. Putting all of the above logic into a single function then looks like this…

CREATE OR REPLACE FUNCTION combination_cnt(p_n IN INTEGER, p_k IN INTEGER)
    RETURN NUMBER
    DETERMINISTIC
IS
BEGIN
    IF p_n < 0 OR p_k < 0
    THEN
        RAISE VALUE_ERROR;
    END IF;

    RETURN CASE
               WHEN p_k > p_n
               THEN
                   0
               WHEN p_k = 0 OR p_k = p_n
               THEN
                   1
               WHEN p_n - p_k > p_k
               THEN
                   factorial_division(p_n, p_n - p_k) / factorial(p_k)
               ELSE
                   factorial_division(p_n, p_k) / factorial(p_n - p_k)
           END;
END combination_cnt;

Thus calculating the number of 5-card hands is simply:

SQL> select combination_cnt(52,5) from dual;

   COMBINATION_CNT(52,5)
________________________
                 2598960

With these tools we can tackle some types of calculations that could otherwise overwhelm the numerical limits of the NUMBER type. They don’t eliminate all possible overflow conditions but they will help.

These functions can be used to help calculate more complicated formulas in probability and combinatorics.

Thank you for reading and I hope you find these functions useful. Questions and comments, as always, are welcome.