Close

Implementing xoshiro256** in PL/SQL

A long time ago I ported the Mersenne Twister pseudo-random number generator to pl/sql. That algorithm, and my port, are showing their age so I started looking at other generators. A newer approach is found in the xoshiro/xoroshiro family by David Blackman and Sebastiano Vigna found here.

I’ve chosen xoshiro256** as one of their “all-purpose, rock-solid
generators”
. Like the Mersenne Twister, the original source is written in c and utilizes numerous bit manipulations to generate new values from the prior ones.

While PL/SQL has the BITAND function, it doesn’t have any others that directly apply to numeric types. It would have been possible to use RAW values and the UTL_RAW package, but I found it easier to keep true to the original implementation by creating my own bit-wise OR, XOR, shift, and rotation functions.

Implementing 64-bit integer subtype (almost)

Another complexity is that c has the uint64_t type. That is, a 64-bit unsigned integer. While PL/SQL NUMBER types can hold such values, their is no direct correlation. Ideally I would create an integer subtype constrained with a RANGE clause of 0..18446744073709551615 (2^64-1). However, there is no legal syntax to support that definition. As of 23ai, the RANGE clause only applies to subtypes of PLS_INTEGER which is, itself, limited to signed 32-bit values. So I settled for a 20-digit integer using:

SUBTYPE uint64_t IS NUMBER(20, 0)

Then, in my calculations I apply a 64-bit mask to ensure overflows are handled in a way that mimics the c uint64_t.

Bit-wise operators

As mentioned above, some of the c syntax has no corresponding pl/sql operator or function. In particular:

  • | – OR
  • ^ – XOR
  • << – Shift left
  • >> – Shift right

Fortunately, BITAND and arithmetic is sufficient to create OR and XOR. Multiplying or dividing by powers of 2 will replicate bit shifts left or right respectively.

    FUNCTION bitor(p_a IN uint64_t, p_b IN uint64_t)
        RETURN uint64_t
    IS
    BEGIN
        RETURN BITAND(p_a + p_b - BITAND(p_a, p_b), c_64bitmask);
    END bitor;

    FUNCTION bitxor(p_a IN uint64_t, p_b IN uint64_t)
        RETURN uint64_t
    IS
    BEGIN
        RETURN BITAND((p_a + p_b) - (BITAND(p_a, p_b) * 2), c_64bitmask);
    END bitxor;

    FUNCTION shl64(p_n IN uint64_t, p_bits IN bits_t)
        RETURN uint64_t
    IS
    BEGIN
        -- Apply 64-bit mask to ensure results are within proper range
        RETURN CASE p_bits WHEN 0 THEN p_n ELSE BITAND(p_n * (2 ** p_bits), c_64bitmask) END;
    END shl64;

    FUNCTION shr64(p_n IN uint64_t, p_bits IN bits_t)
        RETURN uint64_t
    IS
    BEGIN
        -- Apply 64-bit mask to ensure results are within proper range
        RETURN CASE p_bits WHEN 0 THEN p_n ELSE BITAND(p_n / (2 ** p_bits), c_64bitmask) END;
    END shr64;

It wasn’t technically necessary to use a CASE for shifts of 0 bits, but it seemed more natural to capture that special case rather than force the extra math steps to produce a result we already knew. This is merely a preference on my part and both shl64 and shr64 could be simplified to let 0 run the same as any other shift value. Also of note, both of the shift functions apply the 64-bit mask to ensure there the results always fall in the intended 0-18446744073709551615 range.

The last special bit operation is rotation, the xoshiro code itself includes the rotl logic which can then be converted to PL/SQL using the functions above:

    FUNCTION rotl(p_x uint64_t, p_k IN bits_t)
        RETURN uint64_t
    IS
    -- static inline uint64_t rotl(const uint64_t x, int k) {
    --  return (x << k) | (x >> (64 - k));
    -- }
    BEGIN
        RETURN CASE WHEN p_k = 0 THEN p_x ELSE bitor(shl64(p_x, p_k), shr64(p_x, 64 - p_k)) END;
    END rotl;

Pulling it all together

With the bit operations set, implementing the functions was fairly straight forward. There are multiple arrays used within the original c code which I originally implemented as varrays, but it felt artificial to keep adjusting the varray indexes to be 1-based vs the 0-based arrays. So, instead, I decided to use an associative arrays with indexes limited to 0,1,2,3. The full package follows:

CREATE OR REPLACE PACKAGE xoshiro256
IS
    -- Based on xoshiro256** by by David Blackman and Sebastiano Vigna
    -- original  https://prng.di.unimi.it/xoshiro256starstar.c


    --                    .///.
    --                   (0 o)
    ---------------0000--(_)--0000---------------
    --
    --  Sean D. Stuber
    --  sean.stuber@gmail.com
    --
    --             oooO      Oooo
    --------------(   )-----(   )---------------
    --             \ (       ) /
    --              \_)     (_/


    -- It would be nice to define an explicit RANGE 0.. 18446744073709551615  for the subtype
    -- but as of 23ai, only subtypes of PLS_INTEGER may have a range.
    SUBTYPE uint64_t IS NUMBER(20, 0);

    TYPE seed_tab IS TABLE OF uint64_t;

    PROCEDURE set_state(p_seed IN seed_tab);

    FUNCTION get_next
        RETURN uint64_t;

    PROCEDURE jump;

    PROCEDURE long_jump;
END;
CREATE OR REPLACE PACKAGE BODY sds.xoshiro256
IS
    -- TO_NUMBER('ffffffffffffffff', 'xxxxxxxxxxxxxxxx') = 18446744073709551615 = 2^64-1
    c_64bitmask   CONSTANT uint64_t := POWER(2, 64) - 1; -- 18446744073709551615

    SUBTYPE bits_t IS PLS_INTEGER RANGE 0 .. 63;

    SUBTYPE array_index_t IS PLS_INTEGER RANGE 0 .. 3;

    TYPE state_array_t IS TABLE OF uint64_t
        INDEX BY array_index_t;

    g_state                 state_array_t;

    FUNCTION bitor(p_a IN uint64_t, p_b IN uint64_t)
        RETURN uint64_t
    IS
    BEGIN
        RETURN BITAND(p_a + p_b - BITAND(p_a, p_b), c_64bitmask);
    END bitor;

    FUNCTION bitxor(p_a IN uint64_t, p_b IN uint64_t)
        RETURN uint64_t
    IS
    BEGIN
        RETURN BITAND((p_a + p_b) - (BITAND(p_a, p_b) * 2), c_64bitmask);
    END bitxor;

    FUNCTION shl64(p_n IN uint64_t, p_bits IN bits_t)
        RETURN uint64_t
    IS
    BEGIN
        -- Apply 64-bit mask to ensure results are within proper range
        RETURN CASE p_bits WHEN 0 THEN p_n ELSE BITAND(p_n * (2 ** p_bits), c_64bitmask) END;
    END shl64;

    FUNCTION shr64(p_n IN uint64_t, p_bits IN bits_t)
        RETURN uint64_t
    IS
    BEGIN
        -- Apply 64-bit mask to ensure results are within proper range
        RETURN CASE p_bits WHEN 0 THEN p_n ELSE BITAND(p_n / (2 ** p_bits), c_64bitmask) END;
    END shr64;

    FUNCTION rotl(p_x uint64_t, p_k IN bits_t)
        RETURN uint64_t
    IS
    -- static inline uint64_t rotl(const uint64_t x, int k) {
    --  return (x << k) | (x >> (64 - k));
    -- }
    BEGIN
        RETURN CASE WHEN p_k = 0 THEN p_x ELSE bitor(shl64(p_x, p_k), shr64(p_x, 64 - p_k)) END;
    END rotl;

    FUNCTION get_next
        RETURN uint64_t
    IS
        --  uint64_t next(void) {
        --   const uint64_t result = rotl(s[1] * 5, 7) * 9;
        --
        --   const uint64_t t = s[1] << 17;
        --
        --   s[2] ^= s[0];
        --   s[3] ^= s[1];
        --   s[1] ^= s[2];
        --   s[0] ^= s[3];
        --
        --   s[2] ^= t;
        --
        --   s[3] = rotl(s[3], 45);
        --
        --   return result;
        --  }

        v_result   uint64_t := BITAND(rotl(BITAND(g_state(1) * 5, c_64bitmask), 7) * 9, c_64bitmask);

        v_t        uint64_t := shl64(g_state(1), 17);
    BEGIN
        g_state(2) := bitxor(g_state(2), g_state(0));
        g_state(3) := bitxor(g_state(3), g_state(1));
        g_state(1) := bitxor(g_state(1), g_state(2));
        g_state(0) := bitxor(g_state(0), g_state(3));

        g_state(2) := bitxor(g_state(2), v_t);

        g_state(3) := rotl(g_state(3), 45);
        RETURN v_result;
    END get_next;

    PROCEDURE jump
    IS
        --   /* This is the jump function for the generator. It is equivalent
        --     to 2^128 calls to next(); it can be used to generate 2^128
        --     non-overlapping subsequences for parallel computations. */
        --
        --  void jump(void) {
        --   static const uint64_t JUMP[] = { 0x180ec6d33cfd0aba, 0xd5a61266f0c9392c, 0xa9582618e03fc9aa, 0x39abdc4529b1661c };
        --
        --   uint64_t s0 = 0;
        --   uint64_t s1 = 0;
        --   uint64_t s2 = 0;
        --   uint64_t s3 = 0;
        --   for(int i = 0; i < sizeof JUMP / sizeof *JUMP; i++)
        --    for(int b = 0; b < 64; b++) {
        --     if (JUMP[i] & UINT64_C(1) << b) {
        --      s0 ^= s[0];
        --      s1 ^= s[1];
        --      s2 ^= s[2];
        --      s3 ^= s[3];
        --     }
        --     next();
        --    }
        --
        --   s[0] = s0;
        --   s[1] = s1;
        --   s[2] = s2;
        --   s[3] = s3;
        --  }

        v_jump   CONSTANT state_array_t
            := state_array_t(0   => TO_NUMBER('180ec6d33cfd0aba', 'xxxxxxxxxxxxxxxx'),
                            1   => TO_NUMBER('d5a61266f0c9392c', 'xxxxxxxxxxxxxxxx'),
                            2   => TO_NUMBER('a9582618e03fc9aa', 'xxxxxxxxxxxxxxxx'),
                            3   => TO_NUMBER('39abdc4529b1661c', 'xxxxxxxxxxxxxxxx')) ;

        v_s0              uint64_t := 0;
        v_s1              uint64_t := 0;
        v_s2              uint64_t := 0;
        v_s3              uint64_t := 0;
        v_dummy           uint64_t;
    BEGIN
        FOR i IN 0 .. 3
        LOOP
            FOR b IN 0 .. 63
            LOOP
                IF BITAND(v_jump(i), shl64(1, b)) != 0
                THEN
                    v_s0 := bitxor(v_s0, g_state(0));
                    v_s1 := bitxor(v_s1, g_state(1));
                    v_s2 := bitxor(v_s2, g_state(2));
                    v_s3 := bitxor(v_s3, g_state(3));
                END IF;

                v_dummy := get_next;
            END LOOP;
        END LOOP;

        g_state(0) := v_s0;
        g_state(1) := v_s1;
        g_state(2) := v_s2;
        g_state(3) := v_s3;
    END jump;

    PROCEDURE long_jump
    IS
        --  /* This is the long-jump function for the generator. It is equivalent to
        --     2^192 calls to next(); it can be used to generate 2^64 starting points,
        --     from each of which jump() will generate 2^64 non-overlapping
        --     subsequences for parallel distributed computations. */
        --
        --  void long_jump(void) {
        --   static const uint64_t LONG_JUMP[] = { 0x76e15d3efefdcbbf, 0xc5004e441c522fb3, 0x77710069854ee241, 0x39109bb02acbe635 };
        --
        --   uint64_t s0 = 0;
        --   uint64_t s1 = 0;
        --   uint64_t s2 = 0;
        --   uint64_t s3 = 0;
        --   for(int i = 0; i < sizeof LONG_JUMP / sizeof *LONG_JUMP; i++)
        --    for(int b = 0; b < 64; b++) {
        --     if (LONG_JUMP[i] & UINT64_C(1) << b) {
        --      s0 ^= s[0];
        --      s1 ^= s[1];
        --      s2 ^= s[2];
        --      s3 ^= s[3];
        --     }
        --     next();
        --    }
        --
        --   s[0] = s0;
        --   s[1] = s1;
        --   s[2] = s2;
        --   s[3] = s3;
        --  }

        v_longjump   CONSTANT state_array_t
            := state_array_t(0   => TO_NUMBER('76e15d3efefdcbbf', 'xxxxxxxxxxxxxxxx'),
                            1   => TO_NUMBER('c5004e441c522fb3', 'xxxxxxxxxxxxxxxx'),
                            2   => TO_NUMBER('77710069854ee241', 'xxxxxxxxxxxxxxxx'),
                            3   => TO_NUMBER('39109bb02acbe635', 'xxxxxxxxxxxxxxxx')) ;

        v_s0                  uint64_t := 0;
        v_s1                  uint64_t := 0;
        v_s2                  uint64_t := 0;
        v_s3                  uint64_t := 0;
        v_dummy               uint64_t;
    BEGIN
        FOR i IN 0 .. 3
        LOOP
            FOR b IN 0 .. 63
            LOOP
                IF BITAND(v_longjump(i), shl64(1, b)) != 0
                THEN
                    v_s0 := bitxor(v_s0, g_state(0));
                    v_s1 := bitxor(v_s1, g_state(1));
                    v_s2 := bitxor(v_s2, g_state(2));
                    v_s3 := bitxor(v_s3, g_state(3));
                END IF;

                v_dummy := get_next;
            END LOOP;
        END LOOP;

        g_state(0) := v_s0;
        g_state(1) := v_s1;
        g_state(2) := v_s2;
        g_state(3) := v_s3;
    END long_jump;

    PROCEDURE set_state(p_seed IN seed_tab)
    IS
    BEGIN
        g_state := state_array_t();

        FOR i IN 0 .. 3
        LOOP
            g_state(i) := p_seed(i + 1);
        END LOOP;
    END set_state;
END;

Picking a set of good seed values help build better distributions in the generated sequence. For the purposes of illustration though, I’ll use a simple 12345 in each of the seed array elements and then generate 10 values with get_next, then jump and generate 10 more and finally do a long_jump and generate a final set of 10. This simple framework allowed me to verify my logic against the original doing the same steps.

DECLARE
    x   xoshiro256.uint64_t;
BEGIN
    xoshiro256.set_state(xoshiro256.seed_tab(12345, 12345, 12345, 12345));

    FOR i IN 0 .. 9
    LOOP
        x := xoshiro256.get_next;
        DBMS_OUTPUT.put_line(TO_CHAR(i, '9') || ' ' || LPAD(TO_CHAR(x), 20) || ' ' || TO_CHAR(x, 'xxxxxxxxxxxxxxxx'));
    END LOOP;

    DBMS_OUTPUT.put_line('--- jump ---');
    xoshiro256.jump;

    FOR i IN 0 .. 9
    LOOP
        x := xoshiro256.get_next;
        DBMS_OUTPUT.put_line(TO_CHAR(i, '9') || ' ' || LPAD(TO_CHAR(x), 20) || ' ' || TO_CHAR(x, 'xxxxxxxxxxxxxxxx'));
    END LOOP;

    DBMS_OUTPUT.put_line('--- long jump ---');
    xoshiro256.long_jump;

    FOR i IN 0 .. 9
    LOOP
        x := xoshiro256.get_next;
        DBMS_OUTPUT.put_line(TO_CHAR(i, '9') || ' ' || LPAD(TO_CHAR(x), 20) || ' ' || TO_CHAR(x, 'xxxxxxxxxxxxxxxx'));
    END LOOP;
END;

This produced the following sequences…

0             71107200           43d0280
1             71107200           43d0280
2        9320162918400       87a05000000
3        9320234025600       87a093d0280
4 12773345438245847175  b1440a0000000087
5 12768617581213858983  b1333e0a010f3ca7
6  8945543092777141728  7c24f40c1fee45e0
7   470016407425146078   685d56eab2e5cde
8  3493524090943047400  307b7da6b6ee06e8
9  5886979323815290452  51b2c08124146654
--- jump ---
0  4581861990845984958  3f960b34763d30be
1  6555207914207083891  5af8c6ddf3247d73
2  1155739510168040853  100a02f60c733595
3 17756807916997691290  f66cdab88d7dc79a
4 18120637618276044033  fb796fe814ec8d01
5  3601654045701872973  31fba545ade89d4d
6  1200605674697995402  10a9688003b1348a
7  2838098080646629052  2762f32adea906bc
8 13449286687953663012  baa576fc3dde3424
9  5139522055796585030  47534014530a7246
--- long jump ---
0   308148744041885595   446c3a66a996f9b
1  1534996402541018053  154d671868f5bbc5
2  2575119349696104759  23bca967acdf0937
3  1204256562158620046  10b660f67de5cd8e
4  2485655289488877646  227ed250f40e284e
5  2705313921121906326  258b34ad8a63ee96
6 14047681829789752210  c2f364324653db92
7  8712759958851592880  78e9f1053898a2b0
8 16323705335360483007  e28973840c4f6ebf
9  2581999948191718881  23d51b45da0b5de1

This was a fun little exercise. Eventually I’d like to expand on it to implement other variations in the xoshiro/xoroshiro suite as well as include support for splitmix64 as a way to help start with better seed values.

Questions and comments, as always, are welcome.

Leave a Reply