CHICKEN's numeric tower: part 4

In this instalment of the blog series, we'll take a look at how string->number and number->string are implemented for bignums.

Performing base conversions

Performing calculations is all nice and useful, but you eventually want to print the results back to the user. And of course the user needs a way to enter such large numbers into the system. So, converting between numbers and strings is an essential operation. The Scheme standard requires support for conversion between bases 2, 8, 10 and 16, but many practical implementations support conversion between arbitrary bases, and why not? It doesn't really require more effort.

Converting strings to numbers

Let's start with the simpler of the two directions: string->number. The naive way of converting a string in base n to a number is to scan the string from left to right (high to low), adding the digit to the result and multiplying the result by n:

/* "result" is a pre-allocated, zeroed out bignum of the right size */
while (*str != '0') /* Assuming NUL-terminated string */
{
  int digit = hex_char_to_digit((int)*str++);
  bignum_destructive_scale_up(result, radix);
  bignum_destructive_add(result, digit);
}

This is very simple and elegant, but also quite slow. The Scheme48 code also checks for invalid characters in this loop, while in CHICKEN, the reader performs this check. So, when converting a digit stream to a bignum, we already know the digit stream contains only valid digits.

A simple improvement can be made to this algorithm: we can avoid traversing the entire bignum for every string digit. Instead, we collect multiple digits in a register until we fill up a halfword. Then, we scale up the bignum, adding the collected halfword:

do {
  C_word big_digit = 0;  /* Collected digit */
  C_word factor = radix; /* Multiplication factor for bignum */

  /* Keep collecting from str while factor fits a half-digit */
  while (str < str_end && C_fitsinbignumhalfdigitp(factor)) {
    str_digit = hex_char_to_digit((int)*str++);
    factor *= radix;
    big_digit = radix * big_digit + str_digit;
  }

  /* Scaling up with carry avoids traversing bignum twice */
  big_digit = bignum_digits_destructive_scale_up_with_carry(
                digits, last_digit, factor / radix, big_digit);

  if (big_digit) { /* If there was a carry, increment size of bignum */
    (*last_digit++) = big_digit;
  }
} while (str < str_end);

Remember the previous post in this series? Now you know where the design behind bignum_digits_destructive_scale_up_with_carry comes from: we can scale up the bignum with an initial "carry" value that's our collected digit. The return value is the resulting carry (if any), so we know when to increase the bignum's size by moving the last_digit pointer. This pointer makes it easier to detect the final length of the bignum, which can be hard to predict precisely. We can't predict the exact size, but we can calculate the maximum size. Because we allocate this maximum size, this moving of the end pointer is safe.

If the string's base is a power of two, we can perform an even better optimisation: We don't need to multiply or add to the bignum, we can just write straight to the bignum's digits!

int radix_shift = C_ilen(radix) - 1;  /* Integer length (the power of two) */
C_uword big_digit = 0;       /* The current bignum digit being constructed */
int n = 0;                /* The number of bits read so far into big_digit */

/* Read from least to most significant digit.  This is much easier! */
while (str_end > str_start) {
  str_digit = hex_char_to_digit((int)*--str_end);

  big_digit |= (C_uword)str_digit << n;
  n += radix_shift;                             /* Processed n bits so far */

  if (n >= C_BIGNUM_DIGIT_LENGTH) {             /* Filled up the digit? */
    n -= C_BIGNUM_DIGIT_LENGTH;                 /* Number of remainder bits */
    *digits++ = big_digit;
    big_digit = str_digit >> (radix_shift - n); /* Keep only the remainder */
  }
}
/* If radix isn't an exact divisor of digit length, write final remainder */
if (n > 0) *digits++ = big_digit;

From my benchmarks it looks like CHICKEN's string->number implementation is among the fastest of the popular Scheme implementations for power of two bases, due to this bit banging loop.

Converting numbers to strings

The naive way of converting a number to a string in base n is to do the opposite of converting a string in base n to a number: we repeatedly divide by the target base and prepend this number to the string.

char *characters = "0123456789abcdef";

/* This fills the "buf" array *back to front*, so index starts at the
 * end of "buf".  counter represents # of characters written.
 */
do {
  digit = bignum_destructive_scale_down(working_copy, radix);
  *index-- = characters[digit];

  /* If we reached the current string's length, reallocate in
   * increments of BIGNUM_STR_BLOCK_SIZE.
   */
  if (++counter == len) {
   char *newbuf = C_malloc(len + BIGNUM_STR_BLOCK_SIZE);
   if (newbuf == NULL) return ERR;

   C_memcpy(newbuf + BIGNUM_STR_BLOCK_SIZE, buf, len);
   C_free(buf);
   buf = newbuf;
   index = newbuf + BIGNUM_STR_BLOCK_SIZE - 1;
   len += BIGNUM_STR_BLOCK_SIZE;
} while(bignum_length(working_copy) > 0);

This is the original version as provided by Scheme48. Again, it's very short and clean. It operates on a copy of the bignum, which it destructively scales down by dividing it by radix. The remainder digit is written to the string after conversion to a character. Many implementations use this algorithm, but it can be improved pretty easily, in basically the same way we improved the reverse operation. Instead of dividing by the radix on ever loop iteration, you can chop off a big lump. This is the remainder of dividing by a large number. Then, you divide this remainder in a loop while emitting string digits until you hit zero, then repeat until the bignum is zero.

Another improvement over the Scheme48 code is that you can pre-calculate a (pessimistic) upper bound on the number of digits, so you can avoid the reallocation (which implies a memory copy). For powers of two, this can be done precisely. For other radixes you can shorten the buffer only once at the end, and only if it turns out to be necessary.

This is really very simple, except for the finishing up part, where we shorten the buffer:

int steps;
C_uword base;         /* This is the "lump" we cut off every time (divisor) */
C_uword *scan = start + C_bignum_size(bignum); /* Start scanning at the end */

/* Calculate the largest power of radix (string base) that fits a halfdigit.
 * If radix is 10, steps = log10(2^halfdigit_bits), base = 10^steps
 */
for(steps = 0, base = radix; C_fitsinbignumhalfdigitp(base); base *= radix)
  steps++;

base /= radix; /* Back down: we always overshoot in the loop by one step */

while (scan > start) {
  /* Divide by base. This chops "steps" string digits off of the bignum */
  big_digit = bignum_digits_destructive_scale_down(start, scan, base);

  if (*(scan-1) == 0) scan--; /* Adjust if we exhausted the highest digit */

  for(i = 0; i < steps && index >= buf; ++i) {      /* Emit string digits */
    C_uword tmp = big_digit / radix;
    *index-- = characters[big_digit - (tmp*radix)]; /* big_digit % radix */
    big_digit = tmp;
  }
}

/* Move index onto first nonzero digit.  We're writing a bignum
   here: it can't consist of only zeroes. */
while(*++index == '0');

if (negp) *--index = '-';

/* Shorten with distance between start and index. */
if (buf != index) {
  i = C_header_size(string) - (index - buf);
  C_memmove(buf, index, i); /* Move start of number to beginning. */
  C_block_header(string) = C_STRING_TYPE | i; /* Mutate strlength. */
}

Finally, if the radix is a power of two, we can do a straight bit-to-bit extraction like we did with string->number:

int radix_shift = C_ilen(radix) - 1;    /* Integer length (the power of two) */
int radix_mask = radix - 1;              /* Bitmask of N-1 ones (radix = 2ᴺ) */

/* Again, we go from least significant to most significant digit */
while (scan < end) {
  C_uword big_digit = *scan++;
  int big_digit_len = C_BIGNUM_DIGIT_LENGTH;

  while(big_digit_len > 0 && index >= buf) {
    int radix_digit = big_digit & radix_mask;    /* Extract one string digit */
    *index-- = characters[radix_digit]; /* Write it (as character) to string */
    big_digit >>= radix_shift;                            /* Drop this digit */
    big_digit_len -= radix_shift;
  }
}

Unfortunately, there are some caveats that make this slightly trickier than you would guess. The above code is simplified, and only works for some radixes. If your radix is 2ⁿ, and your base digit size is 2ᵐ bits, then this code works if m is a multiple of n. Otherwise, you'll need to take care of overlaps. Octal numbers are the biggest problem here, because they're 3 bits per string digit and the bignum digit sizes of 32 or 64 bits don't divide cleanly by 3.

Getting this right complicates the algorithm enough to make it slightly too hairy to present here (there's some more shifting and if checks involved). If you're interested how to handle this, you can always study the CHICKEN sources.

Divide and conquer

Because of the many divisions, number->string is much slower than string->number. Luckily, we can speed up the former by relying once again on a recursive divide and conquer style algorithm.

This requires you to know the string's expected size in the target base, and will divide the number by half that. For example, if you wish to convert the number 12345678 to a decimal string, you can decide to split it in two. If you had a perfect guess of the string's length (which is 8), you can split the expected string in two halves by dividing the number by 10⁴, or 10000, giving us the quotient 1234 and the remainder 5678. These can recursively be converted to a string and finally appended together. Note that if you have a pessimistic upper limit of the final string length, it'll be slower, but will still produce a correct result. The code for this is quite straightforward:

(define (integer->string/recursive n base expected-string-size)
  (let*-values (((halfsize) (fxshr (fx+ expected-string-size 1) 1))
                ((b^M/2) (integer-power base halfsize))
                ((hi lo) (quotient&remainder n b^M/2))
                ((strhi) (number->string hi base))
                ((strlo) (number->string (abs lo) base)))
    (string-append strhi
                   ;; Fix up any leading zeroes that were stripped from strlo
                   (make-string (fx- halfsize (string-length strlo)) #\0)
                   strlo)))

Because the number 120034, when split in two, generates 120 and 34, we need the make-string call to add back the leading zeroes, otherwise we would get 12034 as the output. This can be omitted if you have a more low-level number->string implementation which doesn't truncate leading zeroes.

While I was researching this, I found out about a technique called the "scaled remainder tree" algorithm. This algorithm is supposedly even faster than the simple recursive algorithm I just showed you. Unfortunately, I was unable to wrap my head around it. Maybe you will have better luck!

Reading list

There doesn't seem to be that much information on how to efficiently perform base conversion, even though there are quite a few clever implementation techniques out there. If you spend the time searching, you'll be sure to find some gems.

The y-cruncher page on radix conversion is full of ideas. Y-cruncher is a program to calculate digits of pi by efficiently using multiple CPU cores. This page has several algorithms, including recursive string splitting and scaled remainder tree. Unfortunately, the program itself is proprietary so you can't study its implementations of these algorithms.
Modern Computer Arithmetic, which has quickly become my favourite bignum textbook, has an interesting alternative string to number technique based on iterative multiplication of the input string. I still want to try that out some day. It also hints at scaled remainder tree for number to string conversion, but doesn't explain how it works.
Division-Free Binary-to-Decimal Conversion by Cyril Bouvier and Paul Zimmermann explains a number to string conversion based on the scaled remainder tree. Unfortunately, this paper only confused me.
It looks like Gambit has a recursive divide-and-conquer implementation of string->number that's slightly different from ours. Their recursive number->string implementation is rather interesting too.

More magic

Cautionary tales from a programmer

About this blog

CHICKEN's numeric tower: part 4 Posted on 2016-10-18

Performing base conversions

Converting strings to numbers

Converting numbers to strings

Divide and conquer

Reading list