Saturday, April 2, 2022

[SOLVED] C++ Fastest numerical string to long parsing

Issue

Here is what I came up with.

  • len is guaranteed to have meaningful value (positive and true size of the char array)
  • s is long unsigned number as a string without null-termination (received from 3rd party lib) typically 11-12 symbols e.g. "123456789000"
  • running on x86 linux

I am not C++ dev, could you help make it faster?

    inline uint64_t strtol(char* s, int len)
     {
         uint64_t val = 0;
         for (int i = 0; i < len; i++)
         {
             char c = *(s + i) - '0';
             val = val * 10 + c;
         }
         return val;
     };

Solution

You might want to have a look at loop unrolling. When the body of a loop is short enough, checking the loop condition every iteration might be relatively expensive.

A specific and interesting way of implementing loop unrolling is called Duff's device: https://en.wikipedia.org/wiki/Duff%27s_device

Here's the version for your function:

inline uint64_t strtol_duff(char* s, int len)
{
    uint64_t val = 0;
    int n = (len + 7) / 8;
    int i = 0;
    switch (len % 8) {
    case 0: do {
                 val = val * 10 + (*(s + i++) - '0');
    case 7:      val = val * 10 + (*(s + i++) - '0');
    case 6:      val = val * 10 + (*(s + i++) - '0');
    case 5:      val = val * 10 + (*(s + i++) - '0');
    case 4:      val = val * 10 + (*(s + i++) - '0');
    case 3:      val = val * 10 + (*(s + i++) - '0');
    case 2:      val = val * 10 + (*(s + i++) - '0');
    case 1:      val = val * 10 + (*(s + i++) - '0');
    } while (--n > 0);
    }
    return val;
};

To be honest, in your case I believe you will not see a huge benefit because the loop's body is not that tiny. It's all very much system dependent and requires experimentation (like most optimizations). Good compiler optimizers might unroll the loop automatically if it is actually beneficial.

But it's worth to try.



Answered By - wohlstad
Answer Checked By - David Marino (WPSolving Volunteer)