Issue
I'm trying to create a code which divides a uint64_t
by another uint64_t
plus it applies rounding to the result. The code should be as fast as possible and work for all inputs (e.g. I would prefer it to now have and conditionals).
My current solution looks like this:
static inline uint64_t divide_with_rounding(uint64_t n, uint64_t d)
{
uint64_t a = n / d;
uint64_t r = n % d;
return a + (r >= d - (d / 2));
}
gcc optimizes the division+modulo quite nicely and as well the / 2
. But I wonder if there's a shorter and nicer solution.
E.g. something like this:
static inline uint64_t divide_with_rounding(uint64_t n, uint64_t d)
{
return (n + d / 2) / d;
}
However that one has the disadvantage that divide_with_rounding(UINT64_MAX, 1000)
produces 0.
Solution
The expression is round(x/d) = ⌊(x + d/2)/d⌋ mathematically. From the property of floor function ⌊x + n⌋ = ⌊x⌋ + n we can prove that in case d is even the result is
In case d is odd we can replace d = 2k + 1 and prove that the result is the same. Therefore you can just use
if (n >= d/2)
return (n - d/2)/d + 1;
else
return (n + d/2)/d;
This will avoid the situation where n + d/2
overflows
However in case d
is not a compile-time constant then it might be faster to do a 128-by-64-bit division if the branch misprediction cost is high. In MSVC you can do like this
uint64_t nH = 0, nL = n, rem = 0;
nL += d/2;
nH += nL < n; // { nH, nL } += d/2
return _udiv128(nH, nL, d, &rem); // { nH, nL } / d
and in compilers with __int128
type like GCC, ICC, Clang... just use it directly
__int128 N = n;
N += d/2;
return N/d;
Answered By - phuclv Answer Checked By - Robin (WPSolving Admin)