digitalmars.D - Modulo Bug?

David (4/4) Aug 11 2012 -1 % 16 = -1

Peter Alexander (13/18) Aug 11 2012 From the language spec:

David (3/20) Aug 11 2012 Thanks! I thought modulo should alawys yield the same ... seems like I

bearophile (4/6) Aug 11 2012 It's C design that's wrong.

Norbert Nemec (8/12) Aug 11 2012 And it's the processor design that makes it inefficient to correct it

bearophile (8/15) Aug 11 2012 I see. Thank you for the explanations. So the situation isn't
Thiez (7/22) Aug 11 2012 A few extra instructions (a CMOV followed by ADD should suffice,

bearophile (5/9) Aug 11 2012 Maybe because C99 requires that kind of modulus, and D tries to
jerro (12/17) Aug 11 2012 The price would really be quite insignificant since IDIV takes

David (8/18) Aug 12 2012 My implementation of py_mod:

Thiez (6/14) Aug 12 2012 It seems to work when b is a power of 2, but not for other

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (7/29) Aug 12 2012 Keep in mind that not all CPUs have cmov, so you end up having to branch...

Thiez (22/59) Aug 12 2012 I think the following trick would work even on the 80386 (386+

monarch_dodra (8/14) Sep 07 2012 According to TDPL regarding integral math: "Anything that

bearophile (6/7) Sep 07 2012 Or some people have fulfilled this need since years (D1) writing

Walter Bright (2/7) Sep 07 2012 Because then C code translated to D will have subtle, silent and unexpec...

bearophile (11/13) Sep 07 2012 I agree. So the solution is to introduce a new Phobos function or

David Nadlinger (3/8) Aug 11 2012 http://en.wikipedia.org/wiki/Modulo_operation
bearophile (8/13) Aug 11 2012 It's a localized but important design bug of languages like C

Caligo (4/10) Aug 11 2012 I've asked for this before:
bearophile (36/40) Aug 12 2012 Better to show an example, reduced from a larger program. This

Steven Schveighoffer (15/19) Sep 07 2012 In case you are still looking for an expression which mods always positi...

David <d dav1d.de> writes:

-1 % 16 = -1

Shouldn't that be 15? It seems like the sign is ignored for the modulo.

Is this a bug or intended behaviour? The Python implementation returns 
here, as expected, 15.

Aug 11 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Saturday, 11 August 2012 at 13:48:16 UTC, David wrote:
 -1 % 16 = -1

 Shouldn't that be 15? It seems like the sign is ignored for the 
 modulo.

 Is this a bug or intended behaviour? The Python implementation 
 returns here, as expected, 15.

 From the language spec:

"For integral operands of the / and % operators, the quotient 
rounds towards zero and the remainder has the same sign as the 
dividend."

http://dlang.org/expression.html

In your case, the dividend is -1, so the remainder has the same 
sign (-ve). The quotient rounds towards zero, so in this case the 
quotient is zero, so the remainder must be -1.

Different programming languages handle it differently. In C and 
C++ it is implementation defined! (C++11 makes it the same as in 
D)

See: http://en.wikipedia.org/wiki/Modulo_operation

Aug 11 2012

David <d dav1d.de> writes:

Am 11.08.2012 16:00, schrieb Peter Alexander:
 On Saturday, 11 August 2012 at 13:48:16 UTC, David wrote:
 -1 % 16 = -1

 Shouldn't that be 15? It seems like the sign is ignored for the modulo.

 Is this a bug or intended behaviour? The Python implementation returns
 here, as expected, 15.

  From the language spec:

 "For integral operands of the / and % operators, the quotient rounds
 towards zero and the remainder has the same sign as the dividend."

 http://dlang.org/expression.html

 In your case, the dividend is -1, so the remainder has the same sign
 (-ve). The quotient rounds towards zero, so in this case the quotient is
 zero, so the remainder must be -1.

 Different programming languages handle it differently. In C and C++ it
 is implementation defined! (C++11 makes it the same as in D)

 See: http://en.wikipedia.org/wiki/Modulo_operation


Thanks! I thought modulo should alawys yield the same ... seems like I 
was wrong ;)

Aug 11 2012

"bearophile" <bearophileHUGS lycos.com> writes:

David:

 Thanks! I thought modulo should alawys yield the same ... seems 
 like I was wrong ;)

It's C design that's wrong.

Bye,
bearophile

Aug 11 2012

Norbert Nemec <Norbert Nemec-online.de> writes:

On 11.08.2012 18:13, bearophile wrote:
 David:

 Thanks! I thought modulo should alawys yield the same ... seems like I
 was wrong ;)

 It's C design that's wrong.

And it's the processor design that makes it inefficient to correct it 
nowadays.

Python's definition of modulo is far more useful than C's. Implemented 
in machine code, however, it takes several additional commands because 
the integer division is hardwired to the C definition. I guess that 
hardware integer division in processors became popular only when C was 
already widely in use.

Aug 11 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Norbert Nemec:

 And it's the processor design that makes it inefficient to 
 correct it nowadays.

 Python's definition of modulo is far more useful than C's. 
 Implemented in machine code, however, it takes several 
 additional commands because the integer division is hardwired 
 to the C definition. I guess that hardware integer division in 
 processors became popular only when C was already widely in use.

I see. Thank you for the explanations. So the situation isn't 
going to change soon.

Ada language offers both kind of modulus. I think std.math of 
Phobos should offer a mod() function that acts like the Python 
modulus.

Bye,
bearophile

Aug 11 2012

"Thiez" <thiezz gmail.com> writes:

On Saturday, 11 August 2012 at 17:15:21 UTC, Norbert Nemec wrote:
 On 11.08.2012 18:13, bearophile wrote:
 David:

 Thanks! I thought modulo should alawys yield the same ... 
 seems like I
 was wrong ;)

 It's C design that's wrong.

 And it's the processor design that makes it inefficient to 
 correct it nowadays.

 Python's definition of modulo is far more useful than C's. 
 Implemented in machine code, however, it takes several 
 additional commands because the integer division is hardwired 
 to the C definition. I guess that hardware integer division in 
 processors became popular only when C was already widely in use.

A few extra instructions (a CMOV followed by ADD should suffice, 
yes?) seems like a small price to pay if it can prevent bugs. Why 
hasn't the Python-modulo been made the default back when D was 
designed? The ever-so-slightly more efficient C-modulo could be 
provided in a library. Of course it's way too late to change it 
now...

Aug 11 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Thiez:

 A few extra instructions (a CMOV followed by ADD should 
 suffice, yes?) seems like a small price to pay if it can 
 prevent bugs. Why hasn't the Python-modulo been made the 
 default back when D was designed?

Maybe because C99 requires that kind of modulus, and D tries to 
act as C99 where possible. It's sad.

Bye,
bearophile

Aug 11 2012

"jerro" <a a.com> writes:

 A few extra instructions (a CMOV followed by ADD should 
 suffice, yes?) seems like a small price to pay if it can 
 prevent bugs.

The price would really be quite insignificant since IDIV takes 
tens of cycles and the additional work needed to make module 
behave intuitively would be just a few cycles. Besides, modulo of 
unsigned integers would still work the same, so you could just 
use that when you cared about the difference in performance. So 
the only situation where you couldn't avoid performance penalty 
would be when you needed to calculate modulo of signed integers 
and you actually wanted the current behavior. I think that is a 
pretty rare situtation.

 Why hasn't the Python-modulo been made the default back when D 
 was designed?

Probably because this is one of the design goals of D: Where D 
code looks the same as C code, have it either behave the same or 
issue an error.

Aug 11 2012

David <d dav1d.de> writes:

 Besides, modulo of unsigned integers would
 still work the same, so you could just use that when you cared about the
 difference in performance. So the only situation where you couldn't
 avoid performance penalty would be when you needed to calculate modulo
 of signed integers and you actually wanted the current behavior. I think
 that is a pretty rare situtation.

My implementation of py_mod:

int py_mod(int a, int b) {
     return cast(uint)a % b;
}

So a simple cast from signed to unsigned does exactly what I need 
(Prf_Jakob showed me that in IRC)

How much does a cast cost? I mean is that even an extra asm instruction?

 Why hasn't the Python-modulo been made the default back when D was
 designed?

 Probably because this is one of the design goals of D: Where D code
 looks the same as C code, have it either behave the same or issue an error.


Unfortunatly.

Aug 12 2012

"Thiez" <thiezz gmail.com> writes:

On Sunday, 12 August 2012 at 13:11:36 UTC, David wrote:
 My implementation of py_mod:

 int py_mod(int a, int b) {
     return cast(uint)a % b;
 }

 So a simple cast from signed to unsigned does exactly what I 
 need (Prf_Jakob showed me that in IRC)

It seems to work when b is a power of 2, but not for other 
numbers. Example: -3 % 7 = 4, but cast(uint)-3 = 4294967293, and 
4294967293 % 7 = 1.

 How much does a cast cost? I mean is that even an extra asm 
 instruction?

Casting ints to uints and back is free because the bits can 
remain the same (only the interpretation changes).

Aug 12 2012

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:

On 11-08-2012 20:54, Thiez wrote:
 On Saturday, 11 August 2012 at 17:15:21 UTC, Norbert Nemec wrote:
 On 11.08.2012 18:13, bearophile wrote:
 David:

 Thanks! I thought modulo should alawys yield the same ... seems like I
 was wrong ;)

 It's C design that's wrong.

 And it's the processor design that makes it inefficient to correct it
 nowadays.

 Python's definition of modulo is far more useful than C's. Implemented
 in machine code, however, it takes several additional commands because
 the integer division is hardwired to the C definition. I guess that
 hardware integer division in processors became popular only when C was
 already widely in use.

 A few extra instructions (a CMOV followed by ADD should suffice, yes?)
 seems like a small price to pay if it can prevent bugs. Why hasn't the
 Python-modulo been made the default back when D was designed? The
 ever-so-slightly more efficient C-modulo could be provided in a library.
 Of course it's way too late to change it now...

Keep in mind that not all CPUs have cmov, so you end up having to branch 
based on whether the CPU has it or not.

-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org

Aug 12 2012

"Thiez" <thiezz gmail.com> writes:

On Sunday, 12 August 2012 at 08:43:53 UTC, Alex Rønne Petersen 
wrote:
 On 11-08-2012 20:54, Thiez wrote:
 On Saturday, 11 August 2012 at 17:15:21 UTC, Norbert Nemec 
 wrote:
 On 11.08.2012 18:13, bearophile wrote:
 David:

 Thanks! I thought modulo should alawys yield the same ... 
 seems like I
 was wrong ;)

 It's C design that's wrong.

 And it's the processor design that makes it inefficient to 
 correct it
 nowadays.

 Python's definition of modulo is far more useful than C's. 
 Implemented
 in machine code, however, it takes several additional 
 commands because
 the integer division is hardwired to the C definition. I 
 guess that
 hardware integer division in processors became popular only 
 when C was
 already widely in use.

 A few extra instructions (a CMOV followed by ADD should 
 suffice, yes?)
 seems like a small price to pay if it can prevent bugs. Why 
 hasn't the
 Python-modulo been made the default back when D was designed? 
 The
 ever-so-slightly more efficient C-modulo could be provided in 
 a library.
 Of course it's way too late to change it now...

 Keep in mind that not all CPUs have cmov, so you end up having 
 to branch based on whether the CPU has it or not.

I think the following trick would work even on the 80386 (386+ 
should have the SETcc instructions):
assume divident in AEX, divisor in EBX or ECX (not EDX, 
obviously). I'll assume it's in EBX.
  CDQ // sign extend EAX to EDX:EAX
  IDIV EBX // signed division
  XOR EAX,EAX // clear EAX
  BT EDX,31 // check if sign-bit of remainder is set
  SETNC AL // AL = (EDX negative ? 0 : 1)
  DEC AEX // EAX = (EDX negative ? 0xFFFF : 0)
  AND EAX,EBX // EAX = (EDX negative ? EBX : 0)
  ADD EDX,EAX // EAX = the correct modulo result

Obviously it's less appealing than a CMOV, but it should work (in 
theory, I didn't test it...) and perhaps someone smarter could 
make it even shorter. Apart form the IDIV, all instructions 
should take only a single clock cycle on most x86s as far as I'm 
aware.

As for other CPU architectures, perhaps those actually have a 
'sane' modulo operator, which would make the hypothetical 
presence or absence of a CMOV irrelevant :)

Aug 12 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Saturday, 11 August 2012 at 18:54:09 UTC, Thiez wrote:
 A few extra instructions (a CMOV followed by ADD should 
 suffice, yes?) seems like a small price to pay if it can 
 prevent bugs. Why hasn't the Python-modulo been made the 
 default back when D was designed? The ever-so-slightly more 
 efficient C-modulo could be provided in a library. Of course 
 it's way too late to change it now...

According to TDPL regarding integral math: "Anything that 
compiles in both C and D WILL produce the same results".

Anything less would make porting code a nightmare.
=>It is absolutely not conceivable to change "%" semantics'.

However, D *could* provide the "Mathematician's Modulo" operator, 
either as a library functionality, or as a new operator.

I guess there hasn't been "a strong need" yet.

Sep 07 2012

"bearophile" <bearophileHUGS lycos.com> writes:

monarch_dodra:

 I guess there hasn't been "a strong need" yet.

Or some people have fulfilled this need since years (D1) writing
their own function.

Regarding an operator, %% ? :-)

Bye,
bearophile

Sep 07 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 8/11/2012 11:54 AM, Thiez wrote:
 A few extra instructions (a CMOV followed by ADD should suffice, yes?) seems
 like a small price to pay if it can prevent bugs. Why hasn't the Python-modulo
 been made the default back when D was designed? The ever-so-slightly more
 efficient C-modulo could be provided in a library. Of course it's way too late
 to change it now...


Because then C code translated to D will have subtle, silent and unexpected
bugs.

Sep 07 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 Because then C code translated to D will have subtle, silent 
 and unexpected bugs.

I agree. So the solution is to introduce a new Phobos function or 
new built-in operator that works more correctly with negative 
numbers too.

A new Phobos function named mod() is a very simple solution.

While a new infix operator has the advantage of being more 
natural to use, this allows programmers to better remember to use 
it instead of the C %, so the maybe a built-in operator avoids 
some bugs compared to the function.

Bye,
bearophile

Sep 07 2012

"David Nadlinger" <see klickverbot.at> writes:

On Saturday, 11 August 2012 at 13:48:16 UTC, David wrote:
 -1 % 16 = -1

 Shouldn't that be 15? It seems like the sign is ignored for the 
 modulo.

 Is this a bug or intended behaviour? The Python implementation 
 returns here, as expected, 15.

http://en.wikipedia.org/wiki/Modulo_operation

David

Aug 11 2012

"bearophile" <bearophileHUGS lycos.com> writes:

David:

 -1 % 16 = -1

 Shouldn't that be 15? It seems like the sign is ignored for the 
 modulo.

 Is this a bug or intended behaviour? The Python implementation 
 returns here, as expected, 15.

It's a localized but important design bug of languages like C 
that D has carried over for backward compatibility reasons. D % 
has caused bugs in my code. I suggest to not use the built-in % 
on numbers that can be negative, unless you really know what you 
are doing.

Bye,
bearophile

Aug 11 2012

Caligo <iteronvexor gmail.com> writes:

I've asked for this before:
http://www.digitalmars.com/d/archives/digitalmars/D/Could_we_have_mod_in_std.math_152977.html


On Sat, Aug 11, 2012 at 11:12 AM, bearophile <bearophileHUGS lycos.com>
wrote:
 It's a localized but important design bug of languages like C that D has
 carried over for backward compatibility reasons. D % has caused bugs in my
 code. I suggest to not use the built-in % on numbers that can be negative,
 unless you really know what you are doing.

 Bye,
 bearophile

Aug 11 2012

"bearophile" <bearophileHUGS lycos.com> writes:

 It's a localized but important design bug of languages like C 
 that D has carried over for backward compatibility reasons. D % 
 has caused bugs in my code. I suggest to not use the built-in % 
 on numbers that can be negative,

Better to show an example, reduced from a larger program. This
little function has to look for the next true value in a circular
array 'data', before or after a given starting index, according
to a int 'movement' variable that is allowed to be only +1 or -1:


This is Python code, it correctly performs the wrap-around in all
cases:


def find_next(data, index, movement):
      assert movement == 1 or movement == -1
      while True:
          index = (index + movement) % len(data)
          if data[index]:
              break
      return index


data = [False, False, False, False, True, False]




The D code doesn't work, because of the C %:


size_t findNext(in bool[] data, size_t index, in int movement)
pure nothrow in {
      assert(movement == 1 || movement == -1);
} body {
      do {
          index = (index + movement) % data.length;
      } while (!data[index]);
      return index;
}

void main() {
      import std.stdio;
      //              0      1      2      3      4     5
      const data = [false, false, false, false, true, false];

      writeln(findNext(data, 5, -1)); // OK, prints 4
      writeln(findNext(data, 0, -1)); // not OK
}


Bye,
bearophile

Aug 12 2012

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 11 Aug 2012 09:48:15 -0400, David <d dav1d.de> wrote:

 -1 % 16 = -1

 Shouldn't that be 15? It seems like the sign is ignored for the modulo.

 Is this a bug or intended behaviour? The Python implementation returns  
 here, as expected, 15.

In case you are still looking for an expression which mods always positive  
(even for non-powers of 2)

(a % b + b) % b

Obviously, b must be positive.

The direct translation of this to assembly isn't the most efficient way.

Technically, in assembly we could check for the sign bit, and only add b  
again (without doing the second mod) if the value was negative.  But I'm  
not sure you can write an expression that causes that, maybe:

a % b < 0 ? a % b + b : a % b

Certainly not as appealing.  To get a direct translation from the above  
assembly, it would be something like:

typeof(a) x;

auto expr = (x = a % b) < 0 ? x + b : x

-Steve

Sep 07 2012

D Programming

C/C++ Programming

Other

digitalmars.D - Modulo Bug?