www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Modulo Bug?

reply David <d dav1d.de> writes:
-1 % 16 = -1

Shouldn't that be 15? It seems like the sign is ignored for the modulo.

Is this a bug or intended behaviour? The Python implementation returns 
here, as expected, 15.
Aug 11 2012
next sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Saturday, 11 August 2012 at 13:48:16 UTC, David wrote:
 -1 % 16 = -1

 Shouldn't that be 15? It seems like the sign is ignored for the 
 modulo.

 Is this a bug or intended behaviour? The Python implementation 
 returns here, as expected, 15.

From the language spec: "For integral operands of the / and % operators, the quotient rounds towards zero and the remainder has the same sign as the dividend." http://dlang.org/expression.html In your case, the dividend is -1, so the remainder has the same sign (-ve). The quotient rounds towards zero, so in this case the quotient is zero, so the remainder must be -1. Different programming languages handle it differently. In C and C++ it is implementation defined! (C++11 makes it the same as in D) See: http://en.wikipedia.org/wiki/Modulo_operation
Aug 11 2012
parent reply David <d dav1d.de> writes:
Am 11.08.2012 16:00, schrieb Peter Alexander:
 On Saturday, 11 August 2012 at 13:48:16 UTC, David wrote:
 -1 % 16 = -1

 Shouldn't that be 15? It seems like the sign is ignored for the modulo.

 Is this a bug or intended behaviour? The Python implementation returns
 here, as expected, 15.

From the language spec: "For integral operands of the / and % operators, the quotient rounds towards zero and the remainder has the same sign as the dividend." http://dlang.org/expression.html In your case, the dividend is -1, so the remainder has the same sign (-ve). The quotient rounds towards zero, so in this case the quotient is zero, so the remainder must be -1. Different programming languages handle it differently. In C and C++ it is implementation defined! (C++11 makes it the same as in D) See: http://en.wikipedia.org/wiki/Modulo_operation

Thanks! I thought modulo should alawys yield the same ... seems like I was wrong ;)
Aug 11 2012
parent reply Norbert Nemec <Norbert Nemec-online.de> writes:
On 11.08.2012 18:13, bearophile wrote:
 David:

 Thanks! I thought modulo should alawys yield the same ... seems like I
 was wrong ;)

It's C design that's wrong.

And it's the processor design that makes it inefficient to correct it nowadays. Python's definition of modulo is far more useful than C's. Implemented in machine code, however, it takes several additional commands because the integer division is hardwired to the C definition. I guess that hardware integer division in processors became popular only when C was already widely in use.
Aug 11 2012
next sibling parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 11-08-2012 20:54, Thiez wrote:
 On Saturday, 11 August 2012 at 17:15:21 UTC, Norbert Nemec wrote:
 On 11.08.2012 18:13, bearophile wrote:
 David:

 Thanks! I thought modulo should alawys yield the same ... seems like I
 was wrong ;)

It's C design that's wrong.

And it's the processor design that makes it inefficient to correct it nowadays. Python's definition of modulo is far more useful than C's. Implemented in machine code, however, it takes several additional commands because the integer division is hardwired to the C definition. I guess that hardware integer division in processors became popular only when C was already widely in use.

A few extra instructions (a CMOV followed by ADD should suffice, yes?) seems like a small price to pay if it can prevent bugs. Why hasn't the Python-modulo been made the default back when D was designed? The ever-so-slightly more efficient C-modulo could be provided in a library. Of course it's way too late to change it now...

Keep in mind that not all CPUs have cmov, so you end up having to branch based on whether the CPU has it or not. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Aug 12 2012
prev sibling next sibling parent David <d dav1d.de> writes:
 Besides, modulo of unsigned integers would
 still work the same, so you could just use that when you cared about the
 difference in performance. So the only situation where you couldn't
 avoid performance penalty would be when you needed to calculate modulo
 of signed integers and you actually wanted the current behavior. I think
 that is a pretty rare situtation.

My implementation of py_mod: int py_mod(int a, int b) { return cast(uint)a % b; } So a simple cast from signed to unsigned does exactly what I need (Prf_Jakob showed me that in IRC) How much does a cast cost? I mean is that even an extra asm instruction?
 Why hasn't the Python-modulo been made the default back when D was
 designed?

Probably because this is one of the design goals of D: Where D code looks the same as C code, have it either behave the same or issue an error.

Unfortunatly.
Aug 12 2012
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 8/11/2012 11:54 AM, Thiez wrote:
 A few extra instructions (a CMOV followed by ADD should suffice, yes?) seems
 like a small price to pay if it can prevent bugs. Why hasn't the Python-modulo
 been made the default back when D was designed? The ever-so-slightly more
 efficient C-modulo could be provided in a library. Of course it's way too late
 to change it now...

Because then C code translated to D will have subtle, silent and unexpected bugs.
Sep 07 2012
prev sibling next sibling parent "David Nadlinger" <see klickverbot.at> writes:
On Saturday, 11 August 2012 at 13:48:16 UTC, David wrote:
 -1 % 16 = -1

 Shouldn't that be 15? It seems like the sign is ignored for the 
 modulo.

 Is this a bug or intended behaviour? The Python implementation 
 returns here, as expected, 15.

http://en.wikipedia.org/wiki/Modulo_operation David
Aug 11 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
David:

 -1 % 16 = -1

 Shouldn't that be 15? It seems like the sign is ignored for the 
 modulo.

 Is this a bug or intended behaviour? The Python implementation 
 returns here, as expected, 15.

It's a localized but important design bug of languages like C that D has carried over for backward compatibility reasons. D % has caused bugs in my code. I suggest to not use the built-in % on numbers that can be negative, unless you really know what you are doing. Bye, bearophile
Aug 11 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
David:

 Thanks! I thought modulo should alawys yield the same ... seems 
 like I was wrong ;)

It's C design that's wrong. Bye, bearophile
Aug 11 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Norbert Nemec:

 And it's the processor design that makes it inefficient to 
 correct it nowadays.

 Python's definition of modulo is far more useful than C's. 
 Implemented in machine code, however, it takes several 
 additional commands because the integer division is hardwired 
 to the C definition. I guess that hardware integer division in 
 processors became popular only when C was already widely in use.

I see. Thank you for the explanations. So the situation isn't going to change soon. Ada language offers both kind of modulus. I think std.math of Phobos should offer a mod() function that acts like the Python modulus. Bye, bearophile
Aug 11 2012
prev sibling next sibling parent "Thiez" <thiezz gmail.com> writes:
On Saturday, 11 August 2012 at 17:15:21 UTC, Norbert Nemec wrote:
 On 11.08.2012 18:13, bearophile wrote:
 David:

 Thanks! I thought modulo should alawys yield the same ... 
 seems like I
 was wrong ;)

It's C design that's wrong.

And it's the processor design that makes it inefficient to correct it nowadays. Python's definition of modulo is far more useful than C's. Implemented in machine code, however, it takes several additional commands because the integer division is hardwired to the C definition. I guess that hardware integer division in processors became popular only when C was already widely in use.

A few extra instructions (a CMOV followed by ADD should suffice, yes?) seems like a small price to pay if it can prevent bugs. Why hasn't the Python-modulo been made the default back when D was designed? The ever-so-slightly more efficient C-modulo could be provided in a library. Of course it's way too late to change it now...
Aug 11 2012
prev sibling next sibling parent Caligo <iteronvexor gmail.com> writes:
--f46d043c06e634a13e04c7023bcb
Content-Type: text/plain; charset=ISO-8859-1

I've asked for this before:
http://www.digitalmars.com/d/archives/digitalmars/D/Could_we_have_mod_in_std.math_152977.html


On Sat, Aug 11, 2012 at 11:12 AM, bearophile <bearophileHUGS lycos.com>
wrote:
 It's a localized but important design bug of languages like C that D has
 carried over for backward compatibility reasons. D % has caused bugs in my
 code. I suggest to not use the built-in % on numbers that can be negative,
 unless you really know what you are doing.

 Bye,
 bearophile

--f46d043c06e634a13e04c7023bcb Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I&#39;ve asked for this before:=A0=A0 <a href=3D"http://www.digitalmars.com= /d/archives/digitalmars/D/Could_we_have_mod_in_std.math_152977.html">http:/= /www.digitalmars.com/d/archives/digitalmars/D/Could_we_have_mod_in_std.math= _152977.html</a><br> <br><br>On Sat, Aug 11, 2012 at 11:12 AM, bearophile &lt;<a href=3D"mailto:= bearophileHUGS lycos.com">bearophileHUGS lycos.com</a>&gt; wrote:<br>&gt; <= br>&gt; It&#39;s a localized but important design bug of languages like C t= hat D has<br> &gt; carried over for backward compatibility reasons. D % has caused bugs i= n my<br>&gt; code. I suggest to not use the built-in % on numbers that can = be negative,<br>&gt; unless you really know what you are doing.<br>&gt;<br> &gt; Bye,<br>&gt; bearophile<br><br> --f46d043c06e634a13e04c7023bcb--
Aug 11 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Thiez:

 A few extra instructions (a CMOV followed by ADD should 
 suffice, yes?) seems like a small price to pay if it can 
 prevent bugs. Why hasn't the Python-modulo been made the 
 default back when D was designed?

Maybe because C99 requires that kind of modulus, and D tries to act as C99 where possible. It's sad. Bye, bearophile
Aug 11 2012
prev sibling next sibling parent "jerro" <a a.com> writes:
 A few extra instructions (a CMOV followed by ADD should 
 suffice, yes?) seems like a small price to pay if it can 
 prevent bugs.

The price would really be quite insignificant since IDIV takes tens of cycles and the additional work needed to make module behave intuitively would be just a few cycles. Besides, modulo of unsigned integers would still work the same, so you could just use that when you cared about the difference in performance. So the only situation where you couldn't avoid performance penalty would be when you needed to calculate modulo of signed integers and you actually wanted the current behavior. I think that is a pretty rare situtation.
 Why hasn't the Python-modulo been made the default back when D 
 was designed?

Probably because this is one of the design goals of D: Where D code looks the same as C code, have it either behave the same or issue an error.
Aug 11 2012
prev sibling next sibling parent "Thiez" <thiezz gmail.com> writes:
On Sunday, 12 August 2012 at 08:43:53 UTC, Alex Rønne Petersen 
wrote:
 On 11-08-2012 20:54, Thiez wrote:
 On Saturday, 11 August 2012 at 17:15:21 UTC, Norbert Nemec 
 wrote:
 On 11.08.2012 18:13, bearophile wrote:
 David:

 Thanks! I thought modulo should alawys yield the same ... 
 seems like I
 was wrong ;)

It's C design that's wrong.

And it's the processor design that makes it inefficient to correct it nowadays. Python's definition of modulo is far more useful than C's. Implemented in machine code, however, it takes several additional commands because the integer division is hardwired to the C definition. I guess that hardware integer division in processors became popular only when C was already widely in use.

A few extra instructions (a CMOV followed by ADD should suffice, yes?) seems like a small price to pay if it can prevent bugs. Why hasn't the Python-modulo been made the default back when D was designed? The ever-so-slightly more efficient C-modulo could be provided in a library. Of course it's way too late to change it now...

Keep in mind that not all CPUs have cmov, so you end up having to branch based on whether the CPU has it or not.

I think the following trick would work even on the 80386 (386+ should have the SETcc instructions): assume divident in AEX, divisor in EBX or ECX (not EDX, obviously). I'll assume it's in EBX. CDQ // sign extend EAX to EDX:EAX IDIV EBX // signed division XOR EAX,EAX // clear EAX BT EDX,31 // check if sign-bit of remainder is set SETNC AL // AL = (EDX negative ? 0 : 1) DEC AEX // EAX = (EDX negative ? 0xFFFF : 0) AND EAX,EBX // EAX = (EDX negative ? EBX : 0) ADD EDX,EAX // EAX = the correct modulo result Obviously it's less appealing than a CMOV, but it should work (in theory, I didn't test it...) and perhaps someone smarter could make it even shorter. Apart form the IDIV, all instructions should take only a single clock cycle on most x86s as far as I'm aware. As for other CPU architectures, perhaps those actually have a 'sane' modulo operator, which would make the hypothetical presence or absence of a CMOV irrelevant :)
Aug 12 2012
prev sibling next sibling parent "Thiez" <thiezz gmail.com> writes:
On Sunday, 12 August 2012 at 13:11:36 UTC, David wrote:
 My implementation of py_mod:

 int py_mod(int a, int b) {
     return cast(uint)a % b;
 }

 So a simple cast from signed to unsigned does exactly what I 
 need (Prf_Jakob showed me that in IRC)

It seems to work when b is a power of 2, but not for other numbers. Example: -3 % 7 = 4, but cast(uint)-3 = 4294967293, and 4294967293 % 7 = 1.
 How much does a cast cost? I mean is that even an extra asm 
 instruction?

Casting ints to uints and back is free because the bits can remain the same (only the interpretation changes).
Aug 12 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
 It's a localized but important design bug of languages like C 
 that D has carried over for backward compatibility reasons. D % 
 has caused bugs in my code. I suggest to not use the built-in % 
 on numbers that can be negative,

Better to show an example, reduced from a larger program. This little function has to look for the next true value in a circular array 'data', before or after a given starting index, according to a int 'movement' variable that is allowed to be only +1 or -1: This is Python code, it correctly performs the wrap-around in all cases: def find_next(data, index, movement): assert movement == 1 or movement == -1 while True: index = (index + movement) % len(data) if data[index]: break return index # 0 1 2 3 4 5 data = [False, False, False, False, True, False] print find_next(data, 5, -1) # OK, prints 4 print find_next(data, 0, -1) # OK, prints 4 The D code doesn't work, because of the C %: size_t findNext(in bool[] data, size_t index, in int movement) pure nothrow in { assert(movement == 1 || movement == -1); } body { do { index = (index + movement) % data.length; } while (!data[index]); return index; } void main() { import std.stdio; // 0 1 2 3 4 5 const data = [false, false, false, false, true, false]; writeln(findNext(data, 5, -1)); // OK, prints 4 writeln(findNext(data, 0, -1)); // not OK } Bye, bearophile
Aug 12 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 11 Aug 2012 09:48:15 -0400, David <d dav1d.de> wrote:

 -1 % 16 = -1

 Shouldn't that be 15? It seems like the sign is ignored for the modulo.

 Is this a bug or intended behaviour? The Python implementation returns  
 here, as expected, 15.

In case you are still looking for an expression which mods always positive (even for non-powers of 2) (a % b + b) % b Obviously, b must be positive. The direct translation of this to assembly isn't the most efficient way. Technically, in assembly we could check for the sign bit, and only add b again (without doing the second mod) if the value was negative. But I'm not sure you can write an expression that causes that, maybe: a % b < 0 ? a % b + b : a % b Certainly not as appealing. To get a direct translation from the above assembly, it would be something like: typeof(a) x; auto expr = (x = a % b) < 0 ? x + b : x -Steve
Sep 07 2012
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Saturday, 11 August 2012 at 18:54:09 UTC, Thiez wrote:
 A few extra instructions (a CMOV followed by ADD should 
 suffice, yes?) seems like a small price to pay if it can 
 prevent bugs. Why hasn't the Python-modulo been made the 
 default back when D was designed? The ever-so-slightly more 
 efficient C-modulo could be provided in a library. Of course 
 it's way too late to change it now...

According to TDPL regarding integral math: "Anything that compiles in both C and D WILL produce the same results". Anything less would make porting code a nightmare. =>It is absolutely not conceivable to change "%" semantics'. However, D *could* provide the "Mathematician's Modulo" operator, either as a library functionality, or as a new operator. I guess there hasn't been "a strong need" yet.
Sep 07 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
monarch_dodra:

 I guess there hasn't been "a strong need" yet.

Or some people have fulfilled this need since years (D1) writing their own function. Regarding an operator, %% ? :-) Bye, bearophile
Sep 07 2012
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 Because then C code translated to D will have subtle, silent 
 and unexpected bugs.

I agree. So the solution is to introduce a new Phobos function or new built-in operator that works more correctly with negative numbers too. A new Phobos function named mod() is a very simple solution. While a new infix operator has the advantage of being more natural to use, this allows programmers to better remember to use it instead of the C %, so the maybe a built-in operator avoids some bugs compared to the function. Bye, bearophile
Sep 07 2012