digitalmars.D - value range propagation for logical OR

Andrei Alexandrescu (6/6) Apr 10 2010 Consider:

Justin Spahr-Summers (5/17) Apr 10 2010 It seems like it would use the lowest minimum and the highest maximum of...

Andrei Alexandrescu (6/23) Apr 10 2010 I thought the same, but consider:

Adam D. Ruppe (25/29) Apr 10 2010 min_c seems easy enough:

Andrei Alexandrescu (12/38) Apr 10 2010 Yah, that works for unsigned types. For signed types, it's tricky (as

Adam D. Ruppe (41/50) Apr 10 2010 My feeling is to go ahead and cast to unsigned, then do the calculation....

bearophile (4/6) Apr 10 2010 Time ago I have even suggested to disallow bitwise ops when one or both ...

Adam D. Ruppe (15/16) Apr 10 2010 I understand the position, but I don't advocate it just because I think ...

bearophile (5/7) Apr 10 2010 I have never added a bug report in Bugzilla about this because I think i...

Adam D. Ruppe (33/35) Apr 10 2010 The main() function there isn't an ideal test. Here's a better one:
Adam D. Ruppe (7/9) Apr 10 2010 Oh no! Eyeballing again showed a problem.. I should have put parens in m...

bearophile (21/23) Apr 10 2010 Bugs are gold nuggets! Yes, the precedence of bitwise operators is low, ...

bearophile (4/4) Apr 10 2010 I have filed a "report":

Andrei Alexandrescu (4/6) Apr 10 2010 Perfect, thanks. Your decision to contribute with reports is very

Adam D. Ruppe (28/29) Apr 10 2010 I *might* have it, but I'm not 100% confident in my test program. Here's...
Adam D. Ruppe (23/24) Apr 10 2010 I think my test program is telling me something: a non-zero min_c subtra...

Andrei Alexandrescu (3/24) Apr 10 2010 What results does it yield with my main() test harness?

Adam D. Ruppe (13/14) Apr 10 2010 total=100000000; equal=14585400 (14.5854%)

Andrei Alexandrescu (10/21) Apr 10 2010 Yah, it's clear that we're not looking anywhere near 100%. I'm a bit sad...

Adam D. Ruppe (46/53) Apr 10 2010 My brute-force program was buggy, but I think I fixed it. Here's the mai...

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (7/11) Apr 10 2010 Absolutely! :) I had just started writing an e-mail about it.

Andrei Alexandrescu (17/23) Apr 10 2010 I think this would work:

Rainer Deyke (12/24) Apr 10 2010 This looks wrong. Your function, if I understand it correctly, flips

Don (4/30) Apr 10 2010 It's a very interesting puzzle. There are some pretty complex cases.

PBa (57/84) Apr 09 2010 How about this?

Andrei Alexandrescu (6/29) Apr 10 2010 It tries to set all bits in which maxa is zero starting and keeps them

Adam D. Ruppe (11/12) Apr 10 2010 mina = 1

Andrei Alexandrescu (3/12) Apr 10 2010 Sorry, I meant max of the two minima.

=?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= (34/55) Apr 10 2010 In other words: maxa | ((1 << highbit (maxb))-1) where:

Andrei Alexandrescu (68/119) Apr 10 2010 Interesting. I don't get how your version is equivalent to mine, and I

Andrei Alexandrescu (30/30) Apr 10 2010 On 04/10/2010 05:33 PM, Andrei Alexandrescu wrote:

Andrei Alexandrescu (18/21) Apr 10 2010 Never mind that. I mistakenly used maxOR instead of maxOR2.

=?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= (57/75) Apr 11 2010 It's not equivalent to yours, because it was a bit late and I was

=?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= (54/54) Apr 11 2010 OK, I found a solution that is optimal in over 99% of the cases

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (17/19) Apr 11 2010 The assumption that maxB would be the value that produces the maximum

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= (92/116) Apr 11 2010 boundary="------------020801040508020503030409"

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (12/43) Apr 11 2010 My mistake: It was your function but the highbit() that I used was not

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= (11/62) Apr 12 2010 Yes, like I said with my code, it is conservative. It will give the

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (6/18) Apr 12 2010 Now I see my problem: I took Andrei's puzzle literally. The implication

=?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= (146/150) Apr 11 2010 boundary="------------010309040703050906000802"

Steven Schveighoffer (5/14) Apr 11 2010 Can I ask, what is the point of having a non-exact solution (unless you ...

Adam D. Ruppe (7/8) Apr 11 2010 I agree with this. While my solution has a certain beauty to it and is

Andrei Alexandrescu (3/8) Apr 12 2010 Agreed. Steve's solution is the best we have for unsigned numbers.

=?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= (7/23) Apr 12 2010 Nope, not anymore...

Andrei Alexandrescu (21/40) Apr 16 2010 Looks great, thanks Jerome!

Don (12/67) Apr 20 2010 A comment: The way DMD currently does range propagation (storing range

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= (27/43) Apr 12 2010 a function of the compiled program. Therefore, slower but more exact f...

Steven Schveighoffer (16/54) Apr 12 2010 My point was simply that the amount of time saved is relative to the siz...

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= (14/37) Apr 12 2010 ,

Steven Schveighoffer (4/11) Apr 12 2010 Really? You rebuilt the compiler with your range propagation algorithm ...

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= (19/33) Apr 13 2010 and verified that it adds 10 seconds versus an accurate one that adds 5...

Steven Schveighoffer (13/41) Apr 13 2010 In my opinion? Yes, slower compilers that make code easier to write are...

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= (17/30) Apr 13 2010 =20

Fawzi Mohamed (32/91) Apr 13 2010 compiler,

bearophile (4/7) Apr 13 2010 Range propagation can improve the situation a little, so it's not bad. B...

Fawzi Mohamed (12/22) Apr 13 2010 charset=US-ASCII;

Don (9/80) Apr 13 2010 It's been part of DMD2 for a while now. It allows you to do things like:

Fawzi Mohamed (8/92) Apr 13 2010 ah ok I understand, I thought that was treated like x & cast(ubyte)7 , =...

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= (11/32) Apr 13 2010 ,

=?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= (31/31) Apr 12 2010 Here's a fast 100% precise solution:

Steven Schveighoffer (7/31) Apr 12 2010 Fails for test case:

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= (10/14) Apr 12 2010 Nope, outputs 6. Note that I've run an exhaustive search for all

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (3/13) Apr 12 2010 I confirm. Jerome's code passes my randomized test.
Steven Schveighoffer (3/12) Apr 12 2010 I'll have to double check, I thought I copied your code verbatim (I did ...

Steven Schveighoffer (8/10) Apr 12 2010 I think I see the issue, I was using an older version of your highbit, a...

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= (22/39) Apr 13 2010 d switch around the arguments to be minA, maxA, minB, maxB to fit my tes...

Steven Schveighoffer (13/28) Apr 13 2010 I confirm, with the updated highbit, your solution works.

Don (6/42) Apr 13 2010 Awesome stuff, guys.
Adam Ruppe (17/47) Apr 13 2010 Jerome's highbit function is the same as std.intrinsic.bsr. I wonder

Steven Schveighoffer (9/18) Apr 13 2010 Just a note, this code should be runnable in C++, because the compiler i...

bearophile (5/7) Apr 13 2010 You can learn something from this page (it can also be useful for the tr...
Adam D. Ruppe (9/11) Apr 13 2010 I just assumed since it was in D that it was probably

Clemens (4/16) Apr 13 2010 That's strange. Looking at src/backend/cod4.c, function cdbscan, in the ...

Adam D. Ruppe (12/13) Apr 13 2010 The opcode is fairly slow anyway (as far as opcodes go) - odds are the

Don (3/16) Apr 13 2010 It's fast on Intel, slow on AMD. I bet the speed difference comes from

Don (5/26) Apr 13 2010 Specifically, bsr is 7 uops on AMD, 1 uop on Intel since the original

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= (9/17) Apr 13 2010 Thanks, but I can't claim credit for it. It's a classical algorithm

Ellery Newcomer (6/6) Dec 04 2010 is it just me, or does this fail for

Ellery Newcomer (25/31) Dec 04 2010 i'm thinking this is better:

bearophile (4/7) Apr 10 2010 This is Interval arithmetic. If you are able to find a book about it... ...
BCS (7/15) Apr 10 2010 x_bit_max == "the largest bit in x_max ^ x_min"

BCS (6/21) Apr 10 2010 // or make that

BCS (5/5) Apr 10 2010 Hello BCS,

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Consider:

byte c = a | b;

Say you already know min_a, max_a, min_b, and max_b. How do you compute 
min_c and max_c? I thought of it for a bit and it seems quite tricky.


Thanks,

Andrei

Apr 10 2010

Justin Spahr-Summers <Justin.SpahrSummers gmail.com> writes:

On Sat, 10 Apr 2010 12:01:45 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> wrote:
 
 Consider:
 
 byte c = a | b;
 
 Say you already know min_a, max_a, min_b, and max_b. How do you compute 
 min_c and max_c? I thought of it for a bit and it seems quite tricky.
 
 
 Thanks,
 
 Andrei

It seems like it would use the lowest minimum and the highest maximum of 
the two... that's what I personally would find most natural, given the 
semantics of bitwise OR, but maybe I'm just way off base.

Apr 10 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 04/10/2010 12:22 PM, Justin Spahr-Summers wrote:
 On Sat, 10 Apr 2010 12:01:45 -0500, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org>  wrote:
 Consider:

 byte c = a | b;

 Say you already know min_a, max_a, min_b, and max_b. How do you compute
 min_c and max_c? I thought of it for a bit and it seems quite tricky.


 Thanks,

 Andrei

 It seems like it would use the lowest minimum and the highest maximum of
 the two... that's what I personally would find most natural, given the
 semantics of bitwise OR, but maybe I'm just way off base.

I thought the same, but consider:

a = 4;
b = 3;

Then the maximum value is 7, larger than both.


Andrei

Apr 10 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Sat, Apr 10, 2010 at 12:01:45PM -0500, Andrei Alexandrescu wrote:
 byte c = a | b;
 
 Say you already know min_a, max_a, min_b, and max_b. How do you compute 
 min_c and max_c? I thought of it for a bit and it seems quite tricky.

min_c seems easy enough:
min_c = max(min_a, min_b);

When you or two values together, the result must be >= both of them, and this
does the trick there.

max_c is a bit harder. The obvious solution of:

max_c = max(max_a, max_b);

Fails because of 0b1100 | 0b0011... looking at this example, let's try:

max_c = max_a + max_b;

It covers that case, but is too big for 0b1111 | 0b1111.

This might work:

int numbits(number) { return the number of bits required to hold that number; }

if ( numbits(max_a + max_b) > max(numbits(max_a), numbits(max_b)) )
    max_c = 2 ^^ (max(numbits(max_a), numbits(max_b))) - 1;
else
    max_c = max_a + max_b;


The biggest we can possibly get is the sum of the two numbers. But, if it
carries
over to more bits than either option, we truncate it - OR never gives more bits
than the two arguments. So we assume the max is the worst case of all ones or'd
with all ones.

I think this would work, but might not be the best we can possibly do. It is the
best I can think of though.

-- 
Adam D. Ruppe
http://arsdnet.net

Apr 10 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

(By the way I meant _bitwise_ in the title.)

On 04/10/2010 12:38 PM, Adam D. Ruppe wrote:
 On Sat, Apr 10, 2010 at 12:01:45PM -0500, Andrei Alexandrescu wrote:
 byte c = a | b;

 Say you already know min_a, max_a, min_b, and max_b. How do you compute
 min_c and max_c? I thought of it for a bit and it seems quite tricky.

 min_c seems easy enough:
 min_c = max(min_a, min_b);

 When you or two values together, the result must be>= both of them, and this
 does the trick there.

Yah, that works for unsigned types. For signed types, it's tricky (as 
tricky as max).

 max_c is a bit harder. The obvious solution of:

 max_c = max(max_a, max_b);

 Fails because of 0b1100 | 0b0011... looking at this example, let's try:

 max_c = max_a + max_b;

 It covers that case, but is too big for 0b1111 | 0b1111.

Yah, sum is a bound for max, but not the exact max. This is because 
bitwise OR is like a sum without transport.

 This might work:

 int numbits(number) { return the number of bits required to hold that number; }

 if ( numbits(max_a + max_b)>  max(numbits(max_a), numbits(max_b)) )
      max_c = 2 ^^ (max(numbits(max_a), numbits(max_b))) - 1;
 else
      max_c = max_a + max_b;


 The biggest we can possibly get is the sum of the two numbers. But, if it
carries
 over to more bits than either option, we truncate it - OR never gives more bits
 than the two arguments. So we assume the max is the worst case of all ones or'd
 with all ones.

 I think this would work, but might not be the best we can possibly do. It is
the
 best I can think of though.

Even on the branch that doesn't use the sum, that's still just a bound. 
Consider:

a = 8;
b = 4;

Then max(a|b) is 12, but computed with your method is 15.


Andrei

Apr 10 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Sat, Apr 10, 2010 at 12:39:22PM -0500, Andrei Alexandrescu wrote:
 Yah, that works for unsigned types. For signed types, it's tricky (as 
 tricky as max).

My feeling is to go ahead and cast to unsigned, then do the calculation. For
me at least, when doing bitwise ops, I assume it is unsigned anyway.

 Even on the branch that doesn't use the sum, that's still just a bound. 
 Consider:
 
 a = 8;
 b = 4;
 
 Then max(a|b) is 12, but computed with your method is 15.

Yeah, you're right. Bah. Maybe:

max_a = max(a, b) | (2 ^^ numbits(min(a, b)) - 1);

8 | 7 == 15;

Same deal, still wrong. Maybe we can throw in one more thing:

let f(a, b) = the above;
max_c = min(f(a, b), a | b);


Let me test it... it seems to work. Here's the D program I used to brute-force
the test:

===
import std.intrinsic; // for bsr()

uint max(uint a, uint b) { return (a >= b) ? a : b; }
uint min(uint a, uint b) { return (a >= b) ? b : a; }

uint numbits(uint a) { return a == 0 ? a : bsr(a) + 1; }

unittest {
	assert(numbits(0) == 0);
	assert(numbits(1) == 1);
	assert(numbits(2) == 2);
	assert(numbits(3) == 2);
	assert(numbits(4) == 3);
	assert(numbits(5) == 3);
}

uint f(uint a, uint b) {
	return max(a, b) | (1 << numbits(min(a, b)) - 1);
}

uint max_a(uint a, uint b) {
	return min(f(a, b), a | b);
}

void main() {
	for(uint a = 0; a <= 256; a++)
	for(uint b = 0; b <= 256; b++)
		assert(a|b <= max_a(a, b));
}
=======

I compiled with the unittests, and it closed without tripping any asserts.
Eyeballing the output looks right too.

-- 
Adam D. Ruppe
http://arsdnet.net

Apr 10 2010

bearophile <bearophileHUGS lycos.com> writes:

Adam D. Ruppe:

 My feeling is to go ahead and cast to unsigned, then do the calculation. For
 me at least, when doing bitwise ops, I assume it is unsigned anyway.

Time ago I have even suggested to disallow bitwise ops when one or both
operands are signed...

Bye,
bearophile

Apr 10 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Sat, Apr 10, 2010 at 02:55:36PM -0400, bearophile wrote:
 Time ago I have even suggested to disallow bitwise ops when one or both
operands are signed...

I understand the position, but I don't advocate it just because I think it would
be annoying and not give much of a benefit.

Consider the following:

for(int a = -10; a < 10; a++)
	writeln(a&1 ? "odd" : "even");

I'd just be annoyed if I had to cast a to uint just to do that. Note that it
works
on negative numbers too, so it isn't strictly wrong to do it on signed ints.

I use this when doing stripes on outputted tables and stuff like that.

a % 2 does the same thing, so this specific case isn't a big deal, but I still
tend to type &1 rather than %2 out of habit (this kind of thing used to make a
real speed difference back in the DOS days when I learned it all!)

-- 
Adam D. Ruppe
http://arsdnet.net

Apr 10 2010

bearophile <bearophileHUGS lycos.com> writes:

Adam D. Ruppe:
 I understand the position, but I don't advocate it just because I think it
would
 be annoying and not give much of a benefit.

I have never added a bug report in Bugzilla about this because I think it
doesn't give enough benefit.

Thank you for your answer,
bye,
bearophile

Apr 10 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Sat, Apr 10, 2010 at 02:57:59PM -0400, Adam D. Ruppe wrote:
 Let me test it... it seems to work. Here's the D program I used to brute-force
 the test:

The main() function there isn't an ideal test. Here's a better one:

void main() {
	for(uint max_a = 0; max_a < 100; max_a++)
	for(uint max_b = 0; max_b < 100; max_b++)
	for(uint a = 0; a <= max_a; a++)
	for(uint b = 0; b <= max_b; b++)
		assert(a|b <= max_c(max_a, max_b));
}

// note that I renamed uint max_a to max_c.

We try a whole load of max_a and max_b and test every possible value. It
still runs without throwing.

My original function for min_c works too:

uint min_c(uint a, uint b) {
	return max(a, b);
}

And you can brute force test it all:

void main() {
	for(uint min_a = 0; min_a < 50; min_a++)
	for(uint min_b = 0; min_b < 50; min_b++)
	for(uint max_a = min_a; max_a < 100; max_a++)
	for(uint max_b = min_b; max_b < 100; max_b++)
	for(uint a = min_a; a <= max_a; a++)
	for(uint b = min_b; b <= max_b; b++) {
		assert(a|b <= max_c(max_a, max_b));
		assert(a|b >= min_c(min_a, min_b));
	}
}

This, as you can expect, takes a while to run, but does so without throwing. I'm
pretty confident in the functions now.

-- 
Adam D. Ruppe
http://arsdnet.net

Apr 10 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Sat, Apr 10, 2010 at 02:57:59PM -0400, Adam D. Ruppe wrote:
 Let me test it... it seems to work. Here's the D program I used to brute-force
 the test:

Oh no! Eyeballing again showed a problem.. I should have put parens in my
asserts:

a | b < f(a,b)   !=  (a|b) < f(a,b)

That breaks it. Back to the drawing board.

-- 
Adam D. Ruppe
http://arsdnet.net

Apr 10 2010

bearophile <bearophileHUGS lycos.com> writes:

Adam D. Ruppe:

 Oh no! Eyeballing again showed a problem.. I should have put parens in my
asserts:
 a | b < f(a,b)   !=  (a|b) < f(a,b)

Bugs are gold nuggets! Yes, the precedence of bitwise operators is low, it's an
error-prone thing, it's a part of C/C++/D that causes frequent bugs in programs
written by everybody. I add extra parentheses when I use bitwise operators.

Unfortunately I don't see a simple way to remove this significant source of
bugs from the D2 language.

When you switch on full warnings GCC warns you about few possible similar
errors, suggesting to add parentheses to remove some ambiguity (for the eyes of
the programmers).

This is a small example in C:

#include "stdio.h"
#include "stdlib.h"
int main() {
    int a = atoi("10");
    int b = atoi("20");
    int c = atoi("30");
    printf("%u\n", a|b <= c);
    return 0;
}

If you compile it with GCC (I am using gcc 4.4.1):

...>gcc -Wall test.c -o test
test.c: In function 'main':
test.c:9: warning: suggest parentheses around comparison in operand of '|'

You always use -Wall (and other warnings) when you write C code. So in this
case the C language+compiler is able to catch a bug like yours :-)

Bye,
bearophile

Apr 10 2010

bearophile <bearophileHUGS lycos.com> writes:

I have filed a "report":
http://d.puremagic.com/issues/show_bug.cgi?id=4077

Bye,
bearophile

Apr 10 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 04/10/2010 09:55 PM, bearophile wrote:
 I have filed a "report":
 http://d.puremagic.com/issues/show_bug.cgi?id=4077

Perfect, thanks. Your decision to contribute with reports is very 
appreciated.

Andrei

Apr 10 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Sat, Apr 10, 2010 at 04:02:05PM -0400, Adam D. Ruppe wrote:
 That breaks it. Back to the drawing board.

I *might* have it, but I'm not 100% confident in my test program. Here's my
implementation:

====
import std.intrinsic;

uint max(uint a, uint b) { return (a >= b) ? a : b; }
uint min(uint a, uint b) { return (a >= b) ? b : a; }
uint numbits(uint a) { return a == 0 ? a : (bsr(a) + 1); }

uint max_c(uint a, uint b) {
	uint mn = min(a, b);
	uint mx = max(a, b);
	if(mn == 0)
		return mx;
	// if there is a bit we can reuse, it grants us everything before it too
	uint reusable = mn & mx;
	if(reusable)
		return (mx | ((1 << numbits(reusable)) - 1)) | mn;
	else
		return mx | mn;
}

uint min_c(uint min_a, uint min_b) {
	return max(min_a, min_b);
}
=====

I can't believe I spent all day on this...

-- 
Adam D. Ruppe
http://arsdnet.net

Apr 10 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Sat, Apr 10, 2010 at 07:38:07PM -0400, Adam D. Ruppe wrote:
 I *might* have it, but I'm not 100% confident in my test program.

I think my test program is telling me something: a non-zero min_c subtracts from
max_c. If I take my brute-force empirically calculated max_c and compare it to
what my max_c function returns, it doesn't always match if the minimums change.

This might be a bug; it is unintuitive that it would do this, but it actually
makes sense. Consider if min == max, you only have one value to work with.

let max_a = 4, min_a = 0;

max_b = 4

You can get 7 out of it, since 3 < 4, so it is a possible number and: 100 | 011
== 7

But, if min_b == 4, the only possible value for b is 4. And: 100 | 100 == 4.

Increasing min_b does indeed decrease the max you can get.


Nice fact, but I've spent too much time on this already, so I'll call it done
with
this:

My max_c included some unnecessary code: my reusable mask obsoletes all of
my special case code! So, we can bring the program down to three lines:

import std.intrinsic;
uint numbits(uint a) { return a == 0 ? a : (bsr(a) + 1); }
uint max_c(uint a, uint b) { return a | ((1 << numbits(a&b)) - 1) | b; }

It passes my test, though since I bugged it up the first time, I might have done
it again.

-- 
Adam D. Ruppe
http://arsdnet.net

Apr 10 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 04/10/2010 06:58 PM, Adam D. Ruppe wrote:
 On Sat, Apr 10, 2010 at 07:38:07PM -0400, Adam D. Ruppe wrote:
 I *might* have it, but I'm not 100% confident in my test program.

 I think my test program is telling me something: a non-zero min_c subtracts
from
 max_c. If I take my brute-force empirically calculated max_c and compare it to
 what my max_c function returns, it doesn't always match if the minimums change.

 This might be a bug; it is unintuitive that it would do this, but it actually
 makes sense. Consider if min == max, you only have one value to work with.

 let max_a = 4, min_a = 0;

 max_b = 4

 You can get 7 out of it, since 3<  4, so it is a possible number and: 100 |
011 == 7

 But, if min_b == 4, the only possible value for b is 4. And: 100 | 100 == 4.

 Increasing min_b does indeed decrease the max you can get.


 Nice fact, but I've spent too much time on this already, so I'll call it done
with
 this:

 My max_c included some unnecessary code: my reusable mask obsoletes all of
 my special case code! So, we can bring the program down to three lines:

 import std.intrinsic;
 uint numbits(uint a) { return a == 0 ? a : (bsr(a) + 1); }
 uint max_c(uint a, uint b) { return a | ((1<<  numbits(a&b)) - 1) | b; }

 It passes my test, though since I bugged it up the first time, I might have
done
 it again.

What results does it yield with my main() test harness?

Andrei

Apr 10 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Sat, Apr 10, 2010 at 08:42:54PM -0500, Andrei Alexandrescu wrote:
 What results does it yield with my main() test harness?

total=100000000; equal=14585400 (14.5854%)

I think that's a perfect result. Here's why: the equality comparison checks
two specific values, (a, b), but the maxOr functions return the max for the
interval ([0, a], [0, b]). It wouldn't always be equal, since there are other
values in there that can reach the max.

For example, a = 4, b = 4. 100|100 == 100, but that range also include 4|3,
100 | 011 == 111, which is the max.


Your maxOR function gives the same percentage, for probably the same reason.
Though, mine runs ~7x faster on my box.

-- 
Adam D. Ruppe
http://arsdnet.net

Apr 10 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 04/10/2010 08:58 PM, Adam D. Ruppe wrote:
 On Sat, Apr 10, 2010 at 08:42:54PM -0500, Andrei Alexandrescu wrote:
 What results does it yield with my main() test harness?

 total=100000000; equal=14585400 (14.5854%)

 I think that's a perfect result. Here's why: the equality comparison checks
 two specific values, (a, b), but the maxOr functions return the max for the
 interval ([0, a], [0, b]). It wouldn't always be equal, since there are other
 values in there that can reach the max.

Yah, it's clear that we're not looking anywhere near 100%. I'm a bit sad 
we don't have an adequate baseline, but the fact that two distinct 
methods obtained the same result is encouraging.

 For example, a = 4, b = 4. 100|100 == 100, but that range also include 4|3,
 100 | 011 == 111, which is the max.


 Your maxOR function gives the same percentage, for probably the same reason.
 Though, mine runs ~7x faster on my box.

Then you haven't wasted your day!

On to the next task: compute minOR and maxOR for _signed_ values. It 
turns out minOR its also nontrivial...

When we're done, we can pass the functions to Walter so he can integrate 
them within his compiler. Right now the compiler uses a wrong algorithm.



Andrei

Apr 10 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Sat, Apr 10, 2010 at 09:21:07PM -0500, Andrei Alexandrescu wrote:
 Yah, it's clear that we're not looking anywhere near 100%. I'm a bit sad 
 we don't have an adequate baseline, but the fact that two distinct 
 methods obtained the same result is encouraging.

My brute-force program was buggy, but I think I fixed it. Here's the main():

========
	uint min_a = 0;
	uint min_b = 0;
	for(uint max_a = min_a; max_a <= 256; max_a++)
	for(uint max_b = min_b; max_b <= 256; max_b++) {
		uint real_max = 0;
		for(uint a = min_a; a <= max_a; a++)
		for(uint b = min_b; b <= max_b; b++) {
			assert((a|b) <= max_c(max_a, max_b), format("%b | %b !<= %b (from %b %b)",
a, b, max_c(max_a, max_b), max_a, max_b));
			assert((a|b) >= min_c(min_a, min_b));
			if((a | b) > real_max)
				real_max = (a|b);
		}
		assert(max_c(max_a, max_b) == real_max, format("[%b, %b]. expected: %d, got
%d (min == %d)", max_a, max_b, max_c(max_a, max_b), real_max, min_c(min_a,
min_b)));
	}

	writeln("success");
=======

With so many nested loops, it is slow as dirt, but what it does is try every
value in the given range and record the maximum number encountered.

If it doesn't match at any step, it throws. But, it works for me. I spent
a chunk of the day eyeballing the output too to ensure it is right (and to
figure out the pattern that led to the final function).


Now, the thing is if you change min_a or min_b, it breaks the real_max; like
I said in the other message, if min == max, for example, you get the case
where 4|4 isn't the max, since 4|3 isn't available.

I'm not sure how to fix that, but I am pretty convinced that to get the
perfect solution, the max_c function needs to know min_a and min_b too.
Still, this solution works very well in the min_a = min_b = 0 case, so
it is at least a decent bound.

 On to the next task: compute minOR and maxOR for _signed_ values. It 
 turns out minOR its also nontrivial...

Nontrivial is an understatement. I started asking if it can even be done
without assuming a size.. I do think you can, by assuming the sizes
are equal, whatever they are, but it still is not easy.

I still like the idea of just casting it. How often would it cause problems
in real code anyway?

The only starting point I have is:
Of either max_a or max_b is negative, the result of the OR is going
to be negative, since that sign bit is one, and we can't change a one to
a zero on any OR operation.

So, the max value will be positive iff both values are possibly positive.

But taking that one statement to working code is too hard for my blood.


 When we're done, we can pass the functions to Walter so he can integrate 
 them within his compiler. Right now the compiler uses a wrong algorithm.

Indeed. We're getting there, but still a ways to go.


-- 
Adam D. Ruppe
http://arsdnet.net

Apr 10 2010

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

Adam D. Ruppe wrote:

 I am pretty convinced that to get the
 perfect solution, the max_c function needs to know min_a and min_b too.

Absolutely! :) I had just started writing an e-mail about it.

Andrei Alexandrescu wrote:
 On to the next task: compute minOR and maxOR for _signed_ values. It
 turns out minOR its also nontrivial...


How about computing minOR for unsigned values first? ;)

The proposed solution of max(min_a, min_b) is correct only for a couple 
of corner cases.

Ali

Apr 10 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 04/10/2010 12:01 PM, Andrei Alexandrescu wrote:
 Consider:

 byte c = a | b;

 Say you already know min_a, max_a, min_b, and max_b. How do you compute
 min_c and max_c? I thought of it for a bit and it seems quite tricky.


 Thanks,

 Andrei

I think this would work:

uint maxOR(uint maxa, uint maxb) {
     if (maxa < maxb) return maxOR(maxb, maxa);
     uint candidate = 0;
     foreach (i, bit; byBitDescending(maxa)) {
         if (bit) continue;
         auto t = candidate | (1 << (31 - i));
         if (t <= maxb) candidate = t;
     }
     return maxa | candidate;
}

The idea is to find the candidate that would fill as many zeros in maxa 
as possible.

But this is inefficient. I'm wondering whether a simple bitwise 
manipulation expression could do the trick.


Andrei

Apr 10 2010

Rainer Deyke <rainerd eldwood.com> writes:

On 4/10/2010 11:52, Andrei Alexandrescu wrote:
 I think this would work:
 
 uint maxOR(uint maxa, uint maxb) {
     if (maxa < maxb) return maxOR(maxb, maxa);
     uint candidate = 0;
     foreach (i, bit; byBitDescending(maxa)) {
         if (bit) continue;
         auto t = candidate | (1 << (31 - i));
         if (t <= maxb) candidate = t;
     }
     return maxa | candidate;
 }

This looks wrong.  Your function, if I understand it correctly, flips
all the bits in 'maxb' (excluding the leading zeros).  If 'maxb' is
exactly 1 less than a power of 2, then 'candidate' will be 0.  Now
consider a in [0, 16], b in [0, 15].  Your function will produce 16, but
the real maximum is 31.

For maximum accuracy, you have to consider the minima as well as the
maxima when calculating the new maximum.  With 'a' in [16, 16] and 'b'
in [16, 16], the new range can only be [16, 16].  With 'a' in [0, 16]
and 'b' in [0, 16], the correct new range is [0, 31].


-- 
Rainer Deyke - rainerd eldwood.com

Apr 10 2010

Don <nospam nospam.com> writes:

Rainer Deyke wrote:
 On 4/10/2010 11:52, Andrei Alexandrescu wrote:
 I think this would work:

 uint maxOR(uint maxa, uint maxb) {
     if (maxa < maxb) return maxOR(maxb, maxa);
     uint candidate = 0;
     foreach (i, bit; byBitDescending(maxa)) {
         if (bit) continue;
         auto t = candidate | (1 << (31 - i));
         if (t <= maxb) candidate = t;
     }
     return maxa | candidate;
 }

 
 This looks wrong.  Your function, if I understand it correctly, flips
 all the bits in 'maxb' (excluding the leading zeros).  If 'maxb' is
 exactly 1 less than a power of 2, then 'candidate' will be 0.  Now
 consider a in [0, 16], b in [0, 15].  Your function will produce 16, but
 the real maximum is 31.
 
 For maximum accuracy, you have to consider the minima as well as the
 maxima when calculating the new maximum.  With 'a' in [16, 16] and 'b'
 in [16, 16], the new range can only be [16, 16].  With 'a' in [0, 16]
 and 'b' in [0, 16], the correct new range is [0, 31].
 
 

It's a very interesting puzzle. There are some pretty complex cases.

Consider (all numbers binary) a in [1010, 1100] and b in [101,101]. The 
maximum value is when a=1011, b=101, so max is 1111.

Apr 10 2010

PBa <spam goto-hell.com> writes:

Am 10.04.2010, 23:17 Uhr, schrieb Don <nospam nospam.com>:

 Rainer Deyke wrote:
 On 4/10/2010 11:52, Andrei Alexandrescu wrote:
 I think this would work:

 uint maxOR(uint maxa, uint maxb) {
     if (maxa < maxb) return maxOR(maxb, maxa);
     uint candidate = 0;
     foreach (i, bit; byBitDescending(maxa)) {
         if (bit) continue;
         auto t = candidate | (1 << (31 - i));
         if (t <= maxb) candidate = t;
     }
     return maxa | candidate;
 }

  This looks wrong.  Your function, if I understand it correctly, flips
 all the bits in 'maxb' (excluding the leading zeros).  If 'maxb' is
 exactly 1 less than a power of 2, then 'candidate' will be 0.  Now
 consider a in [0, 16], b in [0, 15].  Your function will produce 16, but
 the real maximum is 31.
  For maximum accuracy, you have to consider the minima as well as the
 maxima when calculating the new maximum.  With 'a' in [16, 16] and 'b'
 in [16, 16], the new range can only be [16, 16].  With 'a' in [0, 16]
 and 'b' in [0, 16], the correct new range is [0, 31].

 It's a very interesting puzzle. There are some pretty complex cases.

 Consider (all numbers binary) a in [1010, 1100] and b in [101,101]. The  
 maximum value is when a=1011, b=101, so max is 1111.


How about this?

import std.stdio, std.conv, std.intrinsic;

void main(string[] args)
{
   int total = 0, match = 0, from = to!(int)(args[1]), to =  
to!(int)(args[2]);
   uint res, resBF;
   for(int i = from; i < to; ++i)
   {
     for(int j = i; j < to; ++j)
     {
       for(int k = from; k < to; ++k)
       {
         for(int m = k; m < to; ++m)
         {
           ++total;
           res = maxOr(i, j, k, m);
           resBF = maxOrBF(i, j, k, m);
           if(res == resBF) ++match;
           //else writefln("mismatch: minA %s, maxA %s, minB %s, maxB %s,  
maxOr %s, maxOrBF %s", i, j, k, m, res, resBF);
         }
       }
     }
   }
   writefln("total %s, match %s", total, match);
}

uint max(uint a, uint b)
{
   return (a > b) ? a : b;
}

uint maxOr(uint minA, uint maxA, uint minB, uint maxB)
{
   uint result = minA | maxA | minB | maxB,
     diff = max(maxA - minA, maxB - minB);

   if(diff)
   {
     result |= (1 << (bsr(diff) + 1)) - 1;
   }
   return result;
}

/* compute result using brute force */
uint maxOrBF(uint minA, uint maxA, uint minB, uint maxB)
{
   return bitsSetInIntervallBF(minA, maxA) | bitsSetInIntervallBF(minB,  
maxB);
}

uint bitsSetInIntervallBF(uint min, uint max)
{
   uint result = 0;

   for(; min <= max; ++min)
   {
     result |= min;
   }
   return result;
}

Apr 09 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 04/10/2010 01:55 PM, Rainer Deyke wrote:
 On 4/10/2010 11:52, Andrei Alexandrescu wrote:
 I think this would work:

 uint maxOR(uint maxa, uint maxb) {
      if (maxa<  maxb) return maxOR(maxb, maxa);
      uint candidate = 0;
      foreach (i, bit; byBitDescending(maxa)) {
          if (bit) continue;
          auto t = candidate | (1<<  (31 - i));
          if (t<= maxb) candidate = t;
      }
      return maxa | candidate;
 }

 This looks wrong.  Your function, if I understand it correctly, flips
 all the bits in 'maxb' (excluding the leading zeros).

It tries to set all bits in which maxa is zero starting and keeps them 
set if the resulting number is less than maxb.

 If 'maxb' is
 exactly 1 less than a power of 2, then 'candidate' will be 0.  Now
 consider a in [0, 16], b in [0, 15].  Your function will produce 16, but
 the real maximum is 31.

If maxb is 2^^N-1, then it will fill all nonzero bits of maxa.

 For maximum accuracy, you have to consider the minima as well as the
 maxima when calculating the new maximum.  With 'a' in [16, 16] and 'b'
 in [16, 16], the new range can only be [16, 16].  With 'a' in [0, 16]
 and 'b' in [0, 16], the correct new range is [0, 31].

The minimum for unsigned numbers is very simple: min(mina, minb).


Andrei

Apr 10 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Sat, Apr 10, 2010 at 04:05:06PM -0500, Andrei Alexandrescu wrote:
 The minimum for unsigned numbers is very simple: min(mina, minb).

mina = 1
minb = 0

min(1, 0) == 0

But, 1|0 == 1

max(mina, minb) looks to work though. Consider that ORing can only set bits -
it can never take a one and spit out a zero. Therefore, x|y >= x && x|y >= y,
which only works on the max(mina, minb) function.


-- 
Adam D. Ruppe
http://arsdnet.net

Apr 10 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 04/10/2010 04:21 PM, Adam D. Ruppe wrote:
 On Sat, Apr 10, 2010 at 04:05:06PM -0500, Andrei Alexandrescu wrote:
 The minimum for unsigned numbers is very simple: min(mina, minb).

 mina = 1
 minb = 0

 min(1, 0) == 0

 But, 1|0 == 1

 max(mina, minb) looks to work though. Consider that ORing can only set bits -
 it can never take a one and spit out a zero. Therefore, x|y>= x&&  x|y>= y,
 which only works on the max(mina, minb) function.

Sorry, I meant max of the two minima.

Andrei

Apr 10 2010

=?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= <jeberger free.fr> writes:

Andrei Alexandrescu wrote:
 On 04/10/2010 01:55 PM, Rainer Deyke wrote:
 On 4/10/2010 11:52, Andrei Alexandrescu wrote:
 I think this would work:

 uint maxOR(uint maxa, uint maxb) {
      if (maxa<  maxb) return maxOR(maxb, maxa);
      uint candidate =3D 0;
      foreach (i, bit; byBitDescending(maxa)) {
          if (bit) continue;
          auto t =3D candidate | (1<<  (31 - i));
          if (t<=3D maxb) candidate =3D t;
      }
      return maxa | candidate;
 }

 This looks wrong.  Your function, if I understand it correctly, flips
 all the bits in 'maxb' (excluding the leading zeros).

=20
 It tries to set all bits in which maxa is zero starting and keeps them
 set if the resulting number is less than maxb.
=20

	In other words: maxa | ((1 << highbit (maxb))-1) where:

int highbit (uint n) {
   auto result =3D 0;
   if (n & 0xFFFF0000) {
      result +=3D 16;
      n =3D n >> 16;
   }
   if (n & 0xFF00) {
      result +=3D 8;
      n =3D n >> 8;
   }
   if (n & 0xF0) {
      result +=3D 4;
      n =3D n >> 4;
   }
   if (n & 0xC) {
      result +=3D 2;
      n =3D n >> 2;
   }
   if (n & 0x2) {
      result +=3D 1;
      n =3D n >> 1;
   }
   if (n & 0x1) {
      result +=3D 1;
   }
   return result;
}

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Apr 10 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 04/10/2010 04:50 PM, "J�r�me M. Berger" wrote:
 Andrei Alexandrescu wrote:
 On 04/10/2010 01:55 PM, Rainer Deyke wrote:
 On 4/10/2010 11:52, Andrei Alexandrescu wrote:
 I think this would work:

 uint maxOR(uint maxa, uint maxb) {
       if (maxa<   maxb) return maxOR(maxb, maxa);
       uint candidate = 0;
       foreach (i, bit; byBitDescending(maxa)) {
           if (bit) continue;
           auto t = candidate | (1<<   (31 - i));
           if (t<= maxb) candidate = t;
       }
       return maxa | candidate;
 }

 This looks wrong.  Your function, if I understand it correctly, flips
 all the bits in 'maxb' (excluding the leading zeros).

 It tries to set all bits in which maxa is zero starting and keeps them
 set if the resulting number is less than maxb.

 	In other words: maxa | ((1<<  highbit (maxb))-1) where:

 int highbit (uint n) {
     auto result = 0;
     if (n&  0xFFFF0000) {
        result += 16;
        n = n>>  16;
     }
     if (n&  0xFF00) {
        result += 8;
        n = n>>  8;
     }
     if (n&  0xF0) {
        result += 4;
        n = n>>  4;
     }
     if (n&  0xC) {
        result += 2;
        n = n>>  2;
     }
     if (n&  0x2) {
        result += 1;
        n = n>>  1;
     }
     if (n&  0x1) {
        result += 1;
     }
     return result;
 }

 		Jerome

Interesting. I don't get how your version is equivalent to mine, and I 
don't get how it actually works, but it seems to. Could you please explain?

The program below compiles, runs, and prints

total=100000000; equal=3850865 (3.85087%)

Here it is:

import std.contracts, std.stdio;

uint maxOR(uint maxa, uint maxb) {
     return maxa | ((1 << highbit (maxb))-1);
}

uint highbit(uint n) {
    auto result = 0;
    if (n & 0xFFFF0000) {
       result += 16;
       n = n >> 16;
    }
    if (n & 0xFF00) {
       result += 8;
       n = n >> 8;
    }
    if (n & 0xF0) {
       result += 4;
       n = n >> 4;
    }
    if (n & 0xC) {
       result += 2;
       n = n >> 2;
    }
    if (n & 0x2) {
       result += 1;
       n = n >> 1;
    }
    if (n & 0x1) {
       result += 1;
    }
    return result;
}

void main() {
     ulong total, equal;
     uint N = 10000;
     foreach (a; 0 .. N) {
         foreach (b; 0 .. N) {
             auto c = maxOR(a, b);
             enforce((a|b) <= c);
             if ((a|b) == c) equal++;
             total++;
         }
     }
     writeln("total=", total, "; equal=", equal,
             " (", equal * 100. / total, "%)");
}

However, your method is not the tightest; seems like mine is tighter. 
When I replaced maxOR with this:

uint maxOR2(uint maxa, uint maxb) {
     if (maxa < maxb) return maxOR(maxb, maxa);
     uint candidate = 0;
     uint mask = 1u << 31;
     for (uint i = 0; i < 32; ++i, mask >>= 1) {
         if (maxa & mask) continue;
         auto t = candidate | (1 << (31 - i));
         if (t <= maxb) candidate = t;
     }
     return maxa | candidate;
}

I got:

total=100000000; equal=9822516 (9.82252%)

Besides, I verified that my method returns numbers <= yours.


Andrei

Apr 10 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 04/10/2010 05:33 PM, Andrei Alexandrescu wrote:
[snip]

One more interesting detail. I simplified the routine to:

uint maxOR(uint maxa, uint maxb) {
     uint candidate = 0;
     uint mask = 1u << 31;
     for (; mask; mask >>= 1) {
         if (maxa & mask) continue;
         auto t = candidate | mask;
         if (t <= maxb) candidate = t;
     }
     return maxa | candidate;
}

Among other things I remove the test whether maxa >= maxb at the 
beginning of the function. I now obtain:

total=100000000; equal=14585400 (14.5854%)

so this function is better than all others so far. But I don't 
understand why it beats this one:

uint maxOR(uint maxa, uint maxb) {
     if (maxa < maxb) return maxOR(maxb, maxa); // added
     uint candidate = 0;
     uint mask = 1u << 31;
     for (; mask; mask >>= 1) {
         if (maxa & mask) continue;
         auto t = candidate | mask;
         if (t <= maxb) candidate = t;
     }
     return maxa | candidate;
}


Andrei

Apr 10 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 04/10/2010 05:43 PM, Andrei Alexandrescu wrote:
 On 04/10/2010 05:33 PM, Andrei Alexandrescu wrote:
 [snip]

 One more interesting detail.

Never mind that. I mistakenly used maxOR instead of maxOR2.

To summarize: best result is

total=100000000; equal=14585400 (14.5854%)

with the function:

uint maxOR(uint maxa, uint maxb) {
     uint candidate = 0;
     uint mask = 1u << 31;
     for (; mask; mask >>= 1) {
         if (maxa & mask) continue;
         auto t = candidate | mask;
         if (t <= maxb) candidate = t;
     }
     return maxa | candidate;
}

Inserting a test and a recursive call at the top keeps the result but 
actually slows down execution a bit.


Andrei

Apr 10 2010

=?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= <jeberger free.fr> writes:

Andrei Alexandrescu wrote:
 Interesting. I don't get how your version is equivalent to mine, and I
 don't get how it actually works, but it seems to. Could you please expl=

ain?
=20

	It's not equivalent to yours, because it was a bit late and I was
tired. Moreover it doesn't work: it should be:
maxa | ((1 << (highbit (maxb)+1))-1)

	Which still isn't as tight as yours but at least works :) Between 0
and 64, it is optimal for 92% cases whereas yours is optimal for
over 98% cases.

 void main() {
     ulong total, equal;
     uint N =3D 10000;
     foreach (a; 0 .. N) {
         foreach (b; 0 .. N) {
             auto c =3D maxOR(a, b);
             enforce((a|b) <=3D c);
             if ((a|b) =3D=3D c) equal++;
             total++;
         }
     }
     writeln("total=3D", total, "; equal=3D", equal,
             " (", equal * 100. / total, "%)");
 }
=20

	Note that this is incomplete. Besides, it is very easy to get 100%
accuracy with this test: just define maxOR to be maxA|maxB :) I used
the following C++ code for my tests:

int main (int, char**)
{
   int  count      =3D 0;
   int  countTight =3D 0;
   for (uint32_t minA =3D 0; minA < 64; ++minA)
      for (uint32_t maxA =3D minA; maxA < 64; ++maxA)
         for (uint32_t minB =3D 0; minB < 64; ++minB)
            for (uint32_t maxB =3D minB; maxB < 64; ++maxB)
            {
               bool reached =3D false;
               uint32_t max =3D maxOr (minA, minB, maxA, maxB);
               for (uint32_t a =3D minA; a <=3D maxA; ++a)
                  for (uint32_t b =3D minB; b <=3D maxB; ++b) {
                     if ((a|b) > max) {
                        printf ("minA=3D%x\n"
                                "maxA=3D%x\n"
                                "minB=3D%x\n"
                                "maxB=3D%x\n"
                                "a=3D%x\n"
                                "b=3D%x\n"
                                "a|b=3D%x\n"
                                "maxOr=3D%x\n",
                                minA, maxA, minB, maxB,
                                a, b, a|b, max);
                        exit (1);
                     }
                     if ((a|b) =3D=3D max) reached =3D true;
                  }
               if (reached)
                  ++countTight;
               ++count;
            }
   printf ("Total: %d\n", count);
   printf ("Tight: %d (%g%%)\n",
           countTight, countTight * 100. / count);
   printf ("Loose:  %d (%g%%)\n",
           (count - countTight),
           (count - countTight) * 100. / count);
   return 0;
}

	OTOH, my solution was slightly faster (4.98s to your 5.34s)

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Apr 11 2010

=?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= <jeberger free.fr> writes:

	OK, I found a solution that is optimal in over 99% of the cases
(99.195% to be precise), while still being fast (because it avoids
looping over the bits):

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D8<------------------------------
uint32_t maxOr (uint32_t    minA, uint32_t minB,
                uint32_t    maxA, uint32_t maxB)
{
   if (maxA =3D=3D 0) return maxB;
   if (maxB =3D=3D 0) return maxA;

   uint32_t a =3D maxA ^ minA;
   if (a !=3D 0)
      a =3D ((1 << highbit (a)) - 1);

   uint32_t b =3D maxB ^ minB;
   if (b !=3D 0)
      b =3D ((1 << highbit (b)) - 1);

   uint32_t  t =3D maxA & maxB;
   if (t !=3D 0)
      t =3D ((1 << highbit (t)) - 1);

   return min (maxA|maxB|a|b, maxA|maxB|t);
}
------------------------------>8=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

Mostly it's black magic ;) and it's **much** simpler than the first
version that reached those performances. The explanation is in the
rest of the message.

I will only  talk  about the ``a``  side  of the problem here   (i.e
assume ``maxB=3D=3DminB``). The ``b`` side is symmetrical.

The idea is to build a value that is  between minA and maxA and will
set as many bits as possible when or'ed with maxB:

- The first thing  we do is to  notice that minA  and  maxA have the
  following  structure   (in binary):  ``minA=3DA0x`` and ``maxA=3DA1y``
  where  the ``A`` part is  identical.  Any  number between minA and
  maxA will therefore be of the form  ``a=3DAz``.  A very conservative
  estimate tells  us   that if  we   set  ``z`` to all  ones,   then
  ``maxB|maxA|z``  will be  greater than  ``maxB|a``  for all ``a``.
  This is computed by ``(1 << highbit (maxA ^ minA)) - 1``;

- Another way to look at the  problem is to  say that a ``0`` bit in
  maxA cannot be  set  unless some higher  ``1`` bit  is unset.  For
  example if maxA is ``0b10`` then if we want  to set bit 0, we need
  to unset bit 1 (which will give us ``0b01``).  So by doing this we
  get a lower  value for ``a|maxB`` unless  this bit was already set
  in maxB. The expression ``maxA & maxB`` gives us  the bits that we
  can unset. Therefore only bits that  are less significant than the
  high bit of ``maxA & maxB`` may  be set.  This  is stored into the
  variable ``t``;

- Either  method  works,  but  neither  is  perfect.  By  taking the
  minimum   of the  two  results, we  get  the best   of both worlds
  (although still isn't perfect).

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Apr 11 2010

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

� wrote:

 The idea is to build a value that is  between minA and maxA and will
 set as many bits as possible when or'ed with maxB:

The assumption that maxB would be the value that produces the maximum 
a|b is not correct. A lower valued b may fill more gaps in the bit 
representation of what is calculated from min_a and max_a.

Your function failed for me with the following values:

            min_a 00000000000000000000000011001000 000000c8        200
            max_a 00000000000000000000001100001111 0000030f        783
            min_b 00000000000000000000000001000101 00000045         69
            max_b 00000000000000000000001001100001 00000261        609
       calculated 00000000000000000000001001100101 00000265        613
WRONG! empirical 00000000000000000000001111111111 000003ff       1023
        emp_max_a 00000000000000000000000110011110 0000019e        414
        emp_max_b 00000000000000000000001001100001 00000261        609

Please see my test code elsewhere in the same thread: :)
 
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108851

Ali

Apr 11 2010

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:

 boundary="------------020801040508020503030409"

This is a multi-part message in MIME format.
--------------020801040508020503030409
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Ali =C3=87ehreli wrote:
 =EF=BF=BD wrote:
=20
 The idea is to build a value that is  between minA and maxA and will
 set as many bits as possible when or'ed with maxB:

=20
 The assumption that maxB would be the value that produces the maximum
 a|b is not correct. A lower valued b may fill more gaps in the bit
 representation of what is calculated from min_a and max_a.
=20
 Your function failed for me with the following values:
=20
            min_a 00000000000000000000000011001000 000000c8        200
            max_a 00000000000000000000001100001111 0000030f        783
            min_b 00000000000000000000000001000101 00000045         69
            max_b 00000000000000000000001001100001 00000261        609
       calculated 00000000000000000000001001100101 00000265        613
 WRONG! empirical 00000000000000000000001111111111 000003ff       1023
        emp_max_a 00000000000000000000000110011110 0000019e        414
        emp_max_b 00000000000000000000001001100001 00000261        609
=20
 Please see my test code elsewhere in the same thread: :)
=20
 http://www.digitalmars.com/webnews/newsgroups.php?art_group=3Ddigitalma=

rs.D&article_id=3D108851
=20

	The "calculated" value above obviously was not computed with my
function! Since the return value from my function includes maxA and
maxB, at least all bits that are set in either of those should be
set in the output.

	I've run my code with those input and the result is 3ff as
expected... (See attached source file).

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

--------------020801040508020503030409
Content-Type: text/x-c++src;
 name="maxor.cc"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline;
 filename="maxor.cc"

#include <assert.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

int highbit (uint32_t n)
{
   assert (n !=3D 0);

   int  result =3D 0;
   if (n & 0xFFFF0000) {
      result  +=3D 16;
      n      >>=3D 16;
   }
   if (n & 0xFF00) {
      result  +=3D 8;
      n      >>=3D 8;
   }
   if (n & 0xF0) {
      result  +=3D 4;
      n      >>=3D 4;
   }
   if (n & 0xC) {
      result  +=3D 2;
      n      >>=3D 2;
   }
   if (n & 0x2) {
      result  +=3D 1;
      n      >>=3D 1;
   }
   return result;
}

inline uint32_t min (uint32_t a, uint32_t b)
{
   return (a<b) ? a : b;
}

uint32_t maxOr (uint32_t    minA, uint32_t minB,
                uint32_t    maxA, uint32_t maxB)
{
   assert (minA <=3D maxA);
   assert (minB <=3D maxB);

   if (maxA =3D=3D 0) return maxB;
   if (maxB =3D=3D 0) return maxA;

   uint32_t a =3D maxA ^ minA;
   if (a !=3D 0)
      a =3D ((1 << highbit (a)) - 1);

   uint32_t b =3D maxB ^ minB;
   if (b !=3D 0)
      b =3D ((1 << highbit (b)) - 1);

   uint32_t  t =3D maxA & maxB;
   if (t !=3D 0)
      t =3D ((1 << highbit (t)) - 1);

   return min (maxA|maxB|a|b, maxA|maxB|t);
}

int main (int, char**)
{
   uint32_t minA =3D 200;
   uint32_t maxA =3D 783;
   uint32_t minB =3D 69;
   uint32_t maxB =3D 609;
   printf ("minA=3D%x\n"
           "maxA=3D%x\n"
           "minB=3D%x\n"
           "maxB=3D%x\n"
           "maxOr=3D%x\n",
           minA, maxA, minB, maxB,
           maxOr (minA, minB, maxA, maxB));
   return 0;
}

--------------020801040508020503030409--

Apr 11 2010

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

Jérôme M. Berger wrote:
 Ali Çehreli wrote:
 � wrote:

 The idea is to build a value that is  between minA and maxA and will
 set as many bits as possible when or'ed with maxB:

 The assumption that maxB would be the value that produces the maximum
 a|b is not correct. A lower valued b may fill more gaps in the bit
 representation of what is calculated from min_a and max_a.

 Your function failed for me with the following values:

            min_a 00000000000000000000000011001000 000000c8        200
            max_a 00000000000000000000001100001111 0000030f        783
            min_b 00000000000000000000000001000101 00000045         69
            max_b 00000000000000000000001001100001 00000261        609
       calculated 00000000000000000000001001100101 00000265        613
 WRONG! empirical 00000000000000000000001111111111 000003ff       1023
        emp_max_a 00000000000000000000000110011110 0000019e        414
        emp_max_b 00000000000000000000001001100001 00000261        609

 Please see my test code elsewhere in the same thread: :)

 


http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108851

 	The "calculated" value above obviously was not computed with my
 function!

My mistake: It was your function but the highbit() that I used was not 
correct. The one I used was returning the *value* of the highest bit.

 Since the return value from my function includes maxA and
 maxB, at least all bits that are set in either of those should be
 set in the output.

 	I've run my code with those input and the result is 3ff as
 expected... (See attached source file).

Perhaps my test code is wrong; but I can't find my error. :/

Your function reports 0b_111 for these set of values:

         min_a = 5;
         max_a = 6;
         min_b = 4;
         max_b = 4;

But the maximum value of a|b is 6.

Ali

Apr 11 2010

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:

Ali =C3=87ehreli wrote:
 J=C3=A9r=C3=B4me M. Berger wrote:
 Ali =C3=87ehreli wrote:
 =EF=BF=BD wrote:

 The idea is to build a value that is  between minA and maxA and will=




 set as many bits as possible when or'ed with maxB:

 The assumption that maxB would be the value that produces the maximum=



 a|b is not correct. A lower valued b may fill more gaps in the bit
 representation of what is calculated from min_a and max_a.

 Your function failed for me with the following values:

            min_a 00000000000000000000000011001000 000000c8        200=



            max_a 00000000000000000000001100001111 0000030f        783=



            min_b 00000000000000000000000001000101 00000045         69=



            max_b 00000000000000000000001001100001 00000261        609=



       calculated 00000000000000000000001001100101 00000265        613=



 WRONG! empirical 00000000000000000000001111111111 000003ff       1023=



        emp_max_a 00000000000000000000000110011110 0000019e        414=



        emp_max_b 00000000000000000000001001100001 00000261        609=



 Please see my test code elsewhere in the same thread: :)


 http://www.digitalmars.com/webnews/newsgroups.php?art_group=3Ddigitalma=

rs.D&article_id=3D108851
=20

     The "calculated" value above obviously was not computed with my
 function!

=20
 My mistake: It was your function but the highbit() that I used was not
 correct. The one I used was returning the *value* of the highest bit.
=20
 Since the return value from my function includes maxA and
 maxB, at least all bits that are set in either of those should be
 set in the output.

     I've run my code with those input and the result is 3ff as
 expected... (See attached source file).

=20
 Perhaps my test code is wrong; but I can't find my error. :/
=20
 Your function reports 0b_111 for these set of values:
=20
         min_a =3D 5;
         max_a =3D 6;
         min_b =3D 4;
         max_b =3D 4;
=20
 But the maximum value of a|b is 6.
=20

	Yes, like I said with my code, it is conservative. It will give the
optimal result for over 99% of the cases but give a slightly higher
value for the rest. I'll post a new version that's 100% precise in a
few minutes.

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Apr 12 2010

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

Jérôme M. Berger wrote:

 Your function reports 0b_111 for these set of values:

         min_a = 5;
         max_a = 6;
         min_b = 4;
         max_b = 4;

 But the maximum value of a|b is 6.

 	Yes, like I said with my code, it is conservative. It will give the
 optimal result for over 99% of the cases but give a slightly higher
 value for the rest.

Now I see my problem: I took Andrei's puzzle literally. The implication 
of _compiler_implementation_ was too subtle for me.

That's the reason why none of my "correctly inaccurate" solutions has 
been posted to this thread. :)

Ali

Apr 12 2010

=?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= <jeberger free.fr> writes:

 boundary="------------010309040703050906000802"

This is a multi-part message in MIME format.
--------------010309040703050906000802
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

J=E9r=F4me M. Berger wrote:
 	OK, I found a solution that is optimal in over 99% of the cases
 (99.195% to be precise), while still being fast (because it avoids
 looping over the bits):
=20

	I've run my function and Andrei's on all possible min, max pairs in
0..299 without checking, just for the timing. On my computer (Athlon
X2 64  2GHz), my function took 50s and Andrei's took 266s. The
difference should be even bigger for larger numbers.

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

--------------010309040703050906000802
Content-Type: text/x-c++src;
 name="maxor.cc"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline;
 filename="maxor.cc"

#include <assert.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

#define N   300
// Comment the following line to get Andrei's implementation
#define MINE
// Comment the following line for exhaustive checking (slow! don't
// forget to reduce N)
#define BENCH


int highbit (uint32_t n)
{
   assert (n !=3D 0);

   int  result =3D 0;
   if (n & 0xFFFF0000) {
      result  +=3D 16;
      n      >>=3D 16;
   }
   if (n & 0xFF00) {
      result  +=3D 8;
      n      >>=3D 8;
   }
   if (n & 0xF0) {
      result  +=3D 4;
      n      >>=3D 4;
   }
   if (n & 0xC) {
      result  +=3D 2;
      n      >>=3D 2;
   }
   if (n & 0x2) {
      result  +=3D 1;
      n      >>=3D 1;
   }
   return result;
}

inline uint32_t min (uint32_t a, uint32_t b)
{
   return (a<b) ? a : b;
}

#ifdef MINE
uint32_t maxOr (uint32_t    minA, uint32_t minB,
                uint32_t    maxA, uint32_t maxB)
{
   assert (minA <=3D maxA);
   assert (minB <=3D maxB);

   if (maxA =3D=3D 0) return maxB;
   if (maxB =3D=3D 0) return maxA;

   uint32_t a =3D maxA ^ minA;
   if (a !=3D 0)
      a =3D ((1 << highbit (a)) - 1);

   uint32_t b =3D maxB ^ minB;
   if (b !=3D 0)
      b =3D ((1 << highbit (b)) - 1);

   uint32_t  t =3D maxA & maxB;
   if (t !=3D 0)
      t =3D ((1 << highbit (t)) - 1);

   return min (maxA|maxB|a|b, maxA|maxB|t);
}
#else                           // Andrei's
uint32_t maxOr (uint32_t, uint32_t, uint32_t maxa, uint32_t maxb)
{
   uint32_t candidate =3D 0;
   uint32_t mask =3D 1u << 31;
   for (; mask; mask >>=3D 1) {
      if (maxa & mask) continue;
      uint32_t t =3D candidate | mask;
      if (t <=3D maxb) candidate =3D t;
   }
   return maxa | candidate;
}
#endif

int main (int, char**)
{
   int  count      =3D 0;
   int  countTight =3D 0;
   for (uint32_t minA =3D 0; minA < N; ++minA)
      for (uint32_t maxA =3D minA; maxA < N; ++maxA)
         for (uint32_t minB =3D 0; minB < N; ++minB)
            for (uint32_t maxB =3D minB; maxB < N; ++maxB)
            {
               bool reached =3D false;
               uint32_t max =3D maxOr (minA, minB, maxA, maxB);
#ifdef BENCH
               count +=3D max;
#else  // Check
               for (uint32_t a =3D minA; a <=3D maxA; ++a)
                  for (uint32_t b =3D minB; b <=3D maxB; ++b) {
                     if ((a|b) > max) {
                        printf ("minA=3D%x\n"
                                "maxA=3D%x\n"
                                "minB=3D%x\n"
                                "maxB=3D%x\n"
                                "a=3D%x\n"
                                "b=3D%x\n"
                                "a|b=3D%x\n"
                                "maxOr=3D%x\n",
                                minA, maxA, minB, maxB,
                                a, b, a|b, max);
                        exit (1);
                     }
                     if ((a|b) =3D=3D max) reached =3D true;
                  }
               if (reached)
                  ++countTight;
               ++count;
#endif
            }
#ifdef BENCH
   printf ("Result: %d (this number doesn't mean anything, it is only her=
e\n"
           "to make sure the compiler doesn't optimize everything away...=
\n",
           count);
#else
   printf ("Total: %d\n", count);
   printf ("Tight: %d (%g%%)\n",
           countTight, countTight * 100. / count);
   printf ("Loose:  %d (%g%%)\n",
           (count - countTight),
           (count - countTight) * 100. / count);
#endif
   return 0;
}

--------------010309040703050906000802--

Apr 11 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

J�r�me M. Berger Wrote:

 J�r�me M. Berger wrote:
 	OK, I found a solution that is optimal in over 99% of the cases
 (99.195% to be precise), while still being fast (because it avoids
 looping over the bits):
 

 	I've run my function and Andrei's on all possible min, max pairs in
 0..299 without checking, just for the timing. On my computer (Athlon
 X2 64  2GHz), my function took 50s and Andrei's took 266s. The
 difference should be even bigger for larger numbers.

Can I ask, what is the point of having a non-exact solution (unless you are
just doing this for fun)?

We are talking about range propagation, a function of the compiler, not a
function of the compiled program.  Therefore, slower but more exact functions
should be preferred, since they only affect the compile time.  Note that you
will only need to do this range propagation on lines that "or" two values
together, and something that reduces the runtime of the compiler by 216
seconds, but only when compiling enough code to have over 8 billion 'or'
operations in it (300^4), I don't think is really that important.  Give me the
exact solution that prevents me from having to cast when the compiler can prove
it, even if the runtime of the compiler is slightly slower.

Just my $.02

-Steve

Apr 11 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Sun, Apr 11, 2010 at 10:00:41PM -0400, Steven Schveighoffer wrote:
 We are talking about range propagation, a function of the compiler, not a
function of the compiled program.  Therefore, slower but more exact functions
should be preferred, since they only affect the compile time.

I agree with this. While my solution has a certain beauty to it and is
fast, your solution accurately covers all the bases, whereas mine fails with
min > 1. I say we go with your's.

-- 
Adam D. Ruppe
http://arsdnet.net

Apr 11 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 04/11/2010 09:18 PM, Adam D. Ruppe wrote:
 On Sun, Apr 11, 2010 at 10:00:41PM -0400, Steven Schveighoffer wrote:
 We are talking about range propagation, a function of the compiler, not a
function of the compiled program.  Therefore, slower but more exact functions
should be preferred, since they only affect the compile time.

 I agree with this. While my solution has a certain beauty to it and is
 fast, your solution accurately covers all the bases, whereas mine fails with
 min>  1. I say we go with your's.

Agreed. Steve's solution is the best we have for unsigned numbers.

Andrei

Apr 12 2010

=?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= <jeberger free.fr> writes:

Andrei Alexandrescu wrote:
 On 04/11/2010 09:18 PM, Adam D. Ruppe wrote:
 On Sun, Apr 11, 2010 at 10:00:41PM -0400, Steven Schveighoffer wrote:
 We are talking about range propagation, a function of the compiler,
 not a function of the compiled program.  Therefore, slower but more
 exact functions should be preferred, since they only affect the
 compile time.

 I agree with this. While my solution has a certain beauty to it and is=


 fast, your solution accurately covers all the bases, whereas mine
 fails with
 min>  1. I say we go with your's.

=20
 Agreed. Steve's solution is the best we have for unsigned numbers.
=20
 Andrei
=20

	Nope, not anymore...

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Apr 12 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 04/12/2010 12:57 PM, "J�r�me M. Berger" wrote:
 Andrei Alexandrescu wrote:
 On 04/11/2010 09:18 PM, Adam D. Ruppe wrote:
 On Sun, Apr 11, 2010 at 10:00:41PM -0400, Steven Schveighoffer wrote:
 We are talking about range propagation, a function of the compiler,
 not a function of the compiled program.  Therefore, slower but more
 exact functions should be preferred, since they only affect the
 compile time.

 I agree with this. While my solution has a certain beauty to it and is
 fast, your solution accurately covers all the bases, whereas mine
 fails with
 min>   1. I say we go with your's.

 Agreed. Steve's solution is the best we have for unsigned numbers.

 Andrei

 	Nope, not anymore...

 		Jerome

Looks great, thanks Jerome!

Now, how about that case in which some or all of the ranges involved 
include negative values? A simple approach is to define them in terms of 
ranges for unsigned numbers. Here are the cases I identified:

a) If both ranges cross zero, then:

minOR == setmsb(min(
   minOR(nomsb(min_a), nomsb(-1), nomsb(min_b), nomsb(max_b)),
   minOR(nomsb(min_a), nomsb(max_a), nomsb(min_b), nomsb(-1))))
maxOR == maxOR(0, max_a, 0, max_b)

b) If both ranges are negative, then:

minOR == setmsb(minOR(nomsb(max_a), nomsb(min_a), nomsb(max_b), 
nomsb(min_b)))
maxOR == setmsb(maxOR(nomsb(max_a), nomsb(min_a), nomsb(max_b), 
nomsb(min_b)))

c) If a crosses zero and b doesn't:

minOR == setmsb(minOR(nomsb(min_a), nomsb(-1), min_b, max_b))
maxOR == maxOR(0, max_a, min_b, max_b)

The primitive nomsb clears off the sign bit in a number, and setmsb sets it.

Makes sense?



Andrei

Apr 16 2010

Don <nospam nospam.com> writes:

Andrei Alexandrescu wrote:
 On 04/12/2010 12:57 PM, "J�r�me M. Berger" wrote:
 Andrei Alexandrescu wrote:
 On 04/11/2010 09:18 PM, Adam D. Ruppe wrote:
 On Sun, Apr 11, 2010 at 10:00:41PM -0400, Steven Schveighoffer wrote:
 We are talking about range propagation, a function of the compiler,
 not a function of the compiled program.  Therefore, slower but more
 exact functions should be preferred, since they only affect the
 compile time.

 I agree with this. While my solution has a certain beauty to it and is
 fast, your solution accurately covers all the bases, whereas mine
 fails with
 min>   1. I say we go with your's.

 Agreed. Steve's solution is the best we have for unsigned numbers.

 Andrei

     Nope, not anymore...

         Jerome

 
 Looks great, thanks Jerome!
 
 Now, how about that case in which some or all of the ranges involved 
 include negative values? A simple approach is to define them in terms of 
 ranges for unsigned numbers. Here are the cases I identified:
 
 a) If both ranges cross zero, then:
 
 minOR == setmsb(min(
   minOR(nomsb(min_a), nomsb(-1), nomsb(min_b), nomsb(max_b)),
   minOR(nomsb(min_a), nomsb(max_a), nomsb(min_b), nomsb(-1))))
 maxOR == maxOR(0, max_a, 0, max_b)
 
 b) If both ranges are negative, then:
 
 minOR == setmsb(minOR(nomsb(max_a), nomsb(min_a), nomsb(max_b), 
 nomsb(min_b)))
 maxOR == setmsb(maxOR(nomsb(max_a), nomsb(min_a), nomsb(max_b), 
 nomsb(min_b)))
 
 c) If a crosses zero and b doesn't:
 
 minOR == setmsb(minOR(nomsb(min_a), nomsb(-1), min_b, max_b))
 maxOR == maxOR(0, max_a, min_b, max_b)
 
 The primitive nomsb clears off the sign bit in a number, and setmsb sets 
 it.
 
 Makes sense?
 
 
 
 Andrei

A comment: The way DMD currently does range propagation (storing range 
as union{long, ulong}) is wrong. Currently when mixed signed+-unsigned 
operations happen, it gives very pessimistic results.
The sign bit should be stored separately.

sign value
1    1xxxxx    Negative, long.min..-1
1    0xxxxx    (Never occurs)
0    0xxxxx    0..long.max
0    1xxxxx    long.max+1..ulong.max

I'm not sure if that makes this case a little more, or a little less 
complicated. But this defines the problem a bit better.

Apr 20 2010

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:

Steven Schveighoffer wrote:
 J=EF=BF=BDr=EF=BF=BDme M. Berger Wrote:
=20
 J=EF=BF=BDr=EF=BF=BDme M. Berger wrote:
 	OK, I found a solution that is optimal in over 99% of the cases
 (99.195% to be precise), while still being fast (because it avoids
 looping over the bits):

 	I've run my function and Andrei's on all possible min, max pairs in
 0..299 without checking, just for the timing. On my computer (Athlon
 X2 64  2GHz), my function took 50s and Andrei's took 266s. The
 difference should be even bigger for larger numbers.

=20
 Can I ask, what is the point of having a non-exact solution (unless you=

 are just doing this for fun)?
=20
 We are talking about range propagation, a function of the compiler, not=

 a function of the compiled program.  Therefore, slower but more exact fu=
nctions should be preferred, since they only affect the compile time.  No=
te that you will only need to do this range propagation on lines that "or=
" two values together, and something that reduces the runtime of the comp=
iler by 216 seconds, but only when compiling enough code to have over 8 b=
illion 'or' operations in it (300^4), I don't think is really that import=
ant.  Give me the exact solution that prevents me from having to cast whe=
n the compiler can prove it, even if the runtime of the compiler is sligh=
tly slower.
=20

	Funny you should say that given the current thread comparing the
speed of the D compiler to that of the Go compiler...

	We are talking about range propagation, a function of the compiler,
not a function of the compiled program. Since we can't get a 100%
accurate representation of the possible values anyway (even yours
might leave holes in the middle after all), then it might be better
to choose a faster, slightly less precise algorithm if the
difference is not too great. That's the difference between a
compiler and a full-fledged static code analysis an program prover.

	Anyway, the point is moot, I have a new version of my algorithm
with 100% precision and high speed.

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Apr 12 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 12 Apr 2010 13:45:14 -0400, Jérôme M. Berger <jeberger free.fr>  
wrote:

 Steven Schveighoffer wrote:
 J�r�me M. Berger Wrote:

 J�r�me M. Berger wrote:
 	OK, I found a solution that is optimal in over 99% of the cases
 (99.195% to be precise), while still being fast (because it avoids
 looping over the bits):

 	I've run my function and Andrei's on all possible min, max pairs in
 0..299 without checking, just for the timing. On my computer (Athlon
 X2 64  2GHz), my function took 50s and Andrei's took 266s. The
 difference should be even bigger for larger numbers.

 Can I ask, what is the point of having a non-exact solution (unless you  
 are just doing this for fun)?

 We are talking about range propagation, a function of the compiler, not  
 a function of the compiled program.  Therefore, slower but more exact  
 functions should be preferred, since they only affect the compile  
 time.  Note that you will only need to do this range propagation on  
 lines that "or" two values together, and something that reduces the  
 runtime of the compiler by 216 seconds, but only when compiling enough  
 code to have over 8 billion 'or' operations in it (300^4), I don't  
 think is really that important.  Give me the exact solution that  
 prevents me from having to cast when the compiler can prove it, even if  
 the runtime of the compiler is slightly slower.

 	Funny you should say that given the current thread comparing the
 speed of the D compiler to that of the Go compiler...

My point was simply that the amount of time saved is relative to the size  
of the program being compiled.  If we assume conservatively that every  
other line in a program has bitwise or in it, then you are talking about a  
16 billion line program, meaning the 216 seconds you save is insignificant  
compared to the total compile time.  My point was that your fast solution  
that is inaccurate is not preferable because nobody notices the speed,  
they just notice the inaccuracy.

 	We are talking about range propagation, a function of the compiler,
 not a function of the compiled program. Since we can't get a 100%
 accurate representation of the possible values anyway (even yours
 might leave holes in the middle after all), then it might be better
 to choose a faster, slightly less precise algorithm if the
 difference is not too great. That's the difference between a
 compiler and a full-fledged static code analysis an program prover.

When we're talking about the difference between O(1) and O(lgn), I'll take  
accuracy over speed in my compiler any day.  The solution should be 100%  
accurate for the problem statement.  If the problem statement is not what  
we need, then we need a new problem statement :)  Solving the problem  
statement for 99% of values is not good enough.

 	Anyway, the point is moot, I have a new version of my algorithm
 with 100% precision and high speed.

Yes, I'm still trying to understand how it works :)

-Steve

Apr 12 2010

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:

Steven Schveighoffer wrote:
 On Mon, 12 Apr 2010 13:45:14 -0400, J=C3=A9r=C3=B4me M. Berger <jeberge=

r free.fr>
 wrote:
=20
     We are talking about range propagation, a function of the compiler=


,
 not a function of the compiled program. Since we can't get a 100%
 accurate representation of the possible values anyway (even yours
 might leave holes in the middle after all), then it might be better
 to choose a faster, slightly less precise algorithm if the
 difference is not too great. That's the difference between a
 compiler and a full-fledged static code analysis an program prover.

=20
 When we're talking about the difference between O(1) and O(lgn), I'll
 take accuracy over speed in my compiler any day.  The solution should b=

e
 100% accurate for the problem statement.  If the problem statement is
 not what we need, then we need a new problem statement :)  Solving the
 problem statement for 99% of values is not good enough.
=20

	And when we're talking about the difference between 10s and 55s for
a minimal loss of accuracy, which will you take? Especially if the
accuracy loss is less than is lost elsewhere (due to holes in the
ranges).

     Anyway, the point is moot, I have a new version of my algorithm
 with 100% precision and high speed.

=20
 Yes, I'm still trying to understand how it works :)
=20

	:)

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Apr 12 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

Jérôme M. Berger Wrote:

 Steven Schveighoffer wrote:
 When we're talking about the difference between O(1) and O(lgn), I'll
 take accuracy over speed in my compiler any day.

 	And when we're talking about the difference between 10s and 55s for
 a minimal loss of accuracy, which will you take? Especially if the
 accuracy loss is less than is lost elsewhere (due to holes in the
 ranges).

Really?  You rebuilt the compiler with your range propagation algorithm and
verified that it adds 10 seconds versus an accurate one that adds 55s?  How
much time did the compiler spend to compile?  I'd hazard to guess that a code
base that adds 10s worth of your algorithm takes at least a few hours to
compile.  Is 55s that bad at that point?

Again, if it takes the compiler an extra insignificant amount of time to be
more accurate, I'm all for accuracy over speed when you get down to that level
of insignificance.  I'd say the point of pain has to be at least 10% of the
compile time before it makes any sizable difference.

-Steve

Apr 12 2010

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:

Steven Schveighoffer wrote:
 J=C3=A9r=C3=B4me M. Berger Wrote:
=20
 Steven Schveighoffer wrote:
 When we're talking about the difference between O(1) and O(lgn), I'll=



 take accuracy over speed in my compiler any day.

 	And when we're talking about the difference between 10s and 55s for
 a minimal loss of accuracy, which will you take? Especially if the
 accuracy loss is less than is lost elsewhere (due to holes in the
 ranges).

=20
 Really?  You rebuilt the compiler with your range propagation algorithm=

 and verified that it adds 10 seconds versus an accurate one that adds 55=
s?  How much time did the compiler spend to compile?  I'd hazard to guess=
 that a code base that adds 10s worth of your algorithm takes at least a =
few hours to compile.  Is 55s that bad at that point?
=20
 Again, if it takes the compiler an extra insignificant amount of time t=

o be more accurate, I'm all for accuracy over speed when you get down to =
that level of insignificance.  I'd say the point of pain has to be at lea=
st 10% of the compile time before it makes any sizable difference.
=20

	My point is that if you always choose an algorithm that is 5 to 6
times slower just because it brings extra precision which you may
not really need, then you will wind up with a compiler that is 5 to
6 times slower than it needs to be. Sure the difference on *one*
function is not great in absolute terms, but if you make the same
choice for *all* functions, then where do you go?

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Apr 13 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 13 Apr 2010 13:37:13 -0400, Jérôme M. Berger <jeberger free.fr>  
wrote:

 Steven Schveighoffer wrote:
 Jérôme M. Berger Wrote:

 Steven Schveighoffer wrote:
 When we're talking about the difference between O(1) and O(lgn), I'll
 take accuracy over speed in my compiler any day.

 	And when we're talking about the difference between 10s and 55s for
 a minimal loss of accuracy, which will you take? Especially if the
 accuracy loss is less than is lost elsewhere (due to holes in the
 ranges).

 Really?  You rebuilt the compiler with your range propagation algorithm  
 and verified that it adds 10 seconds versus an accurate one that adds  
 55s?  How much time did the compiler spend to compile?  I'd hazard to  
 guess that a code base that adds 10s worth of your algorithm takes at  
 least a few hours to compile.  Is 55s that bad at that point?

 Again, if it takes the compiler an extra insignificant amount of time  
 to be more accurate, I'm all for accuracy over speed when you get down  
 to that level of insignificance.  I'd say the point of pain has to be  
 at least 10% of the compile time before it makes any sizable difference.

 	My point is that if you always choose an algorithm that is 5 to 6
 times slower just because it brings extra precision which you may
 not really need, then you will wind up with a compiler that is 5 to
 6 times slower than it needs to be. Sure the difference on *one*
 function is not great in absolute terms, but if you make the same
 choice for *all* functions, then where do you go?

In my opinion?  Yes, slower compilers that make code easier to write are  
better.  I don't spend lots of time compiling, I spend it writing code.   
And I don't need to babysit the compiler, it goes off and does its thing.

Performance is only important in the end result.  I'm not saying I want my  
compiler to be slow, but I want it to be accurate and useful more than I  
want it to be quick.  In this case, there is a quick and fully accurate  
solution, so it doesn't matter.  But, for instance, if the compiler could  
do a full analysis to check if variables escape their scope, and that  
makes the compiler 5x slower, then I'd rather have the compiler verify my  
work instead of quickly producing memory-corrupting code.

-Steve

Apr 13 2010

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:

Steven Schveighoffer wrote:
 In my opinion?  Yes, slower compilers that make code easier to write ar=

e
 better.  I don't spend lots of time compiling, I spend it writing code.=

=20
 And I don't need to babysit the compiler, it goes off and does its thin=

g.
=20
 Performance is only important in the end result.  I'm not saying I want=

 my compiler to be slow, but I want it to be accurate and useful more
 than I want it to be quick.  In this case, there is a quick and fully
 accurate solution, so it doesn't matter.  But, for instance, if the
 compiler could do a full analysis to check if variables escape their
 scope, and that makes the compiler 5x slower, then I'd rather have the
 compiler verify my work instead of quickly producing memory-corrupting
 code.
=20

	This is why there are static verification tools. I'm all in favor
of this kind of tools, but I don't want to have to put up with their
slowness every time I compile something. Having a quick
write-compile-test cycle is very important IMO. Then when I have a
rough outline of the program, I can run another tool (or adding a
command line option) to enable extra checking (and then it doesn't
matter if the tool takes the whole night to run and I get the
results the next morning when I get to work).

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Apr 13 2010

Fawzi Mohamed <fawzi gmx.ch> writes:

On 12-apr-10, at 21:40, Steven Schveighoffer wrote:

 On Mon, 12 Apr 2010 13:45:14 -0400, J=C3=A9r=C3=B4me M. Berger =20
 <jeberger free.fr> wrote:

 Steven Schveighoffer wrote:
 J=EF=BF=BDr=EF=BF=BDme M. Berger Wrote:

 J=EF=BF=BDr=EF=BF=BDme M. Berger wrote:
 	OK, I found a solution that is optimal in over 99% of the cases
 (99.195% to be precise), while still being fast (because it avoids
 looping over the bits):

 	I've run my function and Andrei's on all possible min, max pairs =




=20
 in
 0..299 without checking, just for the timing. On my computer =20
 (Athlon
 X2 64  2GHz), my function took 50s and Andrei's took 266s. The
 difference should be even bigger for larger numbers.

 Can I ask, what is the point of having a non-exact solution =20
 (unless you are just doing this for fun)?

 We are talking about range propagation, a function of the =20
 compiler, not a function of the compiled program.  Therefore, =20
 slower but more exact functions should be preferred, since they =20
 only affect the compile time.  Note that you will only need to do =20=



 this range propagation on lines that "or" two values together, and =20=



 something that reduces the runtime of the compiler by 216 seconds, =20=



 but only when compiling enough code to have over 8 billion 'or' =20
 operations in it (300^4), I don't think is really that important.  =20=



 Give me the exact solution that prevents me from having to cast =20
 when the compiler can prove it, even if the runtime of the =20
 compiler is slightly slower.

 	Funny you should say that given the current thread comparing the
 speed of the D compiler to that of the Go compiler...

 My point was simply that the amount of time saved is relative to the =20=

 size of the program being compiled.  If we assume conservatively =20
 that every other line in a program has bitwise or in it, then you =20
 are talking about a 16 billion line program, meaning the 216 seconds =20=

 you save is insignificant compared to the total compile time.  My =20
 point was that your fast solution that is inaccurate is not =20
 preferable because nobody notices the speed, they just notice the =20
 inaccuracy.

 	We are talking about range propagation, a function of the =


compiler,
 not a function of the compiled program. Since we can't get a 100%
 accurate representation of the possible values anyway (even yours
 might leave holes in the middle after all), then it might be better
 to choose a faster, slightly less precise algorithm if the
 difference is not too great. That's the difference between a
 compiler and a full-fledged static code analysis an program prover.

 When we're talking about the difference between O(1) and O(lgn), =20
 I'll take accuracy over speed in my compiler any day.  The solution =20=

 should be 100% accurate for the problem statement.  If the problem =20
 statement is not what we need, then we need a new problem =20
 statement :)  Solving the problem statement for 99% of values is not =20=

 good enough.

 	Anyway, the point is moot, I have a new version of my algorithm
 with 100% precision and high speed.

 Yes, I'm still trying to understand how it works :)

Sorry for the probably stupid question, but I don't understand much =20
the need of all this range propagation, in the compiler either you =20
have a a constant (and then you have the value at compile time, and =20
you don't need any range propagation, you can compare with the value), =20=

or you have a runtime value.
Do you want to explicitly add in the compiler the support for more =20
limited runtime values?
Otherwise the range of a runtime value is a priori the whole possible =20=

range, and thus any rule based on range propagation might be expressed =20=

as static type based rule (as done in C).
You can gain something for example you can know that summing 4 shorts =20=

you will never overflow an int, is this where you want to go?
What kind of bugs you are trying to avoid? Or is it simply having the =20=

max and min properties defined?

Nobody commented on my proposal, but there I see the potential for =20
bugs (1u-2)+1UL is 4294967296 and this happens also at runtime if you =20=

have, for example, a function returning size_t and another returning =20
uint, and you combine them without thinking much and then you switch =20
to 64 bits...

There I see a problem, but this you see without any special range =20
propagation, just thinking that subtraction or negation of unsigned =20
types is modulo 2^bit size, and thus cannot be then changed to another =20=

size without explicit cast.
Maybe in some cases using enums range propagation might spare a cast, =20=

but is it really an improvement?
I guess that I am missing something obvious, so I don't see the reason =20=

for range propagation, but maybe I am not the only one, so thanks for =20=

an explanation...

Fawzi=

Apr 13 2010

bearophile <bearophileHUGS lycos.com> writes:

Fawzi Mohamed:
 I guess that I am missing something obvious, so I don't see the reason  
 for range propagation, but maybe I am not the only one, so thanks for  
 an explanation...

Range propagation can improve the situation a little, so it's not bad. But it
can't replace proper integral overflows. You need integral overflows at runtime.

Bye,
bearophile

Apr 13 2010

Fawzi Mohamed <fawzi gmx.ch> writes:

	charset=US-ASCII;
	format=flowed;
	delsp=yes
Content-Transfer-Encoding: 7bit


On 13-apr-10, at 12:01, bearophile wrote:

 Fawzi Mohamed:
 I guess that I am missing something obvious, so I don't see the  
 reason
 for range propagation, but maybe I am not the only one, so thanks for
 an explanation...

 Range propagation can improve the situation a little, so it's not  
 bad. But it can't replace proper integral overflows. You need  
 integral overflows at runtime.

integral overflow are helpful only if you have automatic conversion to  
a larger type, but that breaks the compile time knowledge of the size  
of such an integer, so that you have to assume that it might need to  
be pushed to the heap.
Yes you might use some tricks (like using size_t.sizeof*8-1 bits, or a  
pointer to spare some place), but I don't think that D wants to go  
that way for the basic integers...

 Bye,
 bearophile

Apr 13 2010

Don <nospam nospam.com> writes:

Fawzi Mohamed wrote:
 
 On 12-apr-10, at 21:40, Steven Schveighoffer wrote:
 
 On Mon, 12 Apr 2010 13:45:14 -0400, Jérôme M. Berger 
 <jeberger free.fr> wrote:

 Steven Schveighoffer wrote:
 J�r�me M. Berger Wrote:

 J�r�me M. Berger wrote:
     OK, I found a solution that is optimal in over 99% of the cases
 (99.195% to be precise), while still being fast (because it avoids
 looping over the bits):

     I've run my function and Andrei's on all possible min, max 
 pairs in
 0..299 without checking, just for the timing. On my computer (Athlon
 X2 64  2GHz), my function took 50s and Andrei's took 266s. The
 difference should be even bigger for larger numbers.

 Can I ask, what is the point of having a non-exact solution (unless 
 you are just doing this for fun)?

 We are talking about range propagation, a function of the compiler, 
 not a function of the compiled program.  Therefore, slower but more 
 exact functions should be preferred, since they only affect the 
 compile time.  Note that you will only need to do this range 
 propagation on lines that "or" two values together, and something 
 that reduces the runtime of the compiler by 216 seconds, but only 
 when compiling enough code to have over 8 billion 'or' operations in 
 it (300^4), I don't think is really that important.  Give me the 
 exact solution that prevents me from having to cast when the 
 compiler can prove it, even if the runtime of the compiler is 
 slightly slower.

     Funny you should say that given the current thread comparing the
 speed of the D compiler to that of the Go compiler...

 My point was simply that the amount of time saved is relative to the 
 size of the program being compiled.  If we assume conservatively that 
 every other line in a program has bitwise or in it, then you are 
 talking about a 16 billion line program, meaning the 216 seconds you 
 save is insignificant compared to the total compile time.  My point 
 was that your fast solution that is inaccurate is not preferable 
 because nobody notices the speed, they just notice the inaccuracy.

     We are talking about range propagation, a function of the compiler,
 not a function of the compiled program. Since we can't get a 100%
 accurate representation of the possible values anyway (even yours
 might leave holes in the middle after all), then it might be better
 to choose a faster, slightly less precise algorithm if the
 difference is not too great. That's the difference between a
 compiler and a full-fledged static code analysis an program prover.

 When we're talking about the difference between O(1) and O(lgn), I'll 
 take accuracy over speed in my compiler any day.  The solution should 
 be 100% accurate for the problem statement.  If the problem statement 
 is not what we need, then we need a new problem statement :)  Solving 
 the problem statement for 99% of values is not good enough.

     Anyway, the point is moot, I have a new version of my algorithm
 with 100% precision and high speed.

 Yes, I'm still trying to understand how it works :)

 
 Sorry for the probably stupid question, but I don't understand much the 
 need of all this range propagation, in the compiler either you have a a 
 constant (and then you have the value at compile time, and you don't 
 need any range propagation, you can compare with the value), or you have 
 a runtime value.


It's been part of DMD2 for a while now. It allows you to do things like:

ubyte lowbits(int x)
{
     return x & 7;
}

without an explicit cast. The compiler knows that x&7 can safely fit 
inside a single byte.  Whereas   ((x&7) << 12) | (x&3);
does not fit, and requires an explicit cast.

Apr 13 2010

Fawzi Mohamed <fawzi gmx.ch> writes:

On 13-apr-10, at 12:02, Don wrote:

 Fawzi Mohamed wrote:
 On 12-apr-10, at 21:40, Steven Schveighoffer wrote:
 On Mon, 12 Apr 2010 13:45:14 -0400, J=C3=A9r=C3=B4me M. Berger =



<jeberger free.=20
 fr> wrote:

 Steven Schveighoffer wrote:
 J=EF=BF=BDr=EF=BF=BDme M. Berger Wrote:

 J=EF=BF=BDr=EF=BF=BDme M. Berger wrote:
    OK, I found a solution that is optimal in over 99% of the =20
 cases
 (99.195% to be precise), while still being fast (because it =20
 avoids
 looping over the bits):

    I've run my function and Andrei's on all possible min, max =20
 pairs in
 0..299 without checking, just for the timing. On my computer =20
 (Athlon
 X2 64  2GHz), my function took 50s and Andrei's took 266s. The
 difference should be even bigger for larger numbers.

 Can I ask, what is the point of having a non-exact solution =20
 (unless you are just doing this for fun)?

 We are talking about range propagation, a function of the =20
 compiler, not a function of the compiled program.  Therefore, =20
 slower but more exact functions should be preferred, since they =20=





 only affect the compile time.  Note that you will only need to =20
 do this range propagation on lines that "or" two values =20
 together, and something that reduces the runtime of the compiler =20=





 by 216 seconds, but only when compiling enough code to have over =20=





 8 billion 'or' operations in it (300^4), I don't think is really =20=





 that important.  Give me the exact solution that prevents me =20
 from having to cast when the compiler can prove it, even if the =20=





 runtime of the compiler is slightly slower.

    Funny you should say that given the current thread comparing the
 speed of the D compiler to that of the Go compiler...

 My point was simply that the amount of time saved is relative to =20
 the size of the program being compiled.  If we assume =20
 conservatively that every other line in a program has bitwise or =20
 in it, then you are talking about a 16 billion line program, =20
 meaning the 216 seconds you save is insignificant compared to the =20=



 total compile time.  My point was that your fast solution that is =20=



 inaccurate is not preferable because nobody notices the speed, =20
 they just notice the inaccuracy.

    We are talking about range propagation, a function of the =20
 compiler,
 not a function of the compiled program. Since we can't get a 100%
 accurate representation of the possible values anyway (even yours
 might leave holes in the middle after all), then it might be better
 to choose a faster, slightly less precise algorithm if the
 difference is not too great. That's the difference between a
 compiler and a full-fledged static code analysis an program prover.

 When we're talking about the difference between O(1) and O(lgn), =20
 I'll take accuracy over speed in my compiler any day.  The =20
 solution should be 100% accurate for the problem statement.  If =20
 the problem statement is not what we need, then we need a new =20
 problem statement :)  Solving the problem statement for 99% of =20
 values is not good enough.

    Anyway, the point is moot, I have a new version of my algorithm
 with 100% precision and high speed.

 Yes, I'm still trying to understand how it works :)

 Sorry for the probably stupid question, but I don't understand much =20=


 the need of all this range propagation, in the compiler either you =20=


 have a a constant (and then you have the value at compile time, and =20=


 you don't need any range propagation, you can compare with the =20
 value), or you have a runtime value.


 It's been part of DMD2 for a while now. It allows you to do things =20
 like:

 ubyte lowbits(int x)
 {
    return x & 7;
 }

 without an explicit cast. The compiler knows that x&7 can safely fit =20=

 inside a single byte.  Whereas   ((x&7) << 12) | (x&3);
 does not fit, and requires an explicit cast.

ah ok I understand, I thought that was treated like x & cast(ubyte)7 , =20=

and so as comparison of the compile time value with the ranges of =20
ubyte (no range propagation needed).
But I can understand why it is treated as cast(ubyte)((cast(int)x) & =20
7), as it is probably easier for the compiler, as it upcasts by default.

Thanks for the explanation.

Apr 13 2010

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:

Fawzi Mohamed wrote:
=20
 On 13-apr-10, at 12:02, Don wrote:
 It's been part of DMD2 for a while now. It allows you to do things lik=


e:
 ubyte lowbits(int x)
 {
    return x & 7;
 }

 without an explicit cast. The compiler knows that x&7 can safely fit
 inside a single byte.  Whereas   ((x&7) << 12) | (x&3);
 does not fit, and requires an explicit cast.

=20
 ah ok I understand, I thought that was treated like x & cast(ubyte)7 ,
 and so as comparison of the compile time value with the ranges of ubyte=

 (no range propagation needed).
 But I can understand why it is treated as cast(ubyte)((cast(int)x) & 7)=

,
 as it is probably easier for the compiler, as it upcasts by default.
=20
 Thanks for the explanation.
=20

	Note that in Don's example, "x" is an int, not a ubyte, so you
don't need the cast to "int" in your second example, but you still
need range propagation in the first...

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Apr 13 2010

=?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= <jeberger free.fr> writes:

	Here's a fast 100% precise solution:

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D8<------------------------------
uint32_t maxOr (uint32_t minA, uint32_t minB,
                uint32_t maxA, uint32_t maxB)
{
   assert (minA <=3D maxA);
   assert (minB <=3D maxB);

   if (maxA =3D=3D 0) return maxB;
   if (maxB =3D=3D 0) return maxA;

   uint32_t a =3D maxA ^ minA;
   if (a !=3D 0) {
      a =3D ((1 << (highbit (a)+1)) - 1) & maxA & maxB;
      if (a !=3D 0)
         a =3D (1 << highbit (a)) - 1;
   }

   uint32_t b =3D maxB ^ minB;
   if (b !=3D 0) {
      b =3D ((1 << (highbit (b)+1)) - 1) & maxA & maxB;
      if (b !=3D 0)
         b =3D (1 << highbit (b)) - 1;
   }

   return maxA|maxB|a|b;
}
------------------------------>8=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Apr 12 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 12 Apr 2010 13:56:23 -0400, Jérôme M. Berger <jeberger free.fr>  
wrote:

 	Here's a fast 100% precise solution:

 ==============================8<------------------------------
 uint32_t maxOr (uint32_t minA, uint32_t minB,
                 uint32_t maxA, uint32_t maxB)
 {
    assert (minA <= maxA);
    assert (minB <= maxB);

    if (maxA == 0) return maxB;
    if (maxB == 0) return maxA;

    uint32_t a = maxA ^ minA;
    if (a != 0) {
       a = ((1 << (highbit (a)+1)) - 1) & maxA & maxB;
       if (a != 0)
          a = (1 << highbit (a)) - 1;
    }

    uint32_t b = maxB ^ minB;
    if (b != 0) {
       b = ((1 << (highbit (b)+1)) - 1) & maxA & maxB;
       if (b != 0)
          b = (1 << highbit (b)) - 1;
    }

    return maxA|maxB|a|b;
 }
 ------------------------------>8==============================


Fails for test case:

minA = 4, maxA = 4, minB = 4, maxB = 6 (outputs 7, accurate result is 6).

I'm not quite sure what you are trying to do, but I think you are on the  
right path to getting a faster solution.

-Steve

Apr 12 2010

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:

Steven Schveighoffer wrote:
 Fails for test case:
=20
 minA =3D 4, maxA =3D 4, minB =3D 4, maxB =3D 6 (outputs 7, accurate res=

ult is 6).
=20

	Nope, outputs 6. Note that I've run an exhaustive search for all
combinations in the [0, 63] range, so if there are mistakes they
have to be outside that range...

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Apr 12 2010

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

Jérôme M. Berger wrote:
 Steven Schveighoffer wrote:
 Fails for test case:

 minA = 4, maxA = 4, minB = 4, maxB = 6 (outputs 7, accurate result is 6).

 	Nope, outputs 6. Note that I've run an exhaustive search for all
 combinations in the [0, 63] range, so if there are mistakes they
 have to be outside that range...
 
 		Jerome

I confirm. Jerome's code passes my randomized test.

Ali

Apr 12 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

Jérôme M. Berger Wrote:

 Steven Schveighoffer wrote:
 Fails for test case:
 
 minA = 4, maxA = 4, minB = 4, maxB = 6 (outputs 7, accurate result is 6).
 

 	Nope, outputs 6. Note that I've run an exhaustive search for all
 combinations in the [0, 63] range, so if there are mistakes they
 have to be outside that range...
 

I'll have to double check, I thought I copied your code verbatim (I did switch
around the arguments to be minA, maxA, minB, maxB to fit my test harness, and I
changed the uint_32_t to uint).  I'll check tomorrow at work where the computer
I used to test is.

-Steve

Apr 12 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

Steven Schveighoffer Wrote:

 I'll have to double check, I thought I copied your code verbatim (I did switch
around the arguments to be minA, maxA, minB, maxB to fit my test harness, and I
changed the uint_32_t to uint).  I'll check tomorrow at work where the computer
I used to test is.
 

I think I see the issue, I was using an older version of your highbit, at some
point you changed it, but didn't repeat it in this solution, so I searched for
it on the newsgroup and found your first version.  That differs from later
versions you posted, but I can't find where you identified there was a bug.

I'll test again tomorrow, but it looks like that probably accounts for the
problem.

The version I was using:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108788

The later version: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108855

It looks like in that second post that Ali had the same problem I did.

Next time, you should include your whole code each time, especially parts you
changed.

-Steve

Apr 12 2010

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:

Steven Schveighoffer wrote:
 Steven Schveighoffer Wrote:
=20
 I'll have to double check, I thought I copied your code verbatim (I di=


d switch around the arguments to be minA, maxA, minB, maxB to fit my test=
 harness, and I changed the uint_32_t to uint).  I'll check tomorrow at w=
ork where the computer I used to test is.

=20
 I think I see the issue, I was using an older version of your highbit, =

at some point you changed it, but didn't repeat it in this solution, so I=
 searched for it on the newsgroup and found your first version.  That dif=
fers from later versions you posted, but I can't find where you identifie=
d there was a bug.
=20
 I'll test again tomorrow, but it looks like that probably accounts for =

the problem.
=20
 The version I was using: http://www.digitalmars.com/webnews/newsgroups.=

php?art_group=3Ddigitalmars.D&article_id=3D108788
=20
 The later version: http://www.digitalmars.com/webnews/newsgroups.php?ar=

t_group=3Ddigitalmars.D&article_id=3D108855
=20
 It looks like in that second post that Ali had the same problem I did.
=20
 Next time, you should include your whole code each time, especially par=

ts you changed.
=20

	Oups, I hadn't noticed that I sent two different versions (except
for the C<->D translation). Sorry about that.

	Note that there is no bug: both versions are correct. The first one
uses 1-based indexes, while the second uses 0-based indexes an
processes 0 differently.

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Apr 13 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 12 Apr 2010 22:37:00 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 JÃ©rÃ´me M. Berger Wrote:

 Steven Schveighoffer wrote:
 Fails for test case:

 minA = 4, maxA = 4, minB = 4, maxB = 6 (outputs 7, accurate result is  

 6).

 	Nope, outputs 6. Note that I've run an exhaustive search for all
 combinations in the [0, 63] range, so if there are mistakes they
 have to be outside that range...

 I'll have to double check, I thought I copied your code verbatim (I did  
 switch around the arguments to be minA, maxA, minB, maxB to fit my test  
 harness, and I changed the uint_32_t to uint).  I'll check tomorrow at  
 work where the computer I used to test is.

I confirm, with the updated highbit, your solution works.

Also, inspired by your solution (and using your highbit function, which is  
very neat BTW) I came up with a one-liner.  Enjoy :)

uint maxor(uint mina, uint maxa, uint minb, uint maxb)
{
     return maxa | maxb | ((1 << highbit(((1 << (max(highbit(mina ^ maxa),  
highbit(minb ^ maxb)) + 1)) - 1) & maxa & maxb)) - 1);
}

Anyone want to do the equivalent minor?  I've spent way too much time on  
this :)

-Steve

Apr 13 2010

Don <nospam nospam.com> writes:

Steven Schveighoffer wrote:
 On Mon, 12 Apr 2010 22:37:00 -0400, Steven Schveighoffer 
 <schveiguy yahoo.com> wrote:
 
 JÃ©rÃ´me M. Berger Wrote:

 Steven Schveighoffer wrote:
 Fails for test case:

 minA = 4, maxA = 4, minB = 4, maxB = 6 (outputs 7, accurate result 

 is 6).

     Nope, outputs 6. Note that I've run an exhaustive search for all
 combinations in the [0, 63] range, so if there are mistakes they
 have to be outside that range...

 I'll have to double check, I thought I copied your code verbatim (I 
 did switch around the arguments to be minA, maxA, minB, maxB to fit my 
 test harness, and I changed the uint_32_t to uint).  I'll check 
 tomorrow at work where the computer I used to test is.

 
 I confirm, with the updated highbit, your solution works.
 
 Also, inspired by your solution (and using your highbit function, which 
 is very neat BTW) I came up with a one-liner.  Enjoy :)
 
 uint maxor(uint mina, uint maxa, uint minb, uint maxb)
 {
     return maxa | maxb | ((1 << highbit(((1 << (max(highbit(mina ^ 
 maxa), highbit(minb ^ maxb)) + 1)) - 1) & maxa & maxb)) - 1);
 }
 
 Anyone want to do the equivalent minor?  I've spent way too much time on 
 this :)
 
 -Steve

Awesome stuff, guys.
I think that's worthy of inclusion as an exercise in Knuth's Volume 4, 
fascile 1. Suggest it to someone at ACCU?
(Knuth has highbit(x) but calls it lambda(x). He notes the result that 
highbit(x)==highbit(y) iff x^y <= x&y).

Apr 13 2010

Adam Ruppe <destructionator gmail.com> writes:

Jerome's highbit function is the same as std.intrinsic.bsr. I wonder
which is faster?

I ran a test, and for 100 million iterations (1..10000000), the
intrinsic beat out his function be a mere 300 milliseconds on my box.
- highbit ran in an average of 1897 ms and bsr did the same in an
average if 1534.

Recompiling with -inline -O -release cuts the raw numbers about in
half, but keeps about the same difference, leading me to think
overhead amounts for a fair amount of the percentage instead of actual
implementation. The new averages are 1134 and 853.

I've gotta say, your implementation of the function is better than I
would have done though. Without bsr, I probably would have looped,
shifted, and tested each individual bit...

On 4/13/10, Steven Schveighoffer <schveiguy yahoo.com> wrote:
 On Mon, 12 Apr 2010 22:37:00 -0400, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:

 J=C3=A9r=C3=B4me M. Berger Wrote:

 Steven Schveighoffer wrote:
 Fails for test case:

 minA =3D 4, maxA =3D 4, minB =3D 4, maxB =3D 6 (outputs 7, accurate r=




esult is
 6).

 	Nope, outputs 6. Note that I've run an exhaustive search for all
 combinations in the [0, 63] range, so if there are mistakes they
 have to be outside that range...

 I'll have to double check, I thought I copied your code verbatim (I did
 switch around the arguments to be minA, maxA, minB, maxB to fit my test
 harness, and I changed the uint_32_t to uint).  I'll check tomorrow at
 work where the computer I used to test is.

 I confirm, with the updated highbit, your solution works.

 Also, inspired by your solution (and using your highbit function, which i=

s
 very neat BTW) I came up with a one-liner.  Enjoy :)

 uint maxor(uint mina, uint maxa, uint minb, uint maxb)
 {
      return maxa | maxb | ((1 << highbit(((1 << (max(highbit(mina ^ maxa)=

,
 highbit(minb ^ maxb)) + 1)) - 1) & maxa & maxb)) - 1);
 }

 Anyone want to do the equivalent minor?  I've spent way too much time on
 this :)

 -Steve

Apr 13 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 13 Apr 2010 09:56:34 -0400, Adam Ruppe <destructionator gmail.com>  
wrote:

 Jerome's highbit function is the same as std.intrinsic.bsr. I wonder
 which is faster?

Just a note, this code should be runnable in C++, because the compiler is  
written in C++.  Is bsr available there?

 Recompiling with -inline -O -release cuts the raw numbers about in
 half, but keeps about the same difference, leading me to think
 overhead amounts for a fair amount of the percentage instead of actual
 implementation. The new averages are 1134 and 853.

Probably the inlining accounts for the savings here.  Calling a function  
is rather expensive.

 I've gotta say, your implementation of the function is better than I
 would have done though. Without bsr, I probably would have looped,
 shifted, and tested each individual bit...

I never would have thought of doing a binary search for the high bit, it  
is definitely a novel idea to me :)

-Steve

Apr 13 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:
 I never would have thought of doing a binary search for the high bit, it  
 is definitely a novel idea to me :)

You can learn something from this page (it can also be useful for the
translation to C++):
http://graphics.stanford.edu/~seander/bithacks.html

Bye,
bearophile

Apr 13 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Tue, Apr 13, 2010 at 10:52:14AM -0400, Steven Schveighoffer wrote:
 Just a note, this code should be runnable in C++, because the compiler is  
 written in C++.  Is bsr available there?

I just assumed since it was in D that it was probably
in DMC too, but I can't find it in the docs, so maybe not.
Well, there's always inline asm :D

But 300 milliseconds over 100 million iterations is
negligible anyway.




-- 
Adam D. Ruppe
http://arsdnet.net

Apr 13 2010

Clemens <eriatarka84 gmail.com> writes:

Adam Ruppe Wrote:

 Jerome's highbit function is the same as std.intrinsic.bsr. I wonder
 which is faster?
 
 I ran a test, and for 100 million iterations (1..10000000), the
 intrinsic beat out his function be a mere 300 milliseconds on my box.
 - highbit ran in an average of 1897 ms and bsr did the same in an
 average if 1534.
 
 Recompiling with -inline -O -release cuts the raw numbers about in
 half, but keeps about the same difference, leading me to think
 overhead amounts for a fair amount of the percentage instead of actual
 implementation. The new averages are 1134 and 853.

That's strange. Looking at src/backend/cod4.c, function cdbscan, in the dmd
sources, bsr seems to be implemented in terms of the bsr opcode [1] (which I
guess is the reason it's an intrinsic in the first place). I would have
expected this to be much, much faster than a user function. Anyone care enough
to check the generated assembly?

[1] http://www.itis.mn.it/linux/quarta/x86/bsr.htm

-- Clemens

Apr 13 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Tue, Apr 13, 2010 at 11:10:24AM -0400, Clemens wrote:
 That's strange. Looking at src/backend/cod4.c, function cdbscan, in the dmd
sources, bsr seems to be implemented in terms of the bsr opcode [1] (which I
guess is the reason it's an intrinsic in the first place). I would have
expected this to be much, much faster than a user function. Anyone care enough
to check the generated assembly?

The opcode is fairly slow anyway (as far as opcodes go) - odds are the
implementation inside the processor is similar to Jerome's method, and
the main savings come from it loading fewer bytes into the pipeline.

I remember a line from a blog, IIRC it was the author of the C++ FQA
writing it, saying hardware and software are pretty much the same thing -
moving an instruction to hardware doesn't mean it will be any faster,
since it is the same algorithm, just done in processor microcode instead of
user opcodes.

-- 
Adam D. Ruppe
http://arsdnet.net

Apr 13 2010

Don <nospam nospam.com> writes:

Adam D. Ruppe wrote:
 On Tue, Apr 13, 2010 at 11:10:24AM -0400, Clemens wrote:
 That's strange. Looking at src/backend/cod4.c, function cdbscan, in the dmd
sources, bsr seems to be implemented in terms of the bsr opcode [1] (which I
guess is the reason it's an intrinsic in the first place). I would have
expected this to be much, much faster than a user function. Anyone care enough
to check the generated assembly?

 
 The opcode is fairly slow anyway (as far as opcodes go) - odds are the
 implementation inside the processor is similar to Jerome's method, and
 the main savings come from it loading fewer bytes into the pipeline.
 
 I remember a line from a blog, IIRC it was the author of the C++ FQA
 writing it, saying hardware and software are pretty much the same thing -
 moving an instruction to hardware doesn't mean it will be any faster,
 since it is the same algorithm, just done in processor microcode instead of
 user opcodes.
 

It's fast on Intel, slow on AMD. I bet the speed difference comes from 
inlining max().

Apr 13 2010

Don <nospam nospam.com> writes:

Don wrote:
 Adam D. Ruppe wrote:
 On Tue, Apr 13, 2010 at 11:10:24AM -0400, Clemens wrote:
 That's strange. Looking at src/backend/cod4.c, function cdbscan, in 
 the dmd sources, bsr seems to be implemented in terms of the bsr 
 opcode [1] (which I guess is the reason it's an intrinsic in the 
 first place). I would have expected this to be much, much faster than 
 a user function. Anyone care enough to check the generated assembly?

 The opcode is fairly slow anyway (as far as opcodes go) - odds are the
 implementation inside the processor is similar to Jerome's method, and
 the main savings come from it loading fewer bytes into the pipeline.

 I remember a line from a blog, IIRC it was the author of the C++ FQA
 writing it, saying hardware and software are pretty much the same thing -
 moving an instruction to hardware doesn't mean it will be any faster,
 since it is the same algorithm, just done in processor microcode 
 instead of
 user opcodes.

 It's fast on Intel, slow on AMD. I bet the speed difference comes from 
 inlining max().

Specifically, bsr is 7 uops on AMD, 1 uop on Intel since the original 
Pentium. AMD's performance is shameful.

And bsr() is supported in the compiler; in fact DMC uses it extensively, 
which is why it's included in DMD!

Apr 13 2010

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:

Steven Schveighoffer wrote:
 using your highbit function, which is very neat BTW
=20

	Thanks, but I can't claim credit for it. It's a classical algorithm
which comes up regularly in programming discussions and newsgroups...

 uint maxor(uint mina, uint maxa, uint minb, uint maxb)
 {
     return maxa | maxb | ((1 << highbit(((1 << (max(highbit(mina ^
 maxa), highbit(minb ^ maxb)) + 1)) - 1) & maxa & maxb)) - 1);
 }
=20

	Neat!

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Apr 13 2010

Ellery Newcomer <ellery-newcomer utulsa.edu> writes:

is it just me, or does this fail for

minA = 0
minB = 0
maxA = 0x80_00_00_00
maxB = 0x80_00_00_00

I'm pretty sure you need to handle 1 << 32 specially

Dec 04 2010

Ellery Newcomer <ellery-newcomer utulsa.edu> writes:

On 12/04/2010 01:21 PM, Ellery Newcomer wrote:
 is it just me, or does this fail for

 minA = 0
 minB = 0
 maxA = 0x80_00_00_00
 maxB = 0x80_00_00_00

 I'm pretty sure you need to handle 1<<  32 specially

i'm thinking this is better:

uint32_t maxOr (uint32_t minA, uint32_t minB,
         uint32_t maxA, uint32_t maxB)
{
     assert (minA <= maxA);
     assert (minB <= maxB);

     if (maxA == 0) return maxB;
     if (maxB == 0) return maxA;

     uint32_t a = maxA ^ minA;
     if (a != 0) {
         a |= (1 << highbit (a)) - 1;
         a = a & maxA & maxB;
         if (a != 0)
             a = (1 << highbit (a)) - 1;
     }

     uint32_t b = maxB ^ minB;
     if (b != 0) {
	b |= (1 << highbit (b)) - 1;
         b = b & maxA & maxB;
         if (b != 0)
             b = (1 << highbit (b)) - 1;
     }

     return maxA|maxB|a|b;
}

Dec 04 2010

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:
 byte c = a | b;
 Say you already know min_a, max_a, min_b, and max_b. How do you compute 
 min_c and max_c? I thought of it for a bit and it seems quite tricky.

This is Interval arithmetic. If you are able to find a book about it... (or a
library for some programming language that implements it).

Bye,
bearophile

Apr 10 2010

BCS <none anon.com> writes:

Hello Andrei,

 Consider:
 
 byte c = a | b;
 
 Say you already know min_a, max_a, min_b, and max_b. How do you
 compute min_c and max_c? I thought of it for a bit and it seems quite
 tricky.
 


x_bit_max == "the largest bit in x_max ^ x_min"
x_oth_max == x_bit_max - 1;

c_max = max(a_bit_max | b_oth_max, b_bit_max | a_oth_max) | (a_max &
~(a_bit_max 
| a_oth_max));




-- 
... <IXOYE><

Apr 10 2010

BCS <none anon.com> writes:

Hello BCS,

 Hello Andrei,
 
 Consider:
 
 byte c = a | b;
 
 Say you already know min_a, max_a, min_b, and max_b. How do you
 compute min_c and max_c? I thought of it for a bit and it seems quite
 tricky.
 

 x_bit_max == "the largest bit in x_max ^ x_min"
 x_oth_max == x_bit_max - 1;

// or make that

c_min = (a_max & ~(a_bit_max | a_oth_max);
c_max = max(a_bit_max | b_oth_max, b_bit_max | a_oth_max) | c_min;

 c_max = max(a_bit_max | b_oth_max, b_bit_max | a_oth_max) | (a_max &
 ~(a_bit_max | a_oth_max));
 

-- 
... <IXOYE><

Apr 10 2010

BCS <none anon.com> writes:

Hello BCS,

Scratch all that, I'm clueless. :b
 
-- 
... <IXOYE><

Apr 10 2010

D Programming

C/C++ Programming

Other

digitalmars.D - value range propagation for logical OR