digitalmars.D.learn - BitArray shift left/right confusion.

Bastiaan Veelo (23/23) Dec 27 2017 I suppose the following is not a bug, but confusing it is:

Biotronic (35/58) Dec 27 2017 BitArray is apparently a mess. As you've pointed out it prints

Bastiaan Veelo (5/6) Dec 28 2017 Thanks for your confirmation, digging and reporting issues
Jakub =?UTF-8?B?xYFhYmFq?= (44/61) Feb 01 2018 I was looking at the issues you reported and was pretty surprised

Bastiaan Veelo (7/17) Feb 03 2018 The same confusion existed amongst the authos, that pull was done

Bastiaan Veelo <Bastiaan Veelo.net> writes:

I suppose the following is not a bug, but confusing it is:

```
void main()
{
     import std.stdio;
     import std.bitmanip;
     BitArray ba = [1, 1, 1, 1, 1, 1, 1, 1];
     writeln(ba);	// [1, 1, 1, 1, 1, 1, 1, 1]
     ba >>= 4;		// right shift
     writeln(ba);	// [1, 1, 1, 1, 0, 0, 0, 0] bits shifted left
}```

I suppose this is because the array is printed left-to-right, 
whereas the bits in a byte are typically ordered right-to-left. I 
suppose I should interpret the bits in the array to increase in 
significance with increasing index (little endian) and that 
right-shift means a shift towards less significance (which is to 
the right in big endian).

The documentation of <<= and >>= [1] however just talks about 
left and right, without defining left and right or clarifying 
that the directions are reversed from how the array is printed.

Is there something I have missed?

[1] 
https://dlang.org/phobos/std_bitmanip.html#.BitArray.opOpAssign.2

Dec 27 2017

Biotronic <simen.kjaras gmail.com> writes:

On Wednesday, 27 December 2017 at 18:08:19 UTC, Bastiaan Veelo 
wrote:
 I suppose the following is not a bug, but confusing it is:

 ```
 void main()
 {
     import std.stdio;
     import std.bitmanip;
     BitArray ba = [1, 1, 1, 1, 1, 1, 1, 1];
     writeln(ba);	// [1, 1, 1, 1, 1, 1, 1, 1]
     ba >>= 4;		// right shift
     writeln(ba);	// [1, 1, 1, 1, 0, 0, 0, 0] bits shifted left
 }```

 I suppose this is because the array is printed left-to-right, 
 whereas the bits in a byte are typically ordered right-to-left. 
 I suppose I should interpret the bits in the array to increase 
 in significance with increasing index (little endian) and that 
 right-shift means a shift towards less significance (which is 
 to the right in big endian).

 The documentation of <<= and >>= [1] however just talks about 
 left and right, without defining left and right or clarifying 
 that the directions are reversed from how the array is printed.

 Is there something I have missed?

 [1] 
 https://dlang.org/phobos/std_bitmanip.html#.BitArray.opOpAssign.2

BitArray is apparently a mess. As you've pointed out it prints 
the bits in the wrong order. I won't mince words here, since D 
has binary literals on the form 0b10011110. Put that in a 
BitArray and print it with the format string "0b%b", and you'll 
get 0b01111001. While it may have been intentional, it's bug 
prone and confusing, and so definitely a bug.

It also fucks up royally when it has an exact multiple of 32 bits 
in its buffer, overwriting the last word with 0s when you try and 
shift it in any way.

It also doesn't remove set bits outside of its covered area when 
cast to size_t[]. That is, if I do 
cast(size_t[])(BitArray([1,1,1,1,1,1,1,1]) << 4), the result will 
be something like [4080], which corresponds to 
[0b0000_1111_1111_0000].

Lastly (and this is pointed out explicitly in the documentation, 
but still smells if you ask me), it will overwrite bits in the 
words it covers, even if it does not cover those exact bits.

The first two are definitely bugs. The last two are results of 
the intended use case for BitArray, I believe. The documentation 
doesn't explicitly point this out, but it seems BitArray is 
intended to give a bit-by-bit view of an area of memory that is 
actually some other type. Something like this:

struct S {
    int n;
    float f;
}

void foo(ref S s) {
     import std.bitmanip;
     auto a = BitArray((&s)[0..1], S.sizeof);
     a[7] = true; // Actually sets the corresponding bit in s.
}

--
   Biotronic

Dec 27 2017

Bastiaan Veelo <Bastiaan Veelo.net> writes:

On Wednesday, 27 December 2017 at 20:45:49 UTC, Biotronic wrote:
 BitArray is apparently a mess.

Thanks for your confirmation, digging and reporting issues
https://issues.dlang.org/show_bug.cgi?id=18133 and
https://issues.dlang.org/show_bug.cgi?id=18134

(Turns out we both live in the same country).

Dec 28 2017

Jakub =?UTF-8?B?xYFhYmFq?= <uaaabbjjkl gmail.com> writes:

I was looking at the issues you reported and was pretty surprised 
how some of the features work. So I'm going to revive this thread 
with an explanation.

On Wednesday, 27 December 2017 at 20:45:49 UTC, Biotronic wrote:
 On Wednesday, 27 December 2017 at 18:08:19 UTC, Bastiaan Veelo 
 wrote:
 I suppose the following is not a bug, but confusing it is:

 ```
 void main()
 {
     import std.stdio;
     import std.bitmanip;
     BitArray ba = [1, 1, 1, 1, 1, 1, 1, 1];
     writeln(ba);	// [1, 1, 1, 1, 1, 1, 1, 1]
     ba >>= 4;		// right shift
     writeln(ba);	// [1, 1, 1, 1, 0, 0, 0, 0] bits shifted left
 }```

 I suppose this is because the array is printed left-to-right, 
 whereas the bits in a byte are typically ordered right-to-left.


BitArray stores its data in a `size_t[]` array - on 64-bit it 
consists of 8-byte units, each unit having its beginning at the 
least significant bit. So it works as you would probably expect:

bool[128] bits;
auto ba = BitArray(bits); // one bool corresponds to one bit in 
this constructor
ba[0] = 1;
ba[127] = 1;

Now, the exact representation of the array in this BitArray is as 
follows:
index:  | 0          | 1          |
bit no: | 64 ... 1 0 | 63 ... 1 0 |
values: | 0  ... 0 1 | 1  ... 0 0 |

You can get this array using cast:
size_t[] exact = cast(size_t[]) ba;
assert(exact[0] == 1);
assert(exact[1] == size_t(1) << 63);

Printing works in the human, i.e. logical, way:
writeln(ba); // [1, 0, ..., 0, ..., 0, 1]

BitArray has also a constructor taking a raw representation of 
the underlying array:
size_t val = 0b1111_0000;
auto ba2 = BitArray([val], 8);
writeln(ba2); // [0, 0, 0, 0, 1, 1, 1, 1]

It may be surprising, as it's opposite order of the binary 
literal. But this works as intended because it's supposed to be 
the inversion of casting:
assert(cast(size_t[]) ba2 == [val]);
It also works like that e.g. in Java's BitSet and C++'s bitset.

On the other hand, shifting operators are equally confusing for 
me, as they are for you - they really work in the other way 
around! I thought this is a very weird bug, but I found this pull 
request: https://github.com/dlang/phobos/pull/2844, which states 
this is the intended behaviour.

I don't know if there is anybody that would expect this - it's 
inconsistent with any other bitset implementation I know from 
other languages, as well as with what logic suggests. I'm curious 
if the authors changed their minds in this regard and there is 
any chance for that to be rewritten?

And besides, it's pretty bugged, indeed.

Feb 01 2018

Bastiaan Veelo <Bastiaan Veelo.net> writes:

On Friday, 2 February 2018 at 00:33:03 UTC, Jakub Łabaj wrote:
 On the other hand, shifting operators are equally confusing for 
 me, as they are for you - they really work in the other way 
 around! I thought this is a very weird bug, but I found this 
 pull request: https://github.com/dlang/phobos/pull/2844, which 
 states this is the intended behaviour.

 I don't know if there is anybody that would expect this - it's 
 inconsistent with any other bitset implementation I know from 
 other languages, as well as with what logic suggests. I'm 
 curious if the authors changed their minds in this regard and 
 there is any chance for that to be rewritten?

The same confusion existed amongst the authos, that pull was done 
in reaction to 
https://github.com/dlang/phobos/pull/2797#discussion-diff-22456052.

So the shift direction seems to be as intended (by Martin Nowak 
at least) but the docs should be clear in what order the bits are 
printed and indexed. I guess.

Feb 03 2018

D Programming

C/C++ Programming

Other

digitalmars.D.learn - BitArray shift left/right confusion.