digitalmars.D.learn - BitArray/BitFields

digitalmars.D.learn - BitArray/BitFields - Review

Era Scarecrow (60/60) Jul 28 2012 BitFields:

Jonathan M Davis (7/11) Jul 28 2012 I'm not particularly familar with BitArray, so I don't know what its opC...
Dmitry Olshansky (15/55) Jul 28 2012 Great! But I strongly suggest to repost it in d.D newsgroup as it has

Jonathan M Davis (23/29) Jul 28 2012 I would point out that while hasSlicing doesn't currently check for it, ...

Dmitry Olshansky (5/16) Jul 28 2012 BitArray isn't and shouldn't be a range. It's a container.

Jonathan M Davis (5/22) Jul 28 2012 Probably nothing. I'm not all that familiar with BitArray. If it's not a...

Era Scarecrow (4/9) Jul 28 2012 Which is one of the main reasons I am hesitant to have a

Jonathan M Davis (12/22) Jul 28 2012 Well, it seems that my comment was somewhat misplaced in that BitArray i...

Era Scarecrow (32/45) Jul 28 2012 An array suggests you can control the size of the array, however

Era Scarecrow (7/8) Jul 28 2012 My filter/formatter got me again...
Dmitry Olshansky (21/63) Jul 28 2012 A strange jump to conclusions? Why can't array know it's size in bits

Era Scarecrow (18/57) Jul 28 2012 Knowing the size based on if it's compact or the array.length *

Dmitry Olshansky (28/41) Jul 28 2012 Mhm? Not getting you it at all. What's locking it ?

Era Scarecrow (20/46) Jul 28 2012 I'd rather not rely on the pointer at all as a reference for if

Dmitry Olshansky (35/89) Jul 28 2012 No problem, I redesigned a bit:

Era Scarecrow (28/105) Jul 28 2012 2^63, or drop 8 so it takes up no more than 2^55 bytes of memory.

Dmitry Olshansky (54/113) Jul 28 2012 OK. Now back to the biggest question:

Era Scarecrow (26/86) Jul 29 2012 If it was a previous slice, then it never chances outside and
Era Scarecrow (23/28) Jul 29 2012 I've had a thought. Why not make it fully reference and at the

Dmitry Olshansky (5/31) Jul 29 2012 Hm... and small array optimizations go where? BitArraySlice won't be a

Era Scarecrow (13/23) Jul 29 2012 I was thinking about that; If need be the slice can point to

Dmitry Olshansky (11/33) Jul 29 2012 I have simpler suggestion. Maybe doing it as 2 containers:

Era Scarecrow (10/18) Jul 29 2012 As long as you only use one or the other then it sounds great;

Dmitry Olshansky (9/29) Jul 29 2012 Sounds a lot like mixin template. I'd try to invent some small set of

Era Scarecrow (5/18) Jul 29 2012 Pretty much... syntactical sugar, just add tea (T) I made a

Era Scarecrow (10/18) Jul 29 2012 Another thought coming to mind, is to have the BitArray with a

Era Scarecrow (66/68) Jul 30 2012 A question regarding templates. A template with different

Dmitry Olshansky (6/50) Jul 30 2012 Not sure what you would like to accomplish here.

Era Scarecrow (45/51) Jul 30 2012 Than an example...

Era Scarecrow (2/3) Jul 30 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8475
Dmitry Olshansky (10/30) Jul 30 2012 You can go for simpler separation:

Era Scarecrow (8/19) Jul 30 2012 I'll give it a try, it may very well be what I need; although I

Era Scarecrow (21/28) Jul 30 2012 Tests with 'in' doesn't help, only ref and catching a temporary

Dmitry Olshansky (5/32) Jul 30 2012 By default XT is deduced as X!(current value of smth).XT

Era Scarecrow (25/28) Jul 30 2012 Doesn't really fix it...

Philippe Sigaud (37/58) Jul 30 2012 Correct. They have no reason, in general, too even generate the same cod...

Era Scarecrow (17/45) Jul 30 2012 Anything of template X! previous post of mine shows sort of an

Philippe Sigaud (22/51) Jul 30 2012 No. Use a 3-params template or a tuple:

Era Scarecrow (16/43) Jul 30 2012 I don't know how many arguments it will have (depends on how

Era Scarecrow (33/38) Dec 31 2012 As BitArray is coming back up and my resumed work I'll comment a

Dmitry Olshansky (68/103) Jan 02 2013 Personally I believe that if we introduce a slice of a BitArray as a

Era Scarecrow (74/166) Jan 02 2013 Could do that I suppose, but then either exact length or

Dmitry Olshansky (30/201) Jan 02 2013 Hm, I'd think that having Slice type be:

Era Scarecrow (26/65) Jan 03 2013 I know we've had this discussion before. So let's assume you do

Dmitry Olshansky (31/102) Jan 03 2013 No surprise here. Basically container != range over it. It all flows

Era Scarecrow (23/111) Jan 03 2013 Hmmm.. shouldn't try to make it as close to an actual array as

Dmitry Olshansky (28/76) Jan 03 2013 Appending a slice *to* BitArray is perfectly fine as in fact any range

Era Scarecrow (19/63) Jan 03 2013 So the slice is a non-allocating view of data some data. I can

Era Scarecrow (6/10) Jan 06 2013 Nearly done. Got a working LZW, multi-level huffman working,

Dmitry Olshansky (4/14) Jan 06 2013 Sure thing! I'd love to check it out.

Era Scarecrow (5/13) Jan 06 2013 Here you go. It's about 16k currently (5k for LZW, 10k for MLH).

Dmitry Olshansky (5/18) Jan 07 2013 Thanks, unfortunately I got busy with my own stuff ... bookmarked for

Era Scarecrow (27/39) Jul 28 2012 I was thinking of not bothering Andrei or Walter while i fixed

Dmitry Olshansky (12/48) Jul 28 2012 Obviously that was me (blackwhale) ;)

bearophile (13/15) Jul 28 2012 Good, that's useful.

Era Scarecrow (17/30) Jul 28 2012 4123 - fixed

bearophile (27/41) Jul 29 2012 Era Scarecrow:

Era Scarecrow (23/63) Jul 29 2012 If it has to determine between compact and non-compact, then I

bearophile (37/59) Jul 29 2012 Maybe it's better to consider two use cases: dynamic length

Don Clugston (4/12) Jul 31 2012 See bug 4717. On x86, even on 32 bits you can get close to 1 cycle per

Era Scarecrow (7/7) Aug 02 2012 Currently since I've been concentrating more on bitfields and

David Nadlinger (4/6) Aug 02 2012 Hm? You can just close pull requests (button at the bottom of the

Era Scarecrow (7/12) Aug 02 2012 Most of this doesn't make sense to me still; just one of those
Era Scarecrow (5/7) Aug 03 2012 So I've removed the pull, but the auto tester is still pulling

Era Scarecrow (24/31) Aug 02 2012 versions of the code, that get compiled conditionally according
Era Scarecrow (44/51) Aug 02 2012 How would you tell them apart? If I know how you may want to call
Era Scarecrow (56/56) Jan 09 2013 Having insomnia has left me with a certain gifted curse. :P

Era Scarecrow (23/37) Jan 09 2013 Hmm as I'm thinking about it more, a mixin for BitString may be

Era Scarecrow (18/20) Jan 09 2013 I'm having troubles trying to understand why this won't work.

Dmitry Olshansky (19/37) Jan 10 2013 Overall direction is good (I mean separation of ranges/containers). I'll...

Era Scarecrow (40/45) Jan 10 2013 That was the idea, yes.

Dmitry Olshansky (5/47) Jan 10 2013 Sounds good and seductively simple.

Era Scarecrow (25/28) Jan 10 2013 Why not? I don't see a reason why not personally. Umm maybe I

Era Scarecrow (13/13) Jan 13 2013 Well some minor update. From the looks of how it is going to be

Era Scarecrow (45/45) Jan 16 2013 Well got a few curious problems. Slicing doesn't seem it wants

Dmitry Olshansky (17/59) Jan 17 2013 Well, the slice was invalidated by shrinking an array. I don't expect

Era Scarecrow (18/48) Jan 17 2013 It just seems like shrinking/expanding the array should still

Dmitry Olshansky (6/20) Jan 17 2013 I'm thinking that a opSlice of stack-allocated must be @system and a

H. S. Teoh (7/9) Jan 17 2013 [...]
Era Scarecrow (10/12) Jan 17 2013 That's just a small part of the problem. With the new design,

Dmitry Olshansky (21/31) Jan 18 2013 Slices should have a pointer and length so they can check range (in

Era Scarecrow (14/29) Jan 18 2013 So write an alternate opIndex...

Dmitry Olshansky (15/41) Jan 19 2013 opBinaryAssign --> opOpAssign. opSlice/opSlice assign.

Era Scarecrow (143/155) Jan 19 2013 Yes I did when I could; The slice versions can only forwards

Era Scarecrow (14/14) Aug 06 2014 (Moved from: What have I missed?)

H. S. Teoh via Digitalmars-d-learn (11/27) Aug 06 2014 Since this is about contributing to Phobos, probably a better place to

Era Scarecrow (12/19) Aug 06 2014 Yeah posted in my 'what have i missed?' as well...

Era Scarecrow (3/9) Aug 06 2014 Looks like that pull is just the bitfields... but the actual
H. S. Teoh via Digitalmars-d-learn (17/38) Aug 06 2014 Hold on a sec... that's a pull for your own fork of Phobos. You need to

Era Scarecrow (17/30) Aug 06 2014 I did submit it against the original Phobos, and the auto tester

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

BitFields:
https://github.com/rtcvb32/phobos/commit/729c96e2f20ddd0c05a1afb488030b5c4db3e012

new BitArray Overhead functions/templates:
https://github.com/rtcvb32/phobos/commit/55aaf82a39f3901f03f5f1ac26fd175c5b2cd167

BitArray:
https://github.com/rtcvb32/phobos/commit/568c0d1966208afe70cedcaddc3bb880dc3d481d

  I'd like opinions of the code I've submitted, and certain 
shortcomings that are present that appropriate changes would be 
needed to fix. Only one of these has been a problem during 
debugging and that's due to references. (All three patches must 
be present to work)

BitArray:
1) Slices use references (sometimes) to previous bitarray. Since 
it's a slice (like an array) this could be expected.
Solutions:
  a) Drop the union, make all functions  safe (compared to 
 trusted), and they are all ref/slices
  b) Leave as sometimes ref, sometimes not; Use .dup when you NEED 
to ensure different unique copies.
  c) Apply COW (Copy-On-Write) where a known slice type (canExpand 
is false) when a change would apply, it replaces the present 
slice into it's own unique array.
  d) All slices become new dup'ed allocated arrays (Most expensive 
of them all)

2) opCast disabled: Anyone using opCast their code will break, 
however depending on it's usage that may not be an issue. If 
anyone can recommend alternative opCast(s) that won't prevent 
normal use of casting, then they will be reimplemented.

3) dim - An artifact of the previous BitArray, and I think needs 
to be removed as it has limited use.

4) Certain operators (opCat, opCat_r) I can't replace with 
opBinary/opBinary_r.

5) Examples of additional use cases/construction that it 
currently fails and should work.

6) Added rawWrite (for primitives), usefulness testing and 
opinions.

7) opApply, toHash, opCmp, opEqual tested in various environments 
(Like associative arrays) for problems, along with any possible 
const issues.

BitFields:
  I've concentrated on adding default values. A set of functions 
previously present string functions 'myXXX' were present, why 
were they present to begin with? Was it a limitation of what was 
available at the time? Or was it to make compiling faster so it 
didn't rely on other library functions? Or were the string 
functions not available at the time? They might be removed once 
this is truly answered.

//currently should work
struct A
{
   mixin(bitfields!(
     uint, "x",    2,
     int,  "y",    3,
     uint, "z=1",  2,
     bool, "flag=true", 1));
}
A obj;
assert(obj.x == 0);
assert(obj.z == 1);    //default
assert(obj.flag == 1); //true/false should also work

Jul 28 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, July 28, 2012 22:41:48 Era Scarecrow wrote:
 2) opCast disabled: Anyone using opCast their code will break,
 however depending on it's usage that may not be an issue. If
 anyone can recommend alternative opCast(s) that won't prevent
 normal use of casting, then they will be reimplemented.

I'm not particularly familar with BitArray, so I don't know what its opCast 
would convert to, but opCast will only convert to the types that you let it, 
so if you need to constrain what types it works with, then use template 
constraints to do that. If, on the other hand, opCast just doesn't make any 
sense at all, then just don't have it.

- Jonathan M Davis

Jul 28 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 29-Jul-12 00:41, Era Scarecrow wrote:
 BitFields:
 https://github.com/rtcvb32/phobos/commit/729c96e2f20ddd0c05a1afb488030b5c4db3e012


 new BitArray Overhead functions/templates:
 https://github.com/rtcvb32/phobos/commit/55aaf82a39f3901f03f5f1ac26fd175c5b2cd167


 BitArray:
 https://github.com/rtcvb32/phobos/commit/568c0d1966208afe70cedcaddc3bb880dc3d481d


   I'd like opinions of the code I've submitted, and certain shortcomings
 that are present that appropriate changes would be needed to fix. Only
 one of these has been a problem during debugging and that's due to
 references. (All three patches must be present to work)

Great! But I strongly suggest to repost it in d.D newsgroup as it has
wider audience and is more appropriate for reviews.

 BitArray:
 1) Slices use references (sometimes) to previous bitarray. Since it's a
 slice (like an array) this could be expected.
 Solutions:
   a) Drop the union, make all functions  safe (compared to  trusted),
 and they are all ref/slices
   b) Leave as sometimes ref, sometimes not; Use .dup when you NEED to
 ensure different unique copies.
   c) Apply COW (Copy-On-Write) where a known slice type (canExpand is
 false) when a change would apply, it replaces the present slice into
 it's own unique array.
   d) All slices become new dup'ed allocated arrays (Most expensive of
 them all)

My solution is make slices different type e.g. BitSlice.
It's always reference to the original BitArray. Array itself still can 
be either compact or not. All operations with slice go to array (and 
still check if they can use word-aligned speed up).

How does it sound?

 2) opCast disabled: Anyone using opCast their code will break, however
 depending on it's usage that may not be an issue. If anyone can
 recommend alternative opCast(s) that won't prevent normal use of
 casting, then they will be reimplemented.

 3) dim - An artifact of the previous BitArray, and I think needs to be
 removed as it has limited use.

 4) Certain operators (opCat, opCat_r) I can't replace with
 opBinary/opBinary_r.

opCat isn't it just operator "~" and "~=" ? You can use opOpAssign for 
"~=" IRC.

 5) Examples of additional use cases/construction that it currently fails
 and should work.

 6) Added rawWrite (for primitives), usefulness testing and opinions.

 7) opApply, toHash, opCmp, opEqual tested in various environments (Like
 associative arrays) for problems, along with any possible const issues.

 BitFields:
   I've concentrated on adding default values. A set of functions
 previously present string functions 'myXXX' were present, why were they
 present to begin with?

I suspect to!string wasn't CTFEable. Regardless you can catch a word 
from Andrei Alexandrescu on the main D newsgroup who (I think) came up 
with it in the first place.


-- 
Dmitry Olshansky

Jul 28 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, July 29, 2012 00:58:59 Dmitry Olshansky wrote:
 My solution is make slices different type e.g. BitSlice.
 It's always reference to the original BitArray. Array itself still can
 be either compact or not. All operations with slice go to array (and
 still check if they can use word-aligned speed up).
 
 How does it sound?

I would point out that while hasSlicing doesn't currently check for it, if 
opSlice doesn't return a type which is assignable to the original range, it 
won't work in a lot of Phobos functions. I keep meaning to bring that up for 
discussion in the main newsgroup. I'd argue that hasSlicing really be changed 
from

    enum bool hasSlicing = !isNarrowString!R && is(typeof(
    (inout int _dummy=0)
    {
        R r = void;
        auto s = r[1 .. 2];
        static assert(isInputRange!(typeof(s)));
    }));

to something like

    enum bool hasSlicing = !isNarrowString!R && is(typeof(
    (inout int _dummy=0)
    {
        R r = void;
        auto s = r[1 .. 2];
        static assert(isInputRange!(typeof(s)));
        r = s;
    }));

- Jonathan M Davis

Jul 28 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 29-Jul-12 01:07, Jonathan M Davis wrote:
 On Sunday, July 29, 2012 00:58:59 Dmitry Olshansky wrote:
 My solution is make slices different type e.g. BitSlice.
 It's always reference to the original BitArray. Array itself still can
 be either compact or not. All operations with slice go to array (and
 still check if they can use word-aligned speed up).

 How does it sound?

 I would point out that while hasSlicing doesn't currently check for it, if
 opSlice doesn't return a type which is assignable to the original range, it
 won't work in a lot of Phobos functions. I keep meaning to bring that up for
 discussion in the main newsgroup.

BitArray isn't and shouldn't be a range. It's a container.
So BitSlice is a range over BitArray. What am I missing?



-- 
Dmitry Olshansky

Jul 28 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, July 29, 2012 01:19:49 Dmitry Olshansky wrote:
 On 29-Jul-12 01:07, Jonathan M Davis wrote:
 On Sunday, July 29, 2012 00:58:59 Dmitry Olshansky wrote:
 My solution is make slices different type e.g. BitSlice.
 It's always reference to the original BitArray. Array itself still can
 be either compact or not. All operations with slice go to array (and
 still check if they can use word-aligned speed up).
 
 How does it sound?

 
 I would point out that while hasSlicing doesn't currently check for it, if
 opSlice doesn't return a type which is assignable to the original range,
 it
 won't work in a lot of Phobos functions. I keep meaning to bring that up
 for discussion in the main newsgroup.

 
 BitArray isn't and shouldn't be a range. It's a container.
 So BitSlice is a range over BitArray. What am I missing?

Probably nothing. I'm not all that familiar with BitArray. If it's not a 
range, it's not a problem. I assumed that it was and was just pointing out a 
potential problem based on that assumption.

- Jonathan M Davis

Jul 28 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Saturday, 28 July 2012 at 21:07:31 UTC, Jonathan M Davis wrote:
 I would point out that while hasSlicing doesn't currently check 
 for it, if opSlice doesn't return a type which is assignable to 
 the original range, it won't work in a lot of Phobos functions. 
 I keep meaning to bring that up for discussion in the main 
 newsgroup. I'd argue that hasSlicing really be changed

  Which is one of the main reasons I am hesitant to have a 
separate range struct (of BitSlice)... Hmmm maybe we should keep 
it here for a little while, and talk it over a bit more.

Jul 28 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, July 28, 2012 23:23:30 Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 21:07:31 UTC, Jonathan M Davis wrote:
 I would point out that while hasSlicing doesn't currently check
 for it, if opSlice doesn't return a type which is assignable to
 the original range, it won't work in a lot of Phobos functions.
 I keep meaning to bring that up for discussion in the main
 newsgroup. I'd argue that hasSlicing really be changed

 
   Which is one of the main reasons I am hesitant to have a
 separate range struct (of BitSlice)... Hmmm maybe we should keep
 it here for a little while, and talk it over a bit more.

Well, it seems that my comment was somewhat misplaced in that BitArray isn't a 
range but a container. My comment was directed at opSlice on ranges. Container 
types should return whatever range type is appropriate. That will usually be a 
struct of some variety. I'd have to study BitArray and your changes to 
determine what the appropriate type was here, but it sounds like you were 
talking about duping the contents with opSlice, which would be highly 
irregular. Normally, ranges over containers are light wrappers which can 
iterate through and potentially alter the elements in the container, and 
duping up an array as the range type doesn't work like that. But depending on 
what BitArray is doing exactly, it might be appropriate. I don't know.

- Jonathan M Davis

Jul 28 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Saturday, 28 July 2012 at 21:30:21 UTC, Jonathan M Davis wrote:

 Well, it seems that my comment was somewhat misplaced in that 
 BitArray isn't a range but a container. My comment was directed 
 at opSlice on ranges. Container types should return whatever 
 range type is appropriate. That will usually be a struct of 
 some variety. I'd have to study BitArray and your changes to 
 determine what the appropriate type was here, but it sounds 
 like you were talking about duping the contents with opSlice, 
 which would be highly irregular. Normally, ranges over 
 containers are light wrappers which can iterate through and 
 potentially alter the elements in the container, and duping up 
 an array as the range type doesn't work like that. But 
 depending on what BitArray is doing exactly, it might be 
 appropriate. I don't know.

  An array suggests you can control the size of the array, however 
if you are forced to work in 32increments (or more if 64bit), 
that makes it highly hard to control without making it a slice to 
begin with. Consider without slices (range control)

  BitArray ba; //size 32/64bit

  ba.length = 42; //life, the universe and everything
  assert(ba.length == 42); //fails, can't set length except by 
increments of 32/64

  ba ~= true; //what's the size now? Or are all appending 
operators invalid unless it's a slice?

  foreach(b; ba) {
    //fails, can't go over range.
  }

  auto slice = ba[]; //converts to BitSlice

  foreach(b; slice) {
     //works
  }

  BitSlice func(BitArray ba);
  BitSlice func(BitSlice ba); //two versions of function needed? 
If convertable between the two, then you only lose the 
begining/ending length offsets.

  T func(T)(T ba) if (is(T : BitArray)) // correct? Potential loss 
of information.

  BitSlice x = ba[1 .. 10];
  BitArray y = func(x); //range lost, beginning/ending at max 
regardless of size



  Honestly that doesn't add up to what I call an 'array'. It 
should look like an array, act like an array, and besides not 
being able to specify the array as separate from the entire 
BitArray (Ie const(int)) it's otherwise should not be 
distinguishable from an array.

Jul 28 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Saturday, 28 July 2012 at 21:39:56 UTC, Era Scarecrow wrote:
 BitArray (Ie const(int)) it's otherwise should not be

  My filter/formatter got me again...

   Honestly that doesn't add up to what I call an 'array'. It 
should look like an array, act like an array, and besides not 
being able to specify the array as separate from the entire 
BitArray (ie: const(int)[]) it's otherwise should be 
indistinguishable from an array template-wise.

Jul 28 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 29-Jul-12 01:39, Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 21:30:21 UTC, Jonathan M Davis wrote:

 Well, it seems that my comment was somewhat misplaced in that BitArray
 isn't a range but a container. My comment was directed at opSlice on
 ranges. Container types should return whatever range type is
 appropriate. That will usually be a struct of some variety. I'd have
 to study BitArray and your changes to determine what the appropriate
 type was here, but it sounds like you were talking about duping the
 contents with opSlice, which would be highly irregular. Normally,
 ranges over containers are light wrappers which can iterate through
 and potentially alter the elements in the container, and duping up an
 array as the range type doesn't work like that. But depending on what
 BitArray is doing exactly, it might be appropriate. I don't know.

   An array suggests you can control the size of the array, however if
 you are forced to work in 32increments (or more if 64bit), that makes it
 highly hard to control without making it a slice to begin with. Consider
 without slices (range control)

A strange jump to conclusions? Why can't array know it's size in bits 
and capacity in words?

   BitArray ba; //size 32/64bit

   ba.length = 42; //life, the universe and everything
   assert(ba.length == 42); //fails, can't set length except by
 increments of 32/64

   ba ~= true; //what's the size now? Or are all appending operators
 invalid unless it's a slice?

43 ? Why should it use 32/64 bit increments?

   foreach(b; ba) {
     //fails, can't go over range.
   }

works as foreach takes [] when needed automagically :)
   auto slice = ba[]; //converts to BitSlice

   foreach(b; slice) {
      //works
   }

   BitSlice func(BitArray ba);
   BitSlice func(BitSlice ba); //two versions of function needed? If
 convertable between the two, then you only lose the begining/ending
 length offsets.

Just make it a template if it works with any range.

   T func(T)(T ba) if (is(T : BitArray)) // correct? Potential loss of
 information.

T func(R)(R ba)
  if(isRandomAccessRange!R && is(Unqual!(ElementType!R) == bool))

Would be far more general. As a bonus it will work with built-in arrays 
of bool or anything compatible.

   BitSlice x = ba[1 .. 10];
   BitArray y = func(x); //range lost, beginning/ending at max regardless
 of size

Certain functions would need to take a BitArray if they need to change 
it size, append, prepand etc.

   Honestly that doesn't add up to what I call an 'array'. It should look
 like an array, act like an array, and besides not being able to specify
 the array as separate from the entire BitArray (Ie const(int)) it's
 otherwise should not be distinguishable from an array.

It's damn hard to emulate built-in array. And I'd rather not to see this 
emulation half-working.

And in say C++, std::vector (and most arrays in other languages I know 
about) is not sliceble at all or slices are different type, or they do 
copy the whole thing. So I'm not sure what kind of array you are used 
to.  D's built-in arrays are rare beasts with their own issues, even 
with the fair deal of built-in language/runtime support they have.

-- 
Dmitry Olshansky

Jul 28 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Saturday, 28 July 2012 at 21:49:45 UTC, Dmitry Olshansky wrote:
 On 29-Jul-12 01:39, Era Scarecrow wrote:
 An array suggests you can control the size of the array, 
 however if you are forced to work in 32increments (or more if 
 64bit), that makes it highly hard to control without making it 
 a slice to begin with. Consider without slices (range control)

 A strange jump to conclusions? Why can't array know it's size 
 in bits and capacity in words?

  Knowing the size based on if it's compact or the array.length * 
bits would be easy, but to do slices you drop the 
beginning/ending reference pointers, locking it at 32/64bits. If 
you keep that information, why not just include slicing along 
with the rest of it?

  BitArray ba; //size 32/64bit

  ba.length = 42; //life, the universe and everything
  assert(ba.length == 42); //fails, can't set length except by
 increments of 32/64

  ba ~= true; //what's the size now? Or are all appending 
 operators
 invalid unless it's a slice?

 43 ? Why should it use 32/64 bit increments?

  See above.

 works as foreach takes [] when needed automagically :)

  I thought about that after I submitted it, so my mistake on that.

 Just make it a template if it works with any range.

 T func(R)(R ba)
  if(isRandomAccessRange!R && is(Unqual!(ElementType!R) == bool))

 Would be far more general. As a bonus it will work with 
 built-in arrays of bool or anything compatible.

  True, but you'd still need to specifically convert the array to 
a slice first. So you'd call it func(ba[]), or func(ba.range) or 
func(ba[0 .. $]). Kinda seems silly to require the extra bit 
there.

  BitSlice x = ba[1 .. 10];
  BitArray y = func(x); //range lost, beginning/ending at max 
 regardless
 of size

 Certain functions would need to take a BitArray if they need to 
 change it size, append, prepand etc.

  See first reply.

 It's damn hard to emulate built-in array. And I'd rather not to 
 see this emulation half-working.

 And in say C++, std::vector (and most arrays in other languages 
 I know about) is not sliceble at all or slices are different 
 type, or they do copy the whole thing. So I'm not sure what 
 kind of array you are used to. D's built-in arrays are rare 
 beasts with their own issues, even with the fair deal of 
 built-in language/runtime support they have.

  Obviously it can't emulate it perfectly, but doing all the basic 
operations should be acceptable. I don't know all of what arrays 
require, but opIndex and length are the largest part to my 
opinion, which I guess is just randomAccessRange (or whatever 
it's called)

Jul 28 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 29-Jul-12 02:08, Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 21:49:45 UTC, Dmitry Olshansky wrote:
 On 29-Jul-12 01:39, Era Scarecrow wrote:
 An array suggests you can control the size of the array, however if
 you are forced to work in 32increments (or more if 64bit), that makes
 it highly hard to control without making it a slice to begin with.
 Consider without slices (range control)

 A strange jump to conclusions? Why can't array know it's size in bits
 and capacity in words?

   Knowing the size based on if it's compact or the array.length * bits
 would be easy, but to do slices you drop the beginning/ending reference
 pointers, locking it at 32/64bits.

Mhm? Not getting you it at all. What's locking it ?

What's wrong with:
struct BitArray
{

union {
	struct {
		size_t[] data;
		ulong length;	//in bits

//length in words would be just
// (length + size_t.sizeof*4) / // (size_t.sizeof*8);
// as it's a power of 2 it should get optimized

	}
	size_t[2+ulong.sizeof/size_t.sizeof] compact;

// and say 1 bit is used for isCompact would be great if it could use // 
the least significant bit of pointer from data[], dunno if it's easy enough
}

}

then Range would be:

struct Range{
	BitArray* array;
	ulong start, end;
}

to mark [start..end] piece of it?

If you keep that information, why not
 just include slicing along with the rest of it?

I'm not seeing slicing easily integrated just because length is known.

-- 
Dmitry Olshansky

Jul 28 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Saturday, 28 July 2012 at 22:19:05 UTC, Dmitry Olshansky wrote:
 Mhm? Not getting you it at all. What's locking it ?

 What's wrong with:
 struct BitArray
 {

 union {
 	struct {
 		size_t[] data;
 		ulong length;	//in bits

 //length in words would be just
 // (length + size_t.sizeof*4) / // (size_t.sizeof*8);
 // as it's a power of 2 it should get optimized

 	}
 	size_t[2+ulong.sizeof/size_t.sizeof] compact;

 // and say 1 bit is used for isCompact would be great if it 
 could use // the least significant bit of pointer from data[], 
 dunno if it's easy enough

  I'd rather not rely on the pointer at all as a reference for if 
it's compact. If the array is changed later to a template that 
could accept bytes, then where does that leave you? When the 
slice doesn't need to know more of the beginning, it can re-slice 
it to a lesser more appropriate size. It may very well soon 
contain a third entry in the union for main operations and use 
the larger size_t just for canUseBulk (Actually that sounds 
better to me).

  Besides, separating the slice suggests 'length' is not part of 
the union struct. Sure it's convertible to [0 .. length], but 
it's no longer convertible back to BitArray (Unless you make a 
whole new array and conversion/copying which takes a lot of 
time). If you do that, why not do BitSlice from the beginning 
which holds BitArray, but then if you do that why make them 
separate at all? It doesn't make sense to me. I can see making a 
separate range for the front/popFront/empty if you really wanted 
to use those, but it wouldn't change how it would work.

 then Range would be:

 struct Range{
 	BitArray* array;
 	ulong start, end;
 }

 to mark [start..end] piece of it?

 If you keep that information, why not just include slicing 
 along with the rest of it?

 I'm not seeing slicing easily integrated just because length is 
 known.

  Beginning and ending are known, so slicing can be added at no 
extra cost.

Jul 28 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 29-Jul-12 02:31, Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 22:19:05 UTC, Dmitry Olshansky wrote:
 Mhm? Not getting you it at all. What's locking it ?

 What's wrong with:
 struct BitArray
 {

 union {
     struct {
         size_t[] data;
         ulong length;    //in bits

 //length in words would be just
 // (length + size_t.sizeof*4) / // (size_t.sizeof*8);
 // as it's a power of 2 it should get optimized

     }
     size_t[2+ulong.sizeof/size_t.sizeof] compact;

 // and say 1 bit is used for isCompact would be great if it could use
 // the least significant bit of pointer from data[], dunno if it's
 easy enough

   I'd rather not rely on the pointer at all as a reference for if it's
 compact.

No problem, I redesigned a bit:

struct BitArray
{

union {
     struct {
         ulong length;    //in bits, the top bit used as isCompact
         size_t[] data;
     }
     struct{
	ushort smallLength; //here as well
	ubyte[size_t.sizeof*2+ulong.sizeof-2] compact;
     }
}

Up 2^63 of bits would occupy 2^60 of bytes should be good enough for the 
years to come.
  I'm not seeing it as a problem as at very least processors have to 
actually be able to address more then 42 (or was it 48?) bits that they 
can now. (see AMD or Intel or ARM datasheets on 64 bit chips)

If the array is changed later to a template that could accept
 bytes, then where does that leave you?
 When the slice doesn't need to
 know more of the beginning,

???
Slice can start at some random offset

  it can re-slice it to a lesser more
 appropriate size.

no idea what's you are talking about here.

  It may very well soon contain a third entry in the
 union for main operations and use the larger size_t just for canUseBulk
 (Actually that sounds better to me).

Again I'm having hard time to get what follows from where. Can you 
expand on this point please?
Preferably with some code as it was far easier to reason about.


   Besides, separating the slice suggests 'length' is not part of the
 union struct.

How? All containers know their length.

Sure it's convertible to [0 .. length], but it's no longer
 convertible back to BitArray (Unless you make a whole new array and
 conversion/copying which takes a lot of time).

Yes it's not convertible. And it's no problem and is intended. Otherwise 
like you said it doesn't make a lot of sense.

If you do that, why not
 do BitSlice from the beginning which holds BitArray, but then if you do
 that why make them separate at all?
 It doesn't make sense to me. I can
 see making a separate range for the front/popFront/empty if you really
 wanted to use those, but it wouldn't change how it would work.

 then Range would be:

 struct Range{
     BitArray* array;
     ulong start, end;
 }

 to mark [start..end] piece of it?

 If you keep that information, why not just include slicing along with
 the rest of it?

 I'm not seeing slicing easily integrated just because length is known.

   Beginning and ending are known, so slicing can be added at no extra cost.

Begin takes extra ulong I'm not sure it is needed in the container itself.

-- 
Dmitry Olshansky

Jul 28 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Saturday, 28 July 2012 at 23:02:17 UTC, Dmitry Olshansky wrote:
 On 29-Jul-12 02:31, Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 22:19:05 UTC, Dmitry Olshansky 
 wrote:
 Mhm? Not getting you it at all. What's locking it ?

 What's wrong with:
 struct BitArray
 {

 union {
    struct {
        size_t[] data;
        ulong length;    //in bits

 //length in words would be just
 // (length + size_t.sizeof*4) / // (size_t.sizeof*8);
 // as it's a power of 2 it should get optimized

    }
    size_t[2+ulong.sizeof/size_t.sizeof] compact;

 // and say 1 bit is used for isCompact would be great if it 
 could use
 // the least significant bit of pointer from data[], dunno if 
 it's easy enough

 I'd rather not rely on the pointer at all as a reference for 
 if it's compact.

 No problem, I redesigned a bit:

 struct BitArray
 {

 union {
     struct {
         ulong length;    //in bits, the top bit used as 
 isCompact
         size_t[] data;
     }
     struct{
 	ushort smallLength; //here as well
 	ubyte[size_t.sizeof*2+ulong.sizeof-2] compact;
     }
 }

 Up 2^63 of bits would occupy 2^60 of bytes should be good 
 enough for the years to come.
  I'm not seeing it as a problem as at very least processors 
 have to actually be able to address more then 42 (or was it 
 48?) bits that they can now. (see AMD or Intel or ARM 
 datasheets on 64 bit chips)

2^63, or drop 8 so it takes up no more than 2^55 bytes of memory. 
Yes that's quite a large chunk.

  Curiously enough the current structure is already smaller than 
the proposed unions.

 If the array is changed later to a template that could accept 
 bytes, then where does that leave you? When the slice doesn't 
 need to know more of the beginning,


 ???
 Slice can start at some random offset

  Yes they can, but so can a normal array (as they are pretty much 
the same)

  it can re-slice it to a lesser more appropriate size.

 no idea what's you are talking about here.

  As for requested compact to hold the starting/ending offsets at 
1 byte each, obviously slicing at say 1024 to 2048 of a 10,000 
length bit array, that won't work (0-127). If it's larger than 
size_t in bits, it will subtract that size and internally change 
the pointer of the array it actually points to, until it can fit 
that new value.

start/end slice: [32 .. 64]
equals: size_t array[1 .. 2]

 It may very well soon contain a third entry in the union for 
 main operations and use the larger size_t just for canUseBulk 
 (Actually that sounds better to me).

 Again I'm having hard time to get what follows from where. Can 
 you expand on this point please? Preferably with some code as 
 it was far easier to reason about.

A question earlier was brought up about endienness for the patch, 
that is not part of what it can handle, so:

ba[100] = 1; //then save the array
assert(ba[100]); //fails on opposite endienness when loaded

HOWEVER! if individual bits are handled in a byte array 
reference, then the above would succeed, even with the larger 
size_t ones all working with canUseBulk.

 Besides, separating the slice suggests 'length' is not part of 
 the union struct.

 How? All containers know their length.

  Yes they do, they also know where they start, thereby giving 
start/ending, which allows slicing. So why separate slicing?

 Sure it's convertible to [0 .. length], but it's no longer 
 convertible back to BitArray (Unless you make a whole new 
 array and conversion/copying which takes a lot of time).

 Yes it's not convertible. And it's no problem and is intended. 
 Otherwise like you said it doesn't make a lot of sense.

 If you do that, why not do BitSlice from the beginning which 
 holds BitArray, but then if you do that why make them 
 separate at all? It doesn't make sense to me. I can see 
 making a separate range for the front/popFront/empty if you 
 really wanted to use those, but it wouldn't change how it 
 would work.



 Begin takes extra ulong I'm not sure it is needed in the 
 container itself.

  No it doesn't. bitfields are used to handle the beginning/ending 
offsets, equaling a total of 14 bits. It's the most confusing 
part (aside from [offset + sliceoffset + i[ formulas). canExpand 
and isCompact fill the other two bits.

Jul 28 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 29-Jul-12 03:16, Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 23:02:17 UTC, Dmitry Olshansky wrote:
 On 29-Jul-12 02:31, Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 22:19:05 UTC, Dmitry Olshansky wrote:
 Mhm? Not getting you it at all. What's locking it ?

 What's wrong with:
 struct BitArray
 {

 union {
    struct {
        size_t[] data;
        ulong length;    //in bits

 //length in words would be just
 // (length + size_t.sizeof*4) / // (size_t.sizeof*8);
 // as it's a power of 2 it should get optimized

    }
    size_t[2+ulong.sizeof/size_t.sizeof] compact;

 // and say 1 bit is used for isCompact would be great if it could use
 // the least significant bit of pointer from data[], dunno if it's
 easy enough

 I'd rather not rely on the pointer at all as a reference for if it's
 compact.

 No problem, I redesigned a bit:

 struct BitArray
 {

 union {
     struct {
         ulong length;    //in bits, the top bit used as isCompact
         size_t[] data;
     }
     struct{
     ushort smallLength; //here as well
     ubyte[size_t.sizeof*2+ulong.sizeof-2] compact;
     }
 }

 Up 2^63 of bits would occupy 2^60 of bytes should be good enough for
 the years to come.
  I'm not seeing it as a problem as at very least processors have to
 actually be able to address more then 42 (or was it 48?) bits that
 they can now. (see AMD or Intel or ARM datasheets on 64 bit chips)

 2^63, or drop 8 so it takes up no more than 2^55 bytes of memory. Yes
 that's quite a large chunk.

   Curiously enough the current structure is already smaller than the
 proposed unions.

OK. Now back to the biggest question:
Slices use references (sometimes) to previous bitarray. Since it's a 
slice (like an array) this could be expected.

The "sometimes" part is not god enough! The problem is that small string 
optimization (or small array optimization) doesn't work with reference 
semantics. You may dig up discussion on proposal for small string 
optimization for char[] and string that was shot down on these grounds.
Bulit-in like semantics call for simple ptr/ptr+length struct and you 
can't get that can you?.

E.g.

void mess_with(BitArray ba)
{
ba ~= true;
ba ~= false;
  //does it change anything outside this function? *maybe*
ba[0] ^= true;
ba[1] ^= true;
auto x = ba; //even if used ref BitArray ba, x *maybe* references ba
//is x now a reference or a copy? *maybe*

}

Thus BitArray either becomes value type (that may come as unexpected, 
see your option c/d). Or it becomes full reference type as in option a 
(though it sucks).
Sorry but b is no go, as it makes code unpredictable I'd rather take my 
idea with separate ranges then b.

Solutions:
  a) Drop the union, make all functions  safe (compared to  trusted), 
and they are all ref/slices
  b) Leave as sometimes ref, sometimes not; Use .dup when you NEED to 
ensure different unique copies.
  c) Apply COW (Copy-On-Write) where a known slice type (canExpand is 
false) when a change would apply, it replaces the present slice into 
it's own unique array.
  d) All slices become new dup'ed allocated arrays (Most expensive of 
them all)

Because of d, I proposed that slices be separate type that is always a 
reference to original. Then only coping arrays themselves involves dup.

 Begin takes extra ulong I'm not sure it is needed in the container
 itself.

   No it doesn't. bitfields are used to handle the beginning/ending
 offsets, equaling a total of 14 bits. It's the most confusing part
 (aside from [offset + sliceoffset + i[ formulas). canExpand and
 isCompact fill the other two bits.

OK, finally I think I understand how your current version works. It's 
clever but leaves you in half-way value-reference state.

To point out few problems with unification of slice type and BitArray:

- Every time a small slice is taken say [1..30] it becomes a small 
array(?) thus it is a value type and doesn't affect original array

- No idea if a = b make a copy will make programmers nervous, esp if one 
is writing a library function as in this case there is no assumption 
that fits all

- Same thing but more, does this work?
	auto c = ba; //if ba is compact
	ba ~= true;
	assert(c.length = ba.length-1);
	//c should remain with the same length unaffected (at least like built 
in arrays)

-- 
Dmitry Olshansky

Jul 28 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Sunday, 29 July 2012 at 06:27:32 UTC, Dmitry Olshansky wrote:
 OK. Now back to the biggest question:
 Slices use references (sometimes) to previous bitarray. Since 
 it's a slice (like an array) this could be expected.

 The "sometimes" part is not god enough! The problem is that 
 small string optimization (or small array optimization) doesn't 
 work with reference semantics. You may dig up discussion on 
 proposal for small string optimization for char[] and string 
 that was shot down on these grounds.

 Bulit-in like semantics call for simple ptr/ptr+length struct 
 and you can't get that can you?.>
 E.g.

 void mess_with(BitArray ba)
 {
 ba ~= true;
 ba ~= false;
  //does it change anything outside this function? *maybe*

If it was a previous slice, then it never chances outside and 
appending would automatically allocate new memory.

 ba[0] ^= true;
 ba[1] ^= true;
 auto x = ba; //even if used ref BitArray ba, x *maybe* 
 references ba
 //is x now a reference or a copy? *maybe*

 Thus BitArray either becomes value type (that may come as 
 unexpected, see your option c/d). Or it becomes full reference 
 type as in option a (though it sucks).

 Sorry but b is no go, as it makes code unpredictable I'd rather 
 take my idea with separate ranges then b.

 Solutions:
  a) Drop the union, make all functions  safe (compared to 
  trusted), and they are all ref/slices
  b) Leave as sometimes ref, sometimes not; Use .dup when you 
 NEED to ensure different unique copies.
  c) Apply COW (Copy-On-Write) where a known slice type 
 (canExpand is false) when a change would apply, it replaces the 
 present slice into it's own unique array.
  d) All slices become new dup'ed allocated arrays (Most 
 expensive of them all)

  Agreed, B is a bad choice. I'm going to suggest c myself, as 
changes would only propagate to a new true copy, otherwise it's 
read-only as a slice.

 Because of d, I proposed that slices be separate type that is 
 always a reference to original. Then only coping arrays 
 themselves involves dup.

  Mmmm... i think you're right...

 No it doesn't. bitfields are used to handle the 
 beginning/ending offsets, equaling a total of 14 bits. It's 
 the most confusing part (aside from [offset + sliceoffset + i] 
 formulas). canExpand and isCompact fill the other two bits.

 OK, finally I think I understand how your current version 
 works. It's clever but leaves you in half-way value-reference 
 state.

  I didn't want to originally use the 1-byte per 
startBit/endingBit, but it does actually work when you have the 
functions working.

 To point out few problems with unification of slice type and 
 BitArray:

 - Every time a small slice is taken say [1..30] it becomes a 
 small array(?) thus it is a value type and doesn't affect 
 original array

  No, slices of compact arrays remain compact arrays. slices of 
larger arrays are reference until length is set, dupped, or 
appended to (forcing a resize/dup)

 - No idea if a = b make a copy will make programmers nervous, 
 esp if one is writing a library function as in this case there 
 is no assumption that fits all

  If it's compact, it's an auto separate copy/valuetype. If it's 
not compact, it's a reference to the original.

 - Same thing but more, does this work?
 	auto c = ba; //if ba is compact
 	ba ~= true;

  ba may or may not be a normal/compact at this point

 	assert(c.length = ba.length-1);
 	//c should remain with the same length unaffected (at least 
 like built in arrays)

  Most likely it would be compact afterwards (Even if it allocated 
memory).

  I guess after going this over, two things need to happen. First 
it needs to be non-reference type, I don't want to go option d 
route, so I'll go for c instead. On a copy I'll add a postblit 
that makes the 'copy' now a slice which will create a new array 
when it changes in any way. Only question is when you pass the 
BitArray to a function as you showed, it would still think it's 
the original, unless the postblit/opAssign (or similar) can 
handle that. Hmmm time for some testing.

Jul 29 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Sunday, 29 July 2012 at 06:27:32 UTC, Dmitry Olshansky wrote:
 Thus BitArray either becomes value type (that may come as 
 unexpected, see your option c/d). Or it becomes full reference 
 type as in option a (though it sucks). Sorry but b is no go, as 
 it makes code unpredictable I'd rather take my idea with 
 separate ranges then b.

  I've had a thought. Why not make it fully reference and at the 
same time not? Consider making a slice which is the real 
BitArray, then you have an outer container that when you want 
value semantics you use it and it would make appropriate copies 
every time it needed to (including when first put in the 
container). It would be implicitly convertible to a slice. So...

BitArraySlice {
   //real bitarray stuff and pointer ref
}

BitArray { //value semantics wrapper?
   BitArraySlice ba;
   alias ba this;

   //include wrapper versions of binary operators

   this(this) {
   }
}

  Doing it this way the arguments being passed as reference slices 
would ensure unneeded copies wouldn't be made. On the other hand 
a smaller portion of the code would need to be updated. Along 
with that a slice could be set as COW allowing quite a bit of 
flexibility.

  Thoughts?

Jul 29 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 29-Jul-12 12:55, Era Scarecrow wrote:
 On Sunday, 29 July 2012 at 06:27:32 UTC, Dmitry Olshansky wrote:
 Thus BitArray either becomes value type (that may come as unexpected,
 see your option c/d). Or it becomes full reference type as in option a
 (though it sucks). Sorry but b is no go, as it makes code
 unpredictable I'd rather take my idea with separate ranges then b.

   I've had a thought. Why not make it fully reference and at the same
 time not? Consider making a slice which is the real BitArray, then you
 have an outer container that when you want value semantics you use it
 and it would make appropriate copies every time it needed to (including
 when first put in the container). It would be implicitly convertible to
 a slice. So...

 BitArraySlice {
    //real bitarray stuff and pointer ref
 }

 BitArray { //value semantics wrapper?
    BitArraySlice ba;
    alias ba this;

    //include wrapper versions of binary operators

    this(this) {
    }
 }

   Doing it this way the arguments being passed as reference slices would
 ensure unneeded copies wouldn't be made. On the other hand a smaller
 portion of the code would need to be updated. Along with that a slice
 could be set as COW allowing quite a bit of flexibility.

   Thoughts?

Hm... and small array optimizations go where? BitArraySlice won't be a 
full reference type if it does it.

-- 
Dmitry Olshansky

Jul 29 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Sunday, 29 July 2012 at 09:03:08 UTC, Dmitry Olshansky wrote:
 On 29-Jul-12 12:55, Era Scarecrow wrote:
 Doing it this way the arguments being passed as reference 
 slices would ensure unneeded copies wouldn't be made. On the 
 other hand a smaller portion of the code would need to be 
 updated. Along with that a slice could be set as COW allowing 
 quite a bit of flexibility.

  Thoughts?

 Hm... and small array optimizations go where? BitArraySlice 
 won't be a full reference type if it does it.

  I was thinking about that; If need be the slice can point to 
another slice which holds the actual small array (Kinda a long 
workaround). The way to small array optimization to work would be 
if BitArray generated it, since when you go to slicing you either 
reference the original data in the slice, or you could drop the 
small array optimization; although a loss it would simplify code 
until you come back to it once the code is working the way it 
should.

  Another idea is to convert the struct into a class instead, 
which would allow some options, but won't be a perfect fit, and 
inheritance could be used; which would solve some of the small 
array optimization, but at what cost?

Jul 29 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 29-Jul-12 13:18, Era Scarecrow wrote:
 On Sunday, 29 July 2012 at 09:03:08 UTC, Dmitry Olshansky wrote:
 On 29-Jul-12 12:55, Era Scarecrow wrote:
 Doing it this way the arguments being passed as reference slices
 would ensure unneeded copies wouldn't be made. On the other hand a
 smaller portion of the code would need to be updated. Along with that
 a slice could be set as COW allowing quite a bit of flexibility.

  Thoughts?

 Hm... and small array optimizations go where? BitArraySlice won't be a
 full reference type if it does it.

   I was thinking about that; If need be the slice can point to another
 slice which holds the actual small array (Kinda a long workaround). The
 way to small array optimization to work would be if BitArray generated
 it, since when you go to slicing you either reference the original data
 in the slice, or you could drop the small array optimization; although a
 loss it would simplify code until you come back to it once the code is
 working the way it should.

   Another idea is to convert the struct into a class instead, which
 would allow some options, but won't be a perfect fit, and inheritance
 could be used; which would solve some of the small array optimization,
 but at what cost?

I have simpler suggestion. Maybe doing it as 2 containers:

BitSet is a plain value type with small array optimization (what you 
called BitArray in last 2 posts) and no slicing (only as opSlice() i.e. 
as a whole). Best used for small things under < 1000 bits (as even dup 
will be cheap enough).

BitArray is class or struct that has reference smenatics and drops small 
array optimization in favor of having simple & fast slicing
Best used for big arrays with no less then 100-200 bits.


-- 
Dmitry Olshansky

Jul 29 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Sunday, 29 July 2012 at 09:30:06 UTC, Dmitry Olshansky wrote:
 I have simpler suggestion. Maybe doing it as 2 containers:

 BitSet is a plain value type with small array optimization 
 (what you called BitArray in last 2 posts) and no slicing (only 
 as opSlice() I.e. as a whole). Best used for small things under 
 < 1000 bits (as even dup will be cheap enough).

 BitArray is class or struct that has reference smenatics and 
 drops small array optimization in favor of having simple & fast 
 slicing Best used for big arrays with no less then 100-200 bits.

  As long as you only use one or the other then it sounds great; 
I'm sure a conversion/constructor would be available to convert 
from one to another. That does seem like the best solution, but 
also feels like we're building it twice (Little code re-use).

  Might be a third option: A template struct that represents a 
BitArray type (with 90% of the code); you then insert a structure 
with key functions that would handle the value/reference & 
duplication part.

  I'll meditate on it for a few days and see if anything pops up.

Jul 29 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 29-Jul-12 14:28, Era Scarecrow wrote:
 On Sunday, 29 July 2012 at 09:30:06 UTC, Dmitry Olshansky wrote:
 I have simpler suggestion. Maybe doing it as 2 containers:

 BitSet is a plain value type with small array optimization (what you
 called BitArray in last 2 posts) and no slicing (only as opSlice()
 I.e. as a whole). Best used for small things under < 1000 bits (as
 even dup will be cheap enough).

 BitArray is class or struct that has reference smenatics and drops
 small array optimization in favor of having simple & fast slicing Best
 used for big arrays with no less then 100-200 bits.

   As long as you only use one or the other then it sounds great; I'm
 sure a conversion/constructor would be available to convert from one to
 another. That does seem like the best solution, but also feels like
 we're building it twice (Little code re-use).

Yup.

   Might be a third option: A template struct that represents a BitArray
 type (with 90% of the code); you then insert a structure with key
 functions that would handle the value/reference & duplication part.

Sounds a lot like mixin template. I'd try to invent some small set of 
primitive operations and auto-magically construct the others from it.

   I'll meditate on it for a few days and see if anything pops up.

And just to keep your mind busy :) how about adding SIMD to speed it all up:
http://dlang.org/simd.html
It's kind of new & cool thing but not very well documented.

-- 
Dmitry Olshansky

Jul 29 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Sunday, 29 July 2012 at 10:34:50 UTC, Dmitry Olshansky wrote:
 Might be a third option: A template struct that represents a 
 BitArray type (with 90% of the code); you then insert a 
 structure with key functions that would handle the 
 value/reference & duplication part.

 Sounds a lot like mixin template. I'd try to invent some small 
 set of primitive operations and auto-magically construct the 
 others from it.

  Pretty much...  syntactical sugar, just add tea (T) I made a 
funny! :D

 I'll meditate on it for a few days and see if anything pops up.

 And just to keep your mind busy :) how about adding SIMD to 
 speed it all up:
 http://dlang.org/simd.html
 It's kind of new & cool thing but not very well documented.

  Mmm I'll look it over. If it can work for larger bulk operations 
I don't see why not.

Jul 29 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Sunday, 29 July 2012 at 09:30:06 UTC, Dmitry Olshansky wrote:
 I have simpler suggestion. Maybe doing it as 2 containers:

 BitSet is a plain value type with small array optimization 
 (what you called BitArray in last 2 posts) and no slicing (only 
 as opSlice() I.e. as a whole). Best used for small things under 
 < 1000 bits (as even dup will be cheap enough).

 BitArray is class or struct that has reference semantics and 
 drops small array optimization in favor of having simple & fast 
 slicing Best used for big arrays with no less then 100-200 bits.

  Another thought coming to mind, is to have the BitArray with a 
few different modes it could operate in. One of the modes could 
be 'reference' which would make slices and them all work exactly 
as you would expect, while with referencing off it would act as a 
normal value type struct, since the mode/flags would modify how 
the postblit and assignment operators worked, along with the 
length. (Reference wouldn't use small array optimization).

  But having them statically separated by name/type seems much 
more likely to be safer in the long run with reliable results.

Jul 29 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Sunday, 29 July 2012 at 12:39:13 UTC, Era Scarecrow wrote:
  But having them statically separated by name/type seems much 
 more likely to be safer in the long run with reliable results.

  A question regarding templates. A template with different 
parameters is completely incompatible correct? So...

struct X(T) {
}

alias X!bool XB;
alias X!int XI;

void func(XB xb) {

}

func(XB()); //works
func(XI()); //should fail to compile

Same if it was built differently? so...

template X(bool something) //'bool something' should be 
statically checkable
   struct XY {
     static if (something) {
       //conditional functions or code
     }
     XY makeX(){
       //XY type is only currently formed template version correct?
       return XY();
     }
   }
}

//against above func, with these two...
alias X!(false).XY XB;
alias X!(true).XY XI;

Course if there's a way to avoid asking about the inner XY, I 
wonder how I would do that.


  Now if all that is correct, say I want to make two functions 
that both use X, but are not compatible, but template functions 
will allow it. So...

void tempFunc(T, U)(T t, U u)
if(
   //What do i enter to check they are both template X or struct 
XY? As signatures will be different...
)
body{
   //now acts as a middle-man to be compatible for said operation
}


  Finally, I came across an anomaly and may just be a bug. What 
about a postblits?


T func2(T)(T x)  safe pure {
   return T();
}

struct XY {
   this(this)  safe pure {} //safe pure added so func can call it
   void func(XY x)  safe pure {
     XY y = x; //postblits called on all three lines
     func2(x);
     func2(y);
   }
}

template X(bool something) {
struct XY {
     this(this)  safe pure {}
     void func(XY x)  safe pure {
       XY y = x; //Error: see below
       func2(x);
       func2(y);
     }
   }
}

alias X!(true).XY Xtrue;

pure function 'func' cannot call impure function '__cpctor'
safe function 'func' cannot call system function '__cpctor'

Jul 30 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 30-Jul-12 23:50, Era Scarecrow wrote:
 On Sunday, 29 July 2012 at 12:39:13 UTC, Era Scarecrow wrote:
  But having them statically separated by name/type seems much more
 likely to be safer in the long run with reliable results.

   A question regarding templates. A template with different parameters
 is completely incompatible correct? So...

Yeah, like twin brothers the have different IDs :)

 struct X(T) {
 }

 alias X!bool XB;
 alias X!int XI;

 void func(XB xb) {

 }

 func(XB()); //works
 func(XI()); //should fail to compile

 Same if it was built differently? so...

 template X(bool something) //'bool something' should be statically
 checkable
    struct XY {
      static if (something) {
        //conditional functions or code
      }
      XY makeX(){
        //XY type is only currently formed template version correct?
        return XY();
      }
    }
 }

 //against above func, with these two...
 alias X!(false).XY XB;
 alias X!(true).XY XI;

 Course if there's a way to avoid asking about the inner XY, I wonder how
 I would do that.


   Now if all that is correct, say I want to make two functions that both
 use X, but are not compatible, but template functions will allow it. So...

 void tempFunc(T, U)(T t, U u)
 if(
    //What do i enter to check they are both template X or struct XY? As
 signatures will be different...
 )
 body{
    //now acts as a middle-man to be compatible for said operation
 }

Not sure what you would like to accomplish here.

 alias X!(true).XY Xtrue;

 pure function 'func' cannot call impure function '__cpctor'
 safe function 'func' cannot call system function '__cpctor'

Yeah, it's a bug. File it please.

-- 
Dmitry Olshansky

Jul 30 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Monday, 30 July 2012 at 20:19:51 UTC, Dmitry Olshansky wrote:
 Not sure what you would like to accomplish here.

Than an example...

struct BitArray { //assume template...
   ref BitArray opSliceAssign(T)(const T ba, int start, int end)
   if (
    //if T is type bitArray but only a different change regarding 
parameters
    //we can use canUseBulk and other speedups. Otherwise a range 
is required.
    //checking all possibilities seems silly, but possible
   )
   body {
     //slice specific speedup and code
     return this;
   }
   int opEquals(T)(const T ba) const
   //if constraint as above
   body {}
}

Code wise; If BitSet is always value semantics and BitArray is 
always reference (a previous post) and slices/groups work 
basically the same; then...

BitSet bs = BitSet(100);
BitArray ba = BitArray(100);

  bs[0 .. 10] = ba[0 .. 10]; //should work
  assert(bs[0 .. 10] == ba[0 .. 10]); //should also be possible to 
work.

 alias X!(true).XY Xtrue;

 pure function 'func' cannot call impure function '__cpctor'
 safe function 'func' cannot call system function '__cpctor'

 Yeah, it's a bug. File it please.

Alright... Considered a major (Maybe even blocking).

  Reminds me. Due to how rvalue/lvalues work, assuming I need to 
const reference something but I don't care if it's a temporary or 
not, would I then do...

struct X {
   int opCmp(const X x) const { //catch temporary
     return opCmp(x); //becomes reference, as it's now named 'x'
   }
   int opCmp(const ref X x) const {
     //do work here
     return 0;
   }
}

X x;
assert(x == x);   //ref
assert(x == X()); //temporary

  course if the 'in' keyword is the key i will have to test that 
to make sure.

Jul 30 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Monday, 30 July 2012 at 21:03:39 UTC, Era Scarecrow wrote:
 Alright... Considered a major (Maybe even blocking).

http://d.puremagic.com/issues/show_bug.cgi?id=8475

Jul 30 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 31-Jul-12 01:03, Era Scarecrow wrote:
 On Monday, 30 July 2012 at 20:19:51 UTC, Dmitry Olshansky wrote:
 Not sure what you would like to accomplish here.

 Than an example...

You can go for simpler separation:

struct BitArray{
//() - is an empty template spec
	ref BitArrayopSliceAssign()(const BitArray ba, int start, int end)
{
//two bit array can try balk mode etc.

   Reminds me. Due to how rvalue/lvalues work, assuming I need to const
 reference something but I don't care if it's a temporary or not, would I
 then do...

 struct X {
    int opCmp(const X x) const { //catch temporary
      return opCmp(x); //becomes reference, as it's now named 'x'
    }
    int opCmp(const ref X x) const {
      //do work here
      return 0;
    }
 }

 X x;
 assert(x == x);   //ref
 assert(x == X()); //temporary

   course if the 'in' keyword is the key i will have to test that to make
 sure.

in == scope const not sure what scope buys here but couldn't hurt.

-- 
Dmitry Olshansky

Jul 30 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Monday, 30 July 2012 at 21:56:20 UTC, Dmitry Olshansky wrote:
 You can go for simpler separation:

 struct BitArray{
 //() - is an empty template spec
 	ref BitArrayopSliceAssign()(const BitArray ba, int start, int 
 end)
 {
 //two bit array can try balk mode etc.

  I'll give it a try, it may very well be what I need; although I 
wouldn't have thought you could declare it like that. I have much 
to learn in the ways of the template.

  course if the 'in' keyword is the key i will have to test 
 that to make sure.

 in == scope const not sure what scope buys here but couldn't 
 hurt.

  If it can avoid making a new copy, then it likely would help. 
I'll need to test it. I actually am not quite sure how scope 
works as an argument; it isn't really covered in TDPL from what I 
recall.

Jul 30 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Monday, 30 July 2012 at 22:23:46 UTC, Era Scarecrow wrote:
 On Monday, 30 July 2012 at 21:56:20 UTC, Dmitry Olshansky wrote:
 in == scope const not sure what scope buys here but couldn't 
 hurt.

 If it can avoid making a new copy, then it likely would help. 
 I'll need to test it. I actually am not quite sure how scope 
 works as an argument; it isn't really covered in TDPL from what 
 I recall.

  Tests with 'in' doesn't help, only ref and catching a temporary 
seem to prevent postblits.



Also the empty template doesn't work, somehow I'm not surprised...

template X(bool something) {
  struct XT {
    void func()(XT x){
         writeln("XT template called: typeID:", typeid(x));
    }
  }
}

alias X!true.XT A;
alias X!false.XT B;

A a, b;
B ba;

a.func(b); //passes
a.func(ba);

Error: template test.X!(true).XT.func does not match any function 
template declaration
Error: template test.X!(true).XT.func() cannot deduce template 
function from argument types !()(XT)

Jul 30 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 31-Jul-12 02:40, Era Scarecrow wrote:
 On Monday, 30 July 2012 at 22:23:46 UTC, Era Scarecrow wrote:
 On Monday, 30 July 2012 at 21:56:20 UTC, Dmitry Olshansky wrote:
 in == scope const not sure what scope buys here but couldn't hurt.

 If it can avoid making a new copy, then it likely would help. I'll
 need to test it. I actually am not quite sure how scope works as an
 argument; it isn't really covered in TDPL from what I recall.

   Tests with 'in' doesn't help, only ref and catching a temporary seem
 to prevent postblits.



 Also the empty template doesn't work, somehow I'm not surprised...

 template X(bool something) {
   struct XT {

Fixed :

     void func(bool smth)(X!(smth).XT x){

By default XT is deduced as X!(current value of smth).XT

          writeln("XT template called: typeID:", typeid(x));
     }
   }
 }

 alias X!true.XT A;
 alias X!false.XT B;

 A a, b;
 B ba;

 a.func(b); //passes
 a.func(ba);

 Error: template test.X!(true).XT.func does not match any function
 template declaration
 Error: template test.X!(true).XT.func() cannot deduce template function
 from argument types !()(XT)


-- 
Dmitry Olshansky

Jul 30 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Monday, 30 July 2012 at 22:44:21 UTC, Dmitry Olshansky wrote:
 Fixed :

    void func(bool smth)(X!(smth).XT x){

 By default XT is deduced as X!(current value of smth).XT

Doesn't really fix it...

     a.func(b); //65 - doesn't match declaration.
     a.func(ba); //66
     //other template tests not worth putting, 67-70
     a.func!true(b); //71 - explicitly stated
     a.func!false(ba); //72
     a.func!false(b); //73 - Backwards on purpose for test
     a.func!true(ba); //74

(65): Error: template test.X!(true).XT.func does not match any 
function template declaration
(11): Error: template test.X!(true).XT.func(bool smth) cannot 
deduce template function from argument types !()(XT)

(66): Error: template test.X!(true).XT.func does not match any 
function template declaration
(11): Error: template test.X!(true).XT.func(bool smth) cannot 
deduce template function from argument types !()(XT)

(73): Error: function test.X!(true).XT.func!(false).func (XT x) 
is not callable using argument types (XT)
(73): Error: cannot implicitly convert expression (b) of type XT 
to XT
(74): Error: function test.X!(true).XT.func!(true).func (XT x) is 
not callable using argument types (XT)
(74): Error: cannot implicitly convert expression (ba) of type XT 
to XT

Jul 30 2012

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Mon, Jul 30, 2012 at 9:50 PM, Era Scarecrow <rtcvb32 yahoo.com> wrote:

  A question regarding templates. A template with different parameters is
 completely incompatible correct?

Correct. They have no reason, in general, too even generate the same code:

template Chameleon(T)
{
    static if (is(T == struct))
        enum Chameleon = "I'm a string!"; // Chamelon!(SomeStruct) is
a string value
    else static if (is(T == class))
        void Chameleon(int i) { return i+1;} // Chameleon!(SomeClass)
is a function
    else
        alias T[int] Chameleon; // Here Chameleon is a type
}

The same for you struct S(T) { ... }. Inside the braces, anything can
happen (mwwaahhha hhaa!), depending on T.

 So...

 struct X(T) {
 }

 alias X!bool XB;
 alias X!int XI;

 void func(XB xb) {

 }

 func(XB()); //works
 func(XI()); //should fail to compile

Yes.

  Now if all that is correct, say I want to make two functions that both use
 X, but are not compatible, but template functions will allow it. So...

I'm not sure I understand what you're trying to do. Do you mean you
want a function that accept X!(T)s for any T, and not any other type?

struct X(T) {}

void func(T)(X!T x)
{}

void main()
{
    X!bool b;
    X!int i;
    func(b);
    func(i);
}

 void tempFunc(T, U)(T t, U u)
 if(
   //What do i enter to check they are both template X or struct XY? As
 signatures will be different...
 )
 body{
   //now acts as a middle-man to be compatible for said operation
 }

Here you want a constraint that checks that U and T are both
X!(SomeType) or an XY?

void tempFunc(T,U)(T t, U u) if (is(T a == X!(SomeType), SomeType) &&
is(U a == X!(SomeType), SomeType)
                                                || is(T == XY) && is(U == XY))
{
...
}


Is that what you need?

Jul 30 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Monday, 30 July 2012 at 20:48:26 UTC, Philippe Sigaud wrote:
 Now if all that is correct, say I want to make two functions 
 that both use X, but are not compatible, but template 
 functions will allow it. So...

 I'm not sure I understand what you're trying to do. Do you mean 
 you want a function that accept X!(T)s for any T, and not any 
 other type?

  Anything of template X! previous post of mine shows sort of an 
example.

 struct X(T) {}

 void func(T)(X!T x)
 {}

 void main()
 {
     X!bool b;
     X!int i;
     func(b);
     func(i);
 }

  Hmmm i do think that seems right... but if it contains multiple 
parameters, then...?

template X(x1, x2, x3) {
  struct XT {}
}

 void func(T)(X!T x) {}

Will this still work?

 Here you want a constraint that checks that U and T are both
 X!(SomeType) or an XY?

  As before, of Template X, and struct XY, but T (or sometype) 
doesn't matter.

 void tempFunc(T,U)(T t, U u) if (is(T a == X!(SomeType), 
 SomeType) &&
 is(U a == X!(SomeType), SomeType)
                                                 || is(T == XY) 
 && is(U == XY))
 {
 ...
 }
 Is that what you need?

  I want to say no... But I haven't actually tested it for my use 
cases. Leaving it unconstrained (without checking) to make it 
work is a disaster waiting to happen: I want T and U to both be 
of the same template (X in this case) but not the same 
instantiation arguments.

  I hope I'm not repeating myself too much.

Jul 30 2012

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Mon, Jul 30, 2012 at 11:19 PM, Era Scarecrow <rtcvb32 yahoo.com> wrote:

 void func(T)(X!T x)
 {}

 void main()
 {
     X!bool b;
     X!int i;
     func(b);
     func(i);
 }


  Hmmm i do think that seems right... but if it contains multiple parameters,
 then...?

 template X(x1, x2, x3) {
  struct XT {}
 }

 void func(T)(X!T x) {}

 Will this still work?

No. Use a 3-params template or a tuple:

void func(A,B,C)(X!(A,B,C) x) {}

or

void func(Ts...)(X!(Ts) x) {}

or even, more general:

void func(T)(T x) if (is(T t = X!(SomeTypes), SomeTypes...)), because
template constrains are much more powerful and general than fiddling
arguments types.

There was a discussion between Simen Kjaeraas and me on the subject
not long ago. Kenji Hara extended the functionnality of is( , T...)
and did a pull request. I'm not sure it's in 2.060, the request was
done a week ago.


 void tempFunc(T,U)(T t, U u) if (is(T a == X!(SomeType), SomeType) &&
 is(U a == X!(SomeType), SomeType)
                                                 || is(T == XY) && is(U ==
 XY))
 {
 ...
 }
 Is that what you need?


  I want to say no... But I haven't actually tested it for my use cases.
 Leaving it unconstrained (without checking) to make it work is a disaster
 waiting to happen: I want T and U to both be of the same template (X in this
 case) but not the same instantiation arguments.

Said like this, it's much more clear for me :)

Note that in the above code, SomeType is a fresh variable in each is()
expression. So, even though I used the same name (shouldn't have done
that), T and U can have different type parameters.

*shameless plug*
Btw, I wrote a tutorial on templates, you can find it here:

https://github.com/PhilippeSigaud/D-templates-tutorial/blob/master/dtemplates.pdf

(click on 'view raw', that should download the pdf)

That should shed some light on the matters at hand.

Jul 30 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Tuesday, 31 July 2012 at 05:27:58 UTC, Philippe Sigaud wrote:
 No. Use a 3-params template or a tuple:

 void func(A,B,C)(X!(A,B,C) x) {}
 or
 void func(Ts...)(X!(Ts) x) {}

  I don't know how many arguments it will have (depends on how 
many options I give it), and I honestly don't; It should be 
invisible for the end user as long as we can get the the 
constraints properly accepted.

 or even, more general:

 void func(T)(T x) if (is(T t = X!(SomeTypes), SomeTypes...)), 
 because template constrains are much more powerful and general 
 than fiddling arguments types.

  That doesn't make sense; but I don't have a full grasp of the 
templates (continued below).

 There was a discussion between Simen Kjaeraas and me on the 
 subject not long ago. Kenji Hara extended the functionnality of 
 is( , T...) and did a pull request. I'm not sure it's in 2.060, 
 the request was done a week ago.

  so the (, T...) is shorthand for testing the template type with 
multiple arguments. Nice. So multiple arguments would be...

  struct X (T, U) {}

  if(is(T t = X!(bool, bool), (int, int), (int, bool)))  //?

 I want to say no... But I haven't actually tested it for my 
 use cases. Leaving it unconstrained (without checking) to make 
 it work is a disaster waiting to happen: I want T and U to 
 both be of the same template (X in this case) but not the same 
 instantiation arguments.

 Said like this, it's much more clear for me :)

 Note that in the above code, SomeType is a fresh variable in 
 each is() expression. So, even though I used the same name 
 (shouldn't have done that), T and U can have different type 
 parameters.

 *shameless plug*
 Btw, I wrote a tutorial on templates, you can find it here:

 https://github.com/PhilippeSigaud/D-templates-tutorial/blob/master/dtemplates.pdf

 (click on 'view raw', that should download the pdf)
 That should shed some light on the matters at hand.

  Indeed... I remember glancing through it, but never really 
needed it badly; Now it seems I do. At least it's not C++ and 
it's templates. I'll be reading up on them starting tonight (and 
printing most of it); It is after all the hardest part (and most 
powerful part) of the language :P

Jul 30 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

  As BitArray is coming back up and my resumed work I'll comment a 
few questions and suggestions and go from there. I'll polish up 
the code and try to resubmit it.

On Saturday, 28 July 2012 at 21:07:31 UTC, Jonathan M Davis wrote:
 I would point out that while hasSlicing doesn't currently check 
 for it, if opSlice doesn't return a type which is assignable to 
 the original range, it won't work in a lot of Phobos functions. 
 I keep meaning to bring that up for discussion in the main 
 newsgroup. I'd argue that hasSlicing really be changed from

  Hmmm.. Well Since the 'sometimes ref' as discussed won't seem to 
work well from before I'm considering to put the following two 
rules into effect.
1) If it's an allocated block, it stays allocated.
2) If you dup a BitArray and it's small enough to be compact, it 
converts.
3) If it's a small/compact array and you give out a slice, it 
converts the original block to dynamic before returning the new 
slice.

  With 1 & 2, reserved and length work a little differently, more 
ref friendly.
  With 3 the only issue comes up with if it's const, or a local 
variable.

  const BitArray x = BitArray(32);
  func(x);

  Would the function get passed the original compact buffer? Copy 
of x or a dup?
  May not be so much an issue. However...

  BitArray x = BitArray(32);
  const y = x[]; //slice

  Now if y points to the actual compact buffer and we do the 
following...

  x.length = 256; //allocates if it was compact

  Although y will have const access the data instead of a compact 
array it's a pointer to an array and broken. If during the x call 
it reallocated then y would work fine no matter what in that 
case. If x can't reallocate it, then the issue remains but is 
much smaller than before, most likely a dup would be safest.

  Thoughts? Questions? Ideas?

Dec 31 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

12/31/2012 9:35 PM, Era Scarecrow пишет:
   As BitArray is coming back up and my resumed work I'll comment a few
 questions and suggestions and go from there. I'll polish up the code and
 try to resubmit it.

Yay!

 On Saturday, 28 July 2012 at 21:07:31 UTC, Jonathan M Davis wrote:
 I would point out that while hasSlicing doesn't currently check for
 it, if opSlice doesn't return a type which is assignable to the
 original range, it won't work in a lot of Phobos functions. I keep
 meaning to bring that up for discussion in the main newsgroup. I'd
 argue that hasSlicing really be changed from

   Hmmm.. Well Since the 'sometimes ref' as discussed won't seem to work
 well from before I'm considering to put the following two rules into
 effect.
 1) If it's an allocated block, it stays allocated.
 2) If you dup a BitArray and it's small enough to be compact, it converts.
 3) If it's a small/compact array and you give out a slice, it converts
 the original block to dynamic before returning the new slice.

   With 1 & 2, reserved and length work a little differently, more ref
 friendly.
   With 3 the only issue comes up with if it's const, or a local variable.

Personally I believe that if we introduce a slice of a BitArray as a 
separate range type that represents a view of array (but can't be 
appended to, etc. it's just a RA range with slicing).

   const BitArray x = BitArray(32);
   func(x);

Then here x is passed either by ref or a copy (depending on func).

   Would the function get passed the original compact buffer? Copy of x
 or a dup?
   May not be so much an issue. However...

   BitArray x = BitArray(32);
   const y = x[]; //slice

Then y has type BitArraySlice and references x.

   Now if y points to the actual compact buffer and we do the following...

   x.length = 256; //allocates if it was compact

   Although y will have const access the data instead of a compact array
 it's a pointer to an array and broken.

Then IMO that should be a slice invalidation (akin to c++ iterator 
invalidation). Depending on how slice is implemented it could avoid 
being invalidated in this case but in some others it surely will (e.g. 
shrinking).

 If during the x call it

You mean x[] call i.e. opSlice() call?
Seems like an overkill and semantically bad as I totally expect that the 
following to be true for any Array-like container:
auto cont = ...;
auto range = cont[];
range[0] = 1;
assert(cont[0] == 1);

Also:
foreach(v; x) --> foreach(v; x[]) if x defines opSlice

however since BitArray defines opApply it's called instead so this one 
is probably OK.


 reallocated then y would work fine no matter what in that case. If x
 can't reallocate it, then the issue remains but is much smaller than
 before, most likely a dup would be safest.

   Thoughts? Questions? Ideas?

The link to the latest code would be cool ;)

I'm thinking that the main problem is trying to mimic built-in arrays.

They are not ordinary containers in that they easily combine slice and 
container that is (in fact) partially hidden in run-time & GC 
implementation.

The fact that combination of language features and run-time support 
makes it easy shouldn't encourage other to go this path in the library.

For one thing this way a small string optimization is not achievable 
because of the many slices that have to reference the data packed in 
_small_ chunk. Then whenever original array changes (and becomes not 
small) any other slice to it have to somehow reflect this change. And 
it's hard if slices internally are exactly the same as original (if the 
same type).

...

Now that I tested the current behavior that we shouldn't break a lot. 
Clearly current BitArray is old and rusty D1 design.

     BitArray a, b, c;
     //the interesting thing is that BitArray(32) doesn't construct 
BitArray with length 32 but some explosive garbage !!

     c.length = 512;
     a.length = 32;
     assert(a.length == 32);
     b = a;
     b[0] = true;
//this one is bad as it means both a & b has to reference the same data
     assert(a[0] == 1);
     b ~= c;
     b[1] = 1;
     //the next fails for me (it also may not fail depending on c's length)
     assert(a[1] == 1);


Currently I think that D container design have to be extended to include 
2 kinds of containers:

  -small containers, values type a-la C++ STL these all use small 
container optimization (as they can cleanly)
  -big heavy-duty ones, these have reference semantics and tuned for 
large sets of values

Then the proposed way out is to make 2 containers:
- current BitArray to always have reference semantics (both asserts in 
the above always pass);
- and BitString or BitVector to be small value semantic array that is 
always copied/duped on assign, these should be passed by ref or by 
range/slice.

That's was lengthy but what do you think? And it seems like we'd need to 
bring this last topic to general D NG.

-- 
Dmitry Olshansky

Jan 02 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Wednesday, 2 January 2013 at 21:00:38 UTC, Dmitry Olshansky 
wrote:
 12/31/2012 9:35 PM, Era Scarecrow пишет:

 Personally I believe that if we introduce a slice of a BitArray 
 as a separate range type that represents a view of array (but 
 can't be appended to, etc. it's just a RA range with slicing).

  Could do that I suppose, but then either exact length or 
append/alignment optimizations may go out the window if we remove 
some of those as options. To support those making slices part of 
the main type was easy enough. But the other issues still have to 
be addressed (that's mentioned below)

  const BitArray x = BitArray(32);
  func(x);

 Then here x is passed either by ref or a copy (depending on 
 func).

  yeah... assuming func returned a struct of...

  struct S {
    BitArray x;
    //other stuff
  }

  const(BitArray) func(const BitArray s);

  in the case it's small string optimization and the original 
string gets allocated OR gets passed outside the original 
function that made the BitArray. So small string if possible, 
convert to ref and slice it, or dup it. not 100% ref in all cases 
but pretty close.

  Would the function get passed the original compact buffer? 
 Copy of x or a dup?  May not be so much an issue. However...

  BitArray x = BitArray(32);
  const y = x[]; //slice

 Then y has type BitArraySlice and references x.

  Agreed, although I'm still iffy about a separate slice as a 
range. Could do the range just to give it limitations, but seems 
like a headache to do less and extra code to manage.

  Current code I'm writing converts (if possible) the BitArray to 
a allocated type, then passes that slice to y.

  Now if y points to the actual compact buffer and we do the 
 following...

  x.length = 256; //allocates if it was compact

  Although y will have const access the data instead of a 
 compact array it's a pointer to an array and broken.

 Then IMO that should be a slice invalidation (akin to c++ 
 iterator invalidation). Depending on how slice is implemented 
 it could avoid being invalidated in this case but in some 
 others it surely will (e.g. shrinking).

  Maybe, with the small string optimization (no allocation) it 
isn't so much an easy option. That's what this is about. Say it 
uses the compact (fixed string) and in opSlice...

BitArray {
   bool isCompact;
   bool canExpand; //represents owner for slices or compact.
   union {
     size_t[] normal;
     size_t[2] compact;
   }
   BitArray opSlice() {
     BitArray ba = this;
     ba.canExpand = false;
     if (isCompact) {
       ba.isCompact = false;
       ba.normal = cast(size_t[]) this.compact[0 .. $];
     }

     return ba;
   }
}

  This is where the allocation could cause a problem as the slice 
could be referring to stack memory at which all the data could be 
duds. Worse if you overwrite that data suddenly the allocated 
memory could point to something else! I think it's safe to say 
that's a bad idea.

 If during the x call it

 You mean x[] call i.e. opSlice() call?

  well both this(this) and opSlice() would be called at different 
times so they would have different meanings (copying vs slicing).

 Seems like an overkill and semantically bad as I totally expect 
 that the following to be true for any Array-like container:
 auto cont = ...;
 auto range = cont[];
 range[0] = 1;
 assert(cont[0] == 1);

 Also:
 foreach(v; x) --> foreach(v; x[]) if x defines opSlice

 however since BitArray defines opApply it's called instead so 
 this one is probably OK.

  Does that mean we might be able to drop opApply? Hmmm... Depends 
on if we have enough implemented so the range accepts it. I'll 
glance over std.range.

 reallocated then y would work fine no matter what in that 
 case. If x can't reallocate it, then the issue remains but is 
 much smaller than before, most likely a dup would be safest.

  Thoughts? Questions? Ideas?

 The link to the latest code would be cool ;)

  
https://github.com/rtcvb32/phobos/blob/BitArray-Updates/std/bitmanip.d

  There's a merge issue due to having tried to keep the bitfields 
and bitarray separated; That seems to have made more problems 
than solutions. Working on it but may have to wait until I've 
read my Git book when it arrives before I understand this all.

 I'm thinking that the main problem is trying to mimic built-in 
 arrays.

  Not really, it's trying to do small string optimization. I 
suppose if this was done as a class instead most of these 
problems could/would go away.

 They are not ordinary containers in that they easily combine 
 slice and container that is (in fact) partially hidden in 
 run-time & GC implementation.

 The fact that combination of language features and run-time 
 support makes it easy shouldn't encourage other to go this path 
 in the library.

 For one thing this way a small string optimization is not 
 achievable because of the many slices that have to reference 
 the data packed in _small_ chunk. Then whenever original array 
 changes (and becomes not small) any other slice to it have to 
 somehow reflect this change. And it's hard if slices internally 
 are exactly the same as original (if the same type).

  Really depends on the use case. I'm guessing there's enough call 
for the 64bit small packed bitarray that justifies it, otherwise 
it would be better to throw it out. As you mention later, 
separating them does seem like a better idea.

 ...

 Now that I tested the current behavior that we shouldn't break 
 a lot. Clearly current BitArray is old and rusty D1 design.

     BitArray a, b, c;
     //the interesting thing is that BitArray(32) doesn't 
 construct BitArray with length 32 but some explosive garbage !!

     c.length = 512;
     a.length = 32;
     assert(a.length == 32);
     b = a;
     b[0] = true;
 //this one is bad as it means both a & b has to reference the 
 same data
     assert(a[0] == 1);
     b ~= c;
     b[1] = 1;
     //the next fails for me (it also may not fail depending on 
 c's length)
     assert(a[1] == 1);

  My code tests that the last two asserts fail, course, a being 
under 64bits remains with small string optimization. Making it 
slice... seems I'll add this to my list of tests and get it 
fixed. Even if i reserved the space it still failed the last one. 
Mmm...

 Currently I think that D container design have to be extended 
 to include 2 kinds of containers:

  -small containers, values type a-la C++ STL these all use 
 small container optimization (as they can cleanly)
  -big heavy-duty ones, these have reference semantics and tuned 
 for large sets of values

 Then the proposed way out is to make 2 containers:
 - current BitArray to always have reference semantics (both 
 asserts in the above always pass);
 - and BitString or BitVector to be small value semantic array 
 that is always copied/duped on assign, these should be passed 
 by ref or by range/slice.

 That's was lengthy but what do you think? And it seems like 
 we'd need to bring this last topic to general D NG.

  hmmm.. Either would need some code duplication, or template to 
turn off/swap small string optimization. But separating them does 
seem like a good idea.

Jan 02 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

1/3/2013 6:05 AM, Era Scarecrow пишет:
 On Wednesday, 2 January 2013 at 21:00:38 UTC, Dmitry Olshansky wrote:
 12/31/2012 9:35 PM, Era Scarecrow пишет:

 Personally I believe that if we introduce a slice of a BitArray as a
 separate range type that represents a view of array (but can't be
 appended to, etc. it's just a RA range with slicing).

   Could do that I suppose, but then either exact length or
 append/alignment optimizations may go out the window if we remove some
 of those as options. To support those making slices part of the main
 type was easy enough. But the other issues still have to be addressed
 (that's mentioned below)

  const BitArray x = BitArray(32);
  func(x);

 Then here x is passed either by ref or a copy (depending on func).

   yeah... assuming func returned a struct of...

   struct S {
     BitArray x;
     //other stuff
   }

   const(BitArray) func(const BitArray s);

   in the case it's small string optimization and the original string
 gets allocated OR gets passed outside the original function that made
 the BitArray. So small string if possible, convert to ref and slice it,
 or dup it. not 100% ref in all cases but pretty close.

  Would the function get passed the original compact buffer? Copy of x
 or a dup?  May not be so much an issue. However...

  BitArray x = BitArray(32);
  const y = x[]; //slice

 Then y has type BitArraySlice and references x.

   Agreed, although I'm still iffy about a separate slice as a range.
 Could do the range just to give it limitations, but seems like a
 headache to do less and extra code to manage.

   Current code I'm writing converts (if possible) the BitArray to a
 allocated type, then passes that slice to y.

  Now if y points to the actual compact buffer and we do the following...

  x.length = 256; //allocates if it was compact

  Although y will have const access the data instead of a compact
 array it's a pointer to an array and broken.

 Then IMO that should be a slice invalidation (akin to c++ iterator
 invalidation). Depending on how slice is implemented it could avoid
 being invalidated in this case but in some others it surely will (e.g.
 shrinking).

   Maybe, with the small string optimization (no allocation) it isn't so
 much an easy option. That's what this is about. Say it uses the compact
 (fixed string) and in opSlice...

 BitArray {
    bool isCompact;
    bool canExpand; //represents owner for slices or compact.
    union {
      size_t[] normal;
      size_t[2] compact;
    }
    BitArray opSlice() {
      BitArray ba = this;
      ba.canExpand = false;
      if (isCompact) {
        ba.isCompact = false;
        ba.normal = cast(size_t[]) this.compact[0 .. $];
      }

      return ba;
    }
 }

   This is where the allocation could cause a problem as the slice could
 be referring to stack memory at which all the data could be duds. Worse
 if you overwrite that data suddenly the allocated memory could point to
 something else! I think it's safe to say that's a bad idea.

Hm, I'd think that having Slice type be:

BitArraySlice{
BitArray* bp;
size_t start, end;
// all else forwards to the pointed array
}
should work avoiding the most of code duplication. With any luck inliner 
will expand all of this wrapping. To enable bulk mode operations:

void opOpSliceAssign(BitArraySlice rhs)
{//has all the data to do it
    bulkOp(bp, start, end, rhs.bp, rhs.start, rhs.end);
}

then it's a question of bit of refactoring & tweaking of array bulks ops.

Then the only problem left is slice invalidation and original array 
going out of scope.

 If during the x call it

 You mean x[] call i.e. opSlice() call?

   well both this(this) and opSlice() would be called at different times
 so they would have different meanings (copying vs slicing).

 Seems like an overkill and semantically bad as I totally expect that
 the following to be true for any Array-like container:
 auto cont = ...;
 auto range = cont[];
 range[0] = 1;
 assert(cont[0] == 1);

 Also:
 foreach(v; x) --> foreach(v; x[]) if x defines opSlice

 however since BitArray defines opApply it's called instead so this one
 is probably OK.

   Does that mean we might be able to drop opApply? Hmmm... Depends on if
 we have enough implemented so the range accepts it. I'll glance over
 std.range.

Yes, it's got to be a range to work like this and then opApply is useless.
 reallocated then y would work fine no matter what in that case. If x
 can't reallocate it, then the issue remains but is much smaller than
 before, most likely a dup would be safest.

  Thoughts? Questions? Ideas?

 The link to the latest code would be cool ;)


 https://github.com/rtcvb32/phobos/blob/BitArray-Updates/std/bitmanip.d

   There's a merge issue due to having tried to keep the bitfields and
 bitarray separated; That seems to have made more problems than
 solutions. Working on it but may have to wait until I've read my Git
 book when it arrives before I understand this all.

 I'm thinking that the main problem is trying to mimic built-in arrays.

   Not really, it's trying to do small string optimization. I suppose if
 this was done as a class instead most of these problems could/would go
 away.

 They are not ordinary containers in that they easily combine slice and
 container that is (in fact) partially hidden in run-time & GC
 implementation.

 The fact that combination of language features and run-time support
 makes it easy shouldn't encourage other to go this path in the library.

 For one thing this way a small string optimization is not achievable
 because of the many slices that have to reference the data packed in
 _small_ chunk. Then whenever original array changes (and becomes not
 small) any other slice to it have to somehow reflect this change. And
 it's hard if slices internally are exactly the same as original (if
 the same type).

   Really depends on the use case. I'm guessing there's enough call for
 the 64bit small packed bitarray that justifies it, otherwise it would be
 better to throw it out. As you mention later, separating them does seem
 like a better idea.

I've meant the fact that it has to preserve current std lib behavior 
*and* has small string optimization that makes it overly painful and 
partially defeats the optimization (by requiring pointless compact->big 
conversions on slice etc.).

 ...

 Now that I tested the current behavior that we shouldn't break a lot.
 Clearly current BitArray is old and rusty D1 design.

     BitArray a, b, c;
     //the interesting thing is that BitArray(32) doesn't construct
 BitArray with length 32 but some explosive garbage !!

     c.length = 512;
     a.length = 32;
     assert(a.length == 32);
     b = a;
     b[0] = true;
 //this one is bad as it means both a & b has to reference the same data
     assert(a[0] == 1);
     b ~= c;
     b[1] = 1;
     //the next fails for me (it also may not fail depending on c's
 length)
     assert(a[1] == 1);

   My code tests that the last two asserts fail, course, a being under
 64bits remains with small string optimization. Making it slice... seems
 I'll add this to my list of tests and get it fixed. Even if i reserved
 the space it still failed the last one. Mmm...

Right, since it's _a_ that's tested and in current design it shares data 
with b unless that got relocated (b ~= c can relocate b).
 Currently I think that D container design have to be extended to
 include 2 kinds of containers:

  -small containers, values type a-la C++ STL these all use small
 container optimization (as they can cleanly)
  -big heavy-duty ones, these have reference semantics and tuned for
 large sets of values

 Then the proposed way out is to make 2 containers:
 - current BitArray to always have reference semantics (both asserts in
 the above always pass);
 - and BitString or BitVector to be small value semantic array that is
 always copied/duped on assign, these should be passed by ref or by
 range/slice.

 That's was lengthy but what do you think? And it seems like we'd need
 to bring this last topic to general D NG.

   hmmm.. Either would need some code duplication, or template to turn
 off/swap small string optimization. But separating them does seem like a
 good idea.

I'd expect a fair chunk of code to be split off into free functions, 
then both containers would make use of them.

Now note that there is also std.container Array!bool that has reference 
semantics and got to be doing bit packing... it could potentially take 
place of big BitArray mentioned.
-- 
Dmitry Olshansky

Jan 02 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 3 January 2013 at 07:57:46 UTC, Dmitry Olshansky 
wrote:
 1/3/2013 6:05 AM, Era Scarecrow wrote:

 Hm, I'd think that having Slice type be:

 BitArraySlice{
 BitArray* bp;
 size_t start, end;
 // all else forwards to the pointed array
 }
 should work avoiding the most of code duplication. With any 
 luck inliner will expand all of this wrapping. To enable bulk 
 mode operations:

 void opOpSliceAssign(BitArraySlice rhs)
 {//has all the data to do it
    bulkOp(bp, start, end, rhs.bp, rhs.start, rhs.end);
 }

 then it's a question of bit of refactoring & tweaking of array 
 bulks ops.

 Then the only problem left is slice invalidation and original 
 array going out of scope.

  I know we've had this discussion before. So let's assume you do 
slicing this way. So what happens when...

  BitArray ba = [0,1,1,0];

  ba = ba[1 .. $-1]; // ba is not a BitArraySlice

  Suddenly it won't work and slicing is only a range and can only 
be used in foreach. Correct? You could do appending but likely it 
would require two functions in the original code (one for a 
BitArray and one for a slice). It it supports a opAssign to 
handle that now you have to take care of all opAssign's as well 
(like C++, once you handle one portion of it and opCast, you 
handle them all).

 Does that mean we might be able to drop opApply? Hmmm... 
 Depends on if we have enough implemented so the range accepts 
 it. I'll glance over std.range.

 Yes, it's got to be a range to work like this and then opApply 
 is useless.

  Maybe. Foreach if I remember right worked three different ways, 
and selected whichever one worked. opApply, front/pop/empty, and 
array (opIndex access). If enough is implemented that it 
recognizes it as an array then opApply can be removed.

  A quick tests shows that currently isn't the case.

 Really depends on the use case. I'm guessing there's enough 
 call for the 64bit small packed bitarray that justifies it, 
 otherwise it would be better to throw it out. As you mention 
 later, separating them does seem like a better idea.

 I've meant the fact that it has to preserve current std lib 
 behavior *and* has small string optimization that makes it 
 overly painful and partially defeats the optimization (by 
 requiring pointless compact->big conversions on slice etc.).

  We could always 'dup' on slice instead, but then you can't use 
references on it, at which point it's a value type. Hmmm. Being a 
pure value type then a separate range makes more sense to use.

 hmmm.. Either would need some code duplication, or template to 
 turn off/swap small string optimization. But separating them 
 does seem like a good idea.

 I'd expect a fair chunk of code to be split off into free 
 functions, then both containers would make use of them.

 Now note that there is also std.container Array!bool that has 
 reference semantics and got to be doing bit packing... it could 
 potentially take place of big BitArray mentioned.

  I'm aware of the container version, and overall if it can just 
replace BitArray as it is, then we might consider only worrying 
about a compact version, at which point it would become a 
template... Might start seeing BitArray!1024  in places (or 
BitString!1024 ?).

Jan 03 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

1/3/2013 2:20 PM, Era Scarecrow пишет:
 On Thursday, 3 January 2013 at 07:57:46 UTC, Dmitry Olshansky wrote:
 1/3/2013 6:05 AM, Era Scarecrow wrote:

 Hm, I'd think that having Slice type be:

 BitArraySlice{
 BitArray* bp;
 size_t start, end;
 // all else forwards to the pointed array
 }
 should work avoiding the most of code duplication. With any luck
 inliner will expand all of this wrapping. To enable bulk mode operations:

 void opOpSliceAssign(BitArraySlice rhs)
 {//has all the data to do it
    bulkOp(bp, start, end, rhs.bp, rhs.start, rhs.end);
 }

 then it's a question of bit of refactoring & tweaking of array bulks ops.

 Then the only problem left is slice invalidation and original array
 going out of scope.

   I know we've had this discussion before. So let's assume you do
 slicing this way. So what happens when...

   BitArray ba = [0,1,1,0];

   ba = ba[1 .. $-1]; // ba is not a BitArraySlice

   Suddenly it won't work and slicing is only a range and can only be
 used in foreach.

No surprise here. Basically container != range over it. It all flows 
from there. Range doesn't have append nor it owns any data.
About foreach see below. Yes, the BitArray container won't pass 
hasSlicing trait, but it shouldn't as it's meant for ranges.
(so in a sense my comment about trying to mimic built-in array applies)

And slice can have opIndex and opSlice being a random access range of bool.

To help people convert one to the other (exactly where they need it) we 
can add .dup(?) for the slice that allocates a BitArray and assigns 
values from range.

 Correct? You could do appending but likely it would
 require two functions in the original code (one for a BitArray and one
 for a slice).

Again I don't think there is a need to append to a slice. Create an 
array out of slice (or in fact any range of bool) then append.

It it supports a opAssign to handle that now you have to
 take care of all opAssign's as well (like C++, once you handle one
 portion of it and opCast, you handle them all).

Yes, I'm afraid that for optimal speed they both have to be implemented.
The version for BitArray can just forward to the one for slice over the 
whole of it. Or there could be a common template function that handles 
all of bit-ops over [a, b) portions of BitArrays. Everything else then 
forwards to it.

 Does that mean we might be able to drop opApply? Hmmm... Depends on
 if we have enough implemented so the range accepts it. I'll glance
 over std.range.

 Yes, it's got to be a range to work like this and then opApply is
 useless.

   Maybe. Foreach if I remember right worked three different ways, and
 selected whichever one worked. opApply, front/pop/empty, and array
 (opIndex access). If enough is implemented that it recognizes it as an
 array then opApply can be removed.

There are 2 ways: range or opApply. How built-in arrays are done is up 
to compiler, there is no way to tap into it from user level.

There is an extra rule however, it is the following. If the object in 
question is not a range and doesn't have opApply but has opSlice() then 
it slices it first, then again it checks the 2 ways on this slice - as 
range or opApply.
   A quick tests shows that currently isn't the case.

 Really depends on the use case. I'm guessing there's enough call for
 the 64bit small packed bitarray that justifies it, otherwise it would
 be better to throw it out. As you mention later, separating them does
 seem like a better idea.

 I've meant the fact that it has to preserve current std lib behavior
 *and* has small string optimization that makes it overly painful and
 partially defeats the optimization (by requiring pointless
 compact->big conversions on slice etc.).

   We could always 'dup' on slice instead, but then you can't use
 references on it, at which point it's a value type.

Dup on slice is very bad. Slice in container world (I'm looking at 
std.container and this not likely to change) is a view of the same data, 
thus it's *not* a new container on its own.

Hmmm. Being a pure
 value type then a separate range makes more sense to use.

 hmmm.. Either would need some code duplication, or template to turn
 off/swap small string optimization. But separating them does seem
 like a good idea.

 I'd expect a fair chunk of code to be split off into free functions,
 then both containers would make use of them.

 Now note that there is also std.container Array!bool that has
 reference semantics and got to be doing bit packing... it could
 potentially take place of big BitArray mentioned.

   I'm aware of the container version, and overall if it can just replace
 BitArray as it is, then we might consider only worrying about a compact
 version, at which point it would become a template... Might start seeing
 BitArray!1024  in places (or BitString!1024 ?).

I'm not sure if it will gain that much to have templated fixed size bit 
array over one that just has small optimization. Could be a lot but 
somehow I don't expect much, that needs some measurement.

-- 
Dmitry Olshansky

Jan 03 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 3 January 2013 at 15:48:50 UTC, Dmitry Olshansky 
wrote:
 1/3/2013 2:20 PM, Era Scarecrow wrote:
 Suddenly it won't work and slicing is only a range and can 
 only be used in foreach.

 No surprise here. Basically container != range over it. It all 
 flows from there. Range doesn't have append nor it owns any 
 data.
 About foreach see below. Yes, the BitArray container won't pass 
 hasSlicing trait, but it shouldn't as it's meant for ranges. 
 (so in a sense my comment about trying to mimic built-in array 
 applies)

 And slice can have opIndex and opSlice being a random access 
 range of bool.

  Hmmm.. shouldn't try to make it as close to an actual array as 
possible. I feel like I'm missing something very very simple to 
making it all correct.

 To help people convert one to the other (exactly where they 
 need it) we can add .dup(?) for the slice that allocates a 
 BitArray and assigns values from range.

 Correct? You could do appending but likely it would require 
 two functions in the original code (one for a BitArray and one 
 for a slice).

 Again I don't think there is a need to append to a slice. 
 Create an array out of slice (or in fact any range of bool) 
 then append.

  I want to say you're wrong, I'm thinking heavily in the field 
where you would be doing compression or encryption where random 
access and appending could be needed at any step. I can see 
creating a simple table of the compressed sequences (Say 
huffman), and then appending the slices which then greatly 
simplifies the interface. I can also see where if you append to a 
slice it automatically making a new block of memory would be 
useful as well, as you know you're going to start fresh. I can 
make a short test set of functions and post them if you like, 
doing ranges & slices otherwise might not allow such diversity 
(simplicity?) in the code.

 If it supports a opAssign to handle that now you have to take 
 care of all opAssign's as well (like C++, once you handle one 
 portion of it and opCast, you handle them all).

 Yes, I'm afraid that for optimal speed they both have to be 
 implemented. The version for BitArray can just forward to the 
 one for slice over the whole of it. Or there could be a common 
 template function that handles all of bit-ops over [a, b) 
 portions of BitArrays. Everything else then forwards to it.

 Does that mean we might be able to drop opApply? Hmmm... 
 Depends on if we have enough implemented so the range 
 accepts it. I'll glance over std.range.

 Yes, it's got to be a range to work like this and then 
 opApply is useless.

 Maybe. Foreach if I remember right worked three different 
 ways, and selected whichever one worked. opApply, 
 front/pop/empty, and array (opIndex access). If enough is 
 implemented that it recognizes it as an array then opApply can 
 be removed.

 There are 2 ways: range or opApply. How built-in arrays are 
 done is up to compiler, there is no way to tap into it from 
 user level.

 There is an extra rule however, it is the following. If the 
 object in question is not a range and doesn't have opApply but 
 has opSlice() then it slices it first, then again it checks the 
 2 ways on this slice - as range or opApply.

  I didn't know that; I know it attempted a slice but not that the 
slice has to be a range. Likely that needs to be emphasized in 
the documentation.

  A quick tests shows that currently isn't the case.

 Really depends on the use case. I'm guessing there's enough 
 call for the 64bit small packed bitarray that justifies it, 
 otherwise it would be better to throw it out. As you mention 
 later, separating them does seem like a better idea.

 I've meant the fact that it has to preserve current std lib 
 behavior *and* has small string optimization that makes it 
 overly painful and partially defeats the optimization (by 
 requiring pointless compact->big conversions on slice etc.).

 We could always 'dup' on slice instead, but then you can't use 
 references on it, at which point it's a value type.

 Dup on slice is very bad. Slice in container world (I'm looking 
 at std.container and this not likely to change) is a view of 
 the same data, thus it's *not* a new container on its own.

 Hmmm. Being a pure value type then a separate range makes more 
 sense to use.

 hmmm.. Either would need some code duplication, or template 
 to turn off/swap small string optimization. But separating 
 them does seem like a good idea.

 I'd expect a fair chunk of code to be split off into free 
 functions, then both containers would make use of them.

 Now note that there is also std.container Array!bool that has 
 reference semantics and got to be doing bit packing... it 
 could potentially take place of big BitArray mentioned.

 I'm aware of the container version, and overall if it can just 
 replace BitArray as it is, then we might consider only 
 worrying about a compact version, at which point it would 
 become a template... Might start seeing BitArray!1024  in 
 places (or BitString!1024 ?).

 I'm not sure if it will gain that much to have templated fixed 
 size bit array over one that just has small optimization. Could 
 be a lot but somehow I don't expect much, that needs some 
 measurement.

  I agree, but if it's going to be a value type that's heavy on 
small string optimization with as little referencing as possible, 
then it has to know what size to depend on, since resizing beyond 
the max may not be an option if we take that part out.

Jan 03 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

04-Jan-2013 00:11, Era Scarecrow пишет:
 On Thursday, 3 January 2013 at 15:48:50 UTC, Dmitry Olshansky wrote:
 1/3/2013 2:20 PM, Era Scarecrow wrote:
 Suddenly it won't work and slicing is only a range and can only be
 used in foreach.

 No surprise here. Basically container != range over it. It all flows
 from there. Range doesn't have append nor it owns any data.
 About foreach see below. Yes, the BitArray container won't pass
 hasSlicing trait, but it shouldn't as it's meant for ranges. (so in a
 sense my comment about trying to mimic built-in array applies)

 And slice can have opIndex and opSlice being a random access range of
 bool.

   Hmmm.. shouldn't try to make it as close to an actual array as
 possible. I feel like I'm missing something very very simple to making
 it all correct.

[snip]
 Again I don't think there is a need to append to a slice. Create an
 array out of slice (or in fact any range of bool) then append.

   I want to say you're wrong, I'm thinking heavily in the field where
 you would be doing compression or encryption where random access and
 appending could be needed at any step. I can see creating a simple table
 of the compressed sequences (Say huffman), and then appending the slices
 which then greatly simplifies the interface.

Appending a slice *to* BitArray is perfectly fine as in fact any range 
of bool (or bit if you like). Any array-like or string-like container 
has to support appending a range of element type (and insertion for that 
matter). The table you mention is then an array of BitArray that is 
sliced (getting a range) and these slices are appended to some other 
BitArray. BitArrays in this table could also be appended to just fine 
(so you construct sequences out of others easily).

The key point is that *appending a range to container* is fine (and 
idiomatic) and *slicing is O(1), fast and non-allocating*.

I can also see where if you
 append to a slice it automatically making a new block of memory would be
 useful as well, as you know you're going to start fresh.

Maybe but I can see it being a nasty surprise in a tight loop.

 I can make a
 short test set of functions and post them if you like, doing ranges &
 slices otherwise might not allow such diversity (simplicity?) in the 

code.

Would be great to put them together as we have implementation details 
sorted out but have trouble with usage patterns.

 There is an extra rule however, it is the following. If the object in
 question is not a range and doesn't have opApply but has opSlice()
 then it slices it first, then again it checks the 2 ways on this slice
 - as range or opApply.

   I didn't know that; I know it attempted a slice but not that the slice
 has to be a range. Likely that needs to be emphasized in the documentation.

That's pretty much how std.container tries to work. There is little 
documentation on the subject though, same note IRC is in TDPL book 
(highly recommend that one if you don't have it already).

[snip]

 I'm aware of the container version, and overall if it can just
 replace BitArray as it is, then we might consider only worrying about
 a compact version, at which point it would become a template... Might
 start seeing BitArray!1024  in places (or BitString!1024 ?).

 I'm not sure if it will gain that much to have templated fixed size
 bit array over one that just has small optimization. Could be a lot
 but somehow I don't expect much, that needs some measurement.

   I agree, but if it's going to be a value type that's heavy on small
 string optimization with as little referencing as possible, then it has
 to know what size to depend on, since resizing beyond the max may not be
 an option if we take that part out.

I meant it's still resizing up to any point, just that it does dup's 
underlying array on copy (if not small, if small then postblit does 
nothing). It's questionable to dup on copy but the idea is that it won't 
be often.

If you mean to make type that doesn't resize beyond some maximum (so 
it's always "small") then indeed having fixed size as template parameter 
makes more sense.

-- 
Dmitry Olshansky

Jan 03 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 3 January 2013 at 21:15:19 UTC, Dmitry Olshansky 
wrote:
 04-Jan-2013 00:11, Era Scarecrow wrote:

 Appending a slice *to* BitArray is perfectly fine as in fact 
 any range of bool (or bit if you like). Any array-like or 
 string-like container has to support appending a range of 
 element type (and insertion for that matter). The table you 
 mention is then an array of BitArray that is sliced (getting a 
 range) and these slices are appended to some other BitArray. 
 BitArrays in this table could also be appended to just fine (so 
 you construct sequences out of others easily).

 The key point is that *appending a range to container* is fine 
 (and idiomatic) and *slicing is O(1), fast and non-allocating*.

  So the slice is a non-allocating view of data some data. I can 
see adding more limitations for that then.

 I can also see where if you append to a slice it automatically 
 making a new block of memory would be useful as well, as you 
 know you're going to start fresh.

 Maybe but I can see it being a nasty surprise in a tight loop.

  Only on the first iteration, unless you do a LOT of slicing & 
appending on non-compact blocks, but if you're appending to a 
slice in normal arrays it will dup it anyways (if it doesn't own 
the following memory) so the behavior doesn't really change.

 I can make a short test set of functions and post them if you 
 like, doing ranges & slices otherwise might not allow such 
 diversity (simplicity?) in the code.

 Would be great to put them together as we have implementation 
 details sorted out but have trouble with usage patterns.

  K, I'll likely re-work my multi-level huffman algorithmn and a 
LZW compression, although the LZW wouldn't make use of the more 
exotic features. Got half the LZW written, but I likely won't get 
this done for a few days.

 There is an extra rule however, it is the following. If the 
 object in question is not a range and doesn't have opApply 
 but has opSlice() then it slices it first, then again it 
 checks the 2 ways on this slice - as range or opApply.

 I didn't know that; I know it attempted a slice but not that 
 the slice has to be a range. Likely that needs to be 
 emphasized in the documentation.

 That's pretty much how std.container tries to work. There is 
 little documentation on the subject though, same note IRC is in 
 TDPL book (highly recommend that one if you don't have it 
 already).

  Got a copy, in fact this is one of the first prints. I've likely 
read this book a half dozen times to get a full feel of the 
language (other than templates which I've learned elsewhere)

 I agree, but if it's going to be a value type that's heavy on 
 small string optimization with as little referencing as 
 possible, then it has to know what size to depend on, since 
 resizing beyond the Max may not be an option if we take that 
 part out.

 I meant it's still resizing up to any point, just that it does 
 dup's underlying array on copy (if not small, if small then 
 postblit does nothing). It's questionable to dup on copy but 
 the idea is that it won't be often.

 If you mean to make type that doesn't resize beyond some 
 maximum (so it's always "small") then indeed having fixed size 
 as template parameter makes more sense.

  That was the idea. A true value type with no hidden allocation. 
There are many cases for it, but I think having the BitArray 
which can allocate has a larger audience than a fixed size. Won't 
know for sure though, maybe it's the other way around.

Jan 03 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 3 January 2013 at 21:45:24 UTC, Era Scarecrow wrote:
 K, I'll likely re-work my multi-level huffman algorithmn and a 
 LZW compression, although the LZW wouldn't make use of the more 
 exotic features. Got half the LZW written, but I likely won't 
 get this done for a few days.

  Nearly done. Got a working LZW, multi-level huffman working, 
only cleanup/refactoring to do, and a way to save/load the tree 
structure for external storage.

  Do I assume you would want me to post it? (Even incomplete and 
lacking a good portion of documentation/unittests)

Jan 06 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

07-Jan-2013 00:51, Era Scarecrow пишет:
 On Thursday, 3 January 2013 at 21:45:24 UTC, Era Scarecrow wrote:
 K, I'll likely re-work my multi-level huffman algorithmn and a LZW
 compression, although the LZW wouldn't make use of the more exotic
 features. Got half the LZW written, but I likely won't get this done
 for a few days.

   Nearly done. Got a working LZW, multi-level huffman working, only
 cleanup/refactoring to do, and a way to save/load the tree structure for
 external storage.

   Do I assume you would want me to post it? (Even incomplete and lacking
 a good portion of documentation/unittests)

Sure thing! I'd love to check it out.

-- 
Dmitry Olshansky

Jan 06 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Sunday, 6 January 2013 at 21:14:08 UTC, Dmitry Olshansky wrote:
 07-Jan-2013 00:51, Era Scarecrow wrote:

 Nearly done. Got a working LZW, multi-level huffman working, 
 only cleanup/refactoring to do, and a way to save/load the 
 tree structure for external storage.

 Do I assume you would want me to post it? (Even incomplete and 
 lacking a good portion of documentation/unittests)

 Sure thing! I'd love to check it out.

  Here you go. It's about 16k currently (5k for LZW, 10k for MLH). 
I write on the fly so some of it isn't as thought out as it could 
be.

  https://github.com/rtcvb32/Side-Projects/blob/master/mlh.d

Jan 06 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

07-Jan-2013 01:54, Era Scarecrow пишет:
 On Sunday, 6 January 2013 at 21:14:08 UTC, Dmitry Olshansky wrote:
 07-Jan-2013 00:51, Era Scarecrow wrote:

 Nearly done. Got a working LZW, multi-level huffman working, only
 cleanup/refactoring to do, and a way to save/load the tree structure
 for external storage.

 Do I assume you would want me to post it? (Even incomplete and
 lacking a good portion of documentation/unittests)

 Sure thing! I'd love to check it out.

   Here you go. It's about 16k currently (5k for LZW, 10k for MLH). I
 write on the fly so some of it isn't as thought out as it could be.

   https://github.com/rtcvb32/Side-Projects/blob/master/mlh.d

Thanks, unfortunately I got busy with my own stuff ... bookmarked for 
review.

-- 
Dmitry Olshansky

Jan 07 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Saturday, 28 July 2012 at 20:59:15 UTC, Dmitry Olshansky wrote:
 Great! But I strongly suggest to repost it in d.D newsgroup as 
 it has wider audience and is more appropriate for reviews.

  I was thinking of not bothering Andrei or Walter while i fixed 
other issues on it before bringing more major ones forward (which 
is why i brought it to D.learn), but re-posting it there is easy 
enough

 My solution is make slices different type e.g. BitSlice. It's 
 always reference to the original BitArray. Array itself still 
 can be either compact or not. All operations with slice go to 
 array (and still check if they can use word-aligned speed up).

 How does it sound?

  Someone else also suggested making the slice different (Meaning 
a new Bitarray always is at max compact size). Since we need at 
least a single flag to determine if the array is compact or array 
reference, there would be a larger loss which the current 
offset/maxOffset handle quite nicely filling in that space. 
Making BitSlice separate suggests BitArray would be the bare 
minimum (canUseBulk, length, and index operations) while bitSlice 
would handle all others. But if you do that, either BitSlice 
would have to be inherently convertible to BitArray if you want 
to pass it to a function that expects a BitArray. I'm hesitant to 
do it, but if there's a strong enough argument for it I wouldn't 
refuse it.

 opCat isn't it just operator "~" and "~=" ? You can use 
 opOpAssign for "~=" IRC.

  Yes, I think opAssign is used for ~= already, but doing the 
following requires opCat and opCat_r.

  BitArray a; //filled with something
  auto x = true ~ a;
  auto y = a ~ true;

  I just can't get it to accept it (for whatever reason). Compiler 
bug?

 I suspect to!string wasn't CTFEable. Regardless you can catch a 
 word from Andrei Alexandrescu on the main D newsgroup who (I 
 think) came up with it in the first place.

  If we can use the std.string functions I'll replace them all :) 
(As I was just following the theme since they were present).

  I'll move this to the D group and see what we get for this.

Jul 28 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 29-Jul-12 01:20, Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 20:59:15 UTC, Dmitry Olshansky wrote:
 Great! But I strongly suggest to repost it in d.D newsgroup as it has
 wider audience and is more appropriate for reviews.

   I was thinking of not bothering Andrei or Walter while i fixed other
 issues on it before bringing more major ones forward (which is why i
 brought it to D.learn), but re-posting it there is easy enough

 My solution is make slices different type e.g. BitSlice. It's always
 reference to the original BitArray. Array itself still can be either
 compact or not. All operations with slice go to array (and still check
 if they can use word-aligned speed up).

 How does it sound?

   Someone else also suggested making the slice different (Meaning a new
 Bitarray always is at max compact size).

Obviously that was me (blackwhale) ;)

Since we need at least a single
 flag to determine if the array is compact or array reference, there
 would be a larger loss which the current offset/maxOffset handle quite
 nicely filling in that space.

Not sure I followed you - what with offset/maxOffset ? You mean less 
space available for compact version?

  Making BitSlice separate suggests BitArray
 would be the bare minimum (canUseBulk, length, and index operations)
 while bitSlice would handle all others. But if you do that, either
 BitSlice would have to be inherently convertible to BitArray if you want
 to pass it to a function that expects a BitArray.

Take a look at std.container from what I see there Array does have a 
separate range type.

I'm hesitant to do it,
 but if there's a strong enough argument for it I wouldn't refuse it.

 opCat isn't it just operator "~" and "~=" ? You can use opOpAssign for
 "~=" IRC.

   Yes, I think opAssign is used for ~= already, but doing the following
 requires opCat and opCat_r.

   BitArray a; //filled with something
   auto x = true ~ a;
   auto y = a ~ true;

   I just can't get it to accept it (for whatever reason). Compiler bug?

Hm opBinaryRight with if (op == "~") should work.

 I suspect to!string wasn't CTFEable. Regardless you can catch a word
 from Andrei Alexandrescu on the main D newsgroup who (I think) came up
 with it in the first place.

   If we can use the std.string functions I'll replace them all :) (As I
 was just following the theme since they were present).

   I'll move this to the D group and see what we get for this.


-- 
Dmitry Olshansky

Jul 28 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Era Scarecrow:

 BitFields:
  I've concentrated on adding default values.

Good, that's useful.


Regarding BitArray I have written some ER:

http://d.puremagic.com/issues/show_bug.cgi?id=4123
http://d.puremagic.com/issues/show_bug.cgi?id=4124
http://d.puremagic.com/issues/show_bug.cgi?id=4717
http://d.puremagic.com/issues/show_bug.cgi?id=7487
http://d.puremagic.com/issues/show_bug.cgi?id=7488
http://d.puremagic.com/issues/show_bug.cgi?id=7490

Related:
http://d.puremagic.com/issues/show_bug.cgi?id=6697

Bye,
bearophile

Jul 28 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Saturday, 28 July 2012 at 22:43:22 UTC, bearophile wrote:
 Era Scarecrow:

 BitFields:
 I've concentrated on adding default values.

 Good, that's useful.


 Regarding BitArray I have written some ER:

 http://d.puremagic.com/issues/show_bug.cgi?id=4123
 http://d.puremagic.com/issues/show_bug.cgi?id=4124
 http://d.puremagic.com/issues/show_bug.cgi?id=4717
 http://d.puremagic.com/issues/show_bug.cgi?id=7487
 http://d.puremagic.com/issues/show_bug.cgi?id=7488
 http://d.puremagic.com/issues/show_bug.cgi?id=7490

4123 - fixed
4124 - Set/reset all bits ie:
   BitArray ba = BitArray(100);
   ba[] = true;
Others haven't been done yet.

4717 - Same as 4124, none others done yet.
7487 - Done. When prepending it extends the array to size_t extra 
and slowly backtracks until it needs to do it again. To go 
further, have to disable re-slicing of the allocated array.

Or perhaps a 'start at end' during constructor stage. Hmmm..

7488 - Done, union used and is 'compact' version (by default or 
resized and can fit)

7490 - haven't touched yet. :(

 Related:
 http://d.puremagic.com/issues/show_bug.cgi?id=6697

  Not so sure. Could make a multi-dimentional one that returns 
slices to various sections, but  that's iffy. I'd need an example 
of how you'd use it with BitArray before I could build a solution.

Jul 28 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Era Scarecrow:

As you guess I have had to use many arrays of bits in my programs 
:-)

 4124 - Set/reset all bits ie:
   BitArray ba = BitArray(100);
   ba[] = true;
 Others haven't been done yet.

Among those 4124 suggestions the most important, and very simple 
to implement, are:
- set n-th bit (this can be a little more efficient than bs[n]=1;)
- reset n-th bit (this can be a little more efficient than 
bs[n]=1;)

Another commonly needed operation is a very fast bit count. There 
are very refined algorithms to do this.


 7487 - Done. When prepending it extends the array to size_t 
 extra and slowly backtracks until it needs to do it again.

Leaving a bit of "capacity" at the beginning is a good idea.


 7488 - Done, union used and is 'compact' version (by default or 
 resized and can fit)

Good, this makes BitArray usable in many other situations :-)


 Related:
 http://d.puremagic.com/issues/show_bug.cgi?id=6697

  Not so sure. Could make a multi-dimentional one that returns 
 slices to various sections, but  that's iffy. I'd need an 
 example of how you'd use it with BitArray before I could build 
 a solution.

I have needed many times a fast BitMatrix, but I see it as a data 
structure distinct from BitArray, so don't worry about making 
them inter-operate.

For me a BitMatrix must be as fast as possible, even if this 
causes some waste of memory (see my implementation in the attach 
of issue 6697) and I use it for medium and large bit matrices. 
Generally I don't to count the set bits in a bit matrix. So the 
two data structures need different capabilities and they are 
better implemented in different ways (this means a FastBitMatrix 
is not an array of a BitArray!).

Taking a row of a FastBitMatrix and turning it into a a BitArray 
sounds cute, but generally I don't need to do this operation, so 
I don't care about this feature.

Bye,
bearophile

Jul 29 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Sunday, 29 July 2012 at 14:41:48 UTC, bearophile wrote:
 As you guess I have had to use many arrays of bits in my 
 programs :-)

  I'll do what I can :)

 4124 - Set/reset all bits ie:
  BitArray ba = BitArray(100);
  ba[] = true;
 Others haven't been done yet.

 Among those 4124 suggestions the most important, and very 
 simple to implement, are:
 - set n-th bit (this can be a little more efficient than 
 bs[n]=1;)
 - reset n-th bit (this can be a little more efficient than 
 bs[n]=1;)

  If it has to determine between compact and non-compact, then I 
really don't see how any speed improvement can be made, and if 
it's optimized it may be inlined which would be about as fast as 
you could get.

 Another commonly needed operation is a very fast bit count. 
 There are very refined algorithms to do this.

  Likely similar to the hamming weight table mentioned in TDPL. 
Combined with the canUseBulk I think I could make it fairly fast.

 7487 - Done. When prepending it extends the array to size_t 
 extra and slowly backtracks until it needs to do it again.

 Leaving a bit of "capacity" at the beginning is a good idea.

  With how it's made for speed and slices, it had to :P

 7488 - Done, union used and is 'compact' version (by default 
 or resized and can fit)

 Good, this makes BitArray usable in many other situations :-)

  Yes, currently a debate of it's partial ref/value issue is 
coming up. As it is it's usable, and if you want to be sure it's 
unique you can always dup it, otherwise as a previous suggestion 
I'll either make two array types for either full reference vs 
value, or take his separate slices (reference) idea.

 Related:
 http://d.puremagic.com/issues/show_bug.cgi?id=6697

 Not so sure. Could make a multi-dimentional one that returns 
 slices to various sections, but  that's iffy. I'd need an 
 example of how you'd use it with BitArray before I could build 
 a solution.

 I have needed many times a fast BitMatrix, but I see it as a 
 data structure distinct from BitArray, so don't worry about 
 making them inter-operate.

  A Matrix is just a multidimensional array right? I'll have to 
read up on it a bit; doing simple division (Likely whole rows of 
32bit at a time) would do that job, but expanding width several 
times (compared to height) would be quite a bit slower; Although 
not too horrible.

 For me a BitMatrix must be as fast as possible, even if this 
 causes some waste of memory (see my implementation in the 
 attach of issue 6697) and I use it for medium and large bit 
 matrices. Generally I don't to count the set bits in a bit 
 matrix. So the two data structures need different capabilities 
 and they are better implemented in different ways (this means a 
 FastBitMatrix is not an array of a BitArray!).

  more likely bool array/matrix.. Although if you heavily use 
specific lines it could convert to bool[], and then back again, 
but if you go all over the place it would be worse than just 
using bool[].

 Taking a row of a FastBitMatrix and turning it into a a 
 BitArray sounds cute, but generally I don't need to do this 
 operation, so I don't care about this feature.

Jul 29 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Era Scarecrow:

  If it has to determine between compact and non-compact, then I 
 really don't see how any speed improvement can be made,

Maybe it's better to consider two use cases: dynamic length 
allocated on the heap (or allocated with a given allocator, often 
a stack-styled allocator), and fixed-size allocated in place, so 
there's no need for that run-time test.


 and if it's optimized it may be inlined which would be about as 
 fast as you could get.

I think that  "a[x] = 1;"  is slower than  "a.set(x);"  even if 
the array a is allocated in place and everything is inlined. This 
is probably the most important operation an array of bits has to 
support. If this is missing, I have to use my own implementation 
still.


  Likely similar to the hamming weight table mentioned in TDPL. 
 Combined with the canUseBulk I think I could make it fairly 
 fast.

There is lot of literature about implementing this operation 
efficiently. For the first implementation a moderately fast (and 
short) code is probably enough. Later faster versions of this 
operation will go in Phobos, coming from papers.


  Yes, currently a debate of it's partial ref/value issue is 
 coming up. As it is it's usable, and if you want to be sure 
 it's unique you can always dup it, otherwise as a previous 
 suggestion I'll either make two array types for either full 
 reference vs value, or take his separate slices (reference) 
 idea.

The idea of making two different types (for the dynamic and 
static versions) sounds nice.

There is a third use case, that is bit arrays that start small 
and can grow, and rarely grow bigger than one or two words. This 
asks for a dynamic-length bit array type that allocates locally 
only when it's small (so it needs to perform run-time test of the 
tag. Too bad the D GC doesn't support tagged pointers!). But 
probably this use case is not common enough to require a third 
type of array. So I suggest to not worry about this case now, and 
to focus on the other two more common ones.


  A Matrix is just a multidimensional array right?

Well, yes, but there are many ways to implement arrays and nD 
arrays. Sometimes the implementations differ.


 I'll have to read up on it a bit; doing simple division (Likely 
 whole rows of 32bit at a time) would do that job,

Take a look at my implementation in Bugzilla :-) It's another 
type, the third. The 1D and 2D cases are the most common.


 but expanding width several times (compared to height) would be 
 quite a bit slower; Although not too horrible.

A 2D array of bits probably doesn't need to change its size after 
creation, this is not a common use case.


  more likely bool array/matrix.. Although if you heavily use 
 specific lines it could convert to bool[], and then back again, 
 but if you go all over the place it would be worse than just 
 using bool[].

It's not so simple :-) A bool[] causes many more cache misses, if 
the matrix is not tiny. A bool[] or bool[][] is a good idea only 
if the matrix is very small, or if you don't have a bit matrix 
library and you don't want to write lot of code. Converting to 
bool[] the current line is not a good idea.

Bye,
bearophile

Jul 29 2012

Don Clugston <dac nospam.com> writes:

On 29/07/12 23:36, bearophile wrote:
 Era Scarecrow:

 Another commonly needed operation is a very fast bit count. There 



are very refined algorithms to do this.
  Likely similar to the hamming weight table mentioned in TDPL.
 Combined with the canUseBulk I think I could make it fairly fast.

 There is lot of literature about implementing this operation
 efficiently. For the first implementation a moderately fast (and short)
 code is probably enough. Later faster versions of this operation will go
 in Phobos, coming from papers.

See bug 4717. On x86, even on 32 bits you can get close to 1 cycle per 
byte, on 64 bits you can do much better.

Jul 31 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

  Currently since I've been concentrating more on bitfields and 
BitArray needs more work, the two separate patches and trees are 
out of sync and apparently the auto-tester brings them up as a 
fail. How do I remove one of the pulls so I can concentrate on 
the one set of features and continue the others later?

  (Course I could just update what I have for BitArray so it works 
as a value-type for now too...)

Aug 02 2012

"David Nadlinger" <see klickverbot.at> writes:

On Thursday, 2 August 2012 at 11:11:58 UTC, Era Scarecrow wrote:
 How do I remove one of the pulls so I can concentrate on the 
 one set of features and continue the others later?

Hm? You can just close pull requests (button at the bottom of the 
page), and reopen them later.

David

Aug 02 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 2 August 2012 at 11:26:42 UTC, David Nadlinger wrote:
 On Thursday, 2 August 2012 at 11:11:58 UTC, Era Scarecrow wrote:
 How do I remove one of the pulls so I can concentrate on the 
 one set of features and continue the others later?

 Hm? You can just close pull requests (button at the bottom of 
 the Page), and reopen them later.

  Most of this doesn't make sense to me still; just one of those 
things I'll learn as I use it over the next few years. Anyways 
when the bitfields are as updated as we're going to make it, I'll 
resume work on BitArray code.

  In the meantime I guess I'll check on the auto-tester once it 
refreshes.

Aug 02 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 2 August 2012 at 11:26:42 UTC, David Nadlinger wrote:
 Hm? You can just close pull requests (button at the bottom of 
 the Page), and reopen them later.

  So I've removed the pull, but the auto tester is still pulling 
both code sources, so it's failing automatically because of it. 
Do I close the bitfields pull too? Or do I open another/new pull 
for it?

Aug 03 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

Following from issue list:

bearophile_hugs 2010-10-04 18:16:44 PDT
Another possible feature is to make std.bitmanip.bitfields 
generate two

versions of the code, that get compiled conditionally according 
to the CPU
endianess:

 static if (std.system.endian == std.system.Endian.BigEndian) {
    ...
 } else {
     ...
 }

Having it endian specific brings up a couple questions.

  Which endian is 'correct'? Resolutions could be...
   a) Both are correct (Leave as is)
   b) big/little is correct and should always convert to the other 
(which one?)
   c) Specify correct type as a flag, then swaps for the other in 
generated code

  Second, which would be the best approach?
   a) swap bytes in a temporary, modify the temporary, replace 
storage? (In every function)
   b) use current endianness, then prepping saving/loading have it 
swap? (very unlikely)
   c) use ubyte array and use shifting on each byte (small Endian, 
also allows for odd byte sizes, not very likely, but possible)

  I know an instruction for the Pentium system (686+?) bswap will 
swap the appropriate bytes as a conversion and takes a single 
cycle, using assembler that can be incorporated assuming it's 
ready for it (course than the template would be updated more 
likely and not the bitfield template).

Aug 02 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

repost from issue list:

Austin Hastings 2010-09-24 18:06:53 PDT
 Also, I think I'm going to request that repeated bitfield 
 definitions be allowed if they are identical - I'd like to 
 redeclare "opcode" rather than "".

How would you tell them apart? If I know how you may want to call 
them, I may
be able to make something. I can understand with registers, but 
still need some
way to work with them. Perhaps as a set then?

 So I would like the bitmanip code to permit redeclaration of 
 bitfields that are identical in all respects.

 That is, obviously the names are the same, but the field width, 
 offset, and type representation has to be the same as well.

Maybe....?

struct S {
     mixin(bitfields!(
         uint, "opcode", 4,
         uint, "register", 4,
         uint, "register", 4,
         uint, "register", 4
     ));
}

and using the registers would have function signature like...

struct Register {
   uint register_1;
   uint register_2;
   uint register_3;

}

//setters, likely can't be  propery
void register(uint reg1, uint reg2, uint reg3);
void register(uint[] register ...); //maybe?
void register(Register register);

//getter ??
Register register() const;

Or perhaps...

struct S {
     mixin(bitfields!(
         uint, "opcode", 4,
         uint, "reg1", 4,
         uint, "reg2", 4,
         uint, "reg3", 4
     ));
     mixin(sharedNameSet(
         "nameForGetterAndSetter",
         "struct name for returning/passing",
         "reg1", "reg2", "reg3" //named variables as a set
     ));

     //nameForGetterAndSetter's would be added here, perhaps as 
above.
}

Aug 02 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

  Having insomnia has left me with a certain gifted curse. :P

  I've been thinking, and i think i've written BitArray all wrong, 
in order to better structure it i would break it up into four 
something parts. The following would remove the complication of 
'isCompact' and the union, the code can be  safe (rather than 
 trusted).

  BitString, this is the main of BitArray, handles all basic 
operations but doesn't manage memory of any kind.

  BitArrayRange, which is a reference forwarding for a BitString 
including various range functions.

  Then we come to memory management. This can be done simply as 
wrappers.

  BitArray - Memory management and dynamically 
allocates/reallocates as needed
  BitArrayFixed - Template that includes a fixed size of memory 
for small memory optimization.

It goes something like... (API not yet decided fully, but you 
should get the idea)

BitString {
   //bitfields for start/ending
   size_t[]* bulkRaw; //pointer to the array?
   inout(ubyte[]) noEndianBulkRaw() inout {
     return cast(inout(ubyte[])) bulkRaw;
   }
   this(size[]* raw) {
     bulkRaw = raw;
   }

   //binaryOperators and opIndex, etc
   //length modifier, but no memory re-allocation allowed
}

//Template with bool isConst?
//BitStringRange(bool isConst) {
//  static if (isConst) { const(BitString)* array; }
//  else                { BitString* array; }

BitStringRange {
   BitString* array;
   ulong start, end;
   //forwarding operators for start/end
   //range functions
}

BitArray {
   BitString array;
   size_t[] rawMemory;
   alias array this;
   //include length modifiers and dynamic allocation
   void init() {}
   //opCast can convert to BitArrayFixed?
}

BitArrayFixed(int fixedSize) {
   BitString array;
   size_t[fixedSize] rawMemory;
   alias array this;

   void init(){ array = BitString(rawMemory); }
   //opCast can convert to BitArray?
}

  Thoughts and comments?

Jan 09 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Wednesday, 9 January 2013 at 20:00:59 UTC, Era Scarecrow wrote:
 BitString {
   //bitfields for start/ending
   size_t[]* bulkRaw; //pointer to the array?
   //binaryOperators and opIndex, etc
   //length modifier, but no memory re-allocation allowed
 }

 BitArray {
   BitString array;
   size_t[] rawMemory;
   alias array this;
   //include length modifiers and dynamic allocation
   void init() {}
   //opCast can convert to BitArrayFixed?
 }

  Hmm as I'm thinking about it more, a mixin for BitString may be 
better..

//    'size_t[] bulkRaw' required by enclosing struct/class
   mixin template BitString() {
     struct BitString {
     }
   }

   BitArray {
     mixin BitString;
     size_t[] bulkRaw;

     BitString array;
     alias array this;

     //length setter, handling allocation/reallocation
   }

   //since BitArray and BitArrayFixed may have different 
pointer/setups?
   //Or would either work together?
   BitArrayRange(BS, bool isConst) {
     BS* bitString;
   }

   If anyone wants to talk live with me, I tend to be on YIM, so 
rtcvb32 or Era_Scarecrow.

Jan 09 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Wednesday, 9 January 2013 at 21:16:37 UTC, Era Scarecrow wrote:
  Hmm as I'm thinking about it more, a mixin for BitString may 
 be better..

  I'm having troubles trying to understand why this won't work. 
Anyone clarify?



mixin template BitString()
{
     struct BitString {
         size_t opIndex(int i) {
//Error: this for bulkRaw needs to be type BitArray not type 
BitString
             return bulkRaw[i];
         }
     }
}

struct BitArray {
     mixin BitString;
     size_t[] bulkRaw;
     BitString ba;
}

Jan 09 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

10-Jan-2013 01:47, Era Scarecrow пишет:
 On Wednesday, 9 January 2013 at 21:16:37 UTC, Era Scarecrow wrote:
  Hmm as I'm thinking about it more, a mixin for BitString may be better..


Overall direction is good (I mean separation of ranges/containers). I'll 
post more comments once I think it through.The minor detail is that 
BitString sounds like a individual container whereas it's a "binary view 
of packed bools".

   I'm having troubles trying to understand why this won't work. Anyone
 clarify?



 mixin template BitString()
 {
      struct BitString {
          size_t opIndex(int i) {
 //Error: this for bulkRaw needs to be type BitArray not type BitString
              return bulkRaw[i];
          }
      }
 }

You want to mixin a definition of a inner struct BitString into each bit 
array?
Or include all of functions of BitString to be mixed into BitArray?

If the latter then:

  mixin template BitString()
  {
       size_t opIndex(int i) {
               return bulkRaw[i];
           }
//....
  }

i.e. no struct it's just a template.

 struct BitArray {
      mixin BitString;
      size_t[] bulkRaw;
      BitString ba;
 }


-- 
Dmitry Olshansky

Jan 10 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 10 January 2013 at 20:11:38 UTC, Dmitry Olshansky 
wrote:
 You want to mixin a definition of a inner struct BitString into 
 each bit array?

  That was the idea, yes.

 Or include all of functions of BitString to be mixed into 
 BitArray?

 If the latter then:

  Well it seems like it should work, that it should include a 
silent pointer to the original BitArray (container) similar like 
delegates do for functions. Odd that it doesn't.  Bug?

  Seems with the problem on that, I'll do a different approach.

  BitArray(StorageType) {
    StorageType memory;

    //now we do memory.length =, and memory.rawBulk[] for direct 
access, may
    //add opIndex removing the need to know rawBulk at all
  }

  StorageType in this case should contain 'rawBulk', and the 
storage type handles the memory allocation, giving access to 
members no different than an array. I have two storage types i've 
written that hopefully handle all cases:

//storage types for BitArrays
struct Dynamic {
   size_t[] rawBulk;
   alias rawBulk this;
}

struct Fixed(int maxBits) {
   size_t[maxBits / (size_t.sizeof * 8)] rawBulk;
   int _length;    //fixed sizes are usually going to be rather 
small

   int length()  property const  safe pure nothrow {
       return _length;
   }
   void length(int setSize)  property const  safe pure nothrow {
       assert(setSize < rawBulk.length);
       _length = setSize;
   }
   void reserve(size_t resSize) {} //fixed
}

  Now when modifying memory referencing just treat memory as an 
array, access the raw data via rawBulk (likely will remove named 
access and go with opIndex). This removes the need for several 
workarounds I have currently. while trying to handle 
compact/allocated memory.

Jan 10 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

11-Jan-2013 00:33, Era Scarecrow пишет:
 On Thursday, 10 January 2013 at 20:11:38 UTC, Dmitry Olshansky wrote:
 You want to mixin a definition of a inner struct BitString into each
 bit array?

   That was the idea, yes.

 Or include all of functions of BitString to be mixed into BitArray?

 If the latter then:

   Well it seems like it should work, that it should include a silent
 pointer to the original BitArray (container) similar like delegates do
 for functions. Odd that it doesn't.  Bug?

It's only for classes AFAIK. Inner structs of structs (LOL) don't have it.

   Seems with the problem on that, I'll do a different approach.

   BitArray(StorageType) {
     StorageType memory;

     //now we do memory.length =, and memory.rawBulk[] for direct access,
 may
     //add opIndex removing the need to know rawBulk at all
   }

   StorageType in this case should contain 'rawBulk', and the storage
 type handles the memory allocation, giving access to members no
 different than an array. I have two storage types i've written that
 hopefully handle all cases:

 //storage types for BitArrays
 struct Dynamic {
    size_t[] rawBulk;
    alias rawBulk this;
 }

 struct Fixed(int maxBits) {
    size_t[maxBits / (size_t.sizeof * 8)] rawBulk;
    int _length;    //fixed sizes are usually going to be rather small

    int length()  property const  safe pure nothrow {
        return _length;
    }
    void length(int setSize)  property const  safe pure nothrow {
        assert(setSize < rawBulk.length);
        _length = setSize;
    }
    void reserve(size_t resSize) {} //fixed
 }

   Now when modifying memory referencing just treat memory as an array,
 access the raw data via rawBulk (likely will remove named access and go
 with opIndex). This removes the need for several workarounds I have
 currently. while trying to handle compact/allocated memory.

Sounds good and seductively simple.

-- 
Dmitry Olshansky

Jan 10 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 10 January 2013 at 21:37:22 UTC, Dmitry Olshansky 
wrote:
 It's only for classes AFAIK. Inner structs of structs (LOL) 
 don't have it.

  Why not? I don't see a reason why not personally. Umm maybe I 
do... the outer struct becomes referenced to S2, so...

  struct S {
    int x;
    struct S2 {
      int getX(){return x;}
    }
    S2 s2;
  }

  S.S2 func(){
    S s;
    return s.s2;
  }

  auto x = func();
  writeln(x.getX()); //prints garbage? Or access violation.

  If func had used new on s, then it would have worked... Yeah I 
see the problem now. The only way that would work is if S2 is 
never allowed to be passed outside a function, and on copying the 
new hidden pointer is updated to the new location. That makes 
sense. Maybe something walter will consider.


 Sounds good and seductively simple.

  Yeah I hope so. We'll see, that's the design I'll go for. I'll 
let you know how it goes; Should at least simplify the code and 
remove the need for bitfields.

Jan 10 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

  Well some minor update. From the looks of how it is going to be 
set up, it is almost entirely a template. Consider you have 
BitArray!(Fixed!(1024)), vs BitArray!(Dynamic), in order for one 
to copy to the other template functions are needed. Override one 
of them, and now i have to define it to handle all slice and 
BitArrays as template functions too. not horrible, but has it's 
downsides.

  Also as for ranges/slices; Some need their contents const while 
others don't care. So..

//slice is also a range
struct BitArraySlice(BA, bool isConst){}


  Got a lot of work to do, so I wouldn't expect anything for a 
couple weeks.

Jan 13 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

  Well got a few curious problems. Slicing doesn't seem it wants 
to work as a separate type and can cause problems.

  Let's take an example. Say our slice is..

   struct BitArraySlice {
     BitArray* ba;
     ulong start, end;
   }

  Now how much does it depend on the bitarray that it's pointing 
to? If it is a wrapper then we have a problem with range checks 
which should be legal.

   BitArray x = BitArray(100); //100 elements

   auto sl = x[50 .. 100];
   x.length = 50;
   sl[0] = true; //out of range! BitArray valid from 0-49, not 
50-100

  That much is sorta easy to fix with a separate opIndex (and 
fixed sizes), but it's possible to re-slice the dynamic array to 
make it smaller. So even if we have opIndex allow out of ranges...

   struct BitArray {
     size_t[] store; //storage
     ubyte start, end;
   }

   BitArray x = BitArray(100); //100 elements
   auto sl = x[50 .. 100];

   //likely does x.store[] = x.store[0 .. 2];
   //due to compact 1 byte offsets to determine end of bitarray.
   x.length = 50;

   sl[0] = true; //ok, 0-64 valid on 32bit machines
   sl[32] = true; //out of range! 82 past not within 0-63


  So let's take the slice and give it the address of the storage 
instead, other than it could point to a local variable it will 
work; But now I have basically two definitions of bitarray, one 
that can be a range/slice while the other that manages the memory 
and binary operators.

   struct BitArraySlice {
     //same as BitArray now, what advantage does this give?
     size_t[] store;
     ulong start, end;
   }

  Seems like making the slices a separate entity is going to cause 
more problems (Not that they can't be solved but the advantages 
seem smaller); Plus there's issues trying to get immutable/idup 
working.

  Thoughts? Feedback? I'm about ready to drop this and resume my 
previous version and enhance it with recent experiences.

Jan 16 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

17-Jan-2013 07:53, Era Scarecrow пишет:
   Well got a few curious problems. Slicing doesn't seem it wants to work
 as a separate type and can cause problems.

   Let's take an example. Say our slice is..

    struct BitArraySlice {
      BitArray* ba;
      ulong start, end;
    }

   Now how much does it depend on the bitarray that it's pointing to? If
 it is a wrapper then we have a problem with range checks which should be
 legal.

    BitArray x = BitArray(100); //100 elements

    auto sl = x[50 .. 100];
    x.length = 50;

Well, the slice was invalidated by shrinking an array. I don't expect 
the below to work.

    sl[0] = true; //out of range! BitArray valid from 0-49, not 50-100

The good part is that error can be detected easily. In c++ for instance 
it typically isn't.

The harder problem is what to do when the original BitArray goes out of 
scope. I guess in case of storage allocated on heap it should work but 
if on the stack it shouldn't (and can't).

   That much is sorta easy to fix with a separate opIndex (and fixed
 sizes), but it's possible to re-slice the dynamic array to make it
 smaller. So even if we have opIndex allow out of ranges...

    struct BitArray {
      size_t[] store; //storage
      ubyte start, end;
    }

    BitArray x = BitArray(100); //100 elements
    auto sl = x[50 .. 100];

    //likely does x.store[] = x.store[0 .. 2];
    //due to compact 1 byte offsets to determine end of bitarray.
    x.length = 50;

    sl[0] = true; //ok, 0-64 valid on 32bit machines
    sl[32] = true; //out of range! 82 past not within 0-63

That's why it's far better to allow slices to be invalidated depending 
on the length parameter of BitArray pointed by. Compact array do weird 
things in the previous version too.

   So let's take the slice and give it the address of the storage
 instead, other than it could point to a local variable it will work; But
 now I have basically two definitions of bitarray, one that can be a
 range/slice while the other that manages the memory and binary operators.

    struct BitArraySlice {
      //same as BitArray now, what advantage does this give?
      size_t[] store;
      ulong start, end;
    }

   Seems like making the slices a separate entity is going to cause more
 problems (Not that they can't be solved but the advantages seem
 smaller); Plus there's issues trying to get immutable/idup working.

Slices are ranges and thus are converted to array via std.array.array, 
nothing to invent or re-implement.

   Thoughts? Feedback? I'm about ready to drop this and resume my
 previous version and enhance it with recent experiences.

Feel free to do it how you see it. It's just that I think the semantics 
of the previous version can't be improved to a consistent state.

-- 
Dmitry Olshansky

Jan 17 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 17 January 2013 at 18:12:28 UTC, Dmitry Olshansky 
wrote:
 Well, the slice was invalidated by shrinking an array. I don't 
 expect the below to work.

  It just seems like shrinking/expanding the array should still 
work, perhaps I'm thinking of them too much like a normal array 
slice. If this is acceptable behavior then okay, but the 
increased complexity may not be worth it.

 The good part is that error can be detected easily. In c++ for 
 instance it typically isn't.


 The harder problem is what to do when the original BitArray 
 goes out of scope. I guess in case of storage allocated on heap 
 it should work but if on the stack it shouldn't (and can't).

  Passing back the BitArray in either form is valid, as dynamic 
stays valid and a full binary copy stays valid. The slices on the 
other hand won't be valid unless it's on the heap.


   sl[0] = true; //ok, 0-64 valid on 32bit machines
   sl[32] = true; //out of range! 82 not within 0-63

 That's why it's far better to allow slices to be invalidated 
 depending on the length parameter of BitArray pointed by. 
 Compact array do weird things in the previous version too.

  Hmmm...

 So let's take the slice and give it the address of the storage 
 instead, other than it could point to a local variable it will 
 work; But now I have basically two definitions of bitarray, 
 one that can be a range/slice while the other that manages the 
 memory and binary operators.

 Seems like making the slices a separate entity is going to 
 cause more problems (Not that they can't be solved but the 
 advantages seem smaller); Plus there's issues trying to get 
 immutable/idup working.

 Slices are ranges and thus are converted to array via 
 std.array.array, nothing to invent or re-implement.

  Thoughts? Feedback? I'm about ready to drop this and resume 
 my previous version and enhance it with recent experiences.

 Feel free to do it how you see it. It's just that I think the 
 semantics of the previous version can't be improved to a 
 consistent state.

  Sure they can, drop the compact part of it and leave it fully 
dynamic (or vise-verse). But that would make some people unhappy; 
On the other hand we could go hybrid where I take the new storage 
implementations and allow template versions based on needs, 
remove the compact related stuff, but then it's almost back to 
the separate slice design. Hmmm...

  Should the slices be 'use at your own risk' when you tamper with 
the original BitArray? How else should it be?

Jan 17 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

17-Jan-2013 22:34, Era Scarecrow пишет:
  Thoughts? Feedback? I'm about ready to drop this and resume my
 previous version and enhance it with recent experiences.

 Feel free to do it how you see it. It's just that I think the
 semantics of the previous version can't be improved to a consistent
 state.

   Sure they can, drop the compact part of it and leave it fully dynamic
 (or vise-verse). But that would make some people unhappy; On the other
 hand we could go hybrid where I take the new storage implementations and
 allow template versions based on needs, remove the compact related
 stuff, but then it's almost back to the separate slice design. Hmmm...

   Should the slices be 'use at your own risk' when you tamper with the
 original BitArray? How else should it be?

I'm thinking that a opSlice of stack-allocated must be  system and a 
heap allocated can be  safe.

Possibly there are better ways to handle this.

-- 
Dmitry Olshansky

Jan 17 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Jan 17, 2013 at 11:36:33PM +0400, Dmitry Olshansky wrote:
[...]
 I'm thinking that a opSlice of stack-allocated must be  system and a
 heap allocated can be  safe.

[...]

Related: http://d.puremagic.com/issues/show_bug.cgi?id=8838


T

-- 
Who told you to swim in Crocodile Lake without life insurance??

Jan 17 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 17 January 2013 at 19:36:34 UTC, Dmitry Olshansky 
wrote:
 I'm thinking that a opSlice of stack-allocated must be  system 
 and a heap allocated can be  safe.

  That's just a small part of the problem. With the new design, 
90% of it can be safe; Just the actual slice buffer when you 
request it (with Fixed storage) would be  system, and slices 
(having a pointer). But  safe code can't call  system code which 
means that the current design (convert to slices before 
operations) it all has to all be  system.

  Guess once it's debugged the entire thing can be  trusted and 
only certain things can be marked  system instead.

Jan 17 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

18-Jan-2013 01:49, Era Scarecrow пишет:
 On Thursday, 17 January 2013 at 19:36:34 UTC, Dmitry Olshansky wrote:
 I'm thinking that a opSlice of stack-allocated must be  system and a
 heap allocated can be  safe.

   That's just a small part of the problem. With the new design, 90% of
 it can be safe; Just the actual slice buffer when you request it (with
 Fixed storage) would be  system,and slices (having a pointer)

Slices should have a pointer and length so they can check range (in 
debug builds), alternatively just a pointer to storage that nows the range.

  But
  safe code can't call  system code which means that the current design
 (convert to slices before operations) it all has to all be  system.

I think slice ops could be made  safe if the primitives for bulk 
transfer and opIndex in the Storage are  trusted.

If we just mark all of opSlice on Fixed-sized one as  system then it 
should be good enough.In other words (I'm lost in terminology):

FixedBitArray!(1024) fixed;
auto slice = fixed[0..100]; //this is  system
slice[10..20] = slice[20..30]; //this is  safe
foreach(bool x; slice[0..10])  //this is  safe
{... }

The idea is to make code responsible for the "root" of unsafety to be 
 system everything else then can be  safe or  trusted(where compiler 
can't see it).

   Guess once it's debugged the entire thing can be  trusted and only
 certain things can be marked  system instead.

I think that the most of slice ops can be  safe/ trusted. It's the 
process of taking a slice out of stack-based container that is not 
 safe. The one taking a slice out of GC-based one is  safe.

-- 
Dmitry Olshansky

Jan 18 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Friday, 18 January 2013 at 16:42:04 UTC, Dmitry Olshansky 
wrote:
 Slices should have a pointer and length so they can check range 
 (in debug builds), alternatively just a pointer to storage that 
 nows the range.

  So write an alternate opIndex...

 I think slice ops could be made  safe if the primitives for 
 bulk transfer and opIndex in the Storage are  trusted.

 If we just mark all of opSlice on Fixed-sized one as  system 
 then it should be good enough. In other words (I'm lost in 
 terminology):

<snip>

 The idea is to make code responsible for the "root" of unsafety 
 to be  system everything else then can be  safe or 
  trusted(where compiler can't see it).

  Hmmm okay... Somehow this seemed so much more complex than it's 
put here.

 I think that the most of slice ops can be  safe/ trusted. It's 
 the process of taking a slice out of stack-based container that 
 is not  safe. The one taking a slice out of GC-based one is 
  safe.

  And depending on if it is fixed or dynamic will determine if the 
slice is  system or  safe... Makes sense. That sets me straight.


  So for the slice/range, what all should it support? I've tried 
to have it support (or foward) opBinary, opBinaryAssign, 
comparing, slicing, copying, as well as inputRange, 
bidirectional, opIndex & opIndex assign, opDollar and length. If 
it should support a much smaller sub-set I'll need to know; Could 
greatly affect how it is written.

Jan 18 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

19-Jan-2013 11:14, Era Scarecrow пишет:
 On Friday, 18 January 2013 at 16:42:04 UTC, Dmitry Olshansky wrote:
 Slices should have a pointer and length so they can check range (in
 debug builds), alternatively just a pointer to storage that nows the
 range.

   So write an alternate opIndex...

 I think slice ops could be made  safe if the primitives for bulk
 transfer and opIndex in the Storage are  trusted.

 If we just mark all of opSlice on Fixed-sized one as  system then it
 should be good enough. In other words (I'm lost in terminology):

 <snip>

 The idea is to make code responsible for the "root" of unsafety to be
  system everything else then can be  safe or  trusted(where compiler
 can't see it).

   Hmmm okay... Somehow this seemed so much more complex than it's put here.

 I think that the most of slice ops can be  safe/ trusted. It's the
 process of taking a slice out of stack-based container that is not
  safe. The one taking a slice out of GC-based one is  safe.

   And depending on if it is fixed or dynamic will determine if the slice
 is  system or  safe... Makes sense. That sets me straight.


   So for the slice/range, what all should it support? I've tried to have
 it support (or foward) opBinary, opBinaryAssign, comparing, slicing,
 copying, as well as inputRange, bidirectional, opIndex & opIndex assign,
 opDollar and length. If it should support a much smaller sub-set I'll
 need to know; Could greatly affect how it is written.


opBinaryAssign --> opOpAssign. opSlice/opSlice assign.
In any case it seems to me that (I think you already did this trick 
before) you can reuse a lot of code by making an operator string a 
template parameter (for functions that implement  =, |=, &=, ^=, and 
therefore ^, |, &) and use string mixins.

Other then this - don't forget the '~' _unary_ operator as it makes a 
lot of sense for bit-vectors. Now since slices don't support '~' 
concatenation it won't look so bad. Worst case:

auto ba = BitArray(...);
auopt slice = ba[...];
ba ~= ~slice;

And it's not much of problem.
-- 
Dmitry Olshansky

Jan 19 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Saturday, 19 January 2013 at 09:07:03 UTC, Dmitry Olshansky 
wrote:
 opBinaryAssign --> opOpAssign. opSlice/opSlice assign. In any 
 case it seems to me that (I think you already did this trick 
 before) you can reuse a lot of code by making an operator 
 string a template parameter (for functions that implement  =, 
 |=, &=, ^=, and therefore ^, |, &) and use string mixins.

  Yes I did when I could; The slice versions can only forwards 
them to the bitarray one.

 Other then this - don't forget the '~' _unary_ operator as it 
 makes a lot of sense for bit-vectors. Now since slices don't 
 support '~' concatenation it won't look so bad. Worst case:

 auto ba = BitArray(...);
 auopt slice = ba[...];
 ba ~= ~slice;

  Only works if you create a new version and append it; And all 
opBinary operations make duplicates due to their nature. So the 
above couldn't work if slice is a separate type, and instead:

   auto slice = ba[...];   //slice type of BitArraySlice
   auto bc ~= ~slice;      //ba/bc type of BitArray

   ba = (ba ~ (~slice))[]; //type of BitArraySlice


 And it's not much of problem.

  Hmmm.. Well as mentioned I don't think a separate slice/range 
type is the best design or as practical; I have it half working 
but one problem I'm really having is trying to make sense of the 
errors after I converted to use a pointer. It's confusing since 
it was compiling before and is an annoyance more-so now; And 
suddenly can't deduce the type when it should be able to, and if 
I explicitly call it with the type it doesn't help any.


[quote]
Error: template 
bitmanip.BitArray!(Fixed!(1024)).BitArray.opEquals(alias 
SL)(const SL rhs) if (hasMember!(SL, "isBitArray") || 
hasMember!(SL, "isBitArraySlice")) forward reference to template 
opEquals(alias SL)(const SL rhs) if (hasMember!(SL, "isBitArray") 
|| hasMember!(SL, "isBitArraySlice"))
Error: template 
bitmanip.BitArray!(Fixed!(1024)).BitArray.opEquals cannot deduce 
template function from argument types 
!()(BitArraySlice!(BitArray!(Fixed!(1024)), 
true),const(ulong),const(ulong))

Error: cannot implicitly convert expression 
((*this.slice).toString(this.start, this.end)) of type string to 
string
[/quote]

[code]
   //isBitArray and isBitArraySlice are enums in the structs
   //almost all functions templated to handle fixed/dynamic
   //types as compatible. Plus constness for range(s).

   //auto ref would be nice/preferred in a lot in these functions..
   //Otherwise copying is required unless i do more 
duplication/code forwarding

   //if opSlice is  system, then almost all these are  system, or 
 trusted

   //slice forwarding
   bool opEquals(SL)(const SL rhs) const
   if (hasMember!(SL, "isBitArray") || hasMember!(SL, 
"isBitArraySlice"))
   {
     //opSlice always returns BitArraySlice
     return opEquals(rhs[], 0, length);
   }

   bool opEquals(SL)(const SL rhs, ulong sliceStart, ulong 
sliceEnd) const
   if (hasMember!(SL, "isBitArraySlice"))
   {
     if ((sliceEnd-sliceStart) != rhs.length)
       return false;

       return opCmp(rhs, sliceStart, sliceEnd) == 0;
   }

   //slice forwarding
   int opCmp(SL)(const SL rhs) const
   if (hasMember!(SL, "isBitArray"))
   {
     //opSlice always returns BitArraySlice
     return opCmp(rhs[], 0, length);
   }

   int opCmp(SL)(const SL rhs, ulong sliceStart, ulong sliceEnd) 
const
   if (hasMember!(SL, "isBitArraySlice"));
[/code]



  A small consideration was coming to mind in cases where storage 
management if you only needed to prepend a few bits or append 
just after for a few bits but don't need to do a full 
reallocation. In that case perhaps a before/after allocated 
memory would serve nice.

   size_t beforeArray, afterArray;
   int beforeUsed, afterUsed; //course would likely be bitfield or 
ubyte

  Now if we use that it comes to mind that although it wouldn't be 
perfect storage for small compact arrays (as was the original 
complaint(s)), you could however rely on them just as much as the 
main array and the first size_t bits (*2 if you prepend/append to 
the array a lot. It would also be useful to append past the end 
of a sliced area without tampering with data outside it's access 
area until it's forced to resize.


  Fixed BitArraySizes seem far more wanted where you don't want to 
do extra memory allocation, so using reference counting wouldn't 
work, but with dynamic arrays you could. So if you did all the 
slices to a particular bit array could always point to the proper 
memory.

  If you did do that, prepending to an array would be a nightmare 
unless you could offset it with a virtual offset to represent the 
actual beginning. So...

   //start in the middle
   struct STORE {
     //virtualOffset instead?
     ulong startOffset = ulong.max / 2;
     size_t[] store;
   }

   //in slice/Bitarray, hope that's right
   RefCounted!(STORE) store;
   ulong start, end;

  With these three you can safely do dynamic slices while all 
slices point to the same data; in cases where there's only one 
reference store, start, end & startOffset can freely be changed.

  Assume you slice. So..

   BitArray ba = BitArray(100);
   auto sl = ba[10 .. 20];

   //assume startOffset says say 1024, start, 1024 & end = 1124.
   //slice holds start 1034, end 1044.

  Now if we have it prepend a whole lot.

   foreach(i; 0 .. 100)
     ba = true ~ ba;

  It had to be resized, so if it gave it say 512 more bits then...
   //now startOffset = 512.

   slice[0] = true; //start-startOffset = actual bit offset in 
store[]
   assert(ba[110]); //was set by the slice

  Full reference could be preserved, although if you prepend to a 
slice vs the main array then either you overlap data or need some 
temporary storage until duping/reallocating is required, or just 
dup/allocate new memory. Or have
a separate end/before arrays which won't force reallocation but 
could make for interesting overlapping as they wouldn't all 
always point to the same thing.


  The interesting problems that crop up again.


  Another thought, is what if we give an ID to fixed arrays so 
they know if they are pointing to proper data or junk; This would 
likely mostly be for the known fixed array slices. So..

   size_t ID; //in both slice and fixed array.
   size_t[1024] fixedSize;

  Now add a variant to ensure the ID's match on each and every 
access. If the ID gets corrupted the data is corrupted too, 
although not a guarantee it's pretty darn close if the ID is 
random.


  But as you mentioned making it  system then we could have it 
silently convert the BitArray from fixed to dynamic; And make the 
slice  system where normally it's  safe (although immediately 
duping the memory location after it's using the dynamic form 
would be safe).


  Reminds me... Should initialization/blit dup or 
reference/preserve the memory? I'd think it would point to the 
old memory unless explicitly told to (or going from const to 
non-const versions).

Jan 19 2013

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

(Moved from: What have I missed?)

  If this is the wrong place to ask these questions I apologize, 
getting back into this takes some work.


  So I guess I need to ask: Should I try to resume work on the 
BitManip library? (So far it seems none of my code has been 
integrated to phobos)

  Assuming I do, should I try to fix lots of small bugs and make 
smaller pulls or should I try to do several at once? When I 
re-wrote the BitArray I personally think it is an overall 
improvement in many ways, and being a complete re-write you can't 
just do bug #xxxx and then bug #xxxx and then bug... etc etc.

  Also to ask, how many people tried out the rewrite I proposed, 
and do they think it was actually an improvement for ease of use, 
speed, fewer bugs/issues, etc?

Aug 06 2014

"H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:

On Thu, Aug 07, 2014 at 01:10:06AM +0000, Era Scarecrow via Digitalmars-d-learn
wrote:
 (Moved from: What have I missed?)
 
  If this is the wrong place to ask these questions I apologize,
  getting back into this takes some work.

Since this is about contributing to Phobos, probably a better place to
ask is on the main D forum.


  So I guess I need to ask: Should I try to resume work on the BitManip
  library? (So far it seems none of my code has been integrated to
  phobos)

Do you have a pull request? Which one is it?


  Assuming I do, should I try to fix lots of small bugs and make
  smaller pulls or should I try to do several at once? When I re-wrote
  the BitArray I personally think it is an overall improvement in many
  ways, and being a complete re-write you can't just do bug #xxxx and
  then bug #xxxx and then bug... etc etc.
 
  Also to ask, how many people tried out the rewrite I proposed, and do
  they think it was actually an improvement for ease of use, speed,
  fewer bugs/issues, etc?

There have been a few fixes to std.bitmanip recently, including a
current PR for adding attributes to some of the functions. Are these
your PRs, or are you proposing something totally new? If it's something
totally new, where can we get the code?


T

-- 
"Maybe" is a strange word.  When mom or dad says it it means "yes", but when my
big brothers say it it means "no"! -- PJ jr.

Aug 06 2014

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 7 August 2014 at 01:51:46 UTC, H. S. Teoh via 
Digitalmars-d-learn wrote:
 Since this is about contributing to Phobos, probably a better 
 place to ask is on the main D forum.

  Yeah posted in my 'what have i missed?' as well...

 Do you have a pull request? Which one is it?

https://github.com/rtcvb32/phobos/pull/1

 From the looks of things it's 4 commits that are merged into a 
single request i think...


 There have been a few fixes to std.bitmanip recently, including 
 a current PR for adding attributes to some of the functions. 
 Are these your PRs, or are you proposing something totally new? 
 If it's something totally new, where can we get the code?

  Glancing at the current code, none of my stuff got in. I do a 
large number of fixes, so likely a lot of those changes would get 
thrown away or integrated...

  Knock yourself out... the pull/1 above holds most of the commits 
and diff/changes to note of.

  https://github.com/rtcvb32/phobos/blob/master/std/bitmanip.d

Aug 06 2014

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 7 August 2014 at 02:04:13 UTC, Era Scarecrow wrote:
 On Thursday, 7 August 2014 at 01:51:46 UTC, H. S. Teoh via 
 Digitalmars-d-learn wrote:
 Do you have a pull request? Which one is it?

 https://github.com/rtcvb32/phobos/pull/1

 From the looks of things it's 4 commits that are merged into a 
 single request i think...

  Looks like that pull is just the bitfields... but the actual 
bitmanip one i listed has the whole source.

Aug 06 2014

"H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:

On Thu, Aug 07, 2014 at 02:04:12AM +0000, Era Scarecrow via Digitalmars-d-learn
wrote:
 On Thursday, 7 August 2014 at 01:51:46 UTC, H. S. Teoh via
 Digitalmars-d-learn wrote:
Since this is about contributing to Phobos, probably a better place to ask
is on the main D forum.

 
  Yeah posted in my 'what have i missed?' as well...
 
Do you have a pull request? Which one is it?

 
 https://github.com/rtcvb32/phobos/pull/1
 
 From the looks of things it's 4 commits that are merged into a single
 request i think...

Hold on a sec... that's a pull for your own fork of Phobos. You need to
submit a pull to the main Phobos repo in order to get it reviewed and
merged. :-)


There have been a few fixes to std.bitmanip recently, including a
current PR for adding attributes to some of the functions. Are these
your PRs, or are you proposing something totally new? If it's
something totally new, where can we get the code?

 
 Glancing at the current code, none of my stuff got in. I do a large
 number of fixes, so likely a lot of those changes would get thrown
 away or integrated...

[...]

Well, no wonder, your pull was submitted against your local fork, not to
the main Phobos repo, so nobody knew about it (except Dmitry,
apparently). It would help to submit pulls against the main Phobos repo.
:-)

Also, looks like you made major changes to the code... if it's a
complete rewrite, I'd say submit it as a single pull. If it's a
collection of many fixes, it's probably better to submit separate pulls
so that the easy fixes will get in first while we work out the wrinkles
on the more complicated fixes.


T

-- 
Любишь кататься - люби и саночки возить.

Aug 06 2014

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Thursday, 7 August 2014 at 02:12:20 UTC, H. S. Teoh via 
Digitalmars-d-learn wrote:
 Hold on a sec... that's a pull for your own fork of Phobos. You 
 need to submit a pull to the main Phobos repo in order to get 
 it reviewed and merged. :-)


 Well, no wonder, your pull was submitted against your local 
 fork, not to the main Phobos repo, so nobody knew about it 
 (except Dmitry, apparently). It would help to submit pulls 
 against the main  Phobos repo.
 :-)

 Also, looks like you made major changes to the code... if it's 
 a complete rewrite, I'd say submit it as a single pull. If it's 
 a collection of many fixes, it's probably better to submit 
 separate pulls so that the easy fixes will get in first while 
 we work out the  wrinkles on the more complicated fixes.

  I did submit it against the original Phobos, and the auto tester 
picked it up. For something like 3-4 weeks it tried and kept 
failing over and over again because it couldn't merge it.

  The reason? It was something like 13 unresolved newlines that 
didn't match up... or something that I have no idea how I could 
fix because whitespace is invisible.

  Later Walter or Andrei (I forget who) complained when they 
looked at it and wanted me to break it apart into smaller more 
indivdual bug fixes (since it's hard to tell what i changed as a 
single blob/replacement) but as a complete re-write and I don't 
know if I COULD do that...

  Trust me. I tried. It failed. Eventually I just pulled the 
request and gave up at that time... Right about the time of the 
Dconf 2013 I got burned out and got depressed (Probably major 
changes in meds).

Aug 06 2014

D Programming

C/C++ Programming

Other

digitalmars.D.learn - BitArray/BitFields - Review