www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - BitArray/BitFields - Review

reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
BitFields:
https://github.com/rtcvb32/phobos/commit/729c96e2f20ddd0c05a1afb488030b5c4db3e012

new BitArray Overhead functions/templates:
https://github.com/rtcvb32/phobos/commit/55aaf82a39f3901f03f5f1ac26fd175c5b2cd167

BitArray:
https://github.com/rtcvb32/phobos/commit/568c0d1966208afe70cedcaddc3bb880dc3d481d

  I'd like opinions of the code I've submitted, and certain 
shortcomings that are present that appropriate changes would be 
needed to fix. Only one of these has been a problem during 
debugging and that's due to references. (All three patches must 
be present to work)

BitArray:
1) Slices use references (sometimes) to previous bitarray. Since 
it's a slice (like an array) this could be expected.
Solutions:
  a) Drop the union, make all functions  safe (compared to 
 trusted), and they are all ref/slices
  b) Leave as sometimes ref, sometimes not; Use .dup when you NEED 
to ensure different unique copies.
  c) Apply COW (Copy-On-Write) where a known slice type (canExpand 
is false) when a change would apply, it replaces the present 
slice into it's own unique array.
  d) All slices become new dup'ed allocated arrays (Most expensive 
of them all)

2) opCast disabled: Anyone using opCast their code will break, 
however depending on it's usage that may not be an issue. If 
anyone can recommend alternative opCast(s) that won't prevent 
normal use of casting, then they will be reimplemented.

3) dim - An artifact of the previous BitArray, and I think needs 
to be removed as it has limited use.

4) Certain operators (opCat, opCat_r) I can't replace with 
opBinary/opBinary_r.

5) Examples of additional use cases/construction that it 
currently fails and should work.

6) Added rawWrite (for primitives), usefulness testing and 
opinions.

7) opApply, toHash, opCmp, opEqual tested in various environments 
(Like associative arrays) for problems, along with any possible 
const issues.

BitFields:
  I've concentrated on adding default values. A set of functions 
previously present string functions 'myXXX' were present, why 
were they present to begin with? Was it a limitation of what was 
available at the time? Or was it to make compiling faster so it 
didn't rely on other library functions? Or were the string 
functions not available at the time? They might be removed once 
this is truly answered.

//currently should work
struct A
{
   mixin(bitfields!(
     uint, "x",    2,
     int,  "y",    3,
     uint, "z=1",  2,
     bool, "flag=true", 1));
}
A obj;
assert(obj.x == 0);
assert(obj.z == 1);    //default
assert(obj.flag == 1); //true/false should also work
Jul 28 2012
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, July 28, 2012 22:41:48 Era Scarecrow wrote:
 2) opCast disabled: Anyone using opCast their code will break,
 however depending on it's usage that may not be an issue. If
 anyone can recommend alternative opCast(s) that won't prevent
 normal use of casting, then they will be reimplemented.

I'm not particularly familar with BitArray, so I don't know what its opCast would convert to, but opCast will only convert to the types that you let it, so if you need to constrain what types it works with, then use template constraints to do that. If, on the other hand, opCast just doesn't make any sense at all, then just don't have it. - Jonathan M Davis
Jul 28 2012
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 29-Jul-12 00:41, Era Scarecrow wrote:
 BitFields:
 https://github.com/rtcvb32/phobos/commit/729c96e2f20ddd0c05a1afb488030b5c4db3e012


 new BitArray Overhead functions/templates:
 https://github.com/rtcvb32/phobos/commit/55aaf82a39f3901f03f5f1ac26fd175c5b2cd167


 BitArray:
 https://github.com/rtcvb32/phobos/commit/568c0d1966208afe70cedcaddc3bb880dc3d481d


   I'd like opinions of the code I've submitted, and certain shortcomings
 that are present that appropriate changes would be needed to fix. Only
 one of these has been a problem during debugging and that's due to
 references. (All three patches must be present to work)

Great! But I strongly suggest to repost it in d.D newsgroup as it has wider audience and is more appropriate for reviews.
 BitArray:
 1) Slices use references (sometimes) to previous bitarray. Since it's a
 slice (like an array) this could be expected.
 Solutions:
   a) Drop the union, make all functions  safe (compared to  trusted),
 and they are all ref/slices
   b) Leave as sometimes ref, sometimes not; Use .dup when you NEED to
 ensure different unique copies.
   c) Apply COW (Copy-On-Write) where a known slice type (canExpand is
 false) when a change would apply, it replaces the present slice into
 it's own unique array.
   d) All slices become new dup'ed allocated arrays (Most expensive of
 them all)

My solution is make slices different type e.g. BitSlice. It's always reference to the original BitArray. Array itself still can be either compact or not. All operations with slice go to array (and still check if they can use word-aligned speed up). How does it sound?
 2) opCast disabled: Anyone using opCast their code will break, however
 depending on it's usage that may not be an issue. If anyone can
 recommend alternative opCast(s) that won't prevent normal use of
 casting, then they will be reimplemented.

 3) dim - An artifact of the previous BitArray, and I think needs to be
 removed as it has limited use.

 4) Certain operators (opCat, opCat_r) I can't replace with
 opBinary/opBinary_r.

opCat isn't it just operator "~" and "~=" ? You can use opOpAssign for "~=" IRC.
 5) Examples of additional use cases/construction that it currently fails
 and should work.

 6) Added rawWrite (for primitives), usefulness testing and opinions.

 7) opApply, toHash, opCmp, opEqual tested in various environments (Like
 associative arrays) for problems, along with any possible const issues.

 BitFields:
   I've concentrated on adding default values. A set of functions
 previously present string functions 'myXXX' were present, why were they
 present to begin with?

I suspect to!string wasn't CTFEable. Regardless you can catch a word from Andrei Alexandrescu on the main D newsgroup who (I think) came up with it in the first place. -- Dmitry Olshansky
Jul 28 2012
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, July 29, 2012 00:58:59 Dmitry Olshansky wrote:
 My solution is make slices different type e.g. BitSlice.
 It's always reference to the original BitArray. Array itself still can
 be either compact or not. All operations with slice go to array (and
 still check if they can use word-aligned speed up).
 
 How does it sound?

I would point out that while hasSlicing doesn't currently check for it, if opSlice doesn't return a type which is assignable to the original range, it won't work in a lot of Phobos functions. I keep meaning to bring that up for discussion in the main newsgroup. I'd argue that hasSlicing really be changed from enum bool hasSlicing = !isNarrowString!R && is(typeof( (inout int _dummy=0) { R r = void; auto s = r[1 .. 2]; static assert(isInputRange!(typeof(s))); })); to something like enum bool hasSlicing = !isNarrowString!R && is(typeof( (inout int _dummy=0) { R r = void; auto s = r[1 .. 2]; static assert(isInputRange!(typeof(s))); r = s; })); - Jonathan M Davis
Jul 28 2012
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 29-Jul-12 01:07, Jonathan M Davis wrote:
 On Sunday, July 29, 2012 00:58:59 Dmitry Olshansky wrote:
 My solution is make slices different type e.g. BitSlice.
 It's always reference to the original BitArray. Array itself still can
 be either compact or not. All operations with slice go to array (and
 still check if they can use word-aligned speed up).

 How does it sound?

I would point out that while hasSlicing doesn't currently check for it, if opSlice doesn't return a type which is assignable to the original range, it won't work in a lot of Phobos functions. I keep meaning to bring that up for discussion in the main newsgroup.

BitArray isn't and shouldn't be a range. It's a container. So BitSlice is a range over BitArray. What am I missing? -- Dmitry Olshansky
Jul 28 2012
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, July 29, 2012 01:19:49 Dmitry Olshansky wrote:
 On 29-Jul-12 01:07, Jonathan M Davis wrote:
 On Sunday, July 29, 2012 00:58:59 Dmitry Olshansky wrote:
 My solution is make slices different type e.g. BitSlice.
 It's always reference to the original BitArray. Array itself still can
 be either compact or not. All operations with slice go to array (and
 still check if they can use word-aligned speed up).
 
 How does it sound?

I would point out that while hasSlicing doesn't currently check for it, if opSlice doesn't return a type which is assignable to the original range, it won't work in a lot of Phobos functions. I keep meaning to bring that up for discussion in the main newsgroup.

BitArray isn't and shouldn't be a range. It's a container. So BitSlice is a range over BitArray. What am I missing?

Probably nothing. I'm not all that familiar with BitArray. If it's not a range, it's not a problem. I assumed that it was and was just pointing out a potential problem based on that assumption. - Jonathan M Davis
Jul 28 2012
prev sibling next sibling parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Saturday, 28 July 2012 at 21:07:31 UTC, Jonathan M Davis wrote:
 I would point out that while hasSlicing doesn't currently check 
 for it, if opSlice doesn't return a type which is assignable to 
 the original range, it won't work in a lot of Phobos functions. 
 I keep meaning to bring that up for discussion in the main 
 newsgroup. I'd argue that hasSlicing really be changed

Which is one of the main reasons I am hesitant to have a separate range struct (of BitSlice)... Hmmm maybe we should keep it here for a little while, and talk it over a bit more.
Jul 28 2012
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, July 28, 2012 23:23:30 Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 21:07:31 UTC, Jonathan M Davis wrote:
 I would point out that while hasSlicing doesn't currently check
 for it, if opSlice doesn't return a type which is assignable to
 the original range, it won't work in a lot of Phobos functions.
 I keep meaning to bring that up for discussion in the main
 newsgroup. I'd argue that hasSlicing really be changed

Which is one of the main reasons I am hesitant to have a separate range struct (of BitSlice)... Hmmm maybe we should keep it here for a little while, and talk it over a bit more.

Well, it seems that my comment was somewhat misplaced in that BitArray isn't a range but a container. My comment was directed at opSlice on ranges. Container types should return whatever range type is appropriate. That will usually be a struct of some variety. I'd have to study BitArray and your changes to determine what the appropriate type was here, but it sounds like you were talking about duping the contents with opSlice, which would be highly irregular. Normally, ranges over containers are light wrappers which can iterate through and potentially alter the elements in the container, and duping up an array as the range type doesn't work like that. But depending on what BitArray is doing exactly, it might be appropriate. I don't know. - Jonathan M Davis
Jul 28 2012
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Saturday, 28 July 2012 at 21:30:21 UTC, Jonathan M Davis wrote:

 Well, it seems that my comment was somewhat misplaced in that 
 BitArray isn't a range but a container. My comment was directed 
 at opSlice on ranges. Container types should return whatever 
 range type is appropriate. That will usually be a struct of 
 some variety. I'd have to study BitArray and your changes to 
 determine what the appropriate type was here, but it sounds 
 like you were talking about duping the contents with opSlice, 
 which would be highly irregular. Normally, ranges over 
 containers are light wrappers which can iterate through and 
 potentially alter the elements in the container, and duping up 
 an array as the range type doesn't work like that. But 
 depending on what BitArray is doing exactly, it might be 
 appropriate. I don't know.

An array suggests you can control the size of the array, however if you are forced to work in 32increments (or more if 64bit), that makes it highly hard to control without making it a slice to begin with. Consider without slices (range control) BitArray ba; //size 32/64bit ba.length = 42; //life, the universe and everything assert(ba.length == 42); //fails, can't set length except by increments of 32/64 ba ~= true; //what's the size now? Or are all appending operators invalid unless it's a slice? foreach(b; ba) { //fails, can't go over range. } auto slice = ba[]; //converts to BitSlice foreach(b; slice) { //works } BitSlice func(BitArray ba); BitSlice func(BitSlice ba); //two versions of function needed? If convertable between the two, then you only lose the begining/ending length offsets. T func(T)(T ba) if (is(T : BitArray)) // correct? Potential loss of information. BitSlice x = ba[1 .. 10]; BitArray y = func(x); //range lost, beginning/ending at max regardless of size Honestly that doesn't add up to what I call an 'array'. It should look like an array, act like an array, and besides not being able to specify the array as separate from the entire BitArray (Ie const(int)) it's otherwise should not be distinguishable from an array.
Jul 28 2012
next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Saturday, 28 July 2012 at 21:39:56 UTC, Era Scarecrow wrote:
 BitArray (Ie const(int)) it's otherwise should not be

My filter/formatter got me again... Honestly that doesn't add up to what I call an 'array'. It should look like an array, act like an array, and besides not being able to specify the array as separate from the entire BitArray (ie: const(int)[]) it's otherwise should be indistinguishable from an array template-wise.
Jul 28 2012
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 29-Jul-12 01:39, Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 21:30:21 UTC, Jonathan M Davis wrote:

 Well, it seems that my comment was somewhat misplaced in that BitArray
 isn't a range but a container. My comment was directed at opSlice on
 ranges. Container types should return whatever range type is
 appropriate. That will usually be a struct of some variety. I'd have
 to study BitArray and your changes to determine what the appropriate
 type was here, but it sounds like you were talking about duping the
 contents with opSlice, which would be highly irregular. Normally,
 ranges over containers are light wrappers which can iterate through
 and potentially alter the elements in the container, and duping up an
 array as the range type doesn't work like that. But depending on what
 BitArray is doing exactly, it might be appropriate. I don't know.

An array suggests you can control the size of the array, however if you are forced to work in 32increments (or more if 64bit), that makes it highly hard to control without making it a slice to begin with. Consider without slices (range control)

A strange jump to conclusions? Why can't array know it's size in bits and capacity in words?
   BitArray ba; //size 32/64bit

   ba.length = 42; //life, the universe and everything
   assert(ba.length == 42); //fails, can't set length except by
 increments of 32/64

   ba ~= true; //what's the size now? Or are all appending operators
 invalid unless it's a slice?

43 ? Why should it use 32/64 bit increments?
   foreach(b; ba) {
     //fails, can't go over range.
   }

works as foreach takes [] when needed automagically :)
   auto slice = ba[]; //converts to BitSlice

   foreach(b; slice) {
      //works
   }

   BitSlice func(BitArray ba);
   BitSlice func(BitSlice ba); //two versions of function needed? If
 convertable between the two, then you only lose the begining/ending
 length offsets.

Just make it a template if it works with any range.
   T func(T)(T ba) if (is(T : BitArray)) // correct? Potential loss of
 information.

T func(R)(R ba) if(isRandomAccessRange!R && is(Unqual!(ElementType!R) == bool)) Would be far more general. As a bonus it will work with built-in arrays of bool or anything compatible.
   BitSlice x = ba[1 .. 10];
   BitArray y = func(x); //range lost, beginning/ending at max regardless
 of size

Certain functions would need to take a BitArray if they need to change it size, append, prepand etc.
   Honestly that doesn't add up to what I call an 'array'. It should look
 like an array, act like an array, and besides not being able to specify
 the array as separate from the entire BitArray (Ie const(int)) it's
 otherwise should not be distinguishable from an array.

It's damn hard to emulate built-in array. And I'd rather not to see this emulation half-working. And in say C++, std::vector (and most arrays in other languages I know about) is not sliceble at all or slices are different type, or they do copy the whole thing. So I'm not sure what kind of array you are used to. D's built-in arrays are rare beasts with their own issues, even with the fair deal of built-in language/runtime support they have. -- Dmitry Olshansky
Jul 28 2012
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Saturday, 28 July 2012 at 21:49:45 UTC, Dmitry Olshansky wrote:
 On 29-Jul-12 01:39, Era Scarecrow wrote:
 An array suggests you can control the size of the array, 
 however if you are forced to work in 32increments (or more if 
 64bit), that makes it highly hard to control without making it 
 a slice to begin with. Consider without slices (range control)

A strange jump to conclusions? Why can't array know it's size in bits and capacity in words?

Knowing the size based on if it's compact or the array.length * bits would be easy, but to do slices you drop the beginning/ending reference pointers, locking it at 32/64bits. If you keep that information, why not just include slicing along with the rest of it?
  BitArray ba; //size 32/64bit

  ba.length = 42; //life, the universe and everything
  assert(ba.length == 42); //fails, can't set length except by
 increments of 32/64

  ba ~= true; //what's the size now? Or are all appending 
 operators
 invalid unless it's a slice?

43 ? Why should it use 32/64 bit increments?

See above.
 works as foreach takes [] when needed automagically :)

I thought about that after I submitted it, so my mistake on that.
 Just make it a template if it works with any range.

 T func(R)(R ba)
  if(isRandomAccessRange!R && is(Unqual!(ElementType!R) == bool))

 Would be far more general. As a bonus it will work with 
 built-in arrays of bool or anything compatible.

True, but you'd still need to specifically convert the array to a slice first. So you'd call it func(ba[]), or func(ba.range) or func(ba[0 .. $]). Kinda seems silly to require the extra bit there.
  BitSlice x = ba[1 .. 10];
  BitArray y = func(x); //range lost, beginning/ending at max 
 regardless
 of size

Certain functions would need to take a BitArray if they need to change it size, append, prepand etc.

See first reply.
 It's damn hard to emulate built-in array. And I'd rather not to 
 see this emulation half-working.

 And in say C++, std::vector (and most arrays in other languages 
 I know about) is not sliceble at all or slices are different 
 type, or they do copy the whole thing. So I'm not sure what 
 kind of array you are used to. D's built-in arrays are rare 
 beasts with their own issues, even with the fair deal of 
 built-in language/runtime support they have.

Obviously it can't emulate it perfectly, but doing all the basic operations should be acceptable. I don't know all of what arrays require, but opIndex and length are the largest part to my opinion, which I guess is just randomAccessRange (or whatever it's called)
Jul 28 2012
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 29-Jul-12 02:08, Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 21:49:45 UTC, Dmitry Olshansky wrote:
 On 29-Jul-12 01:39, Era Scarecrow wrote:
 An array suggests you can control the size of the array, however if
 you are forced to work in 32increments (or more if 64bit), that makes
 it highly hard to control without making it a slice to begin with.
 Consider without slices (range control)

A strange jump to conclusions? Why can't array know it's size in bits and capacity in words?

Knowing the size based on if it's compact or the array.length * bits would be easy, but to do slices you drop the beginning/ending reference pointers, locking it at 32/64bits.

Mhm? Not getting you it at all. What's locking it ? What's wrong with: struct BitArray { union { struct { size_t[] data; ulong length; //in bits //length in words would be just // (length + size_t.sizeof*4) / // (size_t.sizeof*8); // as it's a power of 2 it should get optimized } size_t[2+ulong.sizeof/size_t.sizeof] compact; // and say 1 bit is used for isCompact would be great if it could use // the least significant bit of pointer from data[], dunno if it's easy enough } } then Range would be: struct Range{ BitArray* array; ulong start, end; } to mark [start..end] piece of it? If you keep that information, why not
 just include slicing along with the rest of it?

I'm not seeing slicing easily integrated just because length is known. -- Dmitry Olshansky
Jul 28 2012
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Saturday, 28 July 2012 at 22:19:05 UTC, Dmitry Olshansky wrote:
 Mhm? Not getting you it at all. What's locking it ?

 What's wrong with:
 struct BitArray
 {

 union {
 	struct {
 		size_t[] data;
 		ulong length;	//in bits

 //length in words would be just
 // (length + size_t.sizeof*4) / // (size_t.sizeof*8);
 // as it's a power of 2 it should get optimized

 	}
 	size_t[2+ulong.sizeof/size_t.sizeof] compact;

 // and say 1 bit is used for isCompact would be great if it 
 could use // the least significant bit of pointer from data[], 
 dunno if it's easy enough

I'd rather not rely on the pointer at all as a reference for if it's compact. If the array is changed later to a template that could accept bytes, then where does that leave you? When the slice doesn't need to know more of the beginning, it can re-slice it to a lesser more appropriate size. It may very well soon contain a third entry in the union for main operations and use the larger size_t just for canUseBulk (Actually that sounds better to me). Besides, separating the slice suggests 'length' is not part of the union struct. Sure it's convertible to [0 .. length], but it's no longer convertible back to BitArray (Unless you make a whole new array and conversion/copying which takes a lot of time). If you do that, why not do BitSlice from the beginning which holds BitArray, but then if you do that why make them separate at all? It doesn't make sense to me. I can see making a separate range for the front/popFront/empty if you really wanted to use those, but it wouldn't change how it would work.
 then Range would be:

 struct Range{
 	BitArray* array;
 	ulong start, end;
 }

 to mark [start..end] piece of it?

 If you keep that information, why not just include slicing 
 along with the rest of it?

I'm not seeing slicing easily integrated just because length is known.

Beginning and ending are known, so slicing can be added at no extra cost.
Jul 28 2012
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 29-Jul-12 02:31, Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 22:19:05 UTC, Dmitry Olshansky wrote:
 Mhm? Not getting you it at all. What's locking it ?

 What's wrong with:
 struct BitArray
 {

 union {
     struct {
         size_t[] data;
         ulong length;    //in bits

 //length in words would be just
 // (length + size_t.sizeof*4) / // (size_t.sizeof*8);
 // as it's a power of 2 it should get optimized

     }
     size_t[2+ulong.sizeof/size_t.sizeof] compact;

 // and say 1 bit is used for isCompact would be great if it could use
 // the least significant bit of pointer from data[], dunno if it's
 easy enough

I'd rather not rely on the pointer at all as a reference for if it's compact.

No problem, I redesigned a bit: struct BitArray { union { struct { ulong length; //in bits, the top bit used as isCompact size_t[] data; } struct{ ushort smallLength; //here as well ubyte[size_t.sizeof*2+ulong.sizeof-2] compact; } } Up 2^63 of bits would occupy 2^60 of bytes should be good enough for the years to come. I'm not seeing it as a problem as at very least processors have to actually be able to address more then 42 (or was it 48?) bits that they can now. (see AMD or Intel or ARM datasheets on 64 bit chips) If the array is changed later to a template that could accept
 bytes, then where does that leave you?
 When the slice doesn't need to
 know more of the beginning,

??? Slice can start at some random offset it can re-slice it to a lesser more
 appropriate size.

no idea what's you are talking about here. It may very well soon contain a third entry in the
 union for main operations and use the larger size_t just for canUseBulk
 (Actually that sounds better to me).

Again I'm having hard time to get what follows from where. Can you expand on this point please? Preferably with some code as it was far easier to reason about.
   Besides, separating the slice suggests 'length' is not part of the
 union struct.

How? All containers know their length. Sure it's convertible to [0 .. length], but it's no longer
 convertible back to BitArray (Unless you make a whole new array and
 conversion/copying which takes a lot of time).

Yes it's not convertible. And it's no problem and is intended. Otherwise like you said it doesn't make a lot of sense.
If you do that, why not
 do BitSlice from the beginning which holds BitArray, but then if you do
 that why make them separate at all?
 It doesn't make sense to me. I can
 see making a separate range for the front/popFront/empty if you really
 wanted to use those, but it wouldn't change how it would work.

 then Range would be:

 struct Range{
     BitArray* array;
     ulong start, end;
 }

 to mark [start..end] piece of it?

 If you keep that information, why not just include slicing along with
 the rest of it?

I'm not seeing slicing easily integrated just because length is known.

Beginning and ending are known, so slicing can be added at no extra cost.

Begin takes extra ulong I'm not sure it is needed in the container itself. -- Dmitry Olshansky
Jul 28 2012
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Saturday, 28 July 2012 at 23:02:17 UTC, Dmitry Olshansky wrote:
 On 29-Jul-12 02:31, Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 22:19:05 UTC, Dmitry Olshansky 
 wrote:
 Mhm? Not getting you it at all. What's locking it ?

 What's wrong with:
 struct BitArray
 {

 union {
    struct {
        size_t[] data;
        ulong length;    //in bits

 //length in words would be just
 // (length + size_t.sizeof*4) / // (size_t.sizeof*8);
 // as it's a power of 2 it should get optimized

    }
    size_t[2+ulong.sizeof/size_t.sizeof] compact;

 // and say 1 bit is used for isCompact would be great if it 
 could use
 // the least significant bit of pointer from data[], dunno if 
 it's easy enough

I'd rather not rely on the pointer at all as a reference for if it's compact.

No problem, I redesigned a bit: struct BitArray { union { struct { ulong length; //in bits, the top bit used as isCompact size_t[] data; } struct{ ushort smallLength; //here as well ubyte[size_t.sizeof*2+ulong.sizeof-2] compact; } } Up 2^63 of bits would occupy 2^60 of bytes should be good enough for the years to come. I'm not seeing it as a problem as at very least processors have to actually be able to address more then 42 (or was it 48?) bits that they can now. (see AMD or Intel or ARM datasheets on 64 bit chips)

2^63, or drop 8 so it takes up no more than 2^55 bytes of memory. Yes that's quite a large chunk. Curiously enough the current structure is already smaller than the proposed unions.
 If the array is changed later to a template that could accept 
 bytes, then where does that leave you? When the slice doesn't 
 need to know more of the beginning,


 ???
 Slice can start at some random offset

Yes they can, but so can a normal array (as they are pretty much the same)
  it can re-slice it to a lesser more appropriate size.

no idea what's you are talking about here.

As for requested compact to hold the starting/ending offsets at 1 byte each, obviously slicing at say 1024 to 2048 of a 10,000 length bit array, that won't work (0-127). If it's larger than size_t in bits, it will subtract that size and internally change the pointer of the array it actually points to, until it can fit that new value. start/end slice: [32 .. 64] equals: size_t array[1 .. 2]
 It may very well soon contain a third entry in the union for 
 main operations and use the larger size_t just for canUseBulk 
 (Actually that sounds better to me).

Again I'm having hard time to get what follows from where. Can you expand on this point please? Preferably with some code as it was far easier to reason about.

A question earlier was brought up about endienness for the patch, that is not part of what it can handle, so: ba[100] = 1; //then save the array assert(ba[100]); //fails on opposite endienness when loaded HOWEVER! if individual bits are handled in a byte array reference, then the above would succeed, even with the larger size_t ones all working with canUseBulk.
 Besides, separating the slice suggests 'length' is not part of 
 the union struct.

How? All containers know their length.

Yes they do, they also know where they start, thereby giving start/ending, which allows slicing. So why separate slicing?
 Sure it's convertible to [0 .. length], but it's no longer 
 convertible back to BitArray (Unless you make a whole new 
 array and conversion/copying which takes a lot of time).

Yes it's not convertible. And it's no problem and is intended. Otherwise like you said it doesn't make a lot of sense.

 If you do that, why not do BitSlice from the beginning which 
 holds BitArray, but then if you do that why make them 
 separate at all? It doesn't make sense to me. I can see 
 making a separate range for the front/popFront/empty if you 
 really wanted to use those, but it wouldn't change how it 
 would work.



 Begin takes extra ulong I'm not sure it is needed in the 
 container itself.

No it doesn't. bitfields are used to handle the beginning/ending offsets, equaling a total of 14 bits. It's the most confusing part (aside from [offset + sliceoffset + i[ formulas). canExpand and isCompact fill the other two bits.
Jul 28 2012
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 29-Jul-12 03:16, Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 23:02:17 UTC, Dmitry Olshansky wrote:
 On 29-Jul-12 02:31, Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 22:19:05 UTC, Dmitry Olshansky wrote:
 Mhm? Not getting you it at all. What's locking it ?

 What's wrong with:
 struct BitArray
 {

 union {
    struct {
        size_t[] data;
        ulong length;    //in bits

 //length in words would be just
 // (length + size_t.sizeof*4) / // (size_t.sizeof*8);
 // as it's a power of 2 it should get optimized

    }
    size_t[2+ulong.sizeof/size_t.sizeof] compact;

 // and say 1 bit is used for isCompact would be great if it could use
 // the least significant bit of pointer from data[], dunno if it's
 easy enough

I'd rather not rely on the pointer at all as a reference for if it's compact.

No problem, I redesigned a bit: struct BitArray { union { struct { ulong length; //in bits, the top bit used as isCompact size_t[] data; } struct{ ushort smallLength; //here as well ubyte[size_t.sizeof*2+ulong.sizeof-2] compact; } } Up 2^63 of bits would occupy 2^60 of bytes should be good enough for the years to come. I'm not seeing it as a problem as at very least processors have to actually be able to address more then 42 (or was it 48?) bits that they can now. (see AMD or Intel or ARM datasheets on 64 bit chips)

2^63, or drop 8 so it takes up no more than 2^55 bytes of memory. Yes that's quite a large chunk. Curiously enough the current structure is already smaller than the proposed unions.

OK. Now back to the biggest question: Slices use references (sometimes) to previous bitarray. Since it's a slice (like an array) this could be expected. The "sometimes" part is not god enough! The problem is that small string optimization (or small array optimization) doesn't work with reference semantics. You may dig up discussion on proposal for small string optimization for char[] and string that was shot down on these grounds. Bulit-in like semantics call for simple ptr/ptr+length struct and you can't get that can you?. E.g. void mess_with(BitArray ba) { ba ~= true; ba ~= false; //does it change anything outside this function? *maybe* ba[0] ^= true; ba[1] ^= true; auto x = ba; //even if used ref BitArray ba, x *maybe* references ba //is x now a reference or a copy? *maybe* } Thus BitArray either becomes value type (that may come as unexpected, see your option c/d). Or it becomes full reference type as in option a (though it sucks). Sorry but b is no go, as it makes code unpredictable I'd rather take my idea with separate ranges then b. Solutions: a) Drop the union, make all functions safe (compared to trusted), and they are all ref/slices b) Leave as sometimes ref, sometimes not; Use .dup when you NEED to ensure different unique copies. c) Apply COW (Copy-On-Write) where a known slice type (canExpand is false) when a change would apply, it replaces the present slice into it's own unique array. d) All slices become new dup'ed allocated arrays (Most expensive of them all) Because of d, I proposed that slices be separate type that is always a reference to original. Then only coping arrays themselves involves dup.
 Begin takes extra ulong I'm not sure it is needed in the container
 itself.

No it doesn't. bitfields are used to handle the beginning/ending offsets, equaling a total of 14 bits. It's the most confusing part (aside from [offset + sliceoffset + i[ formulas). canExpand and isCompact fill the other two bits.

OK, finally I think I understand how your current version works. It's clever but leaves you in half-way value-reference state. To point out few problems with unification of slice type and BitArray: - Every time a small slice is taken say [1..30] it becomes a small array(?) thus it is a value type and doesn't affect original array - No idea if a = b make a copy will make programmers nervous, esp if one is writing a library function as in this case there is no assumption that fits all - Same thing but more, does this work? auto c = ba; //if ba is compact ba ~= true; assert(c.length = ba.length-1); //c should remain with the same length unaffected (at least like built in arrays) -- Dmitry Olshansky
Jul 28 2012
next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Sunday, 29 July 2012 at 06:27:32 UTC, Dmitry Olshansky wrote:
 OK. Now back to the biggest question:
 Slices use references (sometimes) to previous bitarray. Since 
 it's a slice (like an array) this could be expected.

 The "sometimes" part is not god enough! The problem is that 
 small string optimization (or small array optimization) doesn't 
 work with reference semantics. You may dig up discussion on 
 proposal for small string optimization for char[] and string 
 that was shot down on these grounds.

 Bulit-in like semantics call for simple ptr/ptr+length struct 
 and you can't get that can you?.>
 E.g.

 void mess_with(BitArray ba)
 {
 ba ~= true;
 ba ~= false;
  //does it change anything outside this function? *maybe*

If it was a previous slice, then it never chances outside and appending would automatically allocate new memory.
 ba[0] ^= true;
 ba[1] ^= true;
 auto x = ba; //even if used ref BitArray ba, x *maybe* 
 references ba
 //is x now a reference or a copy? *maybe*

 Thus BitArray either becomes value type (that may come as 
 unexpected, see your option c/d). Or it becomes full reference 
 type as in option a (though it sucks).

 Sorry but b is no go, as it makes code unpredictable I'd rather 
 take my idea with separate ranges then b.

 Solutions:
  a) Drop the union, make all functions  safe (compared to 
  trusted), and they are all ref/slices
  b) Leave as sometimes ref, sometimes not; Use .dup when you 
 NEED to ensure different unique copies.
  c) Apply COW (Copy-On-Write) where a known slice type 
 (canExpand is false) when a change would apply, it replaces the 
 present slice into it's own unique array.
  d) All slices become new dup'ed allocated arrays (Most 
 expensive of them all)

Agreed, B is a bad choice. I'm going to suggest c myself, as changes would only propagate to a new true copy, otherwise it's read-only as a slice.
 Because of d, I proposed that slices be separate type that is 
 always a reference to original. Then only coping arrays 
 themselves involves dup.

Mmmm... i think you're right...
 No it doesn't. bitfields are used to handle the 
 beginning/ending offsets, equaling a total of 14 bits. It's 
 the most confusing part (aside from [offset + sliceoffset + i] 
 formulas). canExpand and isCompact fill the other two bits.

OK, finally I think I understand how your current version works. It's clever but leaves you in half-way value-reference state.

I didn't want to originally use the 1-byte per startBit/endingBit, but it does actually work when you have the functions working.
 To point out few problems with unification of slice type and 
 BitArray:

 - Every time a small slice is taken say [1..30] it becomes a 
 small array(?) thus it is a value type and doesn't affect 
 original array

No, slices of compact arrays remain compact arrays. slices of larger arrays are reference until length is set, dupped, or appended to (forcing a resize/dup)
 - No idea if a = b make a copy will make programmers nervous, 
 esp if one is writing a library function as in this case there 
 is no assumption that fits all

If it's compact, it's an auto separate copy/valuetype. If it's not compact, it's a reference to the original.
 - Same thing but more, does this work?
 	auto c = ba; //if ba is compact
 	ba ~= true;

ba may or may not be a normal/compact at this point
 	assert(c.length = ba.length-1);
 	//c should remain with the same length unaffected (at least 
 like built in arrays)

Most likely it would be compact afterwards (Even if it allocated memory). I guess after going this over, two things need to happen. First it needs to be non-reference type, I don't want to go option d route, so I'll go for c instead. On a copy I'll add a postblit that makes the 'copy' now a slice which will create a new array when it changes in any way. Only question is when you pass the BitArray to a function as you showed, it would still think it's the original, unless the postblit/opAssign (or similar) can handle that. Hmmm time for some testing.
Jul 29 2012
prev sibling parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Sunday, 29 July 2012 at 06:27:32 UTC, Dmitry Olshansky wrote:
 Thus BitArray either becomes value type (that may come as 
 unexpected, see your option c/d). Or it becomes full reference 
 type as in option a (though it sucks). Sorry but b is no go, as 
 it makes code unpredictable I'd rather take my idea with 
 separate ranges then b.

I've had a thought. Why not make it fully reference and at the same time not? Consider making a slice which is the real BitArray, then you have an outer container that when you want value semantics you use it and it would make appropriate copies every time it needed to (including when first put in the container). It would be implicitly convertible to a slice. So... BitArraySlice { //real bitarray stuff and pointer ref } BitArray { //value semantics wrapper? BitArraySlice ba; alias ba this; //include wrapper versions of binary operators this(this) { } } Doing it this way the arguments being passed as reference slices would ensure unneeded copies wouldn't be made. On the other hand a smaller portion of the code would need to be updated. Along with that a slice could be set as COW allowing quite a bit of flexibility. Thoughts?
Jul 29 2012
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 29-Jul-12 12:55, Era Scarecrow wrote:
 On Sunday, 29 July 2012 at 06:27:32 UTC, Dmitry Olshansky wrote:
 Thus BitArray either becomes value type (that may come as unexpected,
 see your option c/d). Or it becomes full reference type as in option a
 (though it sucks). Sorry but b is no go, as it makes code
 unpredictable I'd rather take my idea with separate ranges then b.

I've had a thought. Why not make it fully reference and at the same time not? Consider making a slice which is the real BitArray, then you have an outer container that when you want value semantics you use it and it would make appropriate copies every time it needed to (including when first put in the container). It would be implicitly convertible to a slice. So... BitArraySlice { //real bitarray stuff and pointer ref } BitArray { //value semantics wrapper? BitArraySlice ba; alias ba this; //include wrapper versions of binary operators this(this) { } } Doing it this way the arguments being passed as reference slices would ensure unneeded copies wouldn't be made. On the other hand a smaller portion of the code would need to be updated. Along with that a slice could be set as COW allowing quite a bit of flexibility. Thoughts?

Hm... and small array optimizations go where? BitArraySlice won't be a full reference type if it does it. -- Dmitry Olshansky
Jul 29 2012
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Sunday, 29 July 2012 at 09:03:08 UTC, Dmitry Olshansky wrote:
 On 29-Jul-12 12:55, Era Scarecrow wrote:
 Doing it this way the arguments being passed as reference 
 slices would ensure unneeded copies wouldn't be made. On the 
 other hand a smaller portion of the code would need to be 
 updated. Along with that a slice could be set as COW allowing 
 quite a bit of flexibility.

  Thoughts?

Hm... and small array optimizations go where? BitArraySlice won't be a full reference type if it does it.

I was thinking about that; If need be the slice can point to another slice which holds the actual small array (Kinda a long workaround). The way to small array optimization to work would be if BitArray generated it, since when you go to slicing you either reference the original data in the slice, or you could drop the small array optimization; although a loss it would simplify code until you come back to it once the code is working the way it should. Another idea is to convert the struct into a class instead, which would allow some options, but won't be a perfect fit, and inheritance could be used; which would solve some of the small array optimization, but at what cost?
Jul 29 2012
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 29-Jul-12 13:18, Era Scarecrow wrote:
 On Sunday, 29 July 2012 at 09:03:08 UTC, Dmitry Olshansky wrote:
 On 29-Jul-12 12:55, Era Scarecrow wrote:
 Doing it this way the arguments being passed as reference slices
 would ensure unneeded copies wouldn't be made. On the other hand a
 smaller portion of the code would need to be updated. Along with that
 a slice could be set as COW allowing quite a bit of flexibility.

  Thoughts?

Hm... and small array optimizations go where? BitArraySlice won't be a full reference type if it does it.

I was thinking about that; If need be the slice can point to another slice which holds the actual small array (Kinda a long workaround). The way to small array optimization to work would be if BitArray generated it, since when you go to slicing you either reference the original data in the slice, or you could drop the small array optimization; although a loss it would simplify code until you come back to it once the code is working the way it should. Another idea is to convert the struct into a class instead, which would allow some options, but won't be a perfect fit, and inheritance could be used; which would solve some of the small array optimization, but at what cost?

I have simpler suggestion. Maybe doing it as 2 containers: BitSet is a plain value type with small array optimization (what you called BitArray in last 2 posts) and no slicing (only as opSlice() i.e. as a whole). Best used for small things under < 1000 bits (as even dup will be cheap enough). BitArray is class or struct that has reference smenatics and drops small array optimization in favor of having simple & fast slicing Best used for big arrays with no less then 100-200 bits. -- Dmitry Olshansky
Jul 29 2012
next sibling parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Sunday, 29 July 2012 at 09:30:06 UTC, Dmitry Olshansky wrote:
 I have simpler suggestion. Maybe doing it as 2 containers:

 BitSet is a plain value type with small array optimization 
 (what you called BitArray in last 2 posts) and no slicing (only 
 as opSlice() I.e. as a whole). Best used for small things under 
 < 1000 bits (as even dup will be cheap enough).

 BitArray is class or struct that has reference smenatics and 
 drops small array optimization in favor of having simple & fast 
 slicing Best used for big arrays with no less then 100-200 bits.

As long as you only use one or the other then it sounds great; I'm sure a conversion/constructor would be available to convert from one to another. That does seem like the best solution, but also feels like we're building it twice (Little code re-use). Might be a third option: A template struct that represents a BitArray type (with 90% of the code); you then insert a structure with key functions that would handle the value/reference & duplication part. I'll meditate on it for a few days and see if anything pops up.
Jul 29 2012
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 29-Jul-12 14:28, Era Scarecrow wrote:
 On Sunday, 29 July 2012 at 09:30:06 UTC, Dmitry Olshansky wrote:
 I have simpler suggestion. Maybe doing it as 2 containers:

 BitSet is a plain value type with small array optimization (what you
 called BitArray in last 2 posts) and no slicing (only as opSlice()
 I.e. as a whole). Best used for small things under < 1000 bits (as
 even dup will be cheap enough).

 BitArray is class or struct that has reference smenatics and drops
 small array optimization in favor of having simple & fast slicing Best
 used for big arrays with no less then 100-200 bits.

As long as you only use one or the other then it sounds great; I'm sure a conversion/constructor would be available to convert from one to another. That does seem like the best solution, but also feels like we're building it twice (Little code re-use).

Yup.
   Might be a third option: A template struct that represents a BitArray
 type (with 90% of the code); you then insert a structure with key
 functions that would handle the value/reference & duplication part.

Sounds a lot like mixin template. I'd try to invent some small set of primitive operations and auto-magically construct the others from it.
   I'll meditate on it for a few days and see if anything pops up.

And just to keep your mind busy :) how about adding SIMD to speed it all up: http://dlang.org/simd.html It's kind of new & cool thing but not very well documented. -- Dmitry Olshansky
Jul 29 2012
parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Sunday, 29 July 2012 at 10:34:50 UTC, Dmitry Olshansky wrote:
 Might be a third option: A template struct that represents a 
 BitArray type (with 90% of the code); you then insert a 
 structure with key functions that would handle the 
 value/reference & duplication part.

Sounds a lot like mixin template. I'd try to invent some small set of primitive operations and auto-magically construct the others from it.

Pretty much... syntactical sugar, just add tea (T) I made a funny! :D
 I'll meditate on it for a few days and see if anything pops up.

And just to keep your mind busy :) how about adding SIMD to speed it all up: http://dlang.org/simd.html It's kind of new & cool thing but not very well documented.

Mmm I'll look it over. If it can work for larger bulk operations I don't see why not.
Jul 29 2012
prev sibling parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Sunday, 29 July 2012 at 09:30:06 UTC, Dmitry Olshansky wrote:
 I have simpler suggestion. Maybe doing it as 2 containers:

 BitSet is a plain value type with small array optimization 
 (what you called BitArray in last 2 posts) and no slicing (only 
 as opSlice() I.e. as a whole). Best used for small things under 
 < 1000 bits (as even dup will be cheap enough).

 BitArray is class or struct that has reference semantics and 
 drops small array optimization in favor of having simple & fast 
 slicing Best used for big arrays with no less then 100-200 bits.

Another thought coming to mind, is to have the BitArray with a few different modes it could operate in. One of the modes could be 'reference' which would make slices and them all work exactly as you would expect, while with referencing off it would act as a normal value type struct, since the mode/flags would modify how the postblit and assignment operators worked, along with the length. (Reference wouldn't use small array optimization). But having them statically separated by name/type seems much more likely to be safer in the long run with reliable results.
Jul 29 2012
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Sunday, 29 July 2012 at 12:39:13 UTC, Era Scarecrow wrote:
  But having them statically separated by name/type seems much 
 more likely to be safer in the long run with reliable results.

A question regarding templates. A template with different parameters is completely incompatible correct? So... struct X(T) { } alias X!bool XB; alias X!int XI; void func(XB xb) { } func(XB()); //works func(XI()); //should fail to compile Same if it was built differently? so... template X(bool something) //'bool something' should be statically checkable struct XY { static if (something) { //conditional functions or code } XY makeX(){ //XY type is only currently formed template version correct? return XY(); } } } //against above func, with these two... alias X!(false).XY XB; alias X!(true).XY XI; Course if there's a way to avoid asking about the inner XY, I wonder how I would do that. Now if all that is correct, say I want to make two functions that both use X, but are not compatible, but template functions will allow it. So... void tempFunc(T, U)(T t, U u) if( //What do i enter to check they are both template X or struct XY? As signatures will be different... ) body{ //now acts as a middle-man to be compatible for said operation } Finally, I came across an anomaly and may just be a bug. What about a postblits? T func2(T)(T x) safe pure { return T(); } struct XY { this(this) safe pure {} //safe pure added so func can call it void func(XY x) safe pure { XY y = x; //postblits called on all three lines func2(x); func2(y); } } template X(bool something) { struct XY { this(this) safe pure {} void func(XY x) safe pure { XY y = x; //Error: see below func2(x); func2(y); } } } alias X!(true).XY Xtrue; pure function 'func' cannot call impure function '__cpctor' safe function 'func' cannot call system function '__cpctor'
Jul 30 2012
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 30-Jul-12 23:50, Era Scarecrow wrote:
 On Sunday, 29 July 2012 at 12:39:13 UTC, Era Scarecrow wrote:
  But having them statically separated by name/type seems much more
 likely to be safer in the long run with reliable results.

A question regarding templates. A template with different parameters is completely incompatible correct? So...

Yeah, like twin brothers the have different IDs :)
 struct X(T) {
 }

 alias X!bool XB;
 alias X!int XI;

 void func(XB xb) {

 }

 func(XB()); //works
 func(XI()); //should fail to compile

 Same if it was built differently? so...

 template X(bool something) //'bool something' should be statically
 checkable
    struct XY {
      static if (something) {
        //conditional functions or code
      }
      XY makeX(){
        //XY type is only currently formed template version correct?
        return XY();
      }
    }
 }

 //against above func, with these two...
 alias X!(false).XY XB;
 alias X!(true).XY XI;

 Course if there's a way to avoid asking about the inner XY, I wonder how
 I would do that.


   Now if all that is correct, say I want to make two functions that both
 use X, but are not compatible, but template functions will allow it. So...

 void tempFunc(T, U)(T t, U u)
 if(
    //What do i enter to check they are both template X or struct XY? As
 signatures will be different...
 )
 body{
    //now acts as a middle-man to be compatible for said operation
 }

Not sure what you would like to accomplish here.
 alias X!(true).XY Xtrue;

 pure function 'func' cannot call impure function '__cpctor'
 safe function 'func' cannot call system function '__cpctor'

Yeah, it's a bug. File it please. -- Dmitry Olshansky
Jul 30 2012
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Monday, 30 July 2012 at 20:19:51 UTC, Dmitry Olshansky wrote:
 Not sure what you would like to accomplish here.

Than an example... struct BitArray { //assume template... ref BitArray opSliceAssign(T)(const T ba, int start, int end) if ( //if T is type bitArray but only a different change regarding parameters //we can use canUseBulk and other speedups. Otherwise a range is required. //checking all possibilities seems silly, but possible ) body { //slice specific speedup and code return this; } int opEquals(T)(const T ba) const //if constraint as above body {} } Code wise; If BitSet is always value semantics and BitArray is always reference (a previous post) and slices/groups work basically the same; then... BitSet bs = BitSet(100); BitArray ba = BitArray(100); bs[0 .. 10] = ba[0 .. 10]; //should work assert(bs[0 .. 10] == ba[0 .. 10]); //should also be possible to work.
 alias X!(true).XY Xtrue;

 pure function 'func' cannot call impure function '__cpctor'
 safe function 'func' cannot call system function '__cpctor'

Yeah, it's a bug. File it please.

Alright... Considered a major (Maybe even blocking). Reminds me. Due to how rvalue/lvalues work, assuming I need to const reference something but I don't care if it's a temporary or not, would I then do... struct X { int opCmp(const X x) const { //catch temporary return opCmp(x); //becomes reference, as it's now named 'x' } int opCmp(const ref X x) const { //do work here return 0; } } X x; assert(x == x); //ref assert(x == X()); //temporary course if the 'in' keyword is the key i will have to test that to make sure.
Jul 30 2012
next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Monday, 30 July 2012 at 21:03:39 UTC, Era Scarecrow wrote:
 Alright... Considered a major (Maybe even blocking).

http://d.puremagic.com/issues/show_bug.cgi?id=8475
Jul 30 2012
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 31-Jul-12 01:03, Era Scarecrow wrote:
 On Monday, 30 July 2012 at 20:19:51 UTC, Dmitry Olshansky wrote:
 Not sure what you would like to accomplish here.

Than an example...

You can go for simpler separation: struct BitArray{ //() - is an empty template spec ref BitArrayopSliceAssign()(const BitArray ba, int start, int end) { //two bit array can try balk mode etc.
   Reminds me. Due to how rvalue/lvalues work, assuming I need to const
 reference something but I don't care if it's a temporary or not, would I
 then do...

 struct X {
    int opCmp(const X x) const { //catch temporary
      return opCmp(x); //becomes reference, as it's now named 'x'
    }
    int opCmp(const ref X x) const {
      //do work here
      return 0;
    }
 }

 X x;
 assert(x == x);   //ref
 assert(x == X()); //temporary

   course if the 'in' keyword is the key i will have to test that to make
 sure.

in == scope const not sure what scope buys here but couldn't hurt. -- Dmitry Olshansky
Jul 30 2012
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Monday, 30 July 2012 at 21:56:20 UTC, Dmitry Olshansky wrote:
 You can go for simpler separation:

 struct BitArray{
 //() - is an empty template spec
 	ref BitArrayopSliceAssign()(const BitArray ba, int start, int 
 end)
 {
 //two bit array can try balk mode etc.

I'll give it a try, it may very well be what I need; although I wouldn't have thought you could declare it like that. I have much to learn in the ways of the template.
  course if the 'in' keyword is the key i will have to test 
 that to make sure.

in == scope const not sure what scope buys here but couldn't hurt.

If it can avoid making a new copy, then it likely would help. I'll need to test it. I actually am not quite sure how scope works as an argument; it isn't really covered in TDPL from what I recall.
Jul 30 2012
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Monday, 30 July 2012 at 22:23:46 UTC, Era Scarecrow wrote:
 On Monday, 30 July 2012 at 21:56:20 UTC, Dmitry Olshansky wrote:
 in == scope const not sure what scope buys here but couldn't 
 hurt.

If it can avoid making a new copy, then it likely would help. I'll need to test it. I actually am not quite sure how scope works as an argument; it isn't really covered in TDPL from what I recall.

Tests with 'in' doesn't help, only ref and catching a temporary seem to prevent postblits. Also the empty template doesn't work, somehow I'm not surprised... template X(bool something) { struct XT { void func()(XT x){ writeln("XT template called: typeID:", typeid(x)); } } } alias X!true.XT A; alias X!false.XT B; A a, b; B ba; a.func(b); //passes a.func(ba); Error: template test.X!(true).XT.func does not match any function template declaration Error: template test.X!(true).XT.func() cannot deduce template function from argument types !()(XT)
Jul 30 2012
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 31-Jul-12 02:40, Era Scarecrow wrote:
 On Monday, 30 July 2012 at 22:23:46 UTC, Era Scarecrow wrote:
 On Monday, 30 July 2012 at 21:56:20 UTC, Dmitry Olshansky wrote:
 in == scope const not sure what scope buys here but couldn't hurt.

If it can avoid making a new copy, then it likely would help. I'll need to test it. I actually am not quite sure how scope works as an argument; it isn't really covered in TDPL from what I recall.

Tests with 'in' doesn't help, only ref and catching a temporary seem to prevent postblits. Also the empty template doesn't work, somehow I'm not surprised... template X(bool something) { struct XT {

Fixed :
     void func(bool smth)(X!(smth).XT x){

By default XT is deduced as X!(current value of smth).XT
          writeln("XT template called: typeID:", typeid(x));
     }
   }
 }

 alias X!true.XT A;
 alias X!false.XT B;

 A a, b;
 B ba;

 a.func(b); //passes
 a.func(ba);

 Error: template test.X!(true).XT.func does not match any function
 template declaration
 Error: template test.X!(true).XT.func() cannot deduce template function
 from argument types !()(XT)

-- Dmitry Olshansky
Jul 30 2012
parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Monday, 30 July 2012 at 22:44:21 UTC, Dmitry Olshansky wrote:
 Fixed :

    void func(bool smth)(X!(smth).XT x){

By default XT is deduced as X!(current value of smth).XT

Doesn't really fix it... a.func(b); //65 - doesn't match declaration. a.func(ba); //66 //other template tests not worth putting, 67-70 a.func!true(b); //71 - explicitly stated a.func!false(ba); //72 a.func!false(b); //73 - Backwards on purpose for test a.func!true(ba); //74 (65): Error: template test.X!(true).XT.func does not match any function template declaration (11): Error: template test.X!(true).XT.func(bool smth) cannot deduce template function from argument types !()(XT) (66): Error: template test.X!(true).XT.func does not match any function template declaration (11): Error: template test.X!(true).XT.func(bool smth) cannot deduce template function from argument types !()(XT) (73): Error: function test.X!(true).XT.func!(false).func (XT x) is not callable using argument types (XT) (73): Error: cannot implicitly convert expression (b) of type XT to XT (74): Error: function test.X!(true).XT.func!(true).func (XT x) is not callable using argument types (XT) (74): Error: cannot implicitly convert expression (ba) of type XT to XT
Jul 30 2012
prev sibling parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Mon, Jul 30, 2012 at 9:50 PM, Era Scarecrow <rtcvb32 yahoo.com> wrote:

  A question regarding templates. A template with different parameters is
 completely incompatible correct?

Correct. They have no reason, in general, too even generate the same code: template Chameleon(T) { static if (is(T == struct)) enum Chameleon = "I'm a string!"; // Chamelon!(SomeStruct) is a string value else static if (is(T == class)) void Chameleon(int i) { return i+1;} // Chameleon!(SomeClass) is a function else alias T[int] Chameleon; // Here Chameleon is a type } The same for you struct S(T) { ... }. Inside the braces, anything can happen (mwwaahhha hhaa!), depending on T.
 So...

 struct X(T) {
 }

 alias X!bool XB;
 alias X!int XI;

 void func(XB xb) {

 }

 func(XB()); //works
 func(XI()); //should fail to compile

Yes.
  Now if all that is correct, say I want to make two functions that both use
 X, but are not compatible, but template functions will allow it. So...

I'm not sure I understand what you're trying to do. Do you mean you want a function that accept X!(T)s for any T, and not any other type? struct X(T) {} void func(T)(X!T x) {} void main() { X!bool b; X!int i; func(b); func(i); }
 void tempFunc(T, U)(T t, U u)
 if(
   //What do i enter to check they are both template X or struct XY? As
 signatures will be different...
 )
 body{
   //now acts as a middle-man to be compatible for said operation
 }

Here you want a constraint that checks that U and T are both X!(SomeType) or an XY? void tempFunc(T,U)(T t, U u) if (is(T a == X!(SomeType), SomeType) && is(U a == X!(SomeType), SomeType) || is(T == XY) && is(U == XY)) { ... } Is that what you need?
Jul 30 2012
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Monday, 30 July 2012 at 20:48:26 UTC, Philippe Sigaud wrote:
 Now if all that is correct, say I want to make two functions 
 that both use X, but are not compatible, but template 
 functions will allow it. So...

I'm not sure I understand what you're trying to do. Do you mean you want a function that accept X!(T)s for any T, and not any other type?

Anything of template X! previous post of mine shows sort of an example.
 struct X(T) {}

 void func(T)(X!T x)
 {}

 void main()
 {
     X!bool b;
     X!int i;
     func(b);
     func(i);
 }

Hmmm i do think that seems right... but if it contains multiple parameters, then...? template X(x1, x2, x3) { struct XT {} }
 void func(T)(X!T x) {}

Will this still work?
 Here you want a constraint that checks that U and T are both
 X!(SomeType) or an XY?

As before, of Template X, and struct XY, but T (or sometype) doesn't matter.
 void tempFunc(T,U)(T t, U u) if (is(T a == X!(SomeType), 
 SomeType) &&
 is(U a == X!(SomeType), SomeType)
                                                 || is(T == XY) 
 && is(U == XY))
 {
 ...
 }
 Is that what you need?

I want to say no... But I haven't actually tested it for my use cases. Leaving it unconstrained (without checking) to make it work is a disaster waiting to happen: I want T and U to both be of the same template (X in this case) but not the same instantiation arguments. I hope I'm not repeating myself too much.
Jul 30 2012
parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Mon, Jul 30, 2012 at 11:19 PM, Era Scarecrow <rtcvb32 yahoo.com> wrote:

 void func(T)(X!T x)
 {}

 void main()
 {
     X!bool b;
     X!int i;
     func(b);
     func(i);
 }

Hmmm i do think that seems right... but if it contains multiple parameters, then...? template X(x1, x2, x3) { struct XT {} }
 void func(T)(X!T x) {}

Will this still work?

No. Use a 3-params template or a tuple: void func(A,B,C)(X!(A,B,C) x) {} or void func(Ts...)(X!(Ts) x) {} or even, more general: void func(T)(T x) if (is(T t = X!(SomeTypes), SomeTypes...)), because template constrains are much more powerful and general than fiddling arguments types. There was a discussion between Simen Kjaeraas and me on the subject not long ago. Kenji Hara extended the functionnality of is( , T...) and did a pull request. I'm not sure it's in 2.060, the request was done a week ago.
 void tempFunc(T,U)(T t, U u) if (is(T a == X!(SomeType), SomeType) &&
 is(U a == X!(SomeType), SomeType)
                                                 || is(T == XY) && is(U ==
 XY))
 {
 ...
 }
 Is that what you need?

I want to say no... But I haven't actually tested it for my use cases. Leaving it unconstrained (without checking) to make it work is a disaster waiting to happen: I want T and U to both be of the same template (X in this case) but not the same instantiation arguments.

Said like this, it's much more clear for me :) Note that in the above code, SomeType is a fresh variable in each is() expression. So, even though I used the same name (shouldn't have done that), T and U can have different type parameters. *shameless plug* Btw, I wrote a tutorial on templates, you can find it here: https://github.com/PhilippeSigaud/D-templates-tutorial/blob/master/dtemplates.pdf (click on 'view raw', that should download the pdf) That should shed some light on the matters at hand.
Jul 30 2012
parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Tuesday, 31 July 2012 at 05:27:58 UTC, Philippe Sigaud wrote:
 No. Use a 3-params template or a tuple:

 void func(A,B,C)(X!(A,B,C) x) {}
 or
 void func(Ts...)(X!(Ts) x) {}

I don't know how many arguments it will have (depends on how many options I give it), and I honestly don't; It should be invisible for the end user as long as we can get the the constraints properly accepted.
 or even, more general:

 void func(T)(T x) if (is(T t = X!(SomeTypes), SomeTypes...)), 
 because template constrains are much more powerful and general 
 than fiddling arguments types.

That doesn't make sense; but I don't have a full grasp of the templates (continued below).
 There was a discussion between Simen Kjaeraas and me on the 
 subject not long ago. Kenji Hara extended the functionnality of 
 is( , T...) and did a pull request. I'm not sure it's in 2.060, 
 the request was done a week ago.

so the (, T...) is shorthand for testing the template type with multiple arguments. Nice. So multiple arguments would be... struct X (T, U) {} if(is(T t = X!(bool, bool), (int, int), (int, bool))) //?
 I want to say no... But I haven't actually tested it for my 
 use cases. Leaving it unconstrained (without checking) to make 
 it work is a disaster waiting to happen: I want T and U to 
 both be of the same template (X in this case) but not the same 
 instantiation arguments.

Said like this, it's much more clear for me :) Note that in the above code, SomeType is a fresh variable in each is() expression. So, even though I used the same name (shouldn't have done that), T and U can have different type parameters.

 *shameless plug*
 Btw, I wrote a tutorial on templates, you can find it here:

 https://github.com/PhilippeSigaud/D-templates-tutorial/blob/master/dtemplates.pdf

 (click on 'view raw', that should download the pdf)
 That should shed some light on the matters at hand.

Indeed... I remember glancing through it, but never really needed it badly; Now it seems I do. At least it's not C++ and it's templates. I'll be reading up on them starting tonight (and printing most of it); It is after all the hardest part (and most powerful part) of the language :P
Jul 30 2012
prev sibling parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
  As BitArray is coming back up and my resumed work I'll comment a 
few questions and suggestions and go from there. I'll polish up 
the code and try to resubmit it.

On Saturday, 28 July 2012 at 21:07:31 UTC, Jonathan M Davis wrote:
 I would point out that while hasSlicing doesn't currently check 
 for it, if opSlice doesn't return a type which is assignable to 
 the original range, it won't work in a lot of Phobos functions. 
 I keep meaning to bring that up for discussion in the main 
 newsgroup. I'd argue that hasSlicing really be changed from

Hmmm.. Well Since the 'sometimes ref' as discussed won't seem to work well from before I'm considering to put the following two rules into effect. 1) If it's an allocated block, it stays allocated. 2) If you dup a BitArray and it's small enough to be compact, it converts. 3) If it's a small/compact array and you give out a slice, it converts the original block to dynamic before returning the new slice. With 1 & 2, reserved and length work a little differently, more ref friendly. With 3 the only issue comes up with if it's const, or a local variable. const BitArray x = BitArray(32); func(x); Would the function get passed the original compact buffer? Copy of x or a dup? May not be so much an issue. However... BitArray x = BitArray(32); const y = x[]; //slice Now if y points to the actual compact buffer and we do the following... x.length = 256; //allocates if it was compact Although y will have const access the data instead of a compact array it's a pointer to an array and broken. If during the x call it reallocated then y would work fine no matter what in that case. If x can't reallocate it, then the issue remains but is much smaller than before, most likely a dup would be safest. Thoughts? Questions? Ideas?
Dec 31 2012
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
12/31/2012 9:35 PM, Era Scarecrow пишет:
   As BitArray is coming back up and my resumed work I'll comment a few
 questions and suggestions and go from there. I'll polish up the code and
 try to resubmit it.

Yay!
 On Saturday, 28 July 2012 at 21:07:31 UTC, Jonathan M Davis wrote:
 I would point out that while hasSlicing doesn't currently check for
 it, if opSlice doesn't return a type which is assignable to the
 original range, it won't work in a lot of Phobos functions. I keep
 meaning to bring that up for discussion in the main newsgroup. I'd
 argue that hasSlicing really be changed from

Hmmm.. Well Since the 'sometimes ref' as discussed won't seem to work well from before I'm considering to put the following two rules into effect. 1) If it's an allocated block, it stays allocated. 2) If you dup a BitArray and it's small enough to be compact, it converts. 3) If it's a small/compact array and you give out a slice, it converts the original block to dynamic before returning the new slice. With 1 & 2, reserved and length work a little differently, more ref friendly. With 3 the only issue comes up with if it's const, or a local variable.

Personally I believe that if we introduce a slice of a BitArray as a separate range type that represents a view of array (but can't be appended to, etc. it's just a RA range with slicing).
   const BitArray x = BitArray(32);
   func(x);

Then here x is passed either by ref or a copy (depending on func).
   Would the function get passed the original compact buffer? Copy of x
 or a dup?
   May not be so much an issue. However...

   BitArray x = BitArray(32);
   const y = x[]; //slice

Then y has type BitArraySlice and references x.
   Now if y points to the actual compact buffer and we do the following...

   x.length = 256; //allocates if it was compact

   Although y will have const access the data instead of a compact array
 it's a pointer to an array and broken.

Then IMO that should be a slice invalidation (akin to c++ iterator invalidation). Depending on how slice is implemented it could avoid being invalidated in this case but in some others it surely will (e.g. shrinking).
 If during the x call it

You mean x[] call i.e. opSlice() call? Seems like an overkill and semantically bad as I totally expect that the following to be true for any Array-like container: auto cont = ...; auto range = cont[]; range[0] = 1; assert(cont[0] == 1); Also: foreach(v; x) --> foreach(v; x[]) if x defines opSlice however since BitArray defines opApply it's called instead so this one is probably OK.
 reallocated then y would work fine no matter what in that case. If x
 can't reallocate it, then the issue remains but is much smaller than
 before, most likely a dup would be safest.

   Thoughts? Questions? Ideas?

The link to the latest code would be cool ;) I'm thinking that the main problem is trying to mimic built-in arrays. They are not ordinary containers in that they easily combine slice and container that is (in fact) partially hidden in run-time & GC implementation. The fact that combination of language features and run-time support makes it easy shouldn't encourage other to go this path in the library. For one thing this way a small string optimization is not achievable because of the many slices that have to reference the data packed in _small_ chunk. Then whenever original array changes (and becomes not small) any other slice to it have to somehow reflect this change. And it's hard if slices internally are exactly the same as original (if the same type). ... Now that I tested the current behavior that we shouldn't break a lot. Clearly current BitArray is old and rusty D1 design. BitArray a, b, c; //the interesting thing is that BitArray(32) doesn't construct BitArray with length 32 but some explosive garbage !! c.length = 512; a.length = 32; assert(a.length == 32); b = a; b[0] = true; //this one is bad as it means both a & b has to reference the same data assert(a[0] == 1); b ~= c; b[1] = 1; //the next fails for me (it also may not fail depending on c's length) assert(a[1] == 1); Currently I think that D container design have to be extended to include 2 kinds of containers: -small containers, values type a-la C++ STL these all use small container optimization (as they can cleanly) -big heavy-duty ones, these have reference semantics and tuned for large sets of values Then the proposed way out is to make 2 containers: - current BitArray to always have reference semantics (both asserts in the above always pass); - and BitString or BitVector to be small value semantic array that is always copied/duped on assign, these should be passed by ref or by range/slice. That's was lengthy but what do you think? And it seems like we'd need to bring this last topic to general D NG. -- Dmitry Olshansky
Jan 02 2013
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 2 January 2013 at 21:00:38 UTC, Dmitry Olshansky 
wrote:
 12/31/2012 9:35 PM, Era Scarecrow пишет:

 Personally I believe that if we introduce a slice of a BitArray 
 as a separate range type that represents a view of array (but 
 can't be appended to, etc. it's just a RA range with slicing).

Could do that I suppose, but then either exact length or append/alignment optimizations may go out the window if we remove some of those as options. To support those making slices part of the main type was easy enough. But the other issues still have to be addressed (that's mentioned below)
  const BitArray x = BitArray(32);
  func(x);

Then here x is passed either by ref or a copy (depending on func).

yeah... assuming func returned a struct of... struct S { BitArray x; //other stuff } const(BitArray) func(const BitArray s); in the case it's small string optimization and the original string gets allocated OR gets passed outside the original function that made the BitArray. So small string if possible, convert to ref and slice it, or dup it. not 100% ref in all cases but pretty close.
  Would the function get passed the original compact buffer? 
 Copy of x or a dup?  May not be so much an issue. However...

  BitArray x = BitArray(32);
  const y = x[]; //slice

Then y has type BitArraySlice and references x.

Agreed, although I'm still iffy about a separate slice as a range. Could do the range just to give it limitations, but seems like a headache to do less and extra code to manage. Current code I'm writing converts (if possible) the BitArray to a allocated type, then passes that slice to y.
  Now if y points to the actual compact buffer and we do the 
 following...

  x.length = 256; //allocates if it was compact

  Although y will have const access the data instead of a 
 compact array it's a pointer to an array and broken.

Then IMO that should be a slice invalidation (akin to c++ iterator invalidation). Depending on how slice is implemented it could avoid being invalidated in this case but in some others it surely will (e.g. shrinking).

Maybe, with the small string optimization (no allocation) it isn't so much an easy option. That's what this is about. Say it uses the compact (fixed string) and in opSlice... BitArray { bool isCompact; bool canExpand; //represents owner for slices or compact. union { size_t[] normal; size_t[2] compact; } BitArray opSlice() { BitArray ba = this; ba.canExpand = false; if (isCompact) { ba.isCompact = false; ba.normal = cast(size_t[]) this.compact[0 .. $]; } return ba; } } This is where the allocation could cause a problem as the slice could be referring to stack memory at which all the data could be duds. Worse if you overwrite that data suddenly the allocated memory could point to something else! I think it's safe to say that's a bad idea.
 If during the x call it

You mean x[] call i.e. opSlice() call?

well both this(this) and opSlice() would be called at different times so they would have different meanings (copying vs slicing).
 Seems like an overkill and semantically bad as I totally expect 
 that the following to be true for any Array-like container:
 auto cont = ...;
 auto range = cont[];
 range[0] = 1;
 assert(cont[0] == 1);

 Also:
 foreach(v; x) --> foreach(v; x[]) if x defines opSlice

 however since BitArray defines opApply it's called instead so 
 this one is probably OK.

Does that mean we might be able to drop opApply? Hmmm... Depends on if we have enough implemented so the range accepts it. I'll glance over std.range.
 reallocated then y would work fine no matter what in that 
 case. If x can't reallocate it, then the issue remains but is 
 much smaller than before, most likely a dup would be safest.

  Thoughts? Questions? Ideas?

The link to the latest code would be cool ;)

https://github.com/rtcvb32/phobos/blob/BitArray-Updates/std/bitmanip.d There's a merge issue due to having tried to keep the bitfields and bitarray separated; That seems to have made more problems than solutions. Working on it but may have to wait until I've read my Git book when it arrives before I understand this all.
 I'm thinking that the main problem is trying to mimic built-in 
 arrays.

Not really, it's trying to do small string optimization. I suppose if this was done as a class instead most of these problems could/would go away.
 They are not ordinary containers in that they easily combine 
 slice and container that is (in fact) partially hidden in 
 run-time & GC implementation.

 The fact that combination of language features and run-time 
 support makes it easy shouldn't encourage other to go this path 
 in the library.

 For one thing this way a small string optimization is not 
 achievable because of the many slices that have to reference 
 the data packed in _small_ chunk. Then whenever original array 
 changes (and becomes not small) any other slice to it have to 
 somehow reflect this change. And it's hard if slices internally 
 are exactly the same as original (if the same type).

Really depends on the use case. I'm guessing there's enough call for the 64bit small packed bitarray that justifies it, otherwise it would be better to throw it out. As you mention later, separating them does seem like a better idea.
 ...

 Now that I tested the current behavior that we shouldn't break 
 a lot. Clearly current BitArray is old and rusty D1 design.

     BitArray a, b, c;
     //the interesting thing is that BitArray(32) doesn't 
 construct BitArray with length 32 but some explosive garbage !!

     c.length = 512;
     a.length = 32;
     assert(a.length == 32);
     b = a;
     b[0] = true;
 //this one is bad as it means both a & b has to reference the 
 same data
     assert(a[0] == 1);
     b ~= c;
     b[1] = 1;
     //the next fails for me (it also may not fail depending on 
 c's length)
     assert(a[1] == 1);

My code tests that the last two asserts fail, course, a being under 64bits remains with small string optimization. Making it slice... seems I'll add this to my list of tests and get it fixed. Even if i reserved the space it still failed the last one. Mmm...
 Currently I think that D container design have to be extended 
 to include 2 kinds of containers:

  -small containers, values type a-la C++ STL these all use 
 small container optimization (as they can cleanly)
  -big heavy-duty ones, these have reference semantics and tuned 
 for large sets of values

 Then the proposed way out is to make 2 containers:
 - current BitArray to always have reference semantics (both 
 asserts in the above always pass);
 - and BitString or BitVector to be small value semantic array 
 that is always copied/duped on assign, these should be passed 
 by ref or by range/slice.

 That's was lengthy but what do you think? And it seems like 
 we'd need to bring this last topic to general D NG.

hmmm.. Either would need some code duplication, or template to turn off/swap small string optimization. But separating them does seem like a good idea.
Jan 02 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
1/3/2013 6:05 AM, Era Scarecrow пишет:
 On Wednesday, 2 January 2013 at 21:00:38 UTC, Dmitry Olshansky wrote:
 12/31/2012 9:35 PM, Era Scarecrow пишет:

 Personally I believe that if we introduce a slice of a BitArray as a
 separate range type that represents a view of array (but can't be
 appended to, etc. it's just a RA range with slicing).

Could do that I suppose, but then either exact length or append/alignment optimizations may go out the window if we remove some of those as options. To support those making slices part of the main type was easy enough. But the other issues still have to be addressed (that's mentioned below)
  const BitArray x = BitArray(32);
  func(x);

Then here x is passed either by ref or a copy (depending on func).

yeah... assuming func returned a struct of... struct S { BitArray x; //other stuff } const(BitArray) func(const BitArray s); in the case it's small string optimization and the original string gets allocated OR gets passed outside the original function that made the BitArray. So small string if possible, convert to ref and slice it, or dup it. not 100% ref in all cases but pretty close.
  Would the function get passed the original compact buffer? Copy of x
 or a dup?  May not be so much an issue. However...

  BitArray x = BitArray(32);
  const y = x[]; //slice

Then y has type BitArraySlice and references x.

Agreed, although I'm still iffy about a separate slice as a range. Could do the range just to give it limitations, but seems like a headache to do less and extra code to manage. Current code I'm writing converts (if possible) the BitArray to a allocated type, then passes that slice to y.
  Now if y points to the actual compact buffer and we do the following...

  x.length = 256; //allocates if it was compact

  Although y will have const access the data instead of a compact
 array it's a pointer to an array and broken.

Then IMO that should be a slice invalidation (akin to c++ iterator invalidation). Depending on how slice is implemented it could avoid being invalidated in this case but in some others it surely will (e.g. shrinking).

Maybe, with the small string optimization (no allocation) it isn't so much an easy option. That's what this is about. Say it uses the compact (fixed string) and in opSlice... BitArray { bool isCompact; bool canExpand; //represents owner for slices or compact. union { size_t[] normal; size_t[2] compact; } BitArray opSlice() { BitArray ba = this; ba.canExpand = false; if (isCompact) { ba.isCompact = false; ba.normal = cast(size_t[]) this.compact[0 .. $]; } return ba; } } This is where the allocation could cause a problem as the slice could be referring to stack memory at which all the data could be duds. Worse if you overwrite that data suddenly the allocated memory could point to something else! I think it's safe to say that's a bad idea.

Hm, I'd think that having Slice type be: BitArraySlice{ BitArray* bp; size_t start, end; // all else forwards to the pointed array } should work avoiding the most of code duplication. With any luck inliner will expand all of this wrapping. To enable bulk mode operations: void opOpSliceAssign(BitArraySlice rhs) {//has all the data to do it bulkOp(bp, start, end, rhs.bp, rhs.start, rhs.end); } then it's a question of bit of refactoring & tweaking of array bulks ops. Then the only problem left is slice invalidation and original array going out of scope.
 If during the x call it

You mean x[] call i.e. opSlice() call?

well both this(this) and opSlice() would be called at different times so they would have different meanings (copying vs slicing).
 Seems like an overkill and semantically bad as I totally expect that
 the following to be true for any Array-like container:
 auto cont = ...;
 auto range = cont[];
 range[0] = 1;
 assert(cont[0] == 1);

 Also:
 foreach(v; x) --> foreach(v; x[]) if x defines opSlice

 however since BitArray defines opApply it's called instead so this one
 is probably OK.

Does that mean we might be able to drop opApply? Hmmm... Depends on if we have enough implemented so the range accepts it. I'll glance over std.range.

Yes, it's got to be a range to work like this and then opApply is useless.
 reallocated then y would work fine no matter what in that case. If x
 can't reallocate it, then the issue remains but is much smaller than
 before, most likely a dup would be safest.

  Thoughts? Questions? Ideas?

The link to the latest code would be cool ;)

https://github.com/rtcvb32/phobos/blob/BitArray-Updates/std/bitmanip.d There's a merge issue due to having tried to keep the bitfields and bitarray separated; That seems to have made more problems than solutions. Working on it but may have to wait until I've read my Git book when it arrives before I understand this all.
 I'm thinking that the main problem is trying to mimic built-in arrays.

Not really, it's trying to do small string optimization. I suppose if this was done as a class instead most of these problems could/would go away.
 They are not ordinary containers in that they easily combine slice and
 container that is (in fact) partially hidden in run-time & GC
 implementation.

 The fact that combination of language features and run-time support
 makes it easy shouldn't encourage other to go this path in the library.

 For one thing this way a small string optimization is not achievable
 because of the many slices that have to reference the data packed in
 _small_ chunk. Then whenever original array changes (and becomes not
 small) any other slice to it have to somehow reflect this change. And
 it's hard if slices internally are exactly the same as original (if
 the same type).

Really depends on the use case. I'm guessing there's enough call for the 64bit small packed bitarray that justifies it, otherwise it would be better to throw it out. As you mention later, separating them does seem like a better idea.

I've meant the fact that it has to preserve current std lib behavior *and* has small string optimization that makes it overly painful and partially defeats the optimization (by requiring pointless compact->big conversions on slice etc.).
 ...

 Now that I tested the current behavior that we shouldn't break a lot.
 Clearly current BitArray is old and rusty D1 design.

     BitArray a, b, c;
     //the interesting thing is that BitArray(32) doesn't construct
 BitArray with length 32 but some explosive garbage !!

     c.length = 512;
     a.length = 32;
     assert(a.length == 32);
     b = a;
     b[0] = true;
 //this one is bad as it means both a & b has to reference the same data
     assert(a[0] == 1);
     b ~= c;
     b[1] = 1;
     //the next fails for me (it also may not fail depending on c's
 length)
     assert(a[1] == 1);

My code tests that the last two asserts fail, course, a being under 64bits remains with small string optimization. Making it slice... seems I'll add this to my list of tests and get it fixed. Even if i reserved the space it still failed the last one. Mmm...

Right, since it's _a_ that's tested and in current design it shares data with b unless that got relocated (b ~= c can relocate b).
 Currently I think that D container design have to be extended to
 include 2 kinds of containers:

  -small containers, values type a-la C++ STL these all use small
 container optimization (as they can cleanly)
  -big heavy-duty ones, these have reference semantics and tuned for
 large sets of values

 Then the proposed way out is to make 2 containers:
 - current BitArray to always have reference semantics (both asserts in
 the above always pass);
 - and BitString or BitVector to be small value semantic array that is
 always copied/duped on assign, these should be passed by ref or by
 range/slice.

 That's was lengthy but what do you think? And it seems like we'd need
 to bring this last topic to general D NG.

hmmm.. Either would need some code duplication, or template to turn off/swap small string optimization. But separating them does seem like a good idea.

I'd expect a fair chunk of code to be split off into free functions, then both containers would make use of them. Now note that there is also std.container Array!bool that has reference semantics and got to be doing bit packing... it could potentially take place of big BitArray mentioned. -- Dmitry Olshansky
Jan 02 2013
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 3 January 2013 at 07:57:46 UTC, Dmitry Olshansky 
wrote:
 1/3/2013 6:05 AM, Era Scarecrow wrote:

 Hm, I'd think that having Slice type be:

 BitArraySlice{
 BitArray* bp;
 size_t start, end;
 // all else forwards to the pointed array
 }
 should work avoiding the most of code duplication. With any 
 luck inliner will expand all of this wrapping. To enable bulk 
 mode operations:

 void opOpSliceAssign(BitArraySlice rhs)
 {//has all the data to do it
    bulkOp(bp, start, end, rhs.bp, rhs.start, rhs.end);
 }

 then it's a question of bit of refactoring & tweaking of array 
 bulks ops.

 Then the only problem left is slice invalidation and original 
 array going out of scope.

I know we've had this discussion before. So let's assume you do slicing this way. So what happens when... BitArray ba = [0,1,1,0]; ba = ba[1 .. $-1]; // ba is not a BitArraySlice Suddenly it won't work and slicing is only a range and can only be used in foreach. Correct? You could do appending but likely it would require two functions in the original code (one for a BitArray and one for a slice). It it supports a opAssign to handle that now you have to take care of all opAssign's as well (like C++, once you handle one portion of it and opCast, you handle them all).
 Does that mean we might be able to drop opApply? Hmmm... 
 Depends on if we have enough implemented so the range accepts 
 it. I'll glance over std.range.

Yes, it's got to be a range to work like this and then opApply is useless.

Maybe. Foreach if I remember right worked three different ways, and selected whichever one worked. opApply, front/pop/empty, and array (opIndex access). If enough is implemented that it recognizes it as an array then opApply can be removed. A quick tests shows that currently isn't the case.
 Really depends on the use case. I'm guessing there's enough 
 call for the 64bit small packed bitarray that justifies it, 
 otherwise it would be better to throw it out. As you mention 
 later, separating them does seem like a better idea.

I've meant the fact that it has to preserve current std lib behavior *and* has small string optimization that makes it overly painful and partially defeats the optimization (by requiring pointless compact->big conversions on slice etc.).

We could always 'dup' on slice instead, but then you can't use references on it, at which point it's a value type. Hmmm. Being a pure value type then a separate range makes more sense to use.
 hmmm.. Either would need some code duplication, or template to 
 turn off/swap small string optimization. But separating them 
 does seem like a good idea.

I'd expect a fair chunk of code to be split off into free functions, then both containers would make use of them. Now note that there is also std.container Array!bool that has reference semantics and got to be doing bit packing... it could potentially take place of big BitArray mentioned.

I'm aware of the container version, and overall if it can just replace BitArray as it is, then we might consider only worrying about a compact version, at which point it would become a template... Might start seeing BitArray!1024 in places (or BitString!1024 ?).
Jan 03 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
1/3/2013 2:20 PM, Era Scarecrow пишет:
 On Thursday, 3 January 2013 at 07:57:46 UTC, Dmitry Olshansky wrote:
 1/3/2013 6:05 AM, Era Scarecrow wrote:

 Hm, I'd think that having Slice type be:

 BitArraySlice{
 BitArray* bp;
 size_t start, end;
 // all else forwards to the pointed array
 }
 should work avoiding the most of code duplication. With any luck
 inliner will expand all of this wrapping. To enable bulk mode operations:

 void opOpSliceAssign(BitArraySlice rhs)
 {//has all the data to do it
    bulkOp(bp, start, end, rhs.bp, rhs.start, rhs.end);
 }

 then it's a question of bit of refactoring & tweaking of array bulks ops.

 Then the only problem left is slice invalidation and original array
 going out of scope.

I know we've had this discussion before. So let's assume you do slicing this way. So what happens when... BitArray ba = [0,1,1,0]; ba = ba[1 .. $-1]; // ba is not a BitArraySlice Suddenly it won't work and slicing is only a range and can only be used in foreach.

No surprise here. Basically container != range over it. It all flows from there. Range doesn't have append nor it owns any data. About foreach see below. Yes, the BitArray container won't pass hasSlicing trait, but it shouldn't as it's meant for ranges. (so in a sense my comment about trying to mimic built-in array applies) And slice can have opIndex and opSlice being a random access range of bool. To help people convert one to the other (exactly where they need it) we can add .dup(?) for the slice that allocates a BitArray and assigns values from range.
 Correct? You could do appending but likely it would
 require two functions in the original code (one for a BitArray and one
 for a slice).

Again I don't think there is a need to append to a slice. Create an array out of slice (or in fact any range of bool) then append.
It it supports a opAssign to handle that now you have to
 take care of all opAssign's as well (like C++, once you handle one
 portion of it and opCast, you handle them all).

Yes, I'm afraid that for optimal speed they both have to be implemented. The version for BitArray can just forward to the one for slice over the whole of it. Or there could be a common template function that handles all of bit-ops over [a, b) portions of BitArrays. Everything else then forwards to it.
 Does that mean we might be able to drop opApply? Hmmm... Depends on
 if we have enough implemented so the range accepts it. I'll glance
 over std.range.

Yes, it's got to be a range to work like this and then opApply is useless.

Maybe. Foreach if I remember right worked three different ways, and selected whichever one worked. opApply, front/pop/empty, and array (opIndex access). If enough is implemented that it recognizes it as an array then opApply can be removed.

There are 2 ways: range or opApply. How built-in arrays are done is up to compiler, there is no way to tap into it from user level. There is an extra rule however, it is the following. If the object in question is not a range and doesn't have opApply but has opSlice() then it slices it first, then again it checks the 2 ways on this slice - as range or opApply.
   A quick tests shows that currently isn't the case.

 Really depends on the use case. I'm guessing there's enough call for
 the 64bit small packed bitarray that justifies it, otherwise it would
 be better to throw it out. As you mention later, separating them does
 seem like a better idea.

I've meant the fact that it has to preserve current std lib behavior *and* has small string optimization that makes it overly painful and partially defeats the optimization (by requiring pointless compact->big conversions on slice etc.).

We could always 'dup' on slice instead, but then you can't use references on it, at which point it's a value type.

Dup on slice is very bad. Slice in container world (I'm looking at std.container and this not likely to change) is a view of the same data, thus it's *not* a new container on its own.
Hmmm. Being a pure
 value type then a separate range makes more sense to use.

 hmmm.. Either would need some code duplication, or template to turn
 off/swap small string optimization. But separating them does seem
 like a good idea.

I'd expect a fair chunk of code to be split off into free functions, then both containers would make use of them. Now note that there is also std.container Array!bool that has reference semantics and got to be doing bit packing... it could potentially take place of big BitArray mentioned.

I'm aware of the container version, and overall if it can just replace BitArray as it is, then we might consider only worrying about a compact version, at which point it would become a template... Might start seeing BitArray!1024 in places (or BitString!1024 ?).

I'm not sure if it will gain that much to have templated fixed size bit array over one that just has small optimization. Could be a lot but somehow I don't expect much, that needs some measurement. -- Dmitry Olshansky
Jan 03 2013
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 3 January 2013 at 15:48:50 UTC, Dmitry Olshansky 
wrote:
 1/3/2013 2:20 PM, Era Scarecrow wrote:
 Suddenly it won't work and slicing is only a range and can 
 only be used in foreach.

No surprise here. Basically container != range over it. It all flows from there. Range doesn't have append nor it owns any data. About foreach see below. Yes, the BitArray container won't pass hasSlicing trait, but it shouldn't as it's meant for ranges. (so in a sense my comment about trying to mimic built-in array applies) And slice can have opIndex and opSlice being a random access range of bool.

Hmmm.. shouldn't try to make it as close to an actual array as possible. I feel like I'm missing something very very simple to making it all correct.
 To help people convert one to the other (exactly where they 
 need it) we can add .dup(?) for the slice that allocates a 
 BitArray and assigns values from range.

 Correct? You could do appending but likely it would require 
 two functions in the original code (one for a BitArray and one 
 for a slice).

Again I don't think there is a need to append to a slice. Create an array out of slice (or in fact any range of bool) then append.

I want to say you're wrong, I'm thinking heavily in the field where you would be doing compression or encryption where random access and appending could be needed at any step. I can see creating a simple table of the compressed sequences (Say huffman), and then appending the slices which then greatly simplifies the interface. I can also see where if you append to a slice it automatically making a new block of memory would be useful as well, as you know you're going to start fresh. I can make a short test set of functions and post them if you like, doing ranges & slices otherwise might not allow such diversity (simplicity?) in the code.
 If it supports a opAssign to handle that now you have to take 
 care of all opAssign's as well (like C++, once you handle one 
 portion of it and opCast, you handle them all).

Yes, I'm afraid that for optimal speed they both have to be implemented. The version for BitArray can just forward to the one for slice over the whole of it. Or there could be a common template function that handles all of bit-ops over [a, b) portions of BitArrays. Everything else then forwards to it.
 Does that mean we might be able to drop opApply? Hmmm... 
 Depends on if we have enough implemented so the range 
 accepts it. I'll glance over std.range.

Yes, it's got to be a range to work like this and then opApply is useless.

Maybe. Foreach if I remember right worked three different ways, and selected whichever one worked. opApply, front/pop/empty, and array (opIndex access). If enough is implemented that it recognizes it as an array then opApply can be removed.

There are 2 ways: range or opApply. How built-in arrays are done is up to compiler, there is no way to tap into it from user level. There is an extra rule however, it is the following. If the object in question is not a range and doesn't have opApply but has opSlice() then it slices it first, then again it checks the 2 ways on this slice - as range or opApply.

I didn't know that; I know it attempted a slice but not that the slice has to be a range. Likely that needs to be emphasized in the documentation.
  A quick tests shows that currently isn't the case.

 Really depends on the use case. I'm guessing there's enough 
 call for the 64bit small packed bitarray that justifies it, 
 otherwise it would be better to throw it out. As you mention 
 later, separating them does seem like a better idea.

I've meant the fact that it has to preserve current std lib behavior *and* has small string optimization that makes it overly painful and partially defeats the optimization (by requiring pointless compact->big conversions on slice etc.).

We could always 'dup' on slice instead, but then you can't use references on it, at which point it's a value type.

Dup on slice is very bad. Slice in container world (I'm looking at std.container and this not likely to change) is a view of the same data, thus it's *not* a new container on its own.
 Hmmm. Being a pure value type then a separate range makes more 
 sense to use.

 hmmm.. Either would need some code duplication, or template 
 to turn off/swap small string optimization. But separating 
 them does seem like a good idea.

I'd expect a fair chunk of code to be split off into free functions, then both containers would make use of them. Now note that there is also std.container Array!bool that has reference semantics and got to be doing bit packing... it could potentially take place of big BitArray mentioned.

I'm aware of the container version, and overall if it can just replace BitArray as it is, then we might consider only worrying about a compact version, at which point it would become a template... Might start seeing BitArray!1024 in places (or BitString!1024 ?).

I'm not sure if it will gain that much to have templated fixed size bit array over one that just has small optimization. Could be a lot but somehow I don't expect much, that needs some measurement.

I agree, but if it's going to be a value type that's heavy on small string optimization with as little referencing as possible, then it has to know what size to depend on, since resizing beyond the max may not be an option if we take that part out.
Jan 03 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
04-Jan-2013 00:11, Era Scarecrow пишет:
 On Thursday, 3 January 2013 at 15:48:50 UTC, Dmitry Olshansky wrote:
 1/3/2013 2:20 PM, Era Scarecrow wrote:
 Suddenly it won't work and slicing is only a range and can only be
 used in foreach.

No surprise here. Basically container != range over it. It all flows from there. Range doesn't have append nor it owns any data. About foreach see below. Yes, the BitArray container won't pass hasSlicing trait, but it shouldn't as it's meant for ranges. (so in a sense my comment about trying to mimic built-in array applies) And slice can have opIndex and opSlice being a random access range of bool.

Hmmm.. shouldn't try to make it as close to an actual array as possible. I feel like I'm missing something very very simple to making it all correct.

[snip]
 Again I don't think there is a need to append to a slice. Create an
 array out of slice (or in fact any range of bool) then append.

I want to say you're wrong, I'm thinking heavily in the field where you would be doing compression or encryption where random access and appending could be needed at any step. I can see creating a simple table of the compressed sequences (Say huffman), and then appending the slices which then greatly simplifies the interface.

Appending a slice *to* BitArray is perfectly fine as in fact any range of bool (or bit if you like). Any array-like or string-like container has to support appending a range of element type (and insertion for that matter). The table you mention is then an array of BitArray that is sliced (getting a range) and these slices are appended to some other BitArray. BitArrays in this table could also be appended to just fine (so you construct sequences out of others easily). The key point is that *appending a range to container* is fine (and idiomatic) and *slicing is O(1), fast and non-allocating*.
I can also see where if you
 append to a slice it automatically making a new block of memory would be
 useful as well, as you know you're going to start fresh.

Maybe but I can see it being a nasty surprise in a tight loop.
 I can make a
 short test set of functions and post them if you like, doing ranges &
 slices otherwise might not allow such diversity (simplicity?) in the 

code. Would be great to put them together as we have implementation details sorted out but have trouble with usage patterns.
 There is an extra rule however, it is the following. If the object in
 question is not a range and doesn't have opApply but has opSlice()
 then it slices it first, then again it checks the 2 ways on this slice
 - as range or opApply.

I didn't know that; I know it attempted a slice but not that the slice has to be a range. Likely that needs to be emphasized in the documentation.

That's pretty much how std.container tries to work. There is little documentation on the subject though, same note IRC is in TDPL book (highly recommend that one if you don't have it already). [snip]
 I'm aware of the container version, and overall if it can just
 replace BitArray as it is, then we might consider only worrying about
 a compact version, at which point it would become a template... Might
 start seeing BitArray!1024  in places (or BitString!1024 ?).

I'm not sure if it will gain that much to have templated fixed size bit array over one that just has small optimization. Could be a lot but somehow I don't expect much, that needs some measurement.

I agree, but if it's going to be a value type that's heavy on small string optimization with as little referencing as possible, then it has to know what size to depend on, since resizing beyond the max may not be an option if we take that part out.

I meant it's still resizing up to any point, just that it does dup's underlying array on copy (if not small, if small then postblit does nothing). It's questionable to dup on copy but the idea is that it won't be often. If you mean to make type that doesn't resize beyond some maximum (so it's always "small") then indeed having fixed size as template parameter makes more sense. -- Dmitry Olshansky
Jan 03 2013
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 3 January 2013 at 21:15:19 UTC, Dmitry Olshansky 
wrote:
 04-Jan-2013 00:11, Era Scarecrow wrote:

 Appending a slice *to* BitArray is perfectly fine as in fact 
 any range of bool (or bit if you like). Any array-like or 
 string-like container has to support appending a range of 
 element type (and insertion for that matter). The table you 
 mention is then an array of BitArray that is sliced (getting a 
 range) and these slices are appended to some other BitArray. 
 BitArrays in this table could also be appended to just fine (so 
 you construct sequences out of others easily).

 The key point is that *appending a range to container* is fine 
 (and idiomatic) and *slicing is O(1), fast and non-allocating*.

So the slice is a non-allocating view of data some data. I can see adding more limitations for that then.
 I can also see where if you append to a slice it automatically 
 making a new block of memory would be useful as well, as you 
 know you're going to start fresh.

Maybe but I can see it being a nasty surprise in a tight loop.

Only on the first iteration, unless you do a LOT of slicing & appending on non-compact blocks, but if you're appending to a slice in normal arrays it will dup it anyways (if it doesn't own the following memory) so the behavior doesn't really change.
 I can make a short test set of functions and post them if you 
 like, doing ranges & slices otherwise might not allow such 
 diversity (simplicity?) in the code.

Would be great to put them together as we have implementation details sorted out but have trouble with usage patterns.

K, I'll likely re-work my multi-level huffman algorithmn and a LZW compression, although the LZW wouldn't make use of the more exotic features. Got half the LZW written, but I likely won't get this done for a few days.
 There is an extra rule however, it is the following. If the 
 object in question is not a range and doesn't have opApply 
 but has opSlice() then it slices it first, then again it 
 checks the 2 ways on this slice - as range or opApply.

I didn't know that; I know it attempted a slice but not that the slice has to be a range. Likely that needs to be emphasized in the documentation.

That's pretty much how std.container tries to work. There is little documentation on the subject though, same note IRC is in TDPL book (highly recommend that one if you don't have it already).

Got a copy, in fact this is one of the first prints. I've likely read this book a half dozen times to get a full feel of the language (other than templates which I've learned elsewhere)
 I agree, but if it's going to be a value type that's heavy on 
 small string optimization with as little referencing as 
 possible, then it has to know what size to depend on, since 
 resizing beyond the Max may not be an option if we take that 
 part out.

I meant it's still resizing up to any point, just that it does dup's underlying array on copy (if not small, if small then postblit does nothing). It's questionable to dup on copy but the idea is that it won't be often. If you mean to make type that doesn't resize beyond some maximum (so it's always "small") then indeed having fixed size as template parameter makes more sense.

That was the idea. A true value type with no hidden allocation. There are many cases for it, but I think having the BitArray which can allocate has a larger audience than a fixed size. Won't know for sure though, maybe it's the other way around.
Jan 03 2013
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 3 January 2013 at 21:45:24 UTC, Era Scarecrow wrote:
 K, I'll likely re-work my multi-level huffman algorithmn and a 
 LZW compression, although the LZW wouldn't make use of the more 
 exotic features. Got half the LZW written, but I likely won't 
 get this done for a few days.

Nearly done. Got a working LZW, multi-level huffman working, only cleanup/refactoring to do, and a way to save/load the tree structure for external storage. Do I assume you would want me to post it? (Even incomplete and lacking a good portion of documentation/unittests)
Jan 06 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
07-Jan-2013 00:51, Era Scarecrow пишет:
 On Thursday, 3 January 2013 at 21:45:24 UTC, Era Scarecrow wrote:
 K, I'll likely re-work my multi-level huffman algorithmn and a LZW
 compression, although the LZW wouldn't make use of the more exotic
 features. Got half the LZW written, but I likely won't get this done
 for a few days.

Nearly done. Got a working LZW, multi-level huffman working, only cleanup/refactoring to do, and a way to save/load the tree structure for external storage. Do I assume you would want me to post it? (Even incomplete and lacking a good portion of documentation/unittests)

Sure thing! I'd love to check it out. -- Dmitry Olshansky
Jan 06 2013
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Sunday, 6 January 2013 at 21:14:08 UTC, Dmitry Olshansky wrote:
 07-Jan-2013 00:51, Era Scarecrow wrote:

 Nearly done. Got a working LZW, multi-level huffman working, 
 only cleanup/refactoring to do, and a way to save/load the 
 tree structure for external storage.

 Do I assume you would want me to post it? (Even incomplete and 
 lacking a good portion of documentation/unittests)

Sure thing! I'd love to check it out.

Here you go. It's about 16k currently (5k for LZW, 10k for MLH). I write on the fly so some of it isn't as thought out as it could be. https://github.com/rtcvb32/Side-Projects/blob/master/mlh.d
Jan 06 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
07-Jan-2013 01:54, Era Scarecrow пишет:
 On Sunday, 6 January 2013 at 21:14:08 UTC, Dmitry Olshansky wrote:
 07-Jan-2013 00:51, Era Scarecrow wrote:

 Nearly done. Got a working LZW, multi-level huffman working, only
 cleanup/refactoring to do, and a way to save/load the tree structure
 for external storage.

 Do I assume you would want me to post it? (Even incomplete and
 lacking a good portion of documentation/unittests)

Sure thing! I'd love to check it out.

Here you go. It's about 16k currently (5k for LZW, 10k for MLH). I write on the fly so some of it isn't as thought out as it could be. https://github.com/rtcvb32/Side-Projects/blob/master/mlh.d

Thanks, unfortunately I got busy with my own stuff ... bookmarked for review. -- Dmitry Olshansky
Jan 07 2013
prev sibling parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Saturday, 28 July 2012 at 20:59:15 UTC, Dmitry Olshansky wrote:
 Great! But I strongly suggest to repost it in d.D newsgroup as 
 it has wider audience and is more appropriate for reviews.

I was thinking of not bothering Andrei or Walter while i fixed other issues on it before bringing more major ones forward (which is why i brought it to D.learn), but re-posting it there is easy enough
 My solution is make slices different type e.g. BitSlice. It's 
 always reference to the original BitArray. Array itself still 
 can be either compact or not. All operations with slice go to 
 array (and still check if they can use word-aligned speed up).

 How does it sound?

Someone else also suggested making the slice different (Meaning a new Bitarray always is at max compact size). Since we need at least a single flag to determine if the array is compact or array reference, there would be a larger loss which the current offset/maxOffset handle quite nicely filling in that space. Making BitSlice separate suggests BitArray would be the bare minimum (canUseBulk, length, and index operations) while bitSlice would handle all others. But if you do that, either BitSlice would have to be inherently convertible to BitArray if you want to pass it to a function that expects a BitArray. I'm hesitant to do it, but if there's a strong enough argument for it I wouldn't refuse it.
 opCat isn't it just operator "~" and "~=" ? You can use 
 opOpAssign for "~=" IRC.

Yes, I think opAssign is used for ~= already, but doing the following requires opCat and opCat_r. BitArray a; //filled with something auto x = true ~ a; auto y = a ~ true; I just can't get it to accept it (for whatever reason). Compiler bug?
 I suspect to!string wasn't CTFEable. Regardless you can catch a 
 word from Andrei Alexandrescu on the main D newsgroup who (I 
 think) came up with it in the first place.

If we can use the std.string functions I'll replace them all :) (As I was just following the theme since they were present). I'll move this to the D group and see what we get for this.
Jul 28 2012
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 29-Jul-12 01:20, Era Scarecrow wrote:
 On Saturday, 28 July 2012 at 20:59:15 UTC, Dmitry Olshansky wrote:
 Great! But I strongly suggest to repost it in d.D newsgroup as it has
 wider audience and is more appropriate for reviews.

I was thinking of not bothering Andrei or Walter while i fixed other issues on it before bringing more major ones forward (which is why i brought it to D.learn), but re-posting it there is easy enough
 My solution is make slices different type e.g. BitSlice. It's always
 reference to the original BitArray. Array itself still can be either
 compact or not. All operations with slice go to array (and still check
 if they can use word-aligned speed up).

 How does it sound?

Someone else also suggested making the slice different (Meaning a new Bitarray always is at max compact size).

Obviously that was me (blackwhale) ;) Since we need at least a single
 flag to determine if the array is compact or array reference, there
 would be a larger loss which the current offset/maxOffset handle quite
 nicely filling in that space.

Not sure I followed you - what with offset/maxOffset ? You mean less space available for compact version? Making BitSlice separate suggests BitArray
 would be the bare minimum (canUseBulk, length, and index operations)
 while bitSlice would handle all others. But if you do that, either
 BitSlice would have to be inherently convertible to BitArray if you want
 to pass it to a function that expects a BitArray.

Take a look at std.container from what I see there Array does have a separate range type. I'm hesitant to do it,
 but if there's a strong enough argument for it I wouldn't refuse it.

 opCat isn't it just operator "~" and "~=" ? You can use opOpAssign for
 "~=" IRC.

Yes, I think opAssign is used for ~= already, but doing the following requires opCat and opCat_r. BitArray a; //filled with something auto x = true ~ a; auto y = a ~ true; I just can't get it to accept it (for whatever reason). Compiler bug?

Hm opBinaryRight with if (op == "~") should work.
 I suspect to!string wasn't CTFEable. Regardless you can catch a word
 from Andrei Alexandrescu on the main D newsgroup who (I think) came up
 with it in the first place.

If we can use the std.string functions I'll replace them all :) (As I was just following the theme since they were present). I'll move this to the D group and see what we get for this.

-- Dmitry Olshansky
Jul 28 2012
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Era Scarecrow:

 BitFields:
  I've concentrated on adding default values.

Good, that's useful. Regarding BitArray I have written some ER: http://d.puremagic.com/issues/show_bug.cgi?id=4123 http://d.puremagic.com/issues/show_bug.cgi?id=4124 http://d.puremagic.com/issues/show_bug.cgi?id=4717 http://d.puremagic.com/issues/show_bug.cgi?id=7487 http://d.puremagic.com/issues/show_bug.cgi?id=7488 http://d.puremagic.com/issues/show_bug.cgi?id=7490 Related: http://d.puremagic.com/issues/show_bug.cgi?id=6697 Bye, bearophile
Jul 28 2012
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Saturday, 28 July 2012 at 22:43:22 UTC, bearophile wrote:
 Era Scarecrow:

 BitFields:
 I've concentrated on adding default values.

Good, that's useful. Regarding BitArray I have written some ER: http://d.puremagic.com/issues/show_bug.cgi?id=4123 http://d.puremagic.com/issues/show_bug.cgi?id=4124 http://d.puremagic.com/issues/show_bug.cgi?id=4717 http://d.puremagic.com/issues/show_bug.cgi?id=7487 http://d.puremagic.com/issues/show_bug.cgi?id=7488 http://d.puremagic.com/issues/show_bug.cgi?id=7490

4123 - fixed 4124 - Set/reset all bits ie: BitArray ba = BitArray(100); ba[] = true; Others haven't been done yet. 4717 - Same as 4124, none others done yet. 7487 - Done. When prepending it extends the array to size_t extra and slowly backtracks until it needs to do it again. To go further, have to disable re-slicing of the allocated array. Or perhaps a 'start at end' during constructor stage. Hmmm.. 7488 - Done, union used and is 'compact' version (by default or resized and can fit) 7490 - haven't touched yet. :(
 Related:
 http://d.puremagic.com/issues/show_bug.cgi?id=6697

Not so sure. Could make a multi-dimentional one that returns slices to various sections, but that's iffy. I'd need an example of how you'd use it with BitArray before I could build a solution.
Jul 28 2012
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Era Scarecrow:

As you guess I have had to use many arrays of bits in my programs 
:-)

 4124 - Set/reset all bits ie:
   BitArray ba = BitArray(100);
   ba[] = true;
 Others haven't been done yet.

Among those 4124 suggestions the most important, and very simple to implement, are: - set n-th bit (this can be a little more efficient than bs[n]=1;) - reset n-th bit (this can be a little more efficient than bs[n]=1;) Another commonly needed operation is a very fast bit count. There are very refined algorithms to do this.
 7487 - Done. When prepending it extends the array to size_t 
 extra and slowly backtracks until it needs to do it again.

Leaving a bit of "capacity" at the beginning is a good idea.
 7488 - Done, union used and is 'compact' version (by default or 
 resized and can fit)

Good, this makes BitArray usable in many other situations :-)
 Related:
 http://d.puremagic.com/issues/show_bug.cgi?id=6697

Not so sure. Could make a multi-dimentional one that returns slices to various sections, but that's iffy. I'd need an example of how you'd use it with BitArray before I could build a solution.

I have needed many times a fast BitMatrix, but I see it as a data structure distinct from BitArray, so don't worry about making them inter-operate. For me a BitMatrix must be as fast as possible, even if this causes some waste of memory (see my implementation in the attach of issue 6697) and I use it for medium and large bit matrices. Generally I don't to count the set bits in a bit matrix. So the two data structures need different capabilities and they are better implemented in different ways (this means a FastBitMatrix is not an array of a BitArray!). Taking a row of a FastBitMatrix and turning it into a a BitArray sounds cute, but generally I don't need to do this operation, so I don't care about this feature. Bye, bearophile
Jul 29 2012
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Sunday, 29 July 2012 at 14:41:48 UTC, bearophile wrote:
 As you guess I have had to use many arrays of bits in my 
 programs :-)

I'll do what I can :)
 4124 - Set/reset all bits ie:
  BitArray ba = BitArray(100);
  ba[] = true;
 Others haven't been done yet.

Among those 4124 suggestions the most important, and very simple to implement, are: - set n-th bit (this can be a little more efficient than bs[n]=1;) - reset n-th bit (this can be a little more efficient than bs[n]=1;)

If it has to determine between compact and non-compact, then I really don't see how any speed improvement can be made, and if it's optimized it may be inlined which would be about as fast as you could get.
 Another commonly needed operation is a very fast bit count. 
 There are very refined algorithms to do this.

Likely similar to the hamming weight table mentioned in TDPL. Combined with the canUseBulk I think I could make it fairly fast.
 7487 - Done. When prepending it extends the array to size_t 
 extra and slowly backtracks until it needs to do it again.

Leaving a bit of "capacity" at the beginning is a good idea.

With how it's made for speed and slices, it had to :P
 7488 - Done, union used and is 'compact' version (by default 
 or resized and can fit)

Good, this makes BitArray usable in many other situations :-)

Yes, currently a debate of it's partial ref/value issue is coming up. As it is it's usable, and if you want to be sure it's unique you can always dup it, otherwise as a previous suggestion I'll either make two array types for either full reference vs value, or take his separate slices (reference) idea.
 Related:
 http://d.puremagic.com/issues/show_bug.cgi?id=6697

Not so sure. Could make a multi-dimentional one that returns slices to various sections, but that's iffy. I'd need an example of how you'd use it with BitArray before I could build a solution.

I have needed many times a fast BitMatrix, but I see it as a data structure distinct from BitArray, so don't worry about making them inter-operate.

A Matrix is just a multidimensional array right? I'll have to read up on it a bit; doing simple division (Likely whole rows of 32bit at a time) would do that job, but expanding width several times (compared to height) would be quite a bit slower; Although not too horrible.
 For me a BitMatrix must be as fast as possible, even if this 
 causes some waste of memory (see my implementation in the 
 attach of issue 6697) and I use it for medium and large bit 
 matrices. Generally I don't to count the set bits in a bit 
 matrix. So the two data structures need different capabilities 
 and they are better implemented in different ways (this means a 
 FastBitMatrix is not an array of a BitArray!).

more likely bool array/matrix.. Although if you heavily use specific lines it could convert to bool[], and then back again, but if you go all over the place it would be worse than just using bool[].
 Taking a row of a FastBitMatrix and turning it into a a 
 BitArray sounds cute, but generally I don't need to do this 
 operation, so I don't care about this feature.

Jul 29 2012
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Era Scarecrow:

  If it has to determine between compact and non-compact, then I 
 really don't see how any speed improvement can be made,

Maybe it's better to consider two use cases: dynamic length allocated on the heap (or allocated with a given allocator, often a stack-styled allocator), and fixed-size allocated in place, so there's no need for that run-time test.
 and if it's optimized it may be inlined which would be about as 
 fast as you could get.

I think that "a[x] = 1;" is slower than "a.set(x);" even if the array a is allocated in place and everything is inlined. This is probably the most important operation an array of bits has to support. If this is missing, I have to use my own implementation still.
  Likely similar to the hamming weight table mentioned in TDPL. 
 Combined with the canUseBulk I think I could make it fairly 
 fast.

There is lot of literature about implementing this operation efficiently. For the first implementation a moderately fast (and short) code is probably enough. Later faster versions of this operation will go in Phobos, coming from papers.
  Yes, currently a debate of it's partial ref/value issue is 
 coming up. As it is it's usable, and if you want to be sure 
 it's unique you can always dup it, otherwise as a previous 
 suggestion I'll either make two array types for either full 
 reference vs value, or take his separate slices (reference) 
 idea.

The idea of making two different types (for the dynamic and static versions) sounds nice. There is a third use case, that is bit arrays that start small and can grow, and rarely grow bigger than one or two words. This asks for a dynamic-length bit array type that allocates locally only when it's small (so it needs to perform run-time test of the tag. Too bad the D GC doesn't support tagged pointers!). But probably this use case is not common enough to require a third type of array. So I suggest to not worry about this case now, and to focus on the other two more common ones.
  A Matrix is just a multidimensional array right?

Well, yes, but there are many ways to implement arrays and nD arrays. Sometimes the implementations differ.
 I'll have to read up on it a bit; doing simple division (Likely 
 whole rows of 32bit at a time) would do that job,

Take a look at my implementation in Bugzilla :-) It's another type, the third. The 1D and 2D cases are the most common.
 but expanding width several times (compared to height) would be 
 quite a bit slower; Although not too horrible.

A 2D array of bits probably doesn't need to change its size after creation, this is not a common use case.
  more likely bool array/matrix.. Although if you heavily use 
 specific lines it could convert to bool[], and then back again, 
 but if you go all over the place it would be worse than just 
 using bool[].

It's not so simple :-) A bool[] causes many more cache misses, if the matrix is not tiny. A bool[] or bool[][] is a good idea only if the matrix is very small, or if you don't have a bit matrix library and you don't want to write lot of code. Converting to bool[] the current line is not a good idea. Bye, bearophile
Jul 29 2012
parent Don Clugston <dac nospam.com> writes:
On 29/07/12 23:36, bearophile wrote:
 Era Scarecrow:

 Another commonly needed operation is a very fast bit count. There 



are very refined algorithms to do this.
  Likely similar to the hamming weight table mentioned in TDPL.
 Combined with the canUseBulk I think I could make it fairly fast.

There is lot of literature about implementing this operation efficiently. For the first implementation a moderately fast (and short) code is probably enough. Later faster versions of this operation will go in Phobos, coming from papers.

See bug 4717. On x86, even on 32 bits you can get close to 1 cycle per byte, on 64 bits you can do much better.
Jul 31 2012
prev sibling next sibling parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
  Currently since I've been concentrating more on bitfields and 
BitArray needs more work, the two separate patches and trees are 
out of sync and apparently the auto-tester brings them up as a 
fail. How do I remove one of the pulls so I can concentrate on 
the one set of features and continue the others later?

  (Course I could just update what I have for BitArray so it works 
as a value-type for now too...)
Aug 02 2012
parent reply "David Nadlinger" <see klickverbot.at> writes:
On Thursday, 2 August 2012 at 11:11:58 UTC, Era Scarecrow wrote:
 How do I remove one of the pulls so I can concentrate on the 
 one set of features and continue the others later?

Hm? You can just close pull requests (button at the bottom of the page), and reopen them later. David
Aug 02 2012
next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 2 August 2012 at 11:26:42 UTC, David Nadlinger wrote:
 On Thursday, 2 August 2012 at 11:11:58 UTC, Era Scarecrow wrote:
 How do I remove one of the pulls so I can concentrate on the 
 one set of features and continue the others later?

Hm? You can just close pull requests (button at the bottom of the Page), and reopen them later.

Most of this doesn't make sense to me still; just one of those things I'll learn as I use it over the next few years. Anyways when the bitfields are as updated as we're going to make it, I'll resume work on BitArray code. In the meantime I guess I'll check on the auto-tester once it refreshes.
Aug 02 2012
prev sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 2 August 2012 at 11:26:42 UTC, David Nadlinger wrote:
 Hm? You can just close pull requests (button at the bottom of 
 the Page), and reopen them later.

So I've removed the pull, but the auto tester is still pulling both code sources, so it's failing automatically because of it. Do I close the bitfields pull too? Or do I open another/new pull for it?
Aug 03 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
Following from issue list:

bearophile_hugs 2010-10-04 18:16:44 PDT
Another possible feature is to make std.bitmanip.bitfields 
generate two

versions of the code, that get compiled conditionally according to the CPU endianess:
 static if (std.system.endian == std.system.Endian.BigEndian) {
    ...
 } else {
     ...
 }

Having it endian specific brings up a couple questions. Which endian is 'correct'? Resolutions could be... a) Both are correct (Leave as is) b) big/little is correct and should always convert to the other (which one?) c) Specify correct type as a flag, then swaps for the other in generated code Second, which would be the best approach? a) swap bytes in a temporary, modify the temporary, replace storage? (In every function) b) use current endianness, then prepping saving/loading have it swap? (very unlikely) c) use ubyte array and use shifting on each byte (small Endian, also allows for odd byte sizes, not very likely, but possible) I know an instruction for the Pentium system (686+?) bswap will swap the appropriate bytes as a conversion and takes a single cycle, using assembler that can be incorporated assuming it's ready for it (course than the template would be updated more likely and not the bitfield template).
Aug 02 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
repost from issue list:

Austin Hastings 2010-09-24 18:06:53 PDT
 Also, I think I'm going to request that repeated bitfield 
 definitions be allowed if they are identical - I'd like to 
 redeclare "opcode" rather than "".

How would you tell them apart? If I know how you may want to call them, I may be able to make something. I can understand with registers, but still need some way to work with them. Perhaps as a set then?
 So I would like the bitmanip code to permit redeclaration of 
 bitfields that are identical in all respects.

 That is, obviously the names are the same, but the field width, 
 offset, and type representation has to be the same as well.

Maybe....? struct S { mixin(bitfields!( uint, "opcode", 4, uint, "register", 4, uint, "register", 4, uint, "register", 4 )); } and using the registers would have function signature like... struct Register { uint register_1; uint register_2; uint register_3; } //setters, likely can't be propery void register(uint reg1, uint reg2, uint reg3); void register(uint[] register ...); //maybe? void register(Register register); //getter ?? Register register() const; Or perhaps... struct S { mixin(bitfields!( uint, "opcode", 4, uint, "reg1", 4, uint, "reg2", 4, uint, "reg3", 4 )); mixin(sharedNameSet( "nameForGetterAndSetter", "struct name for returning/passing", "reg1", "reg2", "reg3" //named variables as a set )); //nameForGetterAndSetter's would be added here, perhaps as above. }
Aug 02 2012
prev sibling next sibling parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
  Having insomnia has left me with a certain gifted curse. :P

  I've been thinking, and i think i've written BitArray all wrong, 
in order to better structure it i would break it up into four 
something parts. The following would remove the complication of 
'isCompact' and the union, the code can be  safe (rather than 
 trusted).

  BitString, this is the main of BitArray, handles all basic 
operations but doesn't manage memory of any kind.

  BitArrayRange, which is a reference forwarding for a BitString 
including various range functions.

  Then we come to memory management. This can be done simply as 
wrappers.

  BitArray - Memory management and dynamically 
allocates/reallocates as needed
  BitArrayFixed - Template that includes a fixed size of memory 
for small memory optimization.

It goes something like... (API not yet decided fully, but you 
should get the idea)

BitString {
   //bitfields for start/ending
   size_t[]* bulkRaw; //pointer to the array?
   inout(ubyte[]) noEndianBulkRaw() inout {
     return cast(inout(ubyte[])) bulkRaw;
   }
   this(size[]* raw) {
     bulkRaw = raw;
   }

   //binaryOperators and opIndex, etc
   //length modifier, but no memory re-allocation allowed
}

//Template with bool isConst?
//BitStringRange(bool isConst) {
//  static if (isConst) { const(BitString)* array; }
//  else                { BitString* array; }

BitStringRange {
   BitString* array;
   ulong start, end;
   //forwarding operators for start/end
   //range functions
}

BitArray {
   BitString array;
   size_t[] rawMemory;
   alias array this;
   //include length modifiers and dynamic allocation
   void init() {}
   //opCast can convert to BitArrayFixed?
}

BitArrayFixed(int fixedSize) {
   BitString array;
   size_t[fixedSize] rawMemory;
   alias array this;

   void init(){ array = BitString(rawMemory); }
   //opCast can convert to BitArray?
}

  Thoughts and comments?
Jan 09 2013
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 9 January 2013 at 20:00:59 UTC, Era Scarecrow wrote:
 BitString {
   //bitfields for start/ending
   size_t[]* bulkRaw; //pointer to the array?
   //binaryOperators and opIndex, etc
   //length modifier, but no memory re-allocation allowed
 }

 BitArray {
   BitString array;
   size_t[] rawMemory;
   alias array this;
   //include length modifiers and dynamic allocation
   void init() {}
   //opCast can convert to BitArrayFixed?
 }

Hmm as I'm thinking about it more, a mixin for BitString may be better.. // 'size_t[] bulkRaw' required by enclosing struct/class mixin template BitString() { struct BitString { } } BitArray { mixin BitString; size_t[] bulkRaw; BitString array; alias array this; //length setter, handling allocation/reallocation } //since BitArray and BitArrayFixed may have different pointer/setups? //Or would either work together? BitArrayRange(BS, bool isConst) { BS* bitString; } If anyone wants to talk live with me, I tend to be on YIM, so rtcvb32 or Era_Scarecrow.
Jan 09 2013
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 9 January 2013 at 21:16:37 UTC, Era Scarecrow wrote:
  Hmm as I'm thinking about it more, a mixin for BitString may 
 be better..

I'm having troubles trying to understand why this won't work. Anyone clarify? mixin template BitString() { struct BitString { size_t opIndex(int i) { //Error: this for bulkRaw needs to be type BitArray not type BitString return bulkRaw[i]; } } } struct BitArray { mixin BitString; size_t[] bulkRaw; BitString ba; }
Jan 09 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
10-Jan-2013 01:47, Era Scarecrow пишет:
 On Wednesday, 9 January 2013 at 21:16:37 UTC, Era Scarecrow wrote:
  Hmm as I'm thinking about it more, a mixin for BitString may be better..


Overall direction is good (I mean separation of ranges/containers). I'll post more comments once I think it through.The minor detail is that BitString sounds like a individual container whereas it's a "binary view of packed bools".
   I'm having troubles trying to understand why this won't work. Anyone
 clarify?



 mixin template BitString()
 {
      struct BitString {
          size_t opIndex(int i) {
 //Error: this for bulkRaw needs to be type BitArray not type BitString
              return bulkRaw[i];
          }
      }
 }

You want to mixin a definition of a inner struct BitString into each bit array? Or include all of functions of BitString to be mixed into BitArray? If the latter then: mixin template BitString() { size_t opIndex(int i) { return bulkRaw[i]; } //.... } i.e. no struct it's just a template.
 struct BitArray {
      mixin BitString;
      size_t[] bulkRaw;
      BitString ba;
 }

-- Dmitry Olshansky
Jan 10 2013
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 10 January 2013 at 20:11:38 UTC, Dmitry Olshansky 
wrote:
 You want to mixin a definition of a inner struct BitString into 
 each bit array?

That was the idea, yes.
 Or include all of functions of BitString to be mixed into 
 BitArray?

 If the latter then:

Well it seems like it should work, that it should include a silent pointer to the original BitArray (container) similar like delegates do for functions. Odd that it doesn't. Bug? Seems with the problem on that, I'll do a different approach. BitArray(StorageType) { StorageType memory; //now we do memory.length =, and memory.rawBulk[] for direct access, may //add opIndex removing the need to know rawBulk at all } StorageType in this case should contain 'rawBulk', and the storage type handles the memory allocation, giving access to members no different than an array. I have two storage types i've written that hopefully handle all cases: //storage types for BitArrays struct Dynamic { size_t[] rawBulk; alias rawBulk this; } struct Fixed(int maxBits) { size_t[maxBits / (size_t.sizeof * 8)] rawBulk; int _length; //fixed sizes are usually going to be rather small int length() property const safe pure nothrow { return _length; } void length(int setSize) property const safe pure nothrow { assert(setSize < rawBulk.length); _length = setSize; } void reserve(size_t resSize) {} //fixed } Now when modifying memory referencing just treat memory as an array, access the raw data via rawBulk (likely will remove named access and go with opIndex). This removes the need for several workarounds I have currently. while trying to handle compact/allocated memory.
Jan 10 2013
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
11-Jan-2013 00:33, Era Scarecrow пишет:
 On Thursday, 10 January 2013 at 20:11:38 UTC, Dmitry Olshansky wrote:
 You want to mixin a definition of a inner struct BitString into each
 bit array?

That was the idea, yes.
 Or include all of functions of BitString to be mixed into BitArray?

 If the latter then:

Well it seems like it should work, that it should include a silent pointer to the original BitArray (container) similar like delegates do for functions. Odd that it doesn't. Bug?

It's only for classes AFAIK. Inner structs of structs (LOL) don't have it.
   Seems with the problem on that, I'll do a different approach.

   BitArray(StorageType) {
     StorageType memory;

     //now we do memory.length =, and memory.rawBulk[] for direct access,
 may
     //add opIndex removing the need to know rawBulk at all
   }

   StorageType in this case should contain 'rawBulk', and the storage
 type handles the memory allocation, giving access to members no
 different than an array. I have two storage types i've written that
 hopefully handle all cases:

 //storage types for BitArrays
 struct Dynamic {
    size_t[] rawBulk;
    alias rawBulk this;
 }

 struct Fixed(int maxBits) {
    size_t[maxBits / (size_t.sizeof * 8)] rawBulk;
    int _length;    //fixed sizes are usually going to be rather small

    int length()  property const  safe pure nothrow {
        return _length;
    }
    void length(int setSize)  property const  safe pure nothrow {
        assert(setSize < rawBulk.length);
        _length = setSize;
    }
    void reserve(size_t resSize) {} //fixed
 }

   Now when modifying memory referencing just treat memory as an array,
 access the raw data via rawBulk (likely will remove named access and go
 with opIndex). This removes the need for several workarounds I have
 currently. while trying to handle compact/allocated memory.

Sounds good and seductively simple. -- Dmitry Olshansky
Jan 10 2013
parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 10 January 2013 at 21:37:22 UTC, Dmitry Olshansky 
wrote:
 It's only for classes AFAIK. Inner structs of structs (LOL) 
 don't have it.

Why not? I don't see a reason why not personally. Umm maybe I do... the outer struct becomes referenced to S2, so... struct S { int x; struct S2 { int getX(){return x;} } S2 s2; } S.S2 func(){ S s; return s.s2; } auto x = func(); writeln(x.getX()); //prints garbage? Or access violation. If func had used new on s, then it would have worked... Yeah I see the problem now. The only way that would work is if S2 is never allowed to be passed outside a function, and on copying the new hidden pointer is updated to the new location. That makes sense. Maybe something walter will consider.
 Sounds good and seductively simple.

Yeah I hope so. We'll see, that's the design I'll go for. I'll let you know how it goes; Should at least simplify the code and remove the need for bitfields.
Jan 10 2013
prev sibling parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
  Well some minor update. From the looks of how it is going to be 
set up, it is almost entirely a template. Consider you have 
BitArray!(Fixed!(1024)), vs BitArray!(Dynamic), in order for one 
to copy to the other template functions are needed. Override one 
of them, and now i have to define it to handle all slice and 
BitArrays as template functions too. not horrible, but has it's 
downsides.

  Also as for ranges/slices; Some need their contents const while 
others don't care. So..

//slice is also a range
struct BitArraySlice(BA, bool isConst){}


  Got a lot of work to do, so I wouldn't expect anything for a 
couple weeks.
Jan 13 2013
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
  Well got a few curious problems. Slicing doesn't seem it wants 
to work as a separate type and can cause problems.

  Let's take an example. Say our slice is..

   struct BitArraySlice {
     BitArray* ba;
     ulong start, end;
   }

  Now how much does it depend on the bitarray that it's pointing 
to? If it is a wrapper then we have a problem with range checks 
which should be legal.

   BitArray x = BitArray(100); //100 elements

   auto sl = x[50 .. 100];
   x.length = 50;
   sl[0] = true; //out of range! BitArray valid from 0-49, not 
50-100

  That much is sorta easy to fix with a separate opIndex (and 
fixed sizes), but it's possible to re-slice the dynamic array to 
make it smaller. So even if we have opIndex allow out of ranges...

   struct BitArray {
     size_t[] store; //storage
     ubyte start, end;
   }

   BitArray x = BitArray(100); //100 elements
   auto sl = x[50 .. 100];

   //likely does x.store[] = x.store[0 .. 2];
   //due to compact 1 byte offsets to determine end of bitarray.
   x.length = 50;

   sl[0] = true; //ok, 0-64 valid on 32bit machines
   sl[32] = true; //out of range! 82 past not within 0-63


  So let's take the slice and give it the address of the storage 
instead, other than it could point to a local variable it will 
work; But now I have basically two definitions of bitarray, one 
that can be a range/slice while the other that manages the memory 
and binary operators.

   struct BitArraySlice {
     //same as BitArray now, what advantage does this give?
     size_t[] store;
     ulong start, end;
   }

  Seems like making the slices a separate entity is going to cause 
more problems (Not that they can't be solved but the advantages 
seem smaller); Plus there's issues trying to get immutable/idup 
working.

  Thoughts? Feedback? I'm about ready to drop this and resume my 
previous version and enhance it with recent experiences.
Jan 16 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
17-Jan-2013 07:53, Era Scarecrow пишет:
   Well got a few curious problems. Slicing doesn't seem it wants to work
 as a separate type and can cause problems.

   Let's take an example. Say our slice is..

    struct BitArraySlice {
      BitArray* ba;
      ulong start, end;
    }

   Now how much does it depend on the bitarray that it's pointing to? If
 it is a wrapper then we have a problem with range checks which should be
 legal.

    BitArray x = BitArray(100); //100 elements

    auto sl = x[50 .. 100];
    x.length = 50;

Well, the slice was invalidated by shrinking an array. I don't expect the below to work.
    sl[0] = true; //out of range! BitArray valid from 0-49, not 50-100

The good part is that error can be detected easily. In c++ for instance it typically isn't. The harder problem is what to do when the original BitArray goes out of scope. I guess in case of storage allocated on heap it should work but if on the stack it shouldn't (and can't).
   That much is sorta easy to fix with a separate opIndex (and fixed
 sizes), but it's possible to re-slice the dynamic array to make it
 smaller. So even if we have opIndex allow out of ranges...

    struct BitArray {
      size_t[] store; //storage
      ubyte start, end;
    }

    BitArray x = BitArray(100); //100 elements
    auto sl = x[50 .. 100];

    //likely does x.store[] = x.store[0 .. 2];
    //due to compact 1 byte offsets to determine end of bitarray.
    x.length = 50;

    sl[0] = true; //ok, 0-64 valid on 32bit machines
    sl[32] = true; //out of range! 82 past not within 0-63

That's why it's far better to allow slices to be invalidated depending on the length parameter of BitArray pointed by. Compact array do weird things in the previous version too.
   So let's take the slice and give it the address of the storage
 instead, other than it could point to a local variable it will work; But
 now I have basically two definitions of bitarray, one that can be a
 range/slice while the other that manages the memory and binary operators.

    struct BitArraySlice {
      //same as BitArray now, what advantage does this give?
      size_t[] store;
      ulong start, end;
    }

   Seems like making the slices a separate entity is going to cause more
 problems (Not that they can't be solved but the advantages seem
 smaller); Plus there's issues trying to get immutable/idup working.

Slices are ranges and thus are converted to array via std.array.array, nothing to invent or re-implement.
   Thoughts? Feedback? I'm about ready to drop this and resume my
 previous version and enhance it with recent experiences.

Feel free to do it how you see it. It's just that I think the semantics of the previous version can't be improved to a consistent state. -- Dmitry Olshansky
Jan 17 2013
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 17 January 2013 at 18:12:28 UTC, Dmitry Olshansky 
wrote:
 Well, the slice was invalidated by shrinking an array. I don't 
 expect the below to work.

It just seems like shrinking/expanding the array should still work, perhaps I'm thinking of them too much like a normal array slice. If this is acceptable behavior then okay, but the increased complexity may not be worth it.
 The good part is that error can be detected easily. In c++ for 
 instance it typically isn't.

 The harder problem is what to do when the original BitArray 
 goes out of scope. I guess in case of storage allocated on heap 
 it should work but if on the stack it shouldn't (and can't).

Passing back the BitArray in either form is valid, as dynamic stays valid and a full binary copy stays valid. The slices on the other hand won't be valid unless it's on the heap.
   sl[0] = true; //ok, 0-64 valid on 32bit machines
   sl[32] = true; //out of range! 82 not within 0-63

That's why it's far better to allow slices to be invalidated depending on the length parameter of BitArray pointed by. Compact array do weird things in the previous version too.

Hmmm...
 So let's take the slice and give it the address of the storage 
 instead, other than it could point to a local variable it will 
 work; But now I have basically two definitions of bitarray, 
 one that can be a range/slice while the other that manages the 
 memory and binary operators.

 Seems like making the slices a separate entity is going to 
 cause more problems (Not that they can't be solved but the 
 advantages seem smaller); Plus there's issues trying to get 
 immutable/idup working.

Slices are ranges and thus are converted to array via std.array.array, nothing to invent or re-implement.
  Thoughts? Feedback? I'm about ready to drop this and resume 
 my previous version and enhance it with recent experiences.

Feel free to do it how you see it. It's just that I think the semantics of the previous version can't be improved to a consistent state.

Sure they can, drop the compact part of it and leave it fully dynamic (or vise-verse). But that would make some people unhappy; On the other hand we could go hybrid where I take the new storage implementations and allow template versions based on needs, remove the compact related stuff, but then it's almost back to the separate slice design. Hmmm... Should the slices be 'use at your own risk' when you tamper with the original BitArray? How else should it be?
Jan 17 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
17-Jan-2013 22:34, Era Scarecrow пишет:
  Thoughts? Feedback? I'm about ready to drop this and resume my
 previous version and enhance it with recent experiences.

Feel free to do it how you see it. It's just that I think the semantics of the previous version can't be improved to a consistent state.

Sure they can, drop the compact part of it and leave it fully dynamic (or vise-verse). But that would make some people unhappy; On the other hand we could go hybrid where I take the new storage implementations and allow template versions based on needs, remove the compact related stuff, but then it's almost back to the separate slice design. Hmmm... Should the slices be 'use at your own risk' when you tamper with the original BitArray? How else should it be?

I'm thinking that a opSlice of stack-allocated must be system and a heap allocated can be safe. Possibly there are better ways to handle this. -- Dmitry Olshansky
Jan 17 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 17, 2013 at 11:36:33PM +0400, Dmitry Olshansky wrote:
[...]
 I'm thinking that a opSlice of stack-allocated must be  system and a
 heap allocated can be  safe.

[...] Related: http://d.puremagic.com/issues/show_bug.cgi?id=8838 T -- Who told you to swim in Crocodile Lake without life insurance??
Jan 17 2013
prev sibling parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 17 January 2013 at 19:36:34 UTC, Dmitry Olshansky 
wrote:
 I'm thinking that a opSlice of stack-allocated must be  system 
 and a heap allocated can be  safe.

That's just a small part of the problem. With the new design, 90% of it can be safe; Just the actual slice buffer when you request it (with Fixed storage) would be system, and slices (having a pointer). But safe code can't call system code which means that the current design (convert to slices before operations) it all has to all be system. Guess once it's debugged the entire thing can be trusted and only certain things can be marked system instead.
Jan 17 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
18-Jan-2013 01:49, Era Scarecrow пишет:
 On Thursday, 17 January 2013 at 19:36:34 UTC, Dmitry Olshansky wrote:
 I'm thinking that a opSlice of stack-allocated must be  system and a
 heap allocated can be  safe.

That's just a small part of the problem. With the new design, 90% of it can be safe; Just the actual slice buffer when you request it (with Fixed storage) would be system,and slices (having a pointer)

Slices should have a pointer and length so they can check range (in debug builds), alternatively just a pointer to storage that nows the range. But
  safe code can't call  system code which means that the current design
 (convert to slices before operations) it all has to all be  system.

I think slice ops could be made safe if the primitives for bulk transfer and opIndex in the Storage are trusted. If we just mark all of opSlice on Fixed-sized one as system then it should be good enough.In other words (I'm lost in terminology): FixedBitArray!(1024) fixed; auto slice = fixed[0..100]; //this is system slice[10..20] = slice[20..30]; //this is safe foreach(bool x; slice[0..10]) //this is safe {... } The idea is to make code responsible for the "root" of unsafety to be system everything else then can be safe or trusted(where compiler can't see it).
   Guess once it's debugged the entire thing can be  trusted and only
 certain things can be marked  system instead.

I think that the most of slice ops can be safe/ trusted. It's the process of taking a slice out of stack-based container that is not safe. The one taking a slice out of GC-based one is safe. -- Dmitry Olshansky
Jan 18 2013
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Friday, 18 January 2013 at 16:42:04 UTC, Dmitry Olshansky 
wrote:
 Slices should have a pointer and length so they can check range 
 (in debug builds), alternatively just a pointer to storage that 
 nows the range.

So write an alternate opIndex...
 I think slice ops could be made  safe if the primitives for 
 bulk transfer and opIndex in the Storage are  trusted.

 If we just mark all of opSlice on Fixed-sized one as  system 
 then it should be good enough. In other words (I'm lost in 
 terminology):

<snip>
 The idea is to make code responsible for the "root" of unsafety 
 to be  system everything else then can be  safe or 
  trusted(where compiler can't see it).

Hmmm okay... Somehow this seemed so much more complex than it's put here.
 I think that the most of slice ops can be  safe/ trusted. It's 
 the process of taking a slice out of stack-based container that 
 is not  safe. The one taking a slice out of GC-based one is 
  safe.

And depending on if it is fixed or dynamic will determine if the slice is system or safe... Makes sense. That sets me straight. So for the slice/range, what all should it support? I've tried to have it support (or foward) opBinary, opBinaryAssign, comparing, slicing, copying, as well as inputRange, bidirectional, opIndex & opIndex assign, opDollar and length. If it should support a much smaller sub-set I'll need to know; Could greatly affect how it is written.
Jan 18 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
19-Jan-2013 11:14, Era Scarecrow пишет:
 On Friday, 18 January 2013 at 16:42:04 UTC, Dmitry Olshansky wrote:
 Slices should have a pointer and length so they can check range (in
 debug builds), alternatively just a pointer to storage that nows the
 range.

So write an alternate opIndex...
 I think slice ops could be made  safe if the primitives for bulk
 transfer and opIndex in the Storage are  trusted.

 If we just mark all of opSlice on Fixed-sized one as  system then it
 should be good enough. In other words (I'm lost in terminology):

<snip>
 The idea is to make code responsible for the "root" of unsafety to be
  system everything else then can be  safe or  trusted(where compiler
 can't see it).

Hmmm okay... Somehow this seemed so much more complex than it's put here.
 I think that the most of slice ops can be  safe/ trusted. It's the
 process of taking a slice out of stack-based container that is not
  safe. The one taking a slice out of GC-based one is  safe.

And depending on if it is fixed or dynamic will determine if the slice is system or safe... Makes sense. That sets me straight. So for the slice/range, what all should it support? I've tried to have it support (or foward) opBinary, opBinaryAssign, comparing, slicing, copying, as well as inputRange, bidirectional, opIndex & opIndex assign, opDollar and length. If it should support a much smaller sub-set I'll need to know; Could greatly affect how it is written.

opBinaryAssign --> opOpAssign. opSlice/opSlice assign. In any case it seems to me that (I think you already did this trick before) you can reuse a lot of code by making an operator string a template parameter (for functions that implement =, |=, &=, ^=, and therefore ^, |, &) and use string mixins. Other then this - don't forget the '~' _unary_ operator as it makes a lot of sense for bit-vectors. Now since slices don't support '~' concatenation it won't look so bad. Worst case: auto ba = BitArray(...); auopt slice = ba[...]; ba ~= ~slice; And it's not much of problem. -- Dmitry Olshansky
Jan 19 2013
parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Saturday, 19 January 2013 at 09:07:03 UTC, Dmitry Olshansky 
wrote:
 opBinaryAssign --> opOpAssign. opSlice/opSlice assign. In any 
 case it seems to me that (I think you already did this trick 
 before) you can reuse a lot of code by making an operator 
 string a template parameter (for functions that implement  =, 
 |=, &=, ^=, and therefore ^, |, &) and use string mixins.

Yes I did when I could; The slice versions can only forwards them to the bitarray one.
 Other then this - don't forget the '~' _unary_ operator as it 
 makes a lot of sense for bit-vectors. Now since slices don't 
 support '~' concatenation it won't look so bad. Worst case:

 auto ba = BitArray(...);
 auopt slice = ba[...];
 ba ~= ~slice;

Only works if you create a new version and append it; And all opBinary operations make duplicates due to their nature. So the above couldn't work if slice is a separate type, and instead: auto slice = ba[...]; //slice type of BitArraySlice auto bc ~= ~slice; //ba/bc type of BitArray ba = (ba ~ (~slice))[]; //type of BitArraySlice
 And it's not much of problem.

Hmmm.. Well as mentioned I don't think a separate slice/range type is the best design or as practical; I have it half working but one problem I'm really having is trying to make sense of the errors after I converted to use a pointer. It's confusing since it was compiling before and is an annoyance more-so now; And suddenly can't deduce the type when it should be able to, and if I explicitly call it with the type it doesn't help any. [quote] Error: template bitmanip.BitArray!(Fixed!(1024)).BitArray.opEquals(alias SL)(const SL rhs) if (hasMember!(SL, "isBitArray") || hasMember!(SL, "isBitArraySlice")) forward reference to template opEquals(alias SL)(const SL rhs) if (hasMember!(SL, "isBitArray") || hasMember!(SL, "isBitArraySlice")) Error: template bitmanip.BitArray!(Fixed!(1024)).BitArray.opEquals cannot deduce template function from argument types !()(BitArraySlice!(BitArray!(Fixed!(1024)), true),const(ulong),const(ulong)) Error: cannot implicitly convert expression ((*this.slice).toString(this.start, this.end)) of type string to string [/quote] [code] //isBitArray and isBitArraySlice are enums in the structs //almost all functions templated to handle fixed/dynamic //types as compatible. Plus constness for range(s). //auto ref would be nice/preferred in a lot in these functions.. //Otherwise copying is required unless i do more duplication/code forwarding //if opSlice is system, then almost all these are system, or trusted //slice forwarding bool opEquals(SL)(const SL rhs) const if (hasMember!(SL, "isBitArray") || hasMember!(SL, "isBitArraySlice")) { //opSlice always returns BitArraySlice return opEquals(rhs[], 0, length); } bool opEquals(SL)(const SL rhs, ulong sliceStart, ulong sliceEnd) const if (hasMember!(SL, "isBitArraySlice")) { if ((sliceEnd-sliceStart) != rhs.length) return false; return opCmp(rhs, sliceStart, sliceEnd) == 0; } //slice forwarding int opCmp(SL)(const SL rhs) const if (hasMember!(SL, "isBitArray")) { //opSlice always returns BitArraySlice return opCmp(rhs[], 0, length); } int opCmp(SL)(const SL rhs, ulong sliceStart, ulong sliceEnd) const if (hasMember!(SL, "isBitArraySlice")); [/code] A small consideration was coming to mind in cases where storage management if you only needed to prepend a few bits or append just after for a few bits but don't need to do a full reallocation. In that case perhaps a before/after allocated memory would serve nice. size_t beforeArray, afterArray; int beforeUsed, afterUsed; //course would likely be bitfield or ubyte Now if we use that it comes to mind that although it wouldn't be perfect storage for small compact arrays (as was the original complaint(s)), you could however rely on them just as much as the main array and the first size_t bits (*2 if you prepend/append to the array a lot. It would also be useful to append past the end of a sliced area without tampering with data outside it's access area until it's forced to resize. Fixed BitArraySizes seem far more wanted where you don't want to do extra memory allocation, so using reference counting wouldn't work, but with dynamic arrays you could. So if you did all the slices to a particular bit array could always point to the proper memory. If you did do that, prepending to an array would be a nightmare unless you could offset it with a virtual offset to represent the actual beginning. So... //start in the middle struct STORE { //virtualOffset instead? ulong startOffset = ulong.max / 2; size_t[] store; } //in slice/Bitarray, hope that's right RefCounted!(STORE) store; ulong start, end; With these three you can safely do dynamic slices while all slices point to the same data; in cases where there's only one reference store, start, end & startOffset can freely be changed. Assume you slice. So.. BitArray ba = BitArray(100); auto sl = ba[10 .. 20]; //assume startOffset says say 1024, start, 1024 & end = 1124. //slice holds start 1034, end 1044. Now if we have it prepend a whole lot. foreach(i; 0 .. 100) ba = true ~ ba; It had to be resized, so if it gave it say 512 more bits then... //now startOffset = 512. slice[0] = true; //start-startOffset = actual bit offset in store[] assert(ba[110]); //was set by the slice Full reference could be preserved, although if you prepend to a slice vs the main array then either you overlap data or need some temporary storage until duping/reallocating is required, or just dup/allocate new memory. Or have a separate end/before arrays which won't force reallocation but could make for interesting overlapping as they wouldn't all always point to the same thing. The interesting problems that crop up again. Another thought, is what if we give an ID to fixed arrays so they know if they are pointing to proper data or junk; This would likely mostly be for the known fixed array slices. So.. size_t ID; //in both slice and fixed array. size_t[1024] fixedSize; Now add a variant to ensure the ID's match on each and every access. If the ID gets corrupted the data is corrupted too, although not a guarantee it's pretty darn close if the ID is random. But as you mentioned making it system then we could have it silently convert the BitArray from fixed to dynamic; And make the slice system where normally it's safe (although immediately duping the memory location after it's using the dynamic form would be safe). Reminds me... Should initialization/blit dup or reference/preserve the memory? I'd think it would point to the old memory unless explicitly told to (or going from const to non-const versions).
Jan 19 2013
prev sibling parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
(Moved from: What have I missed?)

  If this is the wrong place to ask these questions I apologize, 
getting back into this takes some work.


  So I guess I need to ask: Should I try to resume work on the 
BitManip library? (So far it seems none of my code has been 
integrated to phobos)

  Assuming I do, should I try to fix lots of small bugs and make 
smaller pulls or should I try to do several at once? When I 
re-wrote the BitArray I personally think it is an overall 
improvement in many ways, and being a complete re-write you can't 
just do bug #xxxx and then bug #xxxx and then bug... etc etc.

  Also to ask, how many people tried out the rewrite I proposed, 
and do they think it was actually an improvement for ease of use, 
speed, fewer bugs/issues, etc?
Aug 06 2014
parent reply "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Thu, Aug 07, 2014 at 01:10:06AM +0000, Era Scarecrow via Digitalmars-d-learn
wrote:
 (Moved from: What have I missed?)
 
  If this is the wrong place to ask these questions I apologize,
  getting back into this takes some work.

Since this is about contributing to Phobos, probably a better place to ask is on the main D forum.
  So I guess I need to ask: Should I try to resume work on the BitManip
  library? (So far it seems none of my code has been integrated to
  phobos)

Do you have a pull request? Which one is it?
  Assuming I do, should I try to fix lots of small bugs and make
  smaller pulls or should I try to do several at once? When I re-wrote
  the BitArray I personally think it is an overall improvement in many
  ways, and being a complete re-write you can't just do bug #xxxx and
  then bug #xxxx and then bug... etc etc.
 
  Also to ask, how many people tried out the rewrite I proposed, and do
  they think it was actually an improvement for ease of use, speed,
  fewer bugs/issues, etc?

There have been a few fixes to std.bitmanip recently, including a current PR for adding attributes to some of the functions. Are these your PRs, or are you proposing something totally new? If it's something totally new, where can we get the code? T -- "Maybe" is a strange word. When mom or dad says it it means "yes", but when my big brothers say it it means "no"! -- PJ jr.
Aug 06 2014
parent reply "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 7 August 2014 at 01:51:46 UTC, H. S. Teoh via 
Digitalmars-d-learn wrote:
 Since this is about contributing to Phobos, probably a better 
 place to ask is on the main D forum.

Yeah posted in my 'what have i missed?' as well...
 Do you have a pull request? Which one is it?

https://github.com/rtcvb32/phobos/pull/1 From the looks of things it's 4 commits that are merged into a single request i think...
 There have been a few fixes to std.bitmanip recently, including 
 a current PR for adding attributes to some of the functions. 
 Are these your PRs, or are you proposing something totally new? 
 If it's something totally new, where can we get the code?

Glancing at the current code, none of my stuff got in. I do a large number of fixes, so likely a lot of those changes would get thrown away or integrated... Knock yourself out... the pull/1 above holds most of the commits and diff/changes to note of. https://github.com/rtcvb32/phobos/blob/master/std/bitmanip.d
Aug 06 2014
next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 7 August 2014 at 02:04:13 UTC, Era Scarecrow wrote:
 On Thursday, 7 August 2014 at 01:51:46 UTC, H. S. Teoh via 
 Digitalmars-d-learn wrote:
 Do you have a pull request? Which one is it?

https://github.com/rtcvb32/phobos/pull/1 From the looks of things it's 4 commits that are merged into a single request i think...

Looks like that pull is just the bitfields... but the actual bitmanip one i listed has the whole source.
Aug 06 2014
prev sibling parent reply "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Thu, Aug 07, 2014 at 02:04:12AM +0000, Era Scarecrow via Digitalmars-d-learn
wrote:
 On Thursday, 7 August 2014 at 01:51:46 UTC, H. S. Teoh via
 Digitalmars-d-learn wrote:
Since this is about contributing to Phobos, probably a better place to ask
is on the main D forum.

Yeah posted in my 'what have i missed?' as well...
Do you have a pull request? Which one is it?

https://github.com/rtcvb32/phobos/pull/1 From the looks of things it's 4 commits that are merged into a single request i think...

Hold on a sec... that's a pull for your own fork of Phobos. You need to submit a pull to the main Phobos repo in order to get it reviewed and merged. :-)
There have been a few fixes to std.bitmanip recently, including a
current PR for adding attributes to some of the functions. Are these
your PRs, or are you proposing something totally new? If it's
something totally new, where can we get the code?

Glancing at the current code, none of my stuff got in. I do a large number of fixes, so likely a lot of those changes would get thrown away or integrated...

[...] Well, no wonder, your pull was submitted against your local fork, not to the main Phobos repo, so nobody knew about it (except Dmitry, apparently). It would help to submit pulls against the main Phobos repo. :-) Also, looks like you made major changes to the code... if it's a complete rewrite, I'd say submit it as a single pull. If it's a collection of many fixes, it's probably better to submit separate pulls so that the easy fixes will get in first while we work out the wrinkles on the more complicated fixes. T -- Любишь кататься - люби и саночки возить.
Aug 06 2014
parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 7 August 2014 at 02:12:20 UTC, H. S. Teoh via 
Digitalmars-d-learn wrote:
 Hold on a sec... that's a pull for your own fork of Phobos. You 
 need to submit a pull to the main Phobos repo in order to get 
 it reviewed and merged. :-)

 Well, no wonder, your pull was submitted against your local 
 fork, not to the main Phobos repo, so nobody knew about it 
 (except Dmitry, apparently). It would help to submit pulls 
 against the main  Phobos repo.
 :-)

 Also, looks like you made major changes to the code... if it's 
 a complete rewrite, I'd say submit it as a single pull. If it's 
 a collection of many fixes, it's probably better to submit 
 separate pulls so that the easy fixes will get in first while 
 we work out the  wrinkles on the more complicated fixes.

I did submit it against the original Phobos, and the auto tester picked it up. For something like 3-4 weeks it tried and kept failing over and over again because it couldn't merge it. The reason? It was something like 13 unresolved newlines that didn't match up... or something that I have no idea how I could fix because whitespace is invisible. Later Walter or Andrei (I forget who) complained when they looked at it and wanted me to break it apart into smaller more indivdual bug fixes (since it's hard to tell what i changed as a single blob/replacement) but as a complete re-write and I don't know if I COULD do that... Trust me. I tried. It failed. Eventually I just pulled the request and gave up at that time... Right about the time of the Dconf 2013 I got burned out and got depressed (Probably major changes in meds).
Aug 06 2014