www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.bitarray examples

reply "Me Here" <p9e883002 sneakemail.com> writes:
Are there any good examples of using std.bitarray around?

(And, is my memiry deceiving me or did D have a Bit type once upon a time?)
-- 
May 06 2008
next sibling parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Me Here wrote:

 (And, is my memiry deceiving me or did D have a Bit type once upon a time?)
The "bool" type was originally called "bit". It is still aliased even. --anders
May 06 2008
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Me Here wrote:
 Are there any good examples of using std.bitarray around?
The garbage collector uses it.
 (And, is my memiry deceiving me or did D have a Bit type once upon a time?)
Yes, but it was dropped.
May 08 2008
parent reply "Me Here" <p9e883002 sneakemail.com> writes:
Walter Bright wrote:

 Me Here wrote:
 Are there any good examples of using std.bitarray around?
The garbage collector uses it.
 (And, is my memiry deceiving me or did D have a Bit type once upon a time?)
Yes, but it was dropped.
Thanks. I got sidetracked away from this. Now I've got back to it I'm still having trouble working out if I can use std.bitarray for my purposes. I need to decompose an unsigned 16-bit value into 8 x 2-bit integers. I C I'd use bitfields. I know and understand why these are not a part of D, but that doesn't help me solve my problem. The data, 900+MB of it is delivered in this packed format and I need to unpack it. Can I use std.bitarray to extract these 2-bit numbers? If so, a cluebat as to how would be useful. Thanks, b. --
May 09 2008
next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 09/05/2008, Me Here <p9e883002 sneakemail.com> wrote:
 Walter Bright wrote:

  > Me Here wrote:
  > > Are there any good examples of using std.bitarray around?
  >
  > The garbage collector uses it.
  >
  > > (And, is my memiry deceiving me or did D have a Bit type once upon a
time?)
  >
  > Yes, but it was dropped.

 Thanks. I got sidetracked away from this. Now I've got back to it I'm still
  having trouble working out if I can use std.bitarray for my purposes. I need
to
  decompose an unsigned 16-bit value into 8 x 2-bit integers. I C I'd use
  bitfields. I know and understand why these are not a part of D, but that
  doesn't help me solve my problem. The data, 900+MB of it is delivered in this
  packed format and I need to unpack it.

  Can I use std.bitarray to extract these 2-bit numbers? If so, a cluebat as to
  how would be useful.
std.bitarray is the wrong module to use. It's being deprecated anyway. What you want is std.bitmanip. import std.bitmanip; struct A { ushort n; // unsigned 16-bit number mixin(bitfields!( uint, "a", 2, uint, "b", 2, uint, "c", 2, uint, "d", 2, uint, "e", 2, uint, "f", 2, uint, "g", 2, uint, "h", 2)); } Now you've got bitfields, just like in C. Given A x; assigning x.n assigns the whole 16 bit number, while reading and writing x.a to x.h will read or write the eight individual two-bit fields. The documentation for std.bitarray consists in entirety of the sentence: "Scheduled for deprecation. Use std.bitmanip instead", so I'm surprised you didn't go there yourself.
May 09 2008
parent reply "Me Here" <p9e883002 sneakemail.com> writes:
Janice Caron wrote:


 The documentation for std.bitarray consists in entirety of the
 sentence: "Scheduled for deprecation. Use std.bitmanip instead", so
 I'm surprised you didn't go there yourself.
Really? Which documentation are you reading because the documentation on my harddrive that D1/Phobos starts ands with: std.bitarray struct BitArray; An array of bits. .... BitArray opCat(bool b); BitArray opCat_r(bool b); BitArray opCat(BitArray b); Support for binary operator ~ for bit arrays. And there is no std.bitmanip list in the index? Cheers, b. --
May 10 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 10/05/2008, Me Here <p9e883002 sneakemail.com> wrote:
 Really? Which documentation are you reading because the documentation on my
  harddrive that D1/Phobos
Oops. Using D2. D2 has bitfields. You might want to consider moving up. :-)
May 10 2008
next sibling parent reply "Me Here" <p9e883002 sneakemail.com> writes:
Janice Caron wrote:

 You might want to consider moving up. :-)
Hm. And so we come full circle. I started out with D2, but the performance impact of using invariant strings (for may application) is just to costly. See http://genome.ucsc.edu/FAQ/FAQformat.html#format7 Essentially, once the compacted (2-bit format has been expanded back to the (huge) strings of ACGTs, then the nBlocks (count/Starts/sizes) and maskBlocks (count/Starts/Sizes) describe ranges of those huge strings that need to be changed to 'N's (nBlocks) or be lower-cased (A->a, C->c etc.). Breaking these huge strings up into pieces to do these tranformations, and then sticking all the pieces back together, when they are *all* 1 to 1 substitutions, is just ludicrous. The impact of generating all those iddy biddy invariant char[]s from one huge invriant char[] and then sticking them all back together to form another huge invariant char[] and throwing away all the intermediates causes the GC to go into fits. When the (compressed) input is 900+MB and the expanded output is > 3GB, the performance impact is considerable and inacceptable. Even once I get around to multi-threading this, the use of invariant char[]s will still be no advantage because I want (*need*) the processing of the strings to operate /in-place/. I (the programmer) need to control what gets copied when. And to control concurrent access by sharing ranges of the data (without copying), for modification /in-place/. Attempting to isolate the programmer (me) from the concerns of multi-threading by duplicating data over and over isn't an option given the volumes of data. (Personnally I think is a waste of time and effort anyway. Better to educate the programmer than to try and nanny his use of threads. The effort expended on this would be far better spent on other things--like fixing the GC. But that's not my call) Cheers, b. --
May 10 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 10/05/2008, Me Here <p9e883002 sneakemail.com> wrote:
 Janice Caron wrote:

  > You might want to consider moving up. :-)

 Hm. And so we come full circle. I started out with D2, but the performance
  impact of using invariant strings (for may application) is just to costly.
In that case, you might want to consider moving up, and using mutable strings. Hint: char[] works on D2 equally as well as on D1. :-)
May 10 2008
parent reply "Me Here" <p9e883002 sneakemail.com> writes:
Janice Caron wrote:

 On 10/05/2008, Me Here <p9e883002 sneakemail.com> wrote:
 Janice Caron wrote:
 
  > You might want to consider moving up. :-)
 
 Hm. And so we come full circle. I started out with D2, but the performance
  impact of using invariant strings (for may application) is just to costly.
In that case, you might want to consider moving up, and using mutable strings. Hint: char[] works on D2 equally as well as on D1. :-)
Yes. But the string library does not. Is your memory really this bad? b. --
May 10 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 10/05/2008, Me Here <p9e883002 sneakemail.com> wrote:
 Yes. But the string library does not.
Yet. It's in-progress.
  Is your memory really this bad?
Please don't get personal. You get personal; I add you to my email filter and I don't see your posts any more. It's that simple. Your choices would seem to be: (1) wait until Phobos adds in-place string functions (2) write your own in-place string functions (3) write your own bitfields class Since I already did (3) for you a few posts up, you have a head start on one of the options.
May 10 2008
next sibling parent reply Dee Girl <deegirl noreply.com> writes:
Janice Caron Wrote:

 On 10/05/2008, Me Here <p9e883002 sneakemail.com> wrote:
 Yes. But the string library does not.
Yet. It's in-progress.
  Is your memory really this bad?
Please don't get personal. You get personal; I add you to my email filter and I don't see your posts any more. It's that simple. Your choices would seem to be: (1) wait until Phobos adds in-place string functions (2) write your own in-place string functions (3) write your own bitfields class Since I already did (3) for you a few posts up, you have a head start on one of the options.
It is good news that std.string is changing. I read it now and it seems many functions want string but they could work with const(char) and char[]. These functions should take the superclass of all three const(char)[]. How does D2 solve the problem of passing the same qualifier to the return as it is in the parameter? In the Javari language there is a keyword romaybe. Thank you, Dee Girl
May 10 2008
next sibling parent Sean Kelly <sean invisibleduck.org> writes:
== Quote from Dee Girl (deegirl noreply.com)'s article
 Janice Caron Wrote:
 On 10/05/2008, Me Here <p9e883002 sneakemail.com> wrote:
 Yes. But the string library does not.
Yet. It's in-progress.
  Is your memory really this bad?
Please don't get personal. You get personal; I add you to my email filter and I don't see your posts any more. It's that simple. Your choices would seem to be: (1) wait until Phobos adds in-place string functions (2) write your own in-place string functions (3) write your own bitfields class Since I already did (3) for you a few posts up, you have a head start on one of the options.
It is good news that std.string is changing. I read it now and it seems many functions want string but
they could work with const(char) and char[]. These functions should take the superclass of all three const(char)[]. There have been objections to this in the past. I'll let someone else outline them.
 How does D2 solve the problem of passing the same qualifier to the return as
it is in the parameter?
In the Javari language there is a keyword romaybe. Thank you, Dee Girl There is a "scoped const" proposal aimed at solving this problem, but so far as I know it isn't going to be added to the language. Sean
May 10 2008
prev sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 10/05/2008, Dee Girl <deegirl noreply.com> wrote:
  How does D2 solve the problem of passing the same qualifier to the return as
it is in the parameter?
That is an entirely separate subject. There is a proposal on the table suggested by Steven Schveighoffer and myself which addresses that very problem, and our proposal sounds similar to your romaybe. However, that is not relevant for std.string.
May 10 2008
parent reply Sean Kelly <sean invisibleduck.org> writes:
== Quote from Janice Caron (caron800 googlemail.com)'s article
 On 10/05/2008, Dee Girl <deegirl noreply.com> wrote:
  How does D2 solve the problem of passing the same qualifier to the return as
it is in the parameter?
That is an entirely separate subject. There is a proposal on the table suggested by Steven Schveighoffer and myself which addresses that very problem, and our proposal sounds similar to your romaybe. However, that is not relevant for std.string.
It's relevant in that the implementation of std.string could benefit from the proposal, were it implemented. Sean
May 10 2008
parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 10/05/2008, Sean Kelly <sean invisibleduck.org> wrote:
 That is an entirely separate subject. There is a proposal on the table
> suggested by Steven Schveighoffer and myself which addresses that very > problem, and our proposal sounds similar to your romaybe. However, > that is not relevant for std.string. It's relevant in that the implementation of std.string could benefit from the proposal, were it implemented.
Not so. The invariant versions will use copy-on-write. That is a huge, huge, plus for invariant strings, and no one wants to lose it. At the other end of the spectrum, mutable char arrays offer the possibility of modify-in-place, which invariant clearly cannot do. There is no way that Steven's/my inout proposal would be sufficient to achieve all that is needed for std.string.
May 10 2008
parent Sean Kelly <sean invisibleduck.org> writes:
Janice Caron wrote:
 On 10/05/2008, Sean Kelly <sean invisibleduck.org> wrote:
 That is an entirely separate subject. There is a proposal on the table
> suggested by Steven Schveighoffer and myself which addresses that very > problem, and our proposal sounds similar to your romaybe. However, > that is not relevant for std.string. It's relevant in that the implementation of std.string could benefit from the proposal, were it implemented.
Not so. The invariant versions will use copy-on-write. That is a huge, huge, plus for invariant strings, and no one wants to lose it. At the other end of the spectrum, mutable char arrays offer the possibility of modify-in-place, which invariant clearly cannot do. There is no way that Steven's/my inout proposal would be sufficient to achieve all that is needed for std.string.
I remember Walter saying that he was surprised at how many string operations did not actually involve in-place modification, and I took this as an explanation for why std.string is designed the way it is. The problem with std.string is that one cannot use it with mutable strings /at all/, and Steven's proposal would change this quite naturally. Sean
May 10 2008
prev sibling parent "Me Here" <p9e883002 sneakemail.com> writes:
Janice Caron wrote:

 Please don't get personal. You get personal; I add you to my email
 filter and I don't see your posts any more. It's that simple.
You're right. I apologise. That is a far better solution. b. --
May 10 2008
prev sibling next sibling parent reply "Me Here" <p9e883002 sneakemail.com> writes:
Janice Caron wrote:

 You might want to consider moving up. :-)
Hm. And so we come full circle. I started out with D2, but the performance impact of using invariant strings (for may application) is just to costly. See http://genome.ucsc.edu/FAQ/FAQformat.html#format7 Essentially, once the compacted (2-bit format has been expanded back to the (huge) strings of ACGTs, then the nBlocks (count/Starts/sizes) and maskBlocks (count/Starts/Sizes) describe ranges of those huge strings that need to be changed to 'N's (nBlocks) or be lower-cased (A->a, C->c etc.). Breaking these huge strings up into pieces to do these tranformations, and then sticking all the pieces back together, when they are *all* 1 to 1 substitutions, is just ludicrous. The impact of generating all those iddy biddy invariant char[]s from one huge invriant char[] and then sticking them all back together to form another huge invariant char[] and throwing away all the intermediates causes the GC to go into fits. When the (compressed) input is 900+MB and the expanded output is > 3GB, the performance impact is considerable and inacceptable. Even once I get around to multi-threading this, the use of invariant char[]s will still be no advantage because I want (*need*) the processing of the strings to operate /in-place/. I (the programmer) need to control what gets copied when. And to control concurrent access by sharing ranges of the data (without copying), for modification /in-place/. Attempting to isolate the programmer (me) from the concerns of multi-threading by duplicating data over and over isn't an option given the volumes of data. (Personnally I think is a waste of time and effort anyway. Better to educate the programmer than to try and nanny his use of threads. The effort expended on this would be far better spent on other things--like fixing the GC. But that's not my call) Cheers, b. --
May 10 2008
parent "Janice Caron" <caron800 googlemail.com> writes:
Here's another possibility you might try:

    struct BitFields
    {
        ubyte n;

        uint a() { return (n >> 6) & 3; }
        uint b() { return (n >> 4) & 3; }
        uint c() { return (n >> 2) & 3; }
        uint d() { return (n) & 3; }

        void a(uint x) { n &= 0x3F; n |= (x & 3) << 6; }
        void b(uint x) { n &= 0xCF; n |= (x & 3) << 4; }
        void c(uint x) { n &= 0xF3; n |= (x & 3) << 2; }
        void d(uint x) { n &= 0xFC; n |= (x & 3); }
    }

Viola. Bitfields in D1.
May 10 2008
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
== Quote from Janice Caron (caron800 googlemail.com)'s article
 On 10/05/2008, Me Here <p9e883002 sneakemail.com> wrote:
 Really? Which documentation are you reading because the documentation on my
  harddrive that D1/Phobos
Oops. Using D2. D2 has bitfields. You might want to consider moving up. :-)
Bit fields and bit arrays are intended to solve different problems. But fortunately, D2 just moves BitArray from std.bitarray to std.bitmanip so no problem. Sean
May 10 2008
prev sibling parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Janice Caron" <caron800 googlemail.com> wrote in message 
news:mailman.583.1210419927.2351.digitalmars-d puremagic.com...
 On 10/05/2008, Me Here <p9e883002 sneakemail.com> wrote:
 Really? Which documentation are you reading because the documentation on 
 my
  harddrive that D1/Phobos
Oops. Using D2. D2 has bitfields. You might want to consider moving up. :-)
You don't *have* to use D2 in order to use bitfields. There's also the std2 project (http://www.dsource.org/projects/std2) which ports many of the new modules in phobos 2, including std.bitmanip, to D1.
May 10 2008
parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Jarrett Billingsley wrote:
 "Janice Caron" <caron800 googlemail.com> wrote in message 
 news:mailman.583.1210419927.2351.digitalmars-d puremagic.com...
 On 10/05/2008, Me Here <p9e883002 sneakemail.com> wrote:
 Really? Which documentation are you reading because the documentation on 
 my
  harddrive that D1/Phobos
Oops. Using D2. D2 has bitfields. You might want to consider moving up. :-)
You don't *have* to use D2 in order to use bitfields. There's also the std2 project (http://www.dsource.org/projects/std2) which ports many of the new modules in phobos 2, including std.bitmanip, to D1.
Yes, it looks like std.bitmanip is available in std2. But not everything is. The project is kind of on hold until Walter fixes IFTI bugs for D1, because Andrei started going wild with APIs that require being able to IFTI remaining parameters when the first few are explicitly specified. Example from memory: all the nifty sorting functions require a string parameter for the "a<b" predicate part, but in D2 can infer the rest of the template arguments. In D1 if you specify any template parameter it instantly stops trying to deduce anything. --bb
May 10 2008
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 10/05/2008, Janice Caron <caron800 googlemail.com> wrote:
  assigning x.n assigns the whole 16 bit number, while reading and
  writing x.a to x.h will read or write the eight individual two-bit
  fields.
Actually, I got that wrong. The example I gave is equivalent to C's: struct A { unsigned short n; unsigned int a : 2; unsigned int b : 2; unsigned int c : 2; unsigned int d : 2; unsigned int e : 2; unsigned int f : 2; unsigned int g : 2; unsigned int h : 2; } I inadvertantly gave the impression that n overlaps a to h. It doesn't - it's a separate variable. If you wanted n and a to h to overlap, you'd have to use a struct inside a union. Apologies for confusion. But anyway, std.bitmanip is what you want.
May 09 2008