www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Deque impl.

reply Robert Schadek <realburner gmx.de> writes:
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Hi,

I have a Deque implementation that I really like. I would like to get 
some comments on it.
http://dpaste.1azy.net/4bf119e7 dpaste does not run the unittests 
successfully. I don't
know why. I tested on a x32 as well as x64 machine and it works on both.

Best Regards
Robert
Jan 29 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Robert Schadek:

 I have a Deque implementation that I really like. I would like 
 to get some comments on it.
 http://dpaste.1azy.net/4bf119e7 dpaste does not run the

What's the structure of the data it keeps? I'd like a deque implemented as a growable circular queue (implemented with a dynamic array) of pointers to fixed-sized chunks, plus a "intrusive-like" freelist that keeps some of the last removed chunks. Bye, bearophile
Jan 29 2013
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 01/29/2013 08:55 PM, bearophile wrote:
 Robert Schadek:

 I have a Deque implementation that I really like. I would like to get
 some comments on it.
 http://dpaste.1azy.net/4bf119e7 dpaste does not run the

What's the structure of the data it keeps?

It's a simple growable circular queue.
 I'd like a deque implemented as a growable circular queue (implemented
 with a dynamic array) of pointers to fixed-sized chunks, plus a
 "intrusive-like" freelist that keeps some of the last removed chunks.

 Bye,
 bearophile

Jan 29 2013
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
29-Jan-2013 23:44, Robert Schadek пишет:
 Hi,

 I have a Deque implementation that I really like. I would like to get
 some comments on it.
 http://dpaste.1azy.net/4bf119e7 dpaste does not run the unittests
 successfully. I don't
 know why. I tested on a x32 as well as x64 machine and it works on both.

I skimmed through it what caught my eye instantly: this deque exposes too much of operations that are not required of any deque. The end result is that it's pretty much always have to be an array (all these indexed inserts, removals etc.). To put it plainly it *is* an array. To be honest deque can be implemented far better if random access as in indexing, removal and such is dropped. Canonical interface is: front insertFront removeFront back insertBack removeBack length (and even this could be dropped sometimes) empty That is all. The only extra sugar to add is a bidirectional range over it via opSlice() (the one with no args). Then cool stuff like foreach works: Deque!T deque = ...; foreach(value; deque) //calls deque[] -> deque.opSlice() implicitly { ...} What about that better representation I spoke of? Simpler interface unties your hands, and in particular you can layout data as a list (or array) of arrays (pages/block you name it) thus killing the need to move data around on insert{Front,Back}. Calling a type an Iterator and then describing it as a range is bound to baffle your users ;) Another nit is that it's common to just house such types inside of a container and call them simply Range i.e. Deque!T.Range. -- Dmitry Olshansky
Jan 29 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/31/13 3:46 AM, monarch_dodra wrote:
 Quite frankly, I'm bringing more and more into question the fact that
 everything in phobos are structs, especially considering the "no default
 constructor" problem.

 The containers in std.container pretty much emulate final classes via
 pointers to payloads anyways. Not doing it via classes only brings
 problems.

 And even if you succeed in coding the damn thing correctly (Objects like
 DList had so many bugs I consider it virtually unusable). The end result
 is subtle user bugs when passing containers, if they have not been
 initialized yet. It creates an ambiguity in behavior if or if not the
 container is initialized (modification to the container inside the
 function may or may not modify the outside container).

 The only argument I see in favor of structs, is the possibility of
 having a deterministic memory model (std.container.Array). I don't think
 D's *generic* containers should do that though, IMO. It also brings lots
 of problems, just for a specific use case.

 Long story short, I'd say that unless you have a damn good reason to
 stay with structs, use classes. It has easier user semantics and much
 less room for implementation bugs.

We should use structs. Many people who need containers have efficiency as a primary concern. It makes sense for the standard library to make containers as good as possible. The intended semantics of containers in stdlib is reference counted with reference semantics. Yes, there are disadvantages for that. After much deliberation it seemed to me this is the best approach to containers in a language that cares for efficiency. Reference counting and adding allocators later rule out final classes. Andrei
Jan 31 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/31/13 11:12 AM, Robert burner Schadek wrote:
 struct Deque(T) {
 struct Payload {
 T* array;
 size_t head, size;
 size_t ref;
 }

 Payload* load;

 ~this() {
 if(load !is null && (load.ref-=1) == 0) {
 free(load);
 }
 }

Yes, that's the general pattern. I agree it's a pain but I also understand that library containers are not required to look like casual application code. Andrei
Jan 31 2013
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
31-Jan-2013 19:11, Andrei Alexandrescu пишет:
 On 1/31/13 3:46 AM, monarch_dodra wrote:
 Quite frankly, I'm bringing more and more into question the fact that
 everything in phobos are structs, especially considering the "no default
 constructor" problem.

 The containers in std.container pretty much emulate final classes via
 pointers to payloads anyways. Not doing it via classes only brings
 problems.

 And even if you succeed in coding the damn thing correctly (Objects like
 DList had so many bugs I consider it virtually unusable). The end result
 is subtle user bugs when passing containers, if they have not been
 initialized yet. It creates an ambiguity in behavior if or if not the
 container is initialized (modification to the container inside the
 function may or may not modify the outside container).

 The only argument I see in favor of structs, is the possibility of
 having a deterministic memory model (std.container.Array). I don't think
 D's *generic* containers should do that though, IMO. It also brings lots
 of problems, just for a specific use case.

 Long story short, I'd say that unless you have a damn good reason to
 stay with structs, use classes. It has easier user semantics and much
 less room for implementation bugs.

We should use structs. Many people who need containers have efficiency as a primary concern. It makes sense for the standard library to make containers as good as possible. The intended semantics of containers in stdlib is reference counted with reference semantics. Yes, there are disadvantages for that. After much deliberation it seemed to me this is the best approach to containers in a language that cares for efficiency. Reference counting and adding allocators later rule out final classes.

IMHO we need both kinds of containers: small value semantics a-la STL (with the usual samll string optimization and whatnot) and the large bulky ones with clean reference semantics. We can just call them differently and point out the intended usage areas. -- Dmitry Olshansky
Jan 31 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/31/13 12:12 PM, Dmitry Olshansky wrote:
 IMHO we need both kinds of containers: small value semantics a-la STL
 (with the usual samll string optimization and whatnot) and the large
 bulky ones with clean reference semantics.

I don't see much use case for the small value semantics containers. Andrei
Jan 31 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
31-Jan-2013 21:47, Andrei Alexandrescu пишет:
 On 1/31/13 12:12 PM, Dmitry Olshansky wrote:
 IMHO we need both kinds of containers: small value semantics a-la STL
 (with the usual samll string optimization and whatnot) and the large
 bulky ones with clean reference semantics.

I don't see much use case for the small value semantics containers.

Even if we had safe and clean stack allocator consider ref-semantic array: struct Array{ struct Payload{ T* data; size_t len; //... other stuff } Payload* payload; Allocator alloc; //a reference most likely ... } Now you want to replace this C-ish (but effective) code: T[SIZE] data; size_t data_size; ... { ... //push zis data[data_size++] = new_item; //and reiteration of the theme } (keeping in mind that an allocation might cost more then the whole code in question in the majority of cases) The other variation of the above that doesn't have the upper bound is use alloca for starters, then fallback to malloc/free on demand. I've seen quite a few of cases like this throughout the phobos and boy those are all ugly. The primitive for clean small-scale throw-away containers is a dire need. I grant you that there is a lot of cases where arrays are tiny 90% of time and small in 99% of time e.g. identifiers are tiny < 16, small < 64, so are relative file paths etc. Yet there is always some measly percent (say 1% or 0.1%, it exists) of rare larger cases to handle. IRC bearophile once shown us how these types of small vectors used extensively in LLVM with great benefit (they replaced std:vectors). In other words the use case is niche but is frequently reinvented badly. -- Dmitry Olshansky
Jan 31 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/31/13 2:02 PM, Dmitry Olshansky wrote:
 In other words the use case is niche but is frequently reinvented badly.

I agree. I do want to get our story straight with the reference containers first, and then tackle this as an addition. Andrei
Jan 31 2013
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
31-Jan-2013 05:34, Steven Schveighoffer пишет:
 On Tue, 29 Jan 2013 15:08:36 -0500, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote

 I skimmed through it what caught my eye instantly: this deque exposes
 too much of operations that are not required of any deque. The end
 result is that it's pretty much always have to be an array (all these
 indexed inserts, removals etc.). To put it plainly it *is* an array.

 To be honest deque can be implemented far better if random access as
 in indexing, removal and such is dropped. Canonical interface is:

 front
 insertFront
 removeFront
 back
 insertBack
 removeBack
 length (and even this could be dropped sometimes)
 empty

 That is all.

Hm... I thought deque's major draw was that it had O(1) insertion/removal from the front and back, and ALSO had O(1) indexing.

Hm I must be getting rusty as I'd thought it doesn't have indexing. Anyway indexing is okay just not insertion in the middle. So throw in opIndex among the above. Then the only layout left to propose is an array of blocks. Then indexing is done like: blocks[idx>>N][idx&mask] if block size is power of 2. Not half-bad and still O(1).
 At least in C++, indexing is supposed to be constant.

 My implementation (probably very naive) is two D arrays placed front to
 front.  This allows the O(1) insertion/removal and O(1) indexing.

 http://dsource.org/projects/dcollections/browser/branches/d2/dcollections/Deque.d


 -Steve

-- Dmitry Olshansky
Jan 31 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/31/13 12:16 PM, Dmitry Olshansky wrote:
 Then the only layout left to propose is an array of blocks. Then
 indexing is done like:
 blocks[idx>>N][idx&mask]
 if block size is power of 2. Not half-bad and still O(1).

Yah that's how C++'s deque is implemented. Blocks are of statically-fixed size. Andrei
Jan 31 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/31/13 1:01 PM, bearophile wrote:
 Andrei Alexandrescu:

 Yah that's how C++'s deque is implemented. Blocks are of
 statically-fixed size.

I think I have suggested a better solution.

You mean this?
 I'd like a deque implemented as a growable circular queue
 (implemented with a dynamic array) of pointers to fixed-sized
 chunks, plus a "intrusive-like" freelist that keeps some of the
 last removed chunks.

The freelist used to be part of std::deque but was since eliminated. I'm not sure of the details of the growable circular queue, but probably the most awesome thing would be if you set out to implement it. Andrei
Jan 31 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/31/13 4:47 PM, bearophile wrote:
 For some usage patterns a freelist (used with an pointer carved out of
 the chunk itself) that keeps few of the last used chunks (it's usually
 not necessary to keep them all) is handy. For some other usages the
 freelist doesn't give you much, and it's just added complexity.

 If you keep using the structure as a queue, you keep adding from one
 side and remove from the other. In this case the circular queue of
 pointers to chunks works well, but you risk freeing and adding blocks
 all the time, for no purpose. Even keeping and reusing just one free
 chunk, "moving" it from one end to the other, avoids all/most memory
 allocations during the steady length state.

That works assuming a relatively slow allocator, which historically was the case with C++ and is the case with D. More recently C++ allocator have improved greatly and people complained that STL containers were hoarding memory, so many implementations abandoned freelists. Anyhow, right now the freelist does sound like a good call for D. DO IT!!! Andrei
Jan 31 2013
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 01/31/2013 10:47 PM, bearophile wrote:
 ...

 The missing memory optimization:
 http://en.wikipedia.org/wiki/Circular_queue#Optimization
 ...

As far as I can see, this just gets rid of a handful bitwise and machine instructions and in return at least doubles the address space requirements, also taking up cache-space (for virtually indexed caches.) Does this actually improve performance?
Jan 31 2013
prev sibling next sibling parent Robert Schadek <realburner gmx.de> writes:
On 01/29/2013 08:55 PM, bearophile wrote:
 Robert Schadek:

 I have a Deque implementation that I really like. I would like to get 
 some comments on it.
 http://dpaste.1azy.net/4bf119e7 dpaste does not run the

What's the structure of the data it keeps? I'd like a deque implemented as a growable circular queue (implemented with a dynamic array) of pointers to fixed-sized chunks, plus a "intrusive-like" freelist that keeps some of the last removed chunks. Bye, bearophile

Jan 29 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 29 January 2013 at 19:44:23 UTC, Robert Schadek wrote:
 Hi,

 I have a Deque implementation that I really like. I would like 
 to get
 some comments on it.
 http://dpaste.1azy.net/4bf119e7 dpaste does not run the 
 unittests
 successfully. I don't
 know why. I tested on a x32 as well as x64 machine and it works 
 on both.

 Best Regards
 Robert

Appart from the fact you seem to have copy pasted your code twice, here are a quick comments, interface wise. I didn't check anything implementation-wise 1. When you put a unittest inside the definition of a class/struct, it gets run for every instantiation of said class, and has access to "T". This is good for writing generic tests for the current T type. This can be useful to test things that depend on T, such as making sure some functions are indeed safe, or whatnot. If you are testing a specific implementation, then move it out of the struct. For example: //---- import std.stdio; struct S(T) { unittest { //Tests S!T for every T. writeln("Unittest: S!", T.stringof); } } unittest { //Write a specific S!int test here, for example writeln("Unittest: Global"); } void main() { alias A = S!int; //Force S!int tests alias B = S!double; //Force S!int tests } //---- "rdmd -unittest main.d" produces: //---- Unittest: Global Unittest: S!int Unittest: S!double //---- 2. Nitpick: I'd personally prefer you use the term "Range" for your iterator. Iterator != Range. You are confusing me. Also, I'd consider nesting the iterator definition inside your deque. 3. Nitpick: Don't use parenthesis for single arg templates: No: "Deque!(int)" yes: "Deque!int" 4. Why does your iterator have two template parameters? Seems it is always "T" and "Deque!T". Why bother with the second parameter at all? If you place it inside your Deque implementation, then it becomes parameter-less (appart from the implicit T of course). The added advantage is that it becomes a "Deque!int.Iterator" type. This is good if you ever write "List", and don't want to have the name "Iterator" clash... 5. Users expect "opSlice()" to produce an actual range. You are using it to deep copy the deck. This goes against your comment in regards to slicing an iterator. Just have your "opSlice()" do the same as what "range()" does, and get rid of "range()". This is the usage I'd expect: //---- Deque!int de = ... ; Iterator!int it = de[]; writeln(it); //---- I'll review the actual deque implementation tomorrow.
Jan 29 2013
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/29/2013 11:44 AM, Robert Schadek wrote:
 I have a Deque implementation that I really like. I would like to get some
 comments on it.

One thing that jumped out at me was the declaration of "Iterator". D convention is to use ranges, not iterators. Calling something an iterator suggests it behaves like a C++ iterator. The comments say it is a range - it should be named RandomAccessRange, because that's what it is.
Jan 29 2013
prev sibling next sibling parent Robert Schadek <realburner gmx.de> writes:
On 01/29/2013 09:16 PM, Timon Gehr wrote:
 On 01/29/2013 08:55 PM, bearophile wrote:
 Robert Schadek:

 I have a Deque implementation that I really like. I would like to get
 some comments on it.
 http://dpaste.1azy.net/4bf119e7 dpaste does not run the

What's the structure of the data it keeps?

It's a simple growable circular queue.

 I'd like a deque implemented as a growable circular queue (implemented
 with a dynamic array) of pointers to fixed-sized chunks, plus a
 "intrusive-like" freelist that keeps some of the last removed chunks.

 Bye,
 bearophile


Jan 29 2013
prev sibling next sibling parent Robert Schadek <realburner gmx.de> writes:
On 01/29/2013 09:34 PM, Walter Bright wrote:
 On 1/29/2013 11:44 AM, Robert Schadek wrote:
 I have a Deque implementation that I really like. I would like to get 
 some
 comments on it.

One thing that jumped out at me was the declaration of "Iterator". D convention is to use ranges, not iterators. Calling something an iterator suggests it behaves like a C++ iterator. The comments say it is a range - it should be named RandomAccessRange, because that's what it is.

The naming is bad. Thats pretty much clear now.
Jan 29 2013
prev sibling next sibling parent Robert Schadek <realburner gmx.de> writes:
On 01/29/2013 09:20 PM, monarch_dodra wrote:
 On Tuesday, 29 January 2013 at 19:44:23 UTC, Robert Schadek wrote:
 Hi,

 I have a Deque implementation that I really like. I would like to get
 some comments on it.
 http://dpaste.1azy.net/4bf119e7 dpaste does not run the unittests
 successfully. I don't
 know why. I tested on a x32 as well as x64 machine and it works on both.

 Best Regards
 Robert

Appart from the fact you seem to have copy pasted your code twice, here are a quick comments, interface wise. I didn't check anything implementation-wise

 1. When you put a unittest inside the definition of a class/struct, it 
 gets run for every instantiation of said class, and has access to "T". 
 This is good for writing generic tests for the current T type. This 
 can be useful to test things that depend on T, such as making sure 
 some functions are indeed safe, or whatnot.

 If you are testing a specific implementation, then move it out of the 
 struct.

 For example:
 //----
 import std.stdio;

 struct S(T)
 {
     unittest
     {
         //Tests S!T for every T.
         writeln("Unittest: S!", T.stringof);
     }
 }

 unittest
 {
     //Write a specific S!int test here, for example
     writeln("Unittest: Global");
 }

 void main()
 {
     alias A = S!int;    //Force S!int tests
     alias B = S!double; //Force S!int tests
 }
 //----

 "rdmd -unittest main.d" produces:
 //----
 Unittest: Global
 Unittest: S!int
 Unittest: S!double
 //----

 2. Nitpick: I'd personally prefer you use the term "Range" for your 
 iterator. Iterator != Range. You are confusing me. Also, I'd consider 
 nesting the iterator definition inside your deque.

 3. Nitpick: Don't use parenthesis for single arg templates:
 No:  "Deque!(int)"
 yes: "Deque!int"

style?
 4. Why does your iterator have two template parameters? Seems it is 
 always "T" and "Deque!T". Why bother with the second parameter at all? 
 If you place it inside your Deque implementation, then it becomes 
 parameter-less (appart from the implicit T of course). The added 
 advantage is that it becomes a "Deque!int.Iterator" type. This is good 
 if you ever write "List", and don't want to have the name "Iterator" 
 clash...

nested is the better and prettier way.
 5. Users expect "opSlice()" to produce an actual range. You are using 
 it to deep copy the deck. This goes against your comment in regards to 
 slicing an iterator. Just have your "opSlice()" do the same as what 
 "range()" does, and get rid of "range()". This is the usage I'd expect:

 //----
 Deque!int    de = ... ;
 Iterator!int it = de[];
 writeln(it);
 //----

monarchdodra said (and pointed to source) that opSlice should return the same type in respect to the called object.
 I'll review the actual deque implementation tomorrow.

Jan 29 2013
prev sibling next sibling parent "Namespace" <rswhite4 googlemail.com> writes:
 3. Nitpick: Don't use parenthesis for single arg templates:
 No:  "Deque!(int)"
 yes: "Deque!int"

preferred style?

My style is also Deque!(int), not Deque!int, the latter looks ugly...
Jan 29 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 29 January 2013 at 22:44:35 UTC, Robert Schadek wrote:
 On 01/29/2013 09:20 PM, monarch_dodra wrote:
 3. Nitpick: Don't use parenthesis for single arg templates:
 No:  "Deque!(int)"
 yes: "Deque!int"

preferred style?

It's just style, and a nitpick. Feel free to ignore it.
 5. Users expect "opSlice()" to produce an actual range. You 
 are using it to deep copy the deck. This goes against your 
 comment in regards to slicing an iterator. Just have your 
 "opSlice()" do the same as what "range()" does, and get rid of 
 "range()". This is the usage I'd expect:

 //----
 Deque!int    de = ... ;
 Iterator!int it = de[];
 writeln(it);
 //----

monarchdodra said (and pointed to source) that opSlice should return the same type in respect to the called object.

Yes, 9071. I remember it quite well, and it was the first thing that came to mind the instant I saw your your code. What I said is that for a *range* to offer correct *hasSlicing* primitive, then the "opSlice(size_t, size_t)" primitive must return the same type. Deque is not a range though, Iterator is, and we're talking about the primitive "opSlice()" Again: this is a Container vs Range issue. For example, built-in arrays also have this scheme. //---- //statarr is a static array. A container. statarr is NOT a range. int[5] statarr = [0, 1, 2, 3, 4]; //dynarrX are ranges extracted from their container int[] dynarr1 = statarr[]; int[] dynarr2 = statarr[1 .. 3]; //---- As you can see, "int[]" != "int[5]"
Jan 29 2013
prev sibling next sibling parent Robert burner Schadek <realburner gmx.de> writes:
On 01/30/2013 07:55 AM, monarch_dodra wrote:
 On Tuesday, 29 January 2013 at 22:44:35 UTC, Robert Schadek wrote:
 On 01/29/2013 09:20 PM, monarch_dodra wrote:
 3. Nitpick: Don't use parenthesis for single arg templates:
 No:  "Deque!(int)"
 yes: "Deque!int"

preferred style?

It's just style, and a nitpick. Feel free to ignore it.
 5. Users expect "opSlice()" to produce an actual range. You are 
 using it to deep copy the deck. This goes against your comment in 
 regards to slicing an iterator. Just have your "opSlice()" do the 
 same as what "range()" does, and get rid of "range()". This is the 
 usage I'd expect:

 //----
 Deque!int    de = ... ;
 Iterator!int it = de[];
 writeln(it);
 //----

monarchdodra said (and pointed to source) that opSlice should return the same type in respect to the called object.

Yes, 9071. I remember it quite well, and it was the first thing that came to mind the instant I saw your your code. What I said is that for a *range* to offer correct *hasSlicing* primitive, then the "opSlice(size_t, size_t)" primitive must return the same type. Deque is not a range though, Iterator is, and we're talking about the primitive "opSlice()" Again: this is a Container vs Range issue. For example, built-in arrays also have this scheme. //---- //statarr is a static array. A container. statarr is NOT a range. int[5] statarr = [0, 1, 2, 3, 4]; //dynarrX are ranges extracted from their container int[] dynarr1 = statarr[]; int[] dynarr2 = statarr[1 .. 3]; //---- As you can see, "int[]" != "int[5]"

Jan 30 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 29 Jan 2013 15:08:36 -0500, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote

 I skimmed through it what caught my eye instantly: this deque exposes  
 too much of operations that are not required of any deque. The end  
 result is that it's pretty much always have to be an array (all these  
 indexed inserts, removals etc.). To put it plainly it *is* an array.

 To be honest deque can be implemented far better if random access as in  
 indexing, removal and such is dropped. Canonical interface is:

 front
 insertFront
 removeFront
 back
 insertBack
 removeBack
 length (and even this could be dropped sometimes)
 empty

 That is all.

Hm... I thought deque's major draw was that it had O(1) insertion/removal from the front and back, and ALSO had O(1) indexing. At least in C++, indexing is supposed to be constant. My implementation (probably very naive) is two D arrays placed front to front. This allows the O(1) insertion/removal and O(1) indexing. http://dsource.org/projects/dcollections/browser/branches/d2/dcollections/Deque.d -Steve
Jan 30 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

 Hm... I thought deque's major draw was that it had O(1) 
 insertion/removal from the front and back, and ALSO had O(1) 
 indexing.

Amortized O(1) insert/removal, and hard O(1) for indexing. And the multiplicative constants must be low.
 My implementation (probably very naive) is two D arrays placed 
 front to front.  This allows the O(1) insertion/removal and 
 O(1) indexing.

What's the memory behavour if you keep using it like a queue (adding at the end and removing from the head)? See my first comment: http://forum.dlang.org/thread/mailman.852.1359488662.22503.digitalmars-d puremagic.com#post-kkfkonazfgsuqddkuimy:40forum.dlang.org Bye, bearophile
Jan 30 2013
prev sibling next sibling parent d coder <dlang.coder gmail.com> writes:
 To be honest deque can be implemented far better if random access as in
 indexing, removal and such is dropped.

As other people too pointed out, Deque is indeed a random access range, so indexing operator is a must. Insert/Remove too are good to have. Array implementation in the std.container too implements remove and insert. Another small issue is that this Deque implementation is a Class. std.container provides all the containers as structs. So for sake of consistency, we should implement Deque too as a struct. Regards - Puneet
Jan 30 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 31 January 2013 at 07:34:52 UTC, d coder wrote:
 To be honest deque can be implemented far better if random 
 access as in
 indexing, removal and such is dropped.

As other people too pointed out, Deque is indeed a random access range, so indexing operator is a must. Insert/Remove too are good to have. Array implementation in the std.container too implements remove and insert. Another small issue is that this Deque implementation is a Class. std.container provides all the containers as structs. So for sake of consistency, we should implement Deque too as a struct. Regards - Puneet

Quite frankly, I'm bringing more and more into question the fact that everything in phobos are structs, especially considering the "no default constructor" problem. The containers in std.container pretty much emulate final classes via pointers to payloads anyways. Not doing it via classes only brings problems. And even if you succeed in coding the damn thing correctly (Objects like DList had so many bugs I consider it virtually unusable). The end result is subtle user bugs when passing containers, if they have not been initialized yet. It creates an ambiguity in behavior if or if not the container is initialized (modification to the container inside the function may or may not modify the outside container). The only argument I see in favor of structs, is the possibility of having a deterministic memory model (std.container.Array). I don't think D's *generic* containers should do that though, IMO. It also brings lots of problems, just for a specific use case. Long story short, I'd say that unless you have a damn good reason to stay with structs, use classes. It has easier user semantics and much less room for implementation bugs. (PS: Argument also applies to all the PRNGs.)
Jan 31 2013
prev sibling next sibling parent Robert burner Schadek <realburner gmx.de> writes:
On 01/31/2013 02:48 AM, bearophile wrote:
 Steven Schveighoffer:

 Hm... I thought deque's major draw was that it had O(1) 
 insertion/removal from the front and back, and ALSO had O(1) indexing.

Amortized O(1) insert/removal, and hard O(1) for indexing. And the multiplicative constants must be low.
 My implementation (probably very naive) is two D arrays placed front 
 to front.  This allows the O(1) insertion/removal and O(1) indexing.

What's the memory behavour if you keep using it like a queue (adding at the end and removing from the head)? See my first comment: http://forum.dlang.org/thread/mailman.852.1359488662.22503.digitalmars-d puremagic.com#post-kkfkonazfgsuqddkui y:40forum.dlang.org

the find of chrome for front though).
 Bye,
 bearophile

Jan 31 2013
prev sibling next sibling parent Robert burner Schadek <realburner gmx.de> writes:
On 01/31/2013 08:33 AM, d coder wrote:
 To be honest deque can be implemented far better if random access as in
 indexing, removal and such is dropped.

range, so indexing operator is a must. Insert/Remove too are good to have. Array implementation in the std.container too implements remove and insert. Another small issue is that this Deque implementation is a Class. std.container provides all the containers as structs. So for sake of consistency, we should implement Deque too as a struct.

for the Deque being a class is that it holds two value members. And if you pass the Deque around you need to do this as ref. Otherwise you shall not call any method that modifies this value members, because this will only be visible at the calling site. See this tiny example: struct F { size_t a; } void foo(F f) { f.a = 10; } void main() { F a; a.a = 9; foo(a); assert(a.a == 10); } So I think leaving it a class is valid. But I could create a static opCall method that creates a Deque or a convince Function similar to that of the Rb tree
 Regards
 - Puneet

Jan 31 2013
prev sibling next sibling parent Robert burner Schadek <realburner gmx.de> writes:
On 01/31/2013 09:46 AM, monarch_dodra wrote:
 On Thursday, 31 January 2013 at 07:34:52 UTC, d coder wrote:
 To be honest deque can be implemented far better if random access as in
 indexing, removal and such is dropped.

As other people too pointed out, Deque is indeed a random access range, so indexing operator is a must. Insert/Remove too are good to have. Array implementation in the std.container too implements remove and insert. Another small issue is that this Deque implementation is a Class. std.container provides all the containers as structs. So for sake of consistency, we should implement Deque too as a struct. Regards - Puneet

Quite frankly, I'm bringing more and more into question the fact that everything in phobos are structs, especially considering the "no default constructor" problem. The containers in std.container pretty much emulate final classes via pointers to payloads anyways. Not doing it via classes only brings problems. And even if you succeed in coding the damn thing correctly (Objects like DList had so many bugs I consider it virtually unusable). The end result is subtle user bugs when passing containers, if they have not been initialized yet. It creates an ambiguity in behavior if or if not the container is initialized (modification to the container inside the function may or may not modify the outside container). The only argument I see in favor of structs, is the possibility of having a deterministic memory model (std.container.Array). I don't think D's *generic* containers should do that though, IMO. It also brings lots of problems, just for a specific use case. Long story short, I'd say that unless you have a damn good reason to stay with structs, use classes. It has easier user semantics and much less room for implementation bugs. (PS: Argument also applies to all the PRNGs.)

Jan 31 2013
prev sibling next sibling parent d coder <dlang.coder gmail.com> writes:
On Thu, Jan 31, 2013 at 2:43 PM, Robert burner Schadek
<realburner gmx.de> wrote:
 Thats not totally correct. The Rb tree is a class. But the real reason for
 the Deque being a class
 is that it holds two value members. And if you pass the Deque around you
 need to do this as ref.
 Otherwise you shall not call any method that modifies this value members,
 because this will only be
 visible at the calling site.

I would expect all the containers in the std.container to have either pass by value or pass by reference semantics. It seems at least Array, Slist, and Dlist are structs and they follow pass by value semantics.
From Dlang.org:

auto a = DList!int([3, 4]); //Create a new chain auto b = a; //Point to the same chain // (3 - 4) assert(a[].equal([3, 4])); assert(b[].equal([3, 4])); b.stableInsertFront(1); //insert before of b b.stableInsertBack(5); //insert after of b // (2 - (3 - 4) - 5) assert(a[].equal([3, 4])); //a is not changed assert(b[].equal([1, 3, 4, 5])); // but b is changed Regards - Puneet
Jan 31 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 31 January 2013 at 11:35:31 UTC, d coder wrote:
 On Thu, Jan 31, 2013 at 2:43 PM, Robert burner Schadek
 <realburner gmx.de> wrote:
 Thats not totally correct. The Rb tree is a class. But the 
 real reason for
 the Deque being a class
 is that it holds two value members. And if you pass the Deque 
 around you
 need to do this as ref.
 Otherwise you shall not call any method that modifies this 
 value members,
 because this will only be
 visible at the calling site.

I would expect all the containers in the std.container to have either pass by value or pass by reference semantics. It seems at least Array, Slist, and Dlist are structs and they follow pass by value semantics.
From Dlang.org:

auto a = DList!int([3, 4]); //Create a new chain auto b = a; //Point to the same chain // (3 - 4) assert(a[].equal([3, 4])); assert(b[].equal([3, 4])); b.stableInsertFront(1); //insert before of b b.stableInsertBack(5); //insert after of b // (2 - (3 - 4) - 5) assert(a[].equal([3, 4])); //a is not changed assert(b[].equal([1, 3, 4, 5])); // but b is changed Regards - Puneet

Technically, all containers in std.container need to adhere to reference semantics when passed by value, regardless of if they are classes or structs. For example, AA's can be considered structs, but when passed, they still adhere to reference semantics. --------- In regards to DList: I wrote that code and documentation. The truth is that DList was written with absolutely 0 concern to what might happen if you mutate a copy. While correcting the code, I preserved the old semantic (accidental) semantic, which was kind of weird. The mistake was to document said semantic. It makes no sense. I should never even have preserved that semantic in the first place anyways. There is now a new version in the tubes for Dlist, to make it behave the way you'd expect it to: https://github.com/D-Programming-Language/phobos/pull/953 The pull is kind of stuck in limbo, specifically because of the problems associated with implementing reference semantics with structs :/
Jan 31 2013
prev sibling next sibling parent Robert burner Schadek <realburner gmx.de> writes:
On 01/31/2013 01:00 PM, monarch_dodra wrote:
 On Thursday, 31 January 2013 at 11:35:31 UTC, d coder wrote:
 On Thu, Jan 31, 2013 at 2:43 PM, Robert burner Schadek
 <realburner gmx.de> wrote:
 Thats not totally correct. The Rb tree is a class. But the real 
 reason for
 the Deque being a class
 is that it holds two value members. And if you pass the Deque around 
 you
 need to do this as ref.
 Otherwise you shall not call any method that modifies this value 
 members,
 because this will only be
 visible at the calling site.

I would expect all the containers in the std.container to have either pass by value or pass by reference semantics. It seems at least Array, Slist, and Dlist are structs and they follow pass by value semantics.
 From Dlang.org:

auto a = DList!int([3, 4]); //Create a new chain auto b = a; //Point to the same chain // (3 - 4) assert(a[].equal([3, 4])); assert(b[].equal([3, 4])); b.stableInsertFront(1); //insert before of b b.stableInsertBack(5); //insert after of b // (2 - (3 - 4) - 5) assert(a[].equal([3, 4])); //a is not changed assert(b[].equal([1, 3, 4, 5])); // but b is changed Regards - Puneet

Technically, all containers in std.container need to adhere to reference semantics when passed by value, regardless of if they are classes or structs.

value members on the heap, but I think there is nothing gained by that.
 For example, AA's can be considered structs, but when passed, they 
 still adhere to reference semantics.

 ---------
 In regards to DList: I wrote that code and documentation.

 The truth is that DList was written with absolutely 0 concern to what 
 might happen if you mutate a copy. While correcting the code, I 
 preserved the old semantic (accidental) semantic, which was kind of 
 weird.

 The mistake was to document said semantic. It makes no sense. I should 
 never even have preserved that semantic in the first place anyways. 
 There is now a new version in the tubes for Dlist, to make it behave 
 the way you'd expect it to:

 https://github.com/D-Programming-Language/phobos/pull/953

 The pull is kind of stuck in limbo, specifically because of the 
 problems associated with implementing reference semantics with structs :/

the std.container2 by jmdavis. I would be really interested to help with the design and impl of this module. Do any further plans exists on that?
Jan 31 2013
prev sibling next sibling parent Robert burner Schadek <realburner gmx.de> writes:
On 01/31/2013 04:11 PM, Andrei Alexandrescu wrote:
 On 1/31/13 3:46 AM, monarch_dodra wrote:
 Quite frankly, I'm bringing more and more into question the fact that
 everything in phobos are structs, especially considering the "no default
 constructor" problem.

 The containers in std.container pretty much emulate final classes via
 pointers to payloads anyways. Not doing it via classes only brings
 problems.

 And even if you succeed in coding the damn thing correctly (Objects like
 DList had so many bugs I consider it virtually unusable). The end result
 is subtle user bugs when passing containers, if they have not been
 initialized yet. It creates an ambiguity in behavior if or if not the
 container is initialized (modification to the container inside the
 function may or may not modify the outside container).

 The only argument I see in favor of structs, is the possibility of
 having a deterministic memory model (std.container.Array). I don't think
 D's *generic* containers should do that though, IMO. It also brings lots
 of problems, just for a specific use case.

 Long story short, I'd say that unless you have a damn good reason to
 stay with structs, use classes. It has easier user semantics and much
 less room for implementation bugs.

We should use structs. Many people who need containers have efficiency as a primary concern. It makes sense for the standard library to make containers as good as possible. The intended semantics of containers in stdlib is reference counted with reference semantics. Yes, there are disadvantages for that. After much deliberation it seemed to me this is the best approach to containers in a language that cares for efficiency. Reference counting and adding allocators later rule out final classes. Andrei

struct Deque(T) { struct Payload { T* array; size_t head, size; size_t ref; } Payload* load; ~this() { if(load !is null && load.ref-1 == 0) { free(load); } }
Jan 31 2013
prev sibling next sibling parent Robert burner Schadek <realburner gmx.de> writes:
On 01/31/2013 05:16 PM, Andrei Alexandrescu wrote:
 On 1/31/13 11:12 AM, Robert burner Schadek wrote:
 struct Deque(T) {
 struct Payload {
 T* array;
 size_t head, size;
 size_t ref;
 }

 Payload* load;

 ~this() {
 if(load !is null && (load.ref-=1) == 0) {
 free(load);
 }
 }

Yes, that's the general pattern. I agree it's a pain but I also understand that library containers are not required to look like casual application code. Andrei

bad either. Just one more indirection.
Jan 31 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 Yah that's how C++'s deque is implemented. Blocks are of 
 statically-fixed size.

I think I have suggested a better solution. Bye, bearophile
Jan 31 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 You mean this?

Yes, I meant that.
 I'm not sure of the details of the growable circular queue,

I meant something like this (but an optimization about memory pages is missing): http://rosettacode.org/wiki/Queue/Usage#Faster_Version The power-of-2 growable circular queue doesn't need to be a too much complex data structure because it doesn't contain the items, it contains the links to the fixed-sized chunks of items, so it grows/shrinks much more slowly. The missing memory optimization: http://en.wikipedia.org/wiki/Circular_queue#Optimization
 The freelist used to be part of std::deque but was since 
 eliminated.

For some usage patterns a freelist (used with an pointer carved out of the chunk itself) that keeps few of the last used chunks (it's usually not necessary to keep them all) is handy. For some other usages the freelist doesn't give you much, and it's just added complexity. If you keep using the structure as a queue, you keep adding from one side and remove from the other. In this case the circular queue of pointers to chunks works well, but you risk freeing and adding blocks all the time, for no purpose. Even keeping and reusing just one free chunk, "moving" it from one end to the other, avoids all/most memory allocations during the steady length state. Bye, bearophile
Jan 31 2013
prev sibling next sibling parent Robert burner Schadek <realburner gmx.de> writes:
On 01/31/2013 11:21 PM, Timon Gehr wrote:
 Does this actually improve performance?

Good question? IMHO I think we should define a template that checks if the right methods are in place and than design a numerous benchmarks and that check.
Jan 31 2013
prev sibling next sibling parent Robert burner Schadek <realburner gmx.de> writes:
On 01/31/2013 11:38 PM, Robert burner Schadek wrote:
 Good question? IMHO I think we should define a template that checks if 
 the right methods are in place and than design a numerous benchmarks 
 and that check.

benchmarks and than check all available implementations against them. It might even be possible to provide multiple implementations and access them through a fassade. This way the user can choose what kind of properties he needs.
Jan 31 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Timon Gehr:

 As far as I can see, this just gets rid of a handful bitwise 
 and machine instructions and in return at least doubles the 
 address space requirements, also taking up cache-space (for 
 virtually indexed caches.)

 Does this actually improve performance?

I have not tried it (that linked code on Rosettacode doesn't have that "optimization"). When you try to create a data structure some benchmarking is required. Regarding the memory requirements, that circular queue doesn't hold the items, only pointers to chunks of items (one chunk is like 2^16 bytes or 2^20 bytes long), so usually it uses only some kilobytes of memory. So wasting a bit of memory here is OK. Bye, bearophile
Jan 31 2013
prev sibling next sibling parent "renoX" <renozyx gmail.com> writes:
On Wednesday, 30 January 2013 at 06:55:54 UTC, monarch_dodra 
wrote:
[cut]
 As you can see, "int[]" != "int[5]"

My reaction is a bit late, but: thanks for showing again the C-style declarations are flawed, in a "Pascal style" declaration we would have 'dyn_array of int' and 'fix_array(5) of int': much more easy to see differences like this. renoX
Feb 04 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Monday, 4 February 2013 at 15:51:10 UTC, renoX wrote:
 On Wednesday, 30 January 2013 at 06:55:54 UTC, monarch_dodra 
 wrote:
 [cut]
 As you can see, "int[]" != "int[5]"

My reaction is a bit late, but: thanks for showing again the C-style declarations are flawed, in a "Pascal style" declaration we would have 'dyn_array of int' and 'fix_array(5) of int': much more easy to see differences like this. renoX

That's not what I was trying to show, and am unsure how you came to that conclusion.
Feb 04 2013
prev sibling parent Robert burner Schadek <realburner gmx.de> writes:
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit

I ported my Deque from a class to a struct. http://dpaste.1azy.net/cc983a8d
It uses a ref counted payload which holds a pointer to a chunk of memory.
A helper function called createDeque was created to get around the 
struct ctor stuff.

While implementing I ran in some problems. Like:
http://d.puremagic.com/issues/show_bug.cgi?id=9401 This goes for all 
attributes as Maxin Fomin correctly says.
http://d.puremagic.com/issues/show_bug.cgi?id=4338 This is sort of 
annoying but I got past it by doing some bad casts.
A  trusted { at the beginning of the file is not horored by the compiler 
created methods.

Somehow "new" now creates segfaults (run the code and see the Appender 
die). I have no idea what I do wrong.
When main is left another segfault occurs while joining running threads 
(I did not create any Threads explicitly).

It would be nice if somebody could take a look and give me some pointers.

Best Regards
Robert
Feb 06 2013