www.digitalmars.com         C & C++   DMDScript  

D - Some more ideas

reply "cblack01" <cblack01 cox.net> writes:
Hello D World!

I've been looking over the specs for D.  I really like the way it's
evolving.  As always, I have some ideas.

My first suggestion is to use the std::vector resizing algorithm for array
resizing.  This way, you can add elements to an array without having to
worry about inefficiency.

Another suggestion is to include set syntax.  I think Delphi has sets if I'm
not mistaken.  Sets simplify syntax and save lots of typing.  There are two
ways to use sets.  The first is with the "in" keyword, or "is an element
of", which would work like this:

int a;
...
if(a in {1 .. 3, 6, 9}) { ... }

Note the {1 .. 3, 6, 9} following the "in" keyword is a set.  The compiler
would translate the above expression into:

if (a >=1 && a <=3 || a ==6 || a==9) { ... }

Sets could also be used to perform iterations.  Perhaps something like:

for(int a = {1 .. 3, 6, 9}) { ... }

And the for loop would work as you would expect.  The expression could be
expanded by defining the "..." to be a function.  Then the expression would
expand to:

void func(int a) { ... }
for(int a = 1; a <=3; a++)func(a);
func(6);
func(9);

-Craig
Aug 24 2002
parent reply Pavel Minayev <evilone omen.ru> writes:
On Sat, 24 Aug 2002 14:56:10 -0400 "cblack01" <cblack01 cox.net> wrote:

 My first suggestion is to use the std::vector resizing algorithm for array
 resizing.  This way, you can add elements to an array without having to
 worry about inefficiency.

It was discussed not long ago. Walter says it cannot be done, even though I don't really understand reasons - but as soon as templates get implemented, we could write our own vector.
Aug 24 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:CFN374931050334954 news.digitalmars.com...
 On Sat, 24 Aug 2002 14:56:10 -0400 "cblack01" <cblack01 cox.net> wrote:

 My first suggestion is to use the std::vector resizing algorithm for


 resizing.  This way, you can add elements to an array without having to
 worry about inefficiency.

It was discussed not long ago. Walter says it cannot be done, even though

 don't really
 understand reasons - but as soon as templates get implemented, we could

 our own vector.

The reason is the resizer doesn't know if some array slice is using the memory just above it. For example, allocate an array of 100. Take a slice from 0..50. Now take the slice, and resize it up to 80. It will step all over the original array. For example: char a[] = new char[100]; char b[] = a[0..50]; b.length = b.length + 30;
Aug 24 2002
next sibling parent reply Pavel Minayev <evilone omen.ru> writes:
On Sat, 24 Aug 2002 20:29:23 -0700 "Walter" <walter digitalmars.com> wrote:

 The reason is the resizer doesn't know if some array slice is using the
 memory just above it. For example, allocate an array of 100. Take a slice
 from 0..50. Now take the slice, and resize it up to 80. It will step all
 over the original array. For example:
 
     char a[] = new char[100];
     char b[] = a[0..50];
     b.length = b.length + 30;

The answer is simple: slices should not be resized (and the compiler must be able to catch it). This probably needs some kind of flag, for example the most significant bit of length could serve as an indicator. Resizing slices is a weird idea anyhow... Of course ~= should also be forbidden, as well as any other operations that change length. Once slice is taken, it cannot be changed - only data within it can.
Aug 25 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:CFN374934732140394 news.digitalmars.com...
 On Sat, 24 Aug 2002 20:29:23 -0700 "Walter" <walter digitalmars.com>

 The reason is the resizer doesn't know if some array slice is using the
 memory just above it. For example, allocate an array of 100. Take a


 from 0..50. Now take the slice, and resize it up to 80. It will step all
 over the original array. For example:

     char a[] = new char[100];
     char b[] = a[0..50];
     b.length = b.length + 30;

The answer is simple: slices should not be resized (and the compiler must

 able to catch it). This probably needs some kind of flag, for example the

 significant bit of length could serve as an indicator. Resizing slices is

 weird
 idea anyhow... Of course ~= should also be forbidden, as well as any other
 operations that change length. Once slice is taken, it cannot be changed -

 data within it can.

The problem is distinguishing a slice from the original allocation. Having an extra bit I suspect will kill off any performance gains that you'd get from resizing in place.
Aug 25 2002
parent reply Pavel Minayev <evilone omen.ru> writes:
On Sun, 25 Aug 2002 09:25:23 -0700 "Walter" <walter digitalmars.com> wrote:

 The problem is distinguishing a slice from the original allocation. Having
 an extra bit I suspect will kill off any performance gains that you'd get
 from resizing in place.

Why? It would be just a single bitwise AND to check the bit, and, if it is a slice, another one to reset it. On the other hand, adding elements to end of list would be _hundred_ times faster on large lists.
Aug 25 2002
parent reply Burton Radons <loth users.sourceforge.net> writes:
Pavel Minayev wrote:

 On Sun, 25 Aug 2002 09:25:23 -0700 "Walter" <walter digitalmars.com> wrote:
 
The problem is distinguishing a slice from the original allocation. Having
an extra bit I suspect will kill off any performance gains that you'd get
from resizing in place.

Why? It would be just a single bitwise AND to check the bit, and, if it is a slice, another one to reset it. On the other hand, adding elements to end of list would be _hundred_ times faster on large lists.

I've done this for my port. Works nice. Assignment, returning, and passing as an in argument also clear the bit, although that doesn't have to be done if the previous owner doesn't take advantage of it or is going out of scope. The only costly operation is in finding out how much more space a list has in it, but this is nothing compared to doing allocation too much.
Aug 25 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Burton Radons" <loth users.sourceforge.net> wrote in message
news:3D6938BA.1070809 users.sourceforge.net...
 Pavel Minayev wrote:
 On Sun, 25 Aug 2002 09:25:23 -0700 "Walter" <walter digitalmars.com>


The problem is distinguishing a slice from the original allocation.



an extra bit I suspect will kill off any performance gains that you'd



from resizing in place.



 slice, another one to
 reset it. On the other hand, adding elements to end of list would be


 times faster
 on large lists.

passing as an in argument also clear the bit, although that doesn't have to be done if the previous owner doesn't take advantage of it or is going out of scope. The only costly operation is in finding out how much more space a list has in it, but this is nothing compared to doing allocation too much.

for (i = 0; i < a.length; i++) becomes: for (i = 0; i < (a.length & 7FFFFFFF); i++) which can have a significant speed penalty.
Aug 26 2002
next sibling parent reply "Juan Carlos Arevalo Baeza" <jcab JCABs-Rumblings.com> writes:
"Walter" <walter digitalmars.com> wrote in message
news:akdohj$2m1h$1 digitaldaemon.com...


 for (i = 0; i < a.length; i++)

 becomes:

    for (i = 0; i < (a.length & 7FFFFFFF); i++)

 which can have a significant speed penalty.

Of course, you can always make a case for this, even when bitmasking is not necessary: int n = a.length & 7FFFFFFF; for (i = 0; i < n; i++) Although, I would expect any decent compiler (wink, wink) to do that optimization for you. Ahem... Salutaciones, JCAB
Aug 26 2002
next sibling parent Mac Reiter <Mac_member pathlink.com> writes:
    for (i = 0; i < (a.length & 7FFFFFFF); i++)

 which can have a significant speed penalty.

Of course, you can always make a case for this, even when bitmasking is not necessary: int n = a.length & 7FFFFFFF; for (i = 0; i < n; i++) Although, I would expect any decent compiler (wink, wink) to do that optimization for you. Ahem...

Unfortunately, while this is valid in most cases, it is not always valid. If something inside the 'for' loop resized 'a', the original loop would still be correct (at least in terms of not going outside the loop. 'i' would usually also have to be manipulated by whatever did the resizing to make the loop logically correct). The optimized loop would not notice the change to length, and could cause a memory violation, corruption, or incomplete loop. If it were a compiler optimization, presumably the compiler could note the array resizing inside the loop body and avoid the optimization. Mac
Aug 26 2002
prev sibling parent "Walter" <walter digitalmars.com> writes:
"Juan Carlos Arevalo Baeza" <jcab JCABs-Rumblings.com> wrote in message
news:akds2h$2pvb$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:akdohj$2m1h$1 digitaldaemon.com...
 for (i = 0; i < a.length; i++)
 becomes:
    for (i = 0; i < (a.length & 7FFFFFFF); i++)
 which can have a significant speed penalty.

not necessary: int n = a.length & 7FFFFFFF; for (i = 0; i < n; i++) Although, I would expect any decent compiler (wink, wink) to do that optimization for you. Ahem...

It's not so easy, because the compiler has to prove that a.length doesn't change inside the loop.
Aug 26 2002
prev sibling next sibling parent reply Pavel Minayev <evilone omen.ru> writes:
On Mon, 26 Aug 2002 10:28:16 -0700 "Walter" <walter digitalmars.com> wrote:

 for (i = 0; i < a.length; i++)
 
 becomes:
 
    for (i = 0; i < (a.length & 7FFFFFFF); i++)
 
 which can have a significant speed penalty.
 

Store the flag in the capacity field, not in the length (capacity as in std::vector). It is only queried whenever new element is added.
Aug 26 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:CFN374949765569907 news.digitalmars.com...
 On Mon, 26 Aug 2002 10:28:16 -0700 "Walter" <walter digitalmars.com>

 for (i = 0; i < a.length; i++)

 becomes:

    for (i = 0; i < (a.length & 7FFFFFFF); i++)

 which can have a significant speed penalty.

std::vector). It is only queried whenever new element is added.

That makes a dynamic array a 3 word quantity instead of a 2 word one. There's a significant penalty for that.
Aug 26 2002
parent reply Pavel Minayev <evilone omen.ru> writes:
On Mon, 26 Aug 2002 17:46:32 -0700 "Walter" <walter digitalmars.com> wrote:

 That makes a dynamic array a 3 word quantity instead of a 2 word one.
 There's a significant penalty for that.

Just how many dynamic arrays you have in your program? std::vector also uses 3 words, by the way, but I haven't heard anyone complaining yet. I don't remember exactly, but Delphi dynamic strings might also use 3 words. On other hand, current implementation of dynamic arrays makes them simply useless (at least for me). When you have an "append to end" ~= operator, but advised against using it because it is extremely slow, it is awful. What you propose is doing memory management manually, preallocating chunks of memory and filling them - back to REDIM days? What's the point of D dynamic arrays then? One of the nice features of D is that it claims to cover large area of STL functionality, but with simpler approach. std::vector is probably the most basic container type, yet there's nothing in D so far that can be really used instead of it...
Aug 27 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:CFN374955394183796 news.digitalmars.com...
 On Mon, 26 Aug 2002 17:46:32 -0700 "Walter" <walter digitalmars.com>

 That makes a dynamic array a 3 word quantity instead of a 2 word one.
 There's a significant penalty for that.


I use them all over the place <g>. As a technical aside, being two words means they can be implemented using register pairs, which should make them pretty efficient. Three words means they don't get enregistered.
 std::vector also uses
 3 words, by the way, but I haven't heard anyone complaining yet. I don't
 remember
 exactly, but Delphi dynamic strings might also use 3 words.

I didn't know that.
 On other hand, current implementation of dynamic arrays makes them simply
 useless
 (at least for me). When you have an "append to end" ~= operator, but

 against
 using it because it is extremely slow, it is awful. What you propose is

 memory
 management manually, preallocating chunks of memory and filling them -

 REDIM days? What's the point of D dynamic arrays then?

The tradeoff is to get faster loops on arrays at the expense of slower resizing. The only time the resize speed is really a problem are things like: for (i = 0; i < 1000; i++) a ~= "hello"; and I suggest that such uses aren't that common.
 One of the nice features of D is that it claims to cover large area of STL
 functionality,
 but with simpler approach. std::vector is probably the most basic

 type,
 yet there's nothing in D so far that can be really used instead of it...

Aug 27 2002
next sibling parent reply Pavel Minayev <evilone omen.ru> writes:
On Tue, 27 Aug 2002 10:36:10 -0700 "Walter" <walter digitalmars.com> wrote:

 The tradeoff is to get faster loops on arrays at the expense of slower
 resizing. The only time the resize speed is really a problem are things
 like:
     for (i = 0; i < 1000; i++)
         a ~= "hello";
 and I suggest that such uses aren't that common.

You get something close to that in a parser. For a large program, with thousands of identifiers (e.g. Win32 headers =)), dynamic arrays grow very large, and every new declared identifier requires reallocation... I noticed it when I ran my PAS2D converter. It is _very_ noticeable. The same will happen in a game if one decides to use a dynamic array to store game entities - due to constant creation of new objects (projectiles, particles, blood etc). This also happens when somebody reads a string character by character, which is quite a common operation.
Aug 27 2002
next sibling parent Mac Reiter <Mac_member pathlink.com> writes:
In article <CFN374959716746759 news.digitalmars.com>, Pavel Minayev says...
On Tue, 27 Aug 2002 10:36:10 -0700 "Walter" <walter digitalmars.com> wrote:

 The tradeoff is to get faster loops on arrays at the expense of slower
 resizing. The only time the resize speed is really a problem are things
 like:
     for (i = 0; i < 1000; i++)
         a ~= "hello";
 and I suggest that such uses aren't that common.

You get something close to that in a parser. For a large program, with thousands of identifiers (e.g. Win32 headers =)), dynamic arrays grow very large, and every new declared identifier requires reallocation... I noticed it when I ran my PAS2D converter. It is _very_ noticeable. The same will happen in a game if one decides to use a dynamic array to store game entities - due to constant creation of new objects (projectiles, particles, blood etc). This also happens when somebody reads a string character by character, which is quite a common operation.

Firstly, I suspect that Walter has a very good understanding of how parsers work... Secondly, wouldn't an associative array be better for some of these applications? Parsers, at least, need fast lookups, and depending on what you're doing in your game you may also need fast lookups (dunno -- not a great deal of real game programming experience yet -- all book learnin') Associative arrays *do* automatically increase in size as necessary, although I do not know the exact details of the underlying extendible hash mechanism. The rehash mechanism is also quite nice for optimizing lookups after loading in an associative array. I would assume that associative arrays are also more efficient, at least in the space domain, for sparse array usages. This is not meant as an argument against auto-resizing arrays. I just wanted to mention that at least one form of auto-resizing array does appear to exist. Personally, I always use .reserve even on my STL vectors. It doesn't bother me to determine my own growth rate mechanism. But other people have different tastes... Mac
Aug 27 2002
prev sibling next sibling parent reply "Sean L. Palmer" <seanpalmer earthlink.net> writes:
The trick to game programming memory allocation is to allocate all objects
from fixed pools.  You can even do that in Java.  Too bad you can't disable
the GC completely.

Sean

"Pavel Minayev" <evilone omen.ru> wrote in message
news:CFN374959716746759 news.digitalmars.com...
 On Tue, 27 Aug 2002 10:36:10 -0700 "Walter" <walter digitalmars.com>

 The tradeoff is to get faster loops on arrays at the expense of slower
 resizing. The only time the resize speed is really a problem are things
 like:
     for (i = 0; i < 1000; i++)
         a ~= "hello";
 and I suggest that such uses aren't that common.

You get something close to that in a parser. For a large program, with

 of identifiers (e.g. Win32 headers =)), dynamic arrays grow very large,

 every
 new declared identifier requires reallocation... I noticed it when I ran

 PAS2D
 converter. It is _very_ noticeable. The same will happen in a game if one
 decides
 to use a dynamic array to store game entities - due to constant creation

 objects (projectiles, particles, blood etc). This also happens when

 reads a string character by character, which is quite a common operation.

Aug 27 2002
parent reply Russell Lewis <spamhole-2001-07-16 deming-os.org> writes:
One advantage of the gc being a mark & sweep rather than a refcounter is 
that, once you call gc.disable(), it is altogether disabled (other than 
the memory footprint for the gc code).

Am I wrong here?
Aug 28 2002
parent "Walter" <walter digitalmars.com> writes:
"Russell Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3D6CDE4E.2030704 deming-os.org...
 One advantage of the gc being a mark & sweep rather than a refcounter is
 that, once you call gc.disable(), it is altogether disabled (other than
 the memory footprint for the gc code).

 Am I wrong here?

Nope. You're right. There are a lot of methods of eliminating the gc pause.
Aug 29 2002
prev sibling parent "Walter" <walter digitalmars.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:CFN374959716746759 news.digitalmars.com...
 On Tue, 27 Aug 2002 10:36:10 -0700 "Walter" <walter digitalmars.com>

 The tradeoff is to get faster loops on arrays at the expense of slower
 resizing. The only time the resize speed is really a problem are things
 like:
     for (i = 0; i < 1000; i++)
         a ~= "hello";
 and I suggest that such uses aren't that common.


 of identifiers (e.g. Win32 headers =)), dynamic arrays grow very large,

 every
 new declared identifier requires reallocation... I noticed it when I ran

 PAS2D
 converter. It is _very_ noticeable.

I'm not sure what your program is doing, but if it is a symbol table, using an associative array will get you much faster results.
 The same will happen in a game if one
 decides
 to use a dynamic array to store game entities - due to constant creation

 objects (projectiles, particles, blood etc).

 This also happens when somebody
 reads a string character by character, which is quite a common operation.

What I do for those cases is read the entire file into one array, and then map slices onto it. The wc is an example of that (also of the associative array).
Aug 29 2002
prev sibling parent reply "Juan Carlos Arevalo Baeza" <jcab JCABs-Rumblings.com> writes:
"Walter" <walter digitalmars.com> wrote in message
news:akgeie$2qh4$1 digitaldaemon.com...

 That makes a dynamic array a 3 word quantity instead of a 2 word one.
 There's a significant penalty for that.


I use them all over the place <g>. As a technical aside, being two words means they can be implemented using register pairs, which should make them pretty efficient. Three words means they don't get enregistered.

You can always store the size _and_ capacity as part of the allocated block of memory. It would mean an extra level of indirection for some operations, but the dynamic array would occupy one register, instead of two. It's somewhat of a tradeoff, but it sounds reasonable to me: extra indirection (optimizable in many cases), smaller register footprint and better reallocation in the worst case. Salutaciones, JCAB
Aug 27 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Juan Carlos Arevalo Baeza" <jcab JCABs-Rumblings.com> wrote in message
news:akgq9d$8c8$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:akgeie$2qh4$1 digitaldaemon.com...
 That makes a dynamic array a 3 word quantity instead of a 2 word




 There's a significant penalty for that.


means they can be implemented using register pairs, which should make


 pretty efficient. Three words means they don't get enregistered.

block of memory. It would mean an extra level of indirection for some operations, but the dynamic array would occupy one register, instead of

    It's somewhat of a tradeoff, but it sounds reasonable to me: extra
 indirection (optimizable in many cases), smaller register footprint and
 better reallocation in the worst case.

That would require a memory allocation (for the 3 word quantity) for every dynamic array. I suspect it would be a big hit on performance.
Aug 29 2002
next sibling parent Pavel Minayev <evilone omen.ru> writes:
"Walter" <walter digitalmars.com> wrote in
news:aklmo8$pm3$1 digitaldaemon.com: 

 That would require a memory allocation (for the 3 word quantity) for
 every dynamic array. I suspect it would be a big hit on performance.

Don't suspect. Just try it. A simple D program, written both using current implementation of dynamic arrays, and then another that emulates the proposal.
Aug 29 2002
prev sibling parent reply "Juan Carlos Arevalo Baeza" <jcab JCABs-Rumblings.com> writes:
"Walter" <walter digitalmars.com> wrote in message
news:aklmo8$pm3$1 digitaldaemon.com...

    You can always store the size _and_ capacity as part of the allocated
 block of memory. It would mean an extra level of indirection for some
 operations, but the dynamic array would occupy one register, instead of

    It's somewhat of a tradeoff, but it sounds reasonable to me: extra
 indirection (optimizable in many cases), smaller register footprint and
 better reallocation in the worst case.

That would require a memory allocation (for the 3 word quantity) for every dynamic array. I suspect it would be a big hit on performance.

No. Just allocate it together (in the same block of memory as) the array elements themselves. No performance hit besides the extra indirection and any calculations neccessary to preserve alignment. It's a pretty common thing to do. All implementations of std::string that I know (except one - flex_string by Alexandrescu, which allocates it separately) of use this to store the string's reference counter, for example. So, for example, an array of 64-bit doubles, if it allocates storage for 4 elements, it allocates 4*8 (elements) + 4 (size) + 4 (capacity) = 40 bytes. Simple and easy. Salutaciones, JCAB
Aug 29 2002
next sibling parent reply "Walter" <walter digitalmars.com> writes:
"Juan Carlos Arevalo Baeza" <jcab JCABs-Rumblings.com> wrote in message
news:akm8rv$1e9n$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:aklmo8$pm3$1 digitaldaemon.com...

    You can always store the size _and_ capacity as part of the



 block of memory. It would mean an extra level of indirection for some
 operations, but the dynamic array would occupy one register, instead



 two.
    It's somewhat of a tradeoff, but it sounds reasonable to me: extra
 indirection (optimizable in many cases), smaller register footprint



 better reallocation in the worst case.

That would require a memory allocation (for the 3 word quantity) for


 dynamic array. I suspect it would be a big hit on performance.

No. Just allocate it together (in the same block of memory as) the

 elements themselves. No performance hit besides the extra indirection and
 any calculations neccessary to preserve alignment.

    It's a pretty common thing to do. All implementations of std::string

 I know (except one - flex_string by Alexandrescu, which allocates it
 separately) of use this to store the string's reference counter, for
 example.

    So, for example, an array of 64-bit doubles, if it allocates storage

 4 elements, it allocates 4*8 (elements) + 4 (size) + 4 (capacity) = 40
 bytes. Simple and easy.

Allocating it together means that slices (as currently done) cannot work. A slice would require copying the array contents.
Aug 29 2002
parent reply "Sean L. Palmer" <seanpalmer earthlink.net> writes:
A slice is just a pointer and size register pair, right?  So just mention in
the spec that if you resize a dynamic array, all slices of it become
invalid.

Or maybe resizing a dynamic array that has had slices made of it, requires
it to be copied (leaving the old array untouched, until GC reclaims it)

Sean

"Walter" <walter digitalmars.com> wrote in message
news:akmeg2$1k62$1 digitaldaemon.com...
 Allocating it together means that slices (as currently done) cannot work.

 slice would require copying the array contents.

Aug 30 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Sean L. Palmer" <seanpalmer earthlink.net> wrote in message
news:aknd4g$2kgn$1 digitaldaemon.com...
 A slice is just a pointer and size register pair, right?

Yes.
 So just mention in
 the spec that if you resize a dynamic array, all slices of it become
 invalid.

That's the best idea yet. It might actually work <g>. The downside, of course, is it can lead to subtle and disastrous programming bugs.
 Or maybe resizing a dynamic array that has had slices made of it, requires
 it to be copied (leaving the old array untouched, until GC reclaims it)

That turns every slice into a function call with some arbitrary code being executed.
Aug 30 2002
parent reply "Sean L. Palmer" <seanpalmer earthlink.net> writes:
"Walter" <walter digitalmars.com> wrote in message
news:akpd4j$231a$1 digitaldaemon.com...
 "Sean L. Palmer" <seanpalmer earthlink.net> wrote in message
 news:aknd4g$2kgn$1 digitaldaemon.com...
 A slice is just a pointer and size register pair, right?

Yes.
 So just mention in
 the spec that if you resize a dynamic array, all slices of it become
 invalid.

That's the best idea yet. It might actually work <g>. The downside, of course, is it can lead to subtle and disastrous programming bugs.

Right. It would be a good source of bugs. Just like iterators in STL. ;)
 Or maybe resizing a dynamic array that has had slices made of it,


 it to be copied (leaving the old array untouched, until GC reclaims it)

That turns every slice into a function call with some arbitrary code being executed.

No, that means every time you slice an array you have to set a flag bit somewhere in the array so that when the array is resized, it will know it must be copied somewhere else safe, leaving the existing slices unharmed, and of course clear the bit in the new copy. 99% of the work happens during darray resize, which is going to be slow anyway. Maybe it's a bad idea... that setting of a bit per slice operation may add up to be noticeable. Sean
Aug 31 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Sean L. Palmer" <seanpalmer earthlink.net> wrote in message
news:akptsn$2l4q$1 digitaldaemon.com...
 No, that means every time you slice an array you have to set a flag bit
 somewhere in the array so that when the array is resized, it will know it
 must be copied somewhere else safe, leaving the existing slices unharmed,
 and of course clear the bit in the new copy.  99% of the work happens

 darray resize, which is going to be slow anyway.

 Maybe it's a bad idea... that setting of a bit per slice operation may add
 up to be noticeable.

This would require not only setting a bit per slice, but setting that bit every time you make a copy of a reference to an array. Issues similar to reference counting come up.
Aug 31 2002
parent reply "anderson" <anderson firestar.com.au> writes:
Currently, how do you resize slices?

Just another idea to add to the list.
Parhaps an array could be taged as either having extra elements or not using
a single bit. Arrays that do not have this bit set would be treated
normally, until they are resized. When an array is resized the bit is set
and it is oversized my a predefinded step. When the gc runs it could also
perform an action to reclaim memory, by simply removing the bit.

ie
For example (jumps of 10)
Length = 8 size  = 8 items  (with out bit set)
Length = 8 size = 10 items  (with bit set)
Length = 12 size  = 12 items  (with out bit set)
Length = 12 size = 20 items  (with bit set)

This way there would be only one extra bit needed per array.

If you have memory to spare a frame count could be added to the array (ie a
byte), so that array items could have slighly longer life times.

Still theres problems with slices.

I think slices should be dealt with like this at compile time as a constant
array.

const int[] Slice;
const int[] Slice2;
int [] something;
...

//Assign slice
Slice = Something[1..10];

//As a parmitor
function(const int[] aSlice)
{

}

//Implicit conversion
Something = Slice;

//Implicit conversion
Slice = Something;

//Error
Slice = Slice ~ Somthing;

//Impicit conversion
Something = Slice ~ Something;

//Fine
Slice = Slice2;

"Walter" <walter digitalmars.com> wrote in message
news:akr27q$143p$1 digitaldaemon.com...
 "Sean L. Palmer" <seanpalmer earthlink.net> wrote in message
 news:akptsn$2l4q$1 digitaldaemon.com...
 No, that means every time you slice an array you have to set a flag bit
 somewhere in the array so that when the array is resized, it will know


 must be copied somewhere else safe, leaving the existing slices


 and of course clear the bit in the new copy.  99% of the work happens

 darray resize, which is going to be slow anyway.

 Maybe it's a bad idea... that setting of a bit per slice operation may


 up to be noticeable.

This would require not only setting a bit per slice, but setting that bit every time you make a copy of a reference to an array. Issues similar to reference counting come up.

Sep 01 2002
parent reply "Walter" <walter digitalmars.com> writes:
"anderson" <anderson firestar.com.au> wrote in message
news:aktj5r$t5d$1 digitaldaemon.com...
 Currently, how do you resize slices?

allocate a new size and copy.
 Just another idea to add to the list.
 Parhaps an array could be taged as either having extra elements or not

 a single bit. Arrays that do not have this bit set would be treated
 normally, until they are resized. When an array is resized the bit is set
 and it is oversized my a predefinded step. When the gc runs it could also
 perform an action to reclaim memory, by simply removing the bit.

 ie
 For example (jumps of 10)
 Length = 8 size  = 8 items  (with out bit set)
 Length = 8 size = 10 items  (with bit set)
 Length = 12 size  = 12 items  (with out bit set)
 Length = 12 size = 20 items  (with bit set)

 This way there would be only one extra bit needed per array.

 If you have memory to spare a frame count could be added to the array (ie

 byte), so that array items could have slighly longer life times.

 Still theres problems with slices.

 I think slices should be dealt with like this at compile time as a

 array.

 const int[] Slice;
 const int[] Slice2;
 int [] something;
 ...

 //Assign slice
 Slice = Something[1..10];

 //As a parmitor
 function(const int[] aSlice)
 {

 }

 //Implicit conversion
 Something = Slice;

 //Implicit conversion
 Slice = Something;

 //Error
 Slice = Slice ~ Somthing;

 //Impicit conversion
 Something = Slice ~ Something;

 //Fine
 Slice = Slice2;

Setting a bit means that accessing an array means masking off the bit. There can be a significant penalty for loops doing this.
Sep 01 2002
parent reply "anderson" <anderson firestar.com.au> writes:
"Walter" <walter digitalmars.com> wrote in message
news:aku2uu$1e8m$1 digitaldaemon.com...
 "anderson" <anderson firestar.com.au> wrote in message
 news:aktj5r$t5d$1 digitaldaemon.com...
 Currently, how do you resize slices?

allocate a new size and copy.
 Just another idea to add to the list.
 Parhaps an array could be taged as either having extra elements or not

 a single bit. Arrays that do not have this bit set would be treated
 normally, until they are resized. When an array is resized the bit is


 and it is oversized my a predefinded step. When the gc runs it could


 perform an action to reclaim memory, by simply removing the bit.

 ie
 For example (jumps of 10)
 Length = 8 size  = 8 items  (with out bit set)
 Length = 8 size = 10 items  (with bit set)
 Length = 12 size  = 12 items  (with out bit set)
 Length = 12 size = 20 items  (with bit set)

 This way there would be only one extra bit needed per array.

 If you have memory to spare a frame count could be added to the array


 a
 byte), so that array items could have slighly longer life times.

 Still theres problems with slices.

 I think slices should be dealt with like this at compile time as a

 array.

 const int[] Slice;
 const int[] Slice2;
 int [] something;
 ...

 //Assign slice
 Slice = Something[1..10];

 //As a parmitor
 function(const int[] aSlice)
 {

 }

 //Implicit conversion
 Something = Slice;

 //Implicit conversion
 Slice = Something;

 //Error
 Slice = Slice ~ Somthing;

 //Impicit conversion
 Something = Slice ~ Something;

 //Fine
 Slice = Slice2;

Setting a bit means that accessing an array means masking off the bit.

 can be a significant penalty for loops doing this.

Howable using an entire byte for a lifetime value instead.
Sep 02 2002
parent "Walter" <walter digitalmars.com> writes:
"anderson" <anderson firestar.com.au> wrote in message
news:akvere$2li$1 digitaldaemon.com...
 Howable using an entire byte for a lifetime value instead.

Right now, a reference to a dynamic array is a 2 word quantity. It fits nicely into the code generator which deals efficently with register pairs. Increasing the size of the reference will make for much less efficient code to handle the references.
Sep 02 2002
prev sibling parent reply "chris jones" <flak clara.co.uk> writes:
"Juan Carlos Arevalo Baeza" <jcab JCABs-Rumblings.com> wrote in message
news:akm8rv$1e9n$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:aklmo8$pm3$1 digitaldaemon.com...

    You can always store the size _and_ capacity as part of the



 block of memory. It would mean an extra level of indirection for some
 operations, but the dynamic array would occupy one register, instead



 two.
    It's somewhat of a tradeoff, but it sounds reasonable to me: extra
 indirection (optimizable in many cases), smaller register footprint



 better reallocation in the worst case.

That would require a memory allocation (for the 3 word quantity) for


 dynamic array. I suspect it would be a big hit on performance.

No. Just allocate it together (in the same block of memory as) the

 elements themselves. No performance hit besides the extra indirection and
 any calculations neccessary to preserve alignment.

    It's a pretty common thing to do. All implementations of std::string

 I know (except one - flex_string by Alexandrescu, which allocates it
 separately) of use this to store the string's reference counter, for
 example.

Thats how Delphi does it, a dynamic array is a pointer, the first 2 words of the memory block are the ref count and the number of elements . The pointer points to the first element of the array so the ref cound is at -8 and lenght is at -4. Also in delhpi strings are arrays with copy on write but the neat thing Delphi does is that it automaticly maintans a trailing null char at the end. So to cast a delphi string to a C string you just point at the first char, zero overhead. And because the lenght is stored you dont have to scan the string to determine its length. chris
Sep 11 2002
parent "Walter" <walter digitalmars.com> writes:
"chris jones" <flak clara.co.uk> wrote in message
news:alo02r$njr$2 digitaldaemon.com...
 Thats how Delphi does it, a dynamic array is a pointer, the first 2 words

 the memory block are the ref count and the number of elements . The

 points to the first element of the array so the ref cound is at -8 and
 lenght is at -4.

That will work, but it's slower than D's implementation. It also makes slices not possible without copying.
 Also in delhpi strings are arrays with copy on write but the neat thing
 Delphi does is that it automaticly maintans a trailing null char at the

 So to cast a delphi string to a C string you just point at the first char,
 zero overhead. And because the lenght is stored you dont have to scan the
 string to determine its length.

D in general tries to put 0 bytes at the ends of character strings for the same reason, but it is not possible for slices.
Sep 11 2002
prev sibling parent reply Roland <rv ronetech.com> writes:
Walter a écrit :

 for (i = 0; i < a.length; i++)

 becomes:

    for (i = 0; i < (a.length & 7FFFFFFF); i++)

 which can have a significant speed penalty.

it seems to me not much if end_index (a.length & 7FFFFFFF) is evaluated only once. if end_index need to be evaluated at each iteration of the loop, the loop can be rewritten like this: for (i=0 | (a.lenght & (0x1000000)); i < a.length; i++) roland
Sep 02 2002
parent "Walter" <walter digitalmars.com> writes:
"Roland" <rv ronetech.com> wrote in message
news:3D73B504.4DE3E1C9 ronetech.com...
 Walter a écrit :
 for (i = 0; i < a.length; i++)
 becomes:
    for (i = 0; i < (a.length & 7FFFFFFF); i++)
 which can have a significant speed penalty.


 once.
 if end_index need to be evaluated at each iteration of the loop, the loop

 be rewritten like this:

 for (i=0  |  (a.lenght & (0x1000000)); i < a.length; i++)

True, but proving that the length of the array does not change within the loop, directly or indirectly, is non-trivial.
Sep 02 2002
prev sibling parent reply Patrick Down <pat codemoon.com> writes:
"Walter" <walter digitalmars.com> wrote in
news:ak9ime$bdv$2 digitaldaemon.com: 

     char a[] = new char[100];
     char b[] = a[0..50];
     b.length = b.length + 30;

I have another question along these lines. char[] a; a.length = 1000000; char[] b; b = a[9000..9010]; char[] c; a = c; Now except for 9000..9010 the rest of the 1000000 byte array is unreferenced. Does b hold the entire original memory block from being collected or is there some gc magic that will reclaim the unreferenced parts?
Aug 25 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Patrick Down" <pat codemoon.com> wrote in message
news:Xns927562C96AF5Dpatcodemooncom 63.105.9.61...
 "Walter" <walter digitalmars.com> wrote in
 news:ak9ime$bdv$2 digitaldaemon.com:

     char a[] = new char[100];
     char b[] = a[0..50];
     b.length = b.length + 30;

I have another question along these lines. char[] a; a.length = 1000000; char[] b; b = a[9000..9010]; char[] c; a = c; Now except for 9000..9010 the rest of the 1000000 byte array is unreferenced. Does b hold the entire original memory block from being collected or is there some gc magic that will reclaim the unreferenced parts?

The entire block will be held.
Aug 25 2002
parent Patrick Down <pat codemoon.com> writes:
"Walter" <walter digitalmars.com> wrote in news:akb9vs$28kh$2
 digitaldaemon.com: 

 The entire block will be held.

That's what I thought. This should probably be in the "Things to be careful about" section of "Tips and Tricks of the D programming gurus".
Aug 25 2002