D - resizing arrays

Walter (21/21) Sep 06 2002 Everyone has pointed out that the current implementation of array resizi...

anderson (5/9) Sep 06 2002 resize

Walter (4/14) Sep 06 2002 Yes, I think it was you!

Burton Radons (20/41) Sep 06 2002 I've done a mano a mano comparison on endemic access of the length

Walter (78/83) Sep 06 2002 Not exactly, I said it was hard for the optimizer to prove that the leng...

Sean L. Palmer (13/96) Sep 08 2002 That's what I think sucks about C and C++; the language assumes the wor...

Walter (8/14) Sep 08 2002 worst.

Pavel Minayev (2/5) Sep 07 2002 My vote goes here!

Robert W. Cunningham (19/24) Sep 08 2002 Perhaps arrays should have a "been sliced" bit, so that the .dup on resi...

Russell Lewis (12/41) Sep 09 2002 I've pondered what the semantics of a "copy on write" array would be.

Walter (4/15) Sep 09 2002 Copy-on-write for arrays would be really cool, but I think it would need

Derek Parnell (7/27) Sep 09 2002 The RDS implementation of the Euphoria language does this implictly. An ...

Walter (6/9) Sep 09 2002 array is copied as a

Pavel Minayev (3/5) Sep 09 2002 Strings in Delphi use copy-on-write without any special hardware. =)

Walter (6/11) Sep 09 2002 What's the performance like on:

Sean L. Palmer (23/36) Sep 10 2002 A writable array could be a subtly different type than a nonwritable one...

Walter (5/11) Sep 10 2002 compiler

Pavel Minayev (8/14) Sep 10 2002 Nope. It uses reference counting, and does a copy only if a single

Walter (3/12) Sep 10 2002 Ok, I see. That is the right semantics if you're doing copy on write.

Derek Parnell (37/55) Sep 10 2002 Couldn't resist the challenge :)

Walter (2/2) Sep 12 2002 Ok, it doesn't copy the array 1000 times. But at least it needs to check

Patrick Down (7/16) Sep 09 2002 So let me see if I have this right. If slice size should not be

Walter (4/9) Sep 09 2002 The compiler doesn't know if any random dynamic array is a slice or an

Sean L. Palmer (25/34) Sep 10 2002 Maybe it should. Which reference has the power to resize an array? The

Walter (15/51) Sep 10 2002 There's another problem:

Pavel Minayev (2/8) Sep 10 2002 a should be copied here, so b and a would become two different arrays.

Patrick Down (6/16) Sep 09 2002 Ok the problem stated is:

Walter (3/4) Sep 10 2002 It can be resized in place, rather than by copying.

Patrick Down (6/12) Sep 10 2002 In order to resize an array in place you need to

Walter (4/8) Sep 10 2002 Yes. The garbage collector can figure it out. See the file

Patrick Down (3/8) Sep 11 2002 Thanks, that was the missing part of the puzzle for me.
Pavel Minayev (3/5) Sep 11 2002 Why not make it a read-only array property? .capacity or something?

"Walter" <walter digitalmars.com> writes:

Everyone has pointed out that the current implementation of array resizing
is too slow to be useful - the always copy semantics just aren't good
enough. Various remedies were proposed, most of which added bloat or slowed
down loops. The reason for the copy semantics to begin with were to deal
with the cases of resizing a slize out of a larger array.

So, perhaps this can be fixed by just a specification change - don't resize
slices. While this may seem wrong, it was already the case. For example:

char[] a;
char[] b = a[0..3];

Now, let's resize b. Do a[0..3] and b[0..3] still point to the same memory?
Maybe yes, maybe no, depending on the implementation.

If we write the rule as: If any slices are made of an array, and either the
original array or any of the slices are resized, the resulting relationship
of the resized array with the original layout is implementation defined.

What that means is, if you're going to resize an array and there are slices,
make a copy of the array with .dup before resizing it, so that there are no
slices on it.

(Note: since resizing does not free memory, there isn't a problem with
dangling references to free'd memory. There is also no issue with a possible
interior pointers only to an array, the gc is designed to regard interior
pointers as valid references to an object.)

Sep 06 2002

"anderson" <anderson firestar.com.au> writes:

"Walter" <walter digitalmars.com> wrote in message
news:alb8qn$1pea$1 digitaldaemon.com...
 So, perhaps this can be fixed by just a specification change - don't

resize
 slices. While this may seem wrong, it was already the case. For example:

 char[] a;
 char[] b = a[0..3];

I suggested, something simular to this before, and I like the idea. One vote
from me.

Sep 06 2002

"Walter" <walter digitalmars.com> writes:

Yes, I think it was you!

"anderson" <anderson firestar.com.au> wrote in message
news:albudv$tgr$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:alb8qn$1pea$1 digitaldaemon.com...
 So, perhaps this can be fixed by just a specification change - don't

 resize
 slices. While this may seem wrong, it was already the case. For example:

 char[] a;
 char[] b = a[0..3];

 I suggested, something simular to this before, and I like the idea. One

vote
 from me.

Sep 06 2002

Burton Radons <loth users.sourceforge.net> writes:

Walter wrote:

 Everyone has pointed out that the current implementation of array resizing
 is too slow to be useful - the always copy semantics just aren't good
 enough. Various remedies were proposed, most of which added bloat or slowed
 down loops. The reason for the copy semantics to begin with were to deal
 with the cases of resizing a slize out of a larger array.


I've done a mano a mano comparison on endemic access of the length 
property.  Using my code generator, there's a large amount of 
variability for both sides.  Converting the code to C with GCC, it's a 
simple "I don't give a damn".  I can only force a difference if I make 
the length volatile, which is a silly handycap.  Length can be optimised 
same as any other struct, only moreso.

You claim that length is dirtied too often to be viably sequestered away 
by the optimiser.  No it isn't.  Passing an array _pointer_ is rare, 
passing as an _out_ parameter is rare.  Passing an array itself is 
common, but if the array is resized in there then it obviously won't 
alter the array the loop is dealing with and the code generator doesn't 
need to deal with it.  It is just not common for the array being looped 
on to be dirtied; good optimisation material.

To put it another way, if your code generator pretends to optimise but 
creates significantly worse code using the ownership bit than without 
outside of fixed, unrealistic tests, it's broken.


 So, perhaps this can be fixed by just a specification change - don't resize
 slices. While this may seem wrong, it was already the case. For example:
 
 char[] a;
 char[] b = a[0..3];
 
 Now, let's resize b. Do a[0..3] and b[0..3] still point to the same memory?
 Maybe yes, maybe no, depending on the implementation.
 
 If we write the rule as: If any slices are made of an array, and either the
 original array or any of the slices are resized, the resulting relationship
 of the resized array with the original layout is implementation defined.
 
 What that means is, if you're going to resize an array and there are slices,
 make a copy of the array with .dup before resizing it, so that there are no
 slices on it.


I don't understand how this helps your situation.  What do you intend to 
do with the change?  For that matter, could you spell out what this does 
change?  The standard is fuzzy here.

Sep 06 2002

"Walter" <walter digitalmars.com> writes:

"Burton Radons" <loth users.sourceforge.net> wrote in message
news:3D799369.5010305 users.sourceforge.net...
 You claim that length is dirtied too often to be viably sequestered away
 by the optimiser.

Not exactly, I said it was hard for the optimizer to prove that the length
has not changed within the loop. Pulling out loop invariants is standard for
advanced optimizers, but it isn't easy to write them.

 To put it another way, if your code generator pretends to optimise but
 creates significantly worse code using the ownership bit than without
 outside of fixed, unrealistic tests, it's broken.

Check this out. Here's some C code:
------------------------------------------
struct Array
{
    int length;
    char *ptr;
};

int loop(struct Array a)
{
    int i;
    int sum = 0;

    for (i = 0; i < (a.length & 0x7FFFFFFF); i++)
       sum += a.ptr[i];
    return sum;
}
-------------------------------------------
Compiled without optimization:

_loop:
  enter 8,0
  push EBX
  push ESI
  xor EAX,EAX
  mov -4[EBP],EAX
  mov -8[EBP],EAX
LE:  mov ECX,8[EBP]
  and ECX,07FFFFFFFh
  cmp ECX,-8[EBP]
  jle L2E
  mov EBX,0Ch[EBP]
  mov ESI,-8[EBP]
  movsx EDX,[ESI][EBX]
  add -4[EBP],EDX
  inc dword ptr -8[EBP]
  jmp short LE
L2E:  mov EAX,-4[EBP]
  pop ESI
  pop EBX
  leave
  ret
----------------------------------
Compiled with optimization:

_loop:
  push EAX
  mov EAX,8[ESP]
  xor ECX,ECX
  push EBX
  xor EDX,EDX
  and EAX,07FFFFFFFh
  push ESI
  cmp EAX,ECX
  push EDI
  jle L26
  mov EBX,EAX
L17:  mov EDI,018h[ESP]
  movsx ESI,[EDI][EDX]
  add ECX,ESI
  inc EDX
  cmp EBX,EDX
  jg L17
L26:  pop EDI
  mov EAX,ECX
  pop ESI
  pop EBX
  pop ECX
  ret
-----------------------------
So, you see, it does pull the mask out of the loop. The trouble is, it took
me years to get all that to work right without bugs - and it took years for
other C compiler vendors to catch up with me and implement the equivalent.

The next problem is if the dynamic array is actually a reference, such as if
it's inside a class definition, it becomes almost impossible to prove the
.length is loop invariant. Every assignment through a pointer and every
function call must be assumed to invalidate all referenced data.

Sep 06 2002

"Sean L. Palmer" <seanpalmer earthlink.net> writes:

That's what I think sucks about C and C++;  the language assumes the worst.
I'd rather it be more restrictive about what you can do and lean toward
allowing the optimizer more flexibility.  Gently tell the programmer not to
write wicked tricky crap.  ;)   Or if they persist, insist on them saying
exactly what they're doing explicitly so the optimizer can tell exactly
what's going on.

Sean

"Walter" <walter digitalmars.com> wrote in message
news:alcb1f$255r$1 digitaldaemon.com...
 "Burton Radons" <loth users.sourceforge.net> wrote in message
 news:3D799369.5010305 users.sourceforge.net...
 You claim that length is dirtied too often to be viably sequestered away
 by the optimiser.

 Not exactly, I said it was hard for the optimizer to prove that the length
 has not changed within the loop. Pulling out loop invariants is standard

for
 advanced optimizers, but it isn't easy to write them.

 To put it another way, if your code generator pretends to optimise but
 creates significantly worse code using the ownership bit than without
 outside of fixed, unrealistic tests, it's broken.

 Check this out. Here's some C code:
 ------------------------------------------
 struct Array
 {
     int length;
     char *ptr;
 };

 int loop(struct Array a)
 {
     int i;
     int sum = 0;

     for (i = 0; i < (a.length & 0x7FFFFFFF); i++)
        sum += a.ptr[i];
     return sum;
 }
 -------------------------------------------
 Compiled without optimization:

 _loop:
   enter 8,0
   push EBX
   push ESI
   xor EAX,EAX
   mov -4[EBP],EAX
   mov -8[EBP],EAX
 LE:  mov ECX,8[EBP]
   and ECX,07FFFFFFFh
   cmp ECX,-8[EBP]
   jle L2E
   mov EBX,0Ch[EBP]
   mov ESI,-8[EBP]
   movsx EDX,[ESI][EBX]
   add -4[EBP],EDX
   inc dword ptr -8[EBP]
   jmp short LE
 L2E:  mov EAX,-4[EBP]
   pop ESI
   pop EBX
   leave
   ret
 ----------------------------------
 Compiled with optimization:

 _loop:
   push EAX
   mov EAX,8[ESP]
   xor ECX,ECX
   push EBX
   xor EDX,EDX
   and EAX,07FFFFFFFh
   push ESI
   cmp EAX,ECX
   push EDI
   jle L26
   mov EBX,EAX
 L17:  mov EDI,018h[ESP]
   movsx ESI,[EDI][EDX]
   add ECX,ESI
   inc EDX
   cmp EBX,EDX
   jg L17
 L26:  pop EDI
   mov EAX,ECX
   pop ESI
   pop EBX
   pop ECX
   ret
 -----------------------------
 So, you see, it does pull the mask out of the loop. The trouble is, it

took
 me years to get all that to work right without bugs - and it took years

for
 other C compiler vendors to catch up with me and implement the equivalent.

 The next problem is if the dynamic array is actually a reference, such as

if
 it's inside a class definition, it becomes almost impossible to prove the
 .length is loop invariant. Every assignment through a pointer and every
 function call must be assumed to invalidate all referenced data.

Sep 08 2002

"Walter" <walter digitalmars.com> writes:

"Sean L. Palmer" <seanpalmer earthlink.net> wrote in message
news:algcfk$2khe$1 digitaldaemon.com...
 That's what I think sucks about C and C++;  the language assumes the

worst.
 I'd rather it be more restrictive about what you can do and lean toward
 allowing the optimizer more flexibility.  Gently tell the programmer not

to
 write wicked tricky crap.  ;)   Or if they persist, insist on them saying
 exactly what they're doing explicitly so the optimizer can tell exactly
 what's going on.

Yeah, I've learned that through bitter experience trying to optimize C.
You've always got to assume the worst. You can't even make use of const as
being constant (one of the reasons I dropped const as a modifier in D, it
looks good but is quite useless. The same goes for volatile.)

Sep 08 2002

Pavel Minayev <evilone omen.ru> writes:

Walter wrote:

 What that means is, if you're going to resize an array and there are slices,
 make a copy of the array with .dup before resizing it, so that there are no
 slices on it.

My vote goes here!

Sep 07 2002

"Robert W. Cunningham" <FlyPG users.sourceforge.net> writes:

Pavel Minayev wrote:

 Walter wrote:

 What that means is, if you're going to resize an array and there are slices,
 make a copy of the array with .dup before resizing it, so that there are no
 slices on it.

 My vote goes here!

Perhaps arrays should have a "been sliced" bit, so that the .dup on resizing a
sliced array could be implicit.  It would also be good if compile-time and
run-time tests could detect and remark upon attempts to resize sliced arrays
then
go ahead and do an automatic .dup followed by the resize.

It seems to be the flip side of GC, where instead of reclaiming unused memory, D
would allocate extra memory for you if you forget to do so manually.  Just as
run-time array index checking can be disabled in D, so should the warning issued
with an attempt to resize a sliced array.

For me, the main use would be to ensure I craft lean&mean code, since I intend
to
use D in embedded systems (where the GC would largely be idle).  However, if I
have a bug in my system where I do attempt to resize a sliced array, I would
still want the operation to succeed in the safest way possible, even if it means
allocating extra memory.

If anything more sophisticated than a "been sliced" bit is needed, then the data
shouldn't be in a "naked" array, but instead should be a properly designed
object
with an appropriate set of methods to completely control allocation, use, and
reallocation.


-BobC

Sep 08 2002

Russell Lewis <spamhole-2001-07-16 deming-os.org> writes:

Walter wrote:
 Everyone has pointed out that the current implementation of array resizing
 is too slow to be useful - the always copy semantics just aren't good
 enough. Various remedies were proposed, most of which added bloat or slowed
 down loops. The reason for the copy semantics to begin with were to deal
 with the cases of resizing a slize out of a larger array.
 
 So, perhaps this can be fixed by just a specification change - don't resize
 slices. While this may seem wrong, it was already the case. For example:
 
 char[] a;
 char[] b = a[0..3];
 
 Now, let's resize b. Do a[0..3] and b[0..3] still point to the same memory?
 Maybe yes, maybe no, depending on the implementation.
 
 If we write the rule as: If any slices are made of an array, and either the
 original array or any of the slices are resized, the resulting relationship
 of the resized array with the original layout is implementation defined.
 
 What that means is, if you're going to resize an array and there are slices,
 make a copy of the array with .dup before resizing it, so that there are no
 slices on it.
 
 (Note: since resizing does not free memory, there isn't a problem with
 dangling references to free'd memory. There is also no issue with a possible
 interior pointers only to an array, the gc is designed to regard interior
 pointers as valid references to an object.)
 
 

I've pondered what the semantics of a "copy on write" array would be. 
COW is a virtual memory device where two virtual pages can point to the 
same physical page as long as neither is modified.  This is very common 
after fork()ing a process...both processes have identical page tables 
until one or the other writes.  The write fails (page fault) and the 
memory manager makes a second copy of the page in question, which is 
writable, and then re-enables the process.

It would be cool to have an array type such that you would be 
referencing the original array until you tried to modify it, either by 
resizing it or changing one of the elements.  I don't know what the 
syntax or usage would look like, though.

Sep 09 2002

"Walter" <walter digitalmars.com> writes:

"Russell Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3D7CE643.4080108 deming-os.org...
 I've pondered what the semantics of a "copy on write" array would be.
 COW is a virtual memory device where two virtual pages can point to the
 same physical page as long as neither is modified.  This is very common
 after fork()ing a process...both processes have identical page tables
 until one or the other writes.  The write fails (page fault) and the
 memory manager makes a second copy of the page in question, which is
 writable, and then re-enables the process.

 It would be cool to have an array type such that you would be
 referencing the original array until you tried to modify it, either by
 resizing it or changing one of the elements.  I don't know what the
 syntax or usage would look like, though.

Copy-on-write for arrays would be really cool, but I think it would need
hardware support.

Sep 09 2002

Derek Parnell <ddparnell bigpond.com> writes:

On Mon, 9 Sep 2002 12:57:05 -0700, "Walter" <walter digitalmars.com> wrote:
 
 "Russell Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
 news:3D7CE643.4080108 deming-os.org...
 I've pondered what the semantics of a "copy on write" array would be.
 COW is a virtual memory device where two virtual pages can point to the
 same physical page as long as neither is modified.  This is very common
 after fork()ing a process...both processes have identical page tables
 until one or the other writes.  The write fails (page fault) and the
 memory manager makes a second copy of the page in question, which is
 writable, and then re-enables the process.

 It would be cool to have an array type such that you would be
 referencing the original array until you tried to modify it, either by
 resizing it or changing one of the elements.  I don't know what the
 syntax or usage would look like, though.

 
 Copy-on-write for arrays would be really cool, but I think it would need
 hardware support.
 
 

The RDS implementation of the Euphoria language does this implictly. An array
is copied as a 
reference until it is modified, at which time a physical clone is made. It
makes for very speedy 
array passing in parameters and for temporary intermediate arrays.

---------
Cheers,
Derek Parnell

Sep 09 2002

"Walter" <walter digitalmars.com> writes:

"Derek Parnell" <ddparnell bigpond.com> wrote in message
news:1103_1031628855 news.digitalmars.com...
 The RDS implementation of the Euphoria language does this implictly. An

array is copied as a
 reference until it is modified, at which time a physical clone is made. It

makes for very speedy
 array passing in parameters and for temporary intermediate arrays.

Javascript also does copy-on-write. But the performance is terrible compared
with typical C. Hardware support would make it much more practical.

Sep 09 2002

Pavel Minayev <evilone omen.ru> writes:

Walter wrote:

 Copy-on-write for arrays would be really cool, but I think it would need
 hardware support.

Strings in Delphi use copy-on-write without any special hardware. =)
And strings are just arrays of characters...

Sep 09 2002

"Walter" <walter digitalmars.com> writes:

"Pavel Minayev" <evilone omen.ru> wrote in message
news:aljvp8$1drj$1 digitaldaemon.com...
 Walter wrote:

 Copy-on-write for arrays would be really cool, but I think it would need
 hardware support.

 Strings in Delphi use copy-on-write without any special hardware. =)
 And strings are just arrays of characters...

What's the performance like on:

    for (i = 0; i < 1000; i++)
        a[i] = 'c';

Would that copy the array 1000 times?

Sep 09 2002

"Sean L. Palmer" <seanpalmer earthlink.net> writes:

A writable array could be a subtly different type than a nonwritable one,
and the check could happen during the conversion between the two types.

int[1000] a;
int[] b = a; // slice or reference
// at this point a and b point to the same memory.
// Neither a nor b is resizable so long as there's more than one active
reference.
for (a[])
    a[] = 0; // fill with zeros
// copy happens here as b transitions from normal slice to writable slice
for (b[])
    b[]++; // fill with ones
// now a is an array of 1000 zeros and b is an array of 1000 ones.

Wierd syntax, but kinda cool.  Terse.  How often do you refer to the same
thing by more than one name?  And if you do, you *really* want the compiler
to manage it for you, to make sure you get it right.  I'd way rather the
language be safe than fast, and coming from me I can assure you that means
alot.  It hurts to say it but it's true.  It doesn't do anyone any good to
go 2000 MPH if it's straight toward a cliff.

Sean

"Walter" <walter digitalmars.com> wrote in message
news:alk4qv$1jnq$3 digitaldaemon.com...
 "Pavel Minayev" <evilone omen.ru> wrote in message
 news:aljvp8$1drj$1 digitaldaemon.com...
 Walter wrote:

 Copy-on-write for arrays would be really cool, but I think it would



need
 hardware support.

 Strings in Delphi use copy-on-write without any special hardware. =)
 And strings are just arrays of characters...

 What's the performance like on:

     for (i = 0; i < 1000; i++)
         a[i] = 'c';

 Would that copy the array 1000 times?

Sep 10 2002

"Walter" <walter digitalmars.com> writes:

"Sean L. Palmer" <seanpalmer earthlink.net> wrote in message
news:alk8ub$1pna$1 digitaldaemon.com...
 Wierd syntax, but kinda cool.  Terse.  How often do you refer to the same
 thing by more than one name?  And if you do, you *really* want the

compiler
 to manage it for you, to make sure you get it right.  I'd way rather the
 language be safe than fast, and coming from me I can assure you that means
 alot.  It hurts to say it but it's true.  It doesn't do anyone any good to
 go 2000 MPH if it's straight toward a cliff.

I've been thinking that maybe the answer is like array bounds checking -
additional checks that can be turned off for the release version.

Sep 10 2002

Pavel Minayev <evilone omen.ru> writes:

Walter wrote:

 What's the performance like on:
 
     for (i = 0; i < 1000; i++)
         a[i] = 'c';
 
 Would that copy the array 1000 times?

Nope. It uses reference counting, and does a copy only if a single 
string is referenced twice or more.

By the way, when I was writing my own BASIC, I first ignored 
copy-on-write completely. Later, it turned out that it was a major 
source of slowdown (mostly when string was function argument). Then I 
implemented Delphi-like reference counted copy-on-write... the result 
was about 30% speedup for most programs (and for some it was like 70%)!

Sep 10 2002

"Walter" <walter digitalmars.com> writes:

"Pavel Minayev" <evilone omen.ru> wrote in message
news:alkro8$2g4b$1 digitaldaemon.com...
 Walter wrote:
 What's the performance like on:

     for (i = 0; i < 1000; i++)
         a[i] = 'c';

 Would that copy the array 1000 times?

 Nope. It uses reference counting, and does a copy only if a single
 string is referenced twice or more.

Ok, I see. That is the right semantics if you're doing copy on write.

Sep 10 2002

Derek Parnell <ddparnell bigpond.com> writes:

On Mon, 9 Sep 2002 23:04:48 -0700, "Walter" <walter digitalmars.com> wrote:
 
 "Pavel Minayev" <evilone omen.ru> wrote in message
 news:aljvp8$1drj$1 digitaldaemon.com...
 Walter wrote:

 Copy-on-write for arrays would be really cool, but I think it would need
 hardware support.

 Strings in Delphi use copy-on-write without any special hardware. =)
 And strings are just arrays of characters...

 
 What's the performance like on:
 
     for (i = 0; i < 1000; i++)
         a[i] = 'c';
 
 Would that copy the array 1000 times?
 

Couldn't resist the challenge :)

With respect to Euphoria, it wouldn't copy the array because nothing else is
referencing it. 

I tested the speed and with a 1000 iterations of the above code I got 0.05
seconds. So I doctored it 
to force referencing, thus...
---
sequence a,b,c
atom e
-- Create a 100 element array, zero-filled.
b = repeat(0, 100)
-- start the timer.
e = time()          
-- repeat the test 1000 times 
for q = 1 to 1000 do         

-- Test begins--
    -- Deallocate an array
    a = {}
    -- Append 1000 items, thus growing the array each time.
    for i = 1 to 1000 do
        -- Each item is an array of 98 elements sliced from 'b'
        a = append(a,b[2..99])
        -- copy (reference) the new element
        c = a[i]
        -- change it's 1st element, thus forcing a physical copy.
        c[1] = 'c'
    end for  
end for           
-- stop the timer and print the duration.
printf(1, "%f\n", time()-e)      

----
This took 4.06 seconds.

When I commented out the "c[1] = 'c'" line, it took 3.08 seconds. 

Euphoria is an interpreted language, so I would expect much better times if
this strategy was used 
in D.

--------------
cheers,
Derek Parnell

Sep 10 2002

"Walter" <walter digitalmars.com> writes:

Ok, it doesn't copy the array 1000 times. But at least it needs to check
1000 times to see if it needs to copy it <g>.

Sep 12 2002

Patrick Down <pat codemoon.com> writes:

"Walter" <walter digitalmars.com> wrote in
news:alb8qn$1pea$1 digitaldaemon.com: 

 Everyone has pointed out that the current implementation of array
 resizing is too slow to be useful - the always copy semantics just
 aren't good enough. Various remedies were proposed, most of which
 added bloat or slowed down loops. The reason for the copy semantics to
 begin with were to deal with the cases of resizing a slize out of a
 larger array. 
 
 So, perhaps this can be fixed by just a specification change - don't
 resize slices. 

So let me see if I have this right. If slice size should not be 
changed then it leaves open the possibility that slice length can 
be a loop invariant. If the length is invariant then techniques
like a bit flag can be used to efficiently to distinguish between
arrays and slices.

Sep 09 2002

"Walter" <walter digitalmars.com> writes:

"Patrick Down" <pat codemoon.com> wrote in message
news:Xns9284B5AAB7F25patcodemooncom 63.105.9.61...
 So let me see if I have this right. If slice size should not be
 changed then it leaves open the possibility that slice length can
 be a loop invariant. If the length is invariant then techniques
 like a bit flag can be used to efficiently to distinguish between
 arrays and slices.

The compiler doesn't know if any random dynamic array is a slice or an
original.

Sep 09 2002

"Sean L. Palmer" <seanpalmer earthlink.net> writes:

Maybe it should.  Which reference has the power to resize an array?  The
auto reference.  The others are all slices and can't resize the array thru a
slice.

There are so many ways to get into trouble with slices as they exist now..
resize an array out from under some slices to it, resize a slice of a larger
array.  I'd rather have the language guard against this.

It would almost have to enter the type system to guarantee enforcement.
There should be some class difference between a slice and a real array.
Slices are definitely reduced functionality ... maybe dynamic array is
really a special class of slice that can be resized or modified.  Writable
fixed array is a slice with a write operation.  Dynamic array is a writable
slice that can change size as well.  Slicing an array should really obtain
some kind of lock against resizing of the original array, and the slice's
destructor releases that lock.

Isn't a lock kinda like the opposite of a reference count?  Interesting
viewpoint though... a lock will prevent the auto-destroy from deleting an
object it would otherwise collect.  If everything just incremented on
obtaining a reference, test-and-decremented when releasing it at scope exit
or before assignment, and if zero destroying it... seems simple enough at
least.  Just disallow circular references.  Impossible to emulate this kind
of thing in D without overloading the assignment operator or copy
constructor type of stuff C++ lets you do.

Sean

"Walter" <walter digitalmars.com> wrote in message
news:aljctp$8rt$1 digitaldaemon.com...
 "Patrick Down" <pat codemoon.com> wrote in message
 news:Xns9284B5AAB7F25patcodemooncom 63.105.9.61...
 So let me see if I have this right. If slice size should not be
 changed then it leaves open the possibility that slice length can
 be a loop invariant. If the length is invariant then techniques
 like a bit flag can be used to efficiently to distinguish between
 arrays and slices.

 The compiler doesn't know if any random dynamic array is a slice or an
 original.

Sep 10 2002

"Walter" <walter digitalmars.com> writes:

There's another problem:

    char[] b;
    char[] a = ...;
    b = a;
    a.length += 10;    // uh-oh, what is b[] referring to now?

(Note that this is not a memory corruption problem.)

Essentially, what you want is to guarantee that any array undergoing a
resize is the only pointer into that allocated memory block.

"Sean L. Palmer" <seanpalmer earthlink.net> wrote in message
news:alk6ro$1nbg$1 digitaldaemon.com...
 Maybe it should.  Which reference has the power to resize an array?  The
 auto reference.  The others are all slices and can't resize the array thru

a
 slice.

 There are so many ways to get into trouble with slices as they exist now..
 resize an array out from under some slices to it, resize a slice of a

larger
 array.  I'd rather have the language guard against this.

 It would almost have to enter the type system to guarantee enforcement.
 There should be some class difference between a slice and a real array.
 Slices are definitely reduced functionality ... maybe dynamic array is
 really a special class of slice that can be resized or modified.  Writable
 fixed array is a slice with a write operation.  Dynamic array is a

writable
 slice that can change size as well.  Slicing an array should really obtain
 some kind of lock against resizing of the original array, and the slice's
 destructor releases that lock.

 Isn't a lock kinda like the opposite of a reference count?  Interesting
 viewpoint though... a lock will prevent the auto-destroy from deleting an
 object it would otherwise collect.  If everything just incremented on
 obtaining a reference, test-and-decremented when releasing it at scope

exit
 or before assignment, and if zero destroying it... seems simple enough at
 least.  Just disallow circular references.  Impossible to emulate this

kind
 of thing in D without overloading the assignment operator or copy
 constructor type of stuff C++ lets you do.

 Sean

 "Walter" <walter digitalmars.com> wrote in message
 news:aljctp$8rt$1 digitaldaemon.com...
 "Patrick Down" <pat codemoon.com> wrote in message
 news:Xns9284B5AAB7F25patcodemooncom 63.105.9.61...
 So let me see if I have this right. If slice size should not be
 changed then it leaves open the possibility that slice length can
 be a loop invariant. If the length is invariant then techniques
 like a bit flag can be used to efficiently to distinguish between
 arrays and slices.

 The compiler doesn't know if any random dynamic array is a slice or an
 original.

Sep 10 2002

Pavel Minayev <evilone omen.ru> writes:

Walter wrote:

 There's another problem:
 
     char[] b;
     char[] a = ...;
     b = a;
     a.length += 10;    // uh-oh, what is b[] referring to now?

a should be copied here, so b and a would become two different arrays.

Sep 10 2002

Patrick Down <pat codemoon.com> writes:

"Walter" <walter digitalmars.com> wrote in
news:alb8qn$1pea$1 digitaldaemon.com: 

 Everyone has pointed out that the current implementation of array
 resizing is too slow to be useful - the always copy semantics just
 aren't good enough. 

Ok the problem stated is:

1. Current implementation of array resizing is too slow to be useful

 Various remedies were proposed, most of which
 added bloat or slowed down loops. The reason for the copy semantics to
 begin with were to deal with the cases of resizing a slize out of a
 larger array. 

Understood

 
 So, perhaps this can be fixed by just a specification change - don't
 resize slices. 

How does this solve the problem?

Sep 09 2002

"Walter" <walter digitalmars.com> writes:

"Patrick Down" <pat codemoon.com> wrote in message
news:Xns9284D2EE6549Dpatcodemooncom 63.105.9.61...
 How does this solve the problem?

It can be resized in place, rather than by copying.

Sep 10 2002

Patrick Down <pat codemoon.com> writes:

"Walter" <walter digitalmars.com> wrote in news:alk7l3$1oc8$1
 digitaldaemon.com:

 
 "Patrick Down" <pat codemoon.com> wrote in message
 news:Xns9284D2EE6549Dpatcodemooncom 63.105.9.61...
 How does this solve the problem?

 
 It can be resized in place, rather than by copying.

In order to resize an array in place you need to
know how much extra capacity is beyond the current
length of the array.  Do you have a way of finding
out the extra capacity for an array?

Sep 10 2002

"Walter" <walter digitalmars.com> writes:

"Patrick Down" <pat codemoon.com> wrote in message
news:Xns92855730266AApatcodemooncom 63.105.9.61...
 In order to resize an array in place you need to
 know how much extra capacity is beyond the current
 length of the array.  Do you have a way of finding
 out the extra capacity for an array?

Yes. The garbage collector can figure it out. See the file
\dmd\src\phobos\gc2\gcx.d, and look at the capacity() function.

Sep 10 2002

Patrick Down <pat codemoon.com> writes:

"Walter" <walter digitalmars.com> wrote in news:almj8h$1k8u$2
 digitaldaemon.com:
 length of the array.  Do you have a way of finding
 out the extra capacity for an array?

 
 Yes. The garbage collector can figure it out. See the file
 \dmd\src\phobos\gc2\gcx.d, and look at the capacity() function.

Thanks, that was the missing part of the puzzle for me.

Sep 11 2002

Pavel Minayev <evilone omen.ru> writes:

Walter wrote:

 Yes. The garbage collector can figure it out. See the file
 \dmd\src\phobos\gc2\gcx.d, and look at the capacity() function.

Why not make it a read-only array property? .capacity or something? 
Could be useful sometimes...

Sep 11 2002

D Programming

C/C++ Programming

Other

D - resizing arrays