www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Dynamic array and slices (need Walter and Andrei decision)

reply "Maxim Fomin" <maxim maxim-fomin.ru> writes:
In neighbor thread (especially from 5 page 
http://forum.dlang.org/thread/mailman.175.1369540733.13711.digitalmars-d p
remagic.com?page=5) 
there is discussion about current state of definitions in D 
related to slices, slice expressions and arrays. There is 
significant contradiction between documentation in different 
parts of the D site and people who interpret it.

Problem boils down to following:

- in array and type official spec page, dynamic array is defined 
as T[] type as "Dynamic arrays consist of a length and a pointer 
to the array data.". The page also describes what slicing is. 
Also expression page defines what SliceExpression is. Internally 
dmd follows these conventions.

- in articles part of the site there is article "D Slices" 
written by Steven Schveighoffer, which abolishes current relevant 
parts of current spec. According to the article, dynamic array is 
runtime managed memory which in implementation specific manner 
provides some set of operation related to arrays. According to 
the article T[] is by no means a dynamic array, but a slice. The 
article explicitly claims that spec is wrong.

So, there is contradiction between what T[] is. Either it is a 
slice (and what is more important, not a dynamic array type) 
which point by druntime managed dynamic array, or is object of 
type dynamic array, which may point to heap or stack memory.

Discussion shows that there is no clear consensus on this, so 
there is need for Walter and Andrei to comment on this.
May 30 2013
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 30 May 2013 at 18:25:08 UTC, Maxim Fomin wrote:
 In neighbor thread (especially from 5 page 
 http://forum.dlang.org/thread/mailman.175.1369540733.13711.digitalmars-d p
remagic.com?page=5) 
 there is discussion about current state of definitions in D 
 related to slices, slice expressions and arrays. There is 
 significant contradiction between documentation in different 
 parts of the D site and people who interpret it.

 Problem boils down to following:

 - in array and type official spec page, dynamic array is 
 defined as T[] type as "Dynamic arrays consist of a length and 
 a pointer to the array data.". The page also describes what 
 slicing is. Also expression page defines what SliceExpression 
 is. Internally dmd follows these conventions.

 - in articles part of the site there is article "D Slices" 
 written by Steven Schveighoffer, which abolishes current 
 relevant parts of current spec. According to the article, 
 dynamic array is runtime managed memory which in implementation 
 specific manner provides some set of operation related to 
 arrays. According to the article T[] is by no means a dynamic 
 array, but a slice. The article explicitly claims that spec is 
 wrong.

 So, there is contradiction between what T[] is. Either it is a 
 slice (and what is more important, not a dynamic array type) 
 which point by druntime managed dynamic array, or is object of 
 type dynamic array, which may point to heap or stack memory.

 Discussion shows that there is no clear consensus on this, so 
 there is need for Walter and Andrei to comment on this.

An array IS a slice. A slice IS an array. Nothing contradict anything.
May 30 2013
prev sibling next sibling parent "Brad Anderson" <eco gnuk.net> writes:
On Thursday, 30 May 2013 at 18:46:17 UTC, deadalnix wrote:
 On Thursday, 30 May 2013 at 18:25:08 UTC, Maxim Fomin wrote:
 In neighbor thread (especially from 5 page 
 http://forum.dlang.org/thread/mailman.175.1369540733.13711.digitalmars-d p
remagic.com?page=5) 
 there is discussion about current state of definitions in D 
 related to slices, slice expressions and arrays. There is 
 significant contradiction between documentation in different 
 parts of the D site and people who interpret it.

 Problem boils down to following:

 - in array and type official spec page, dynamic array is 
 defined as T[] type as "Dynamic arrays consist of a length and 
 a pointer to the array data.". The page also describes what 
 slicing is. Also expression page defines what SliceExpression 
 is. Internally dmd follows these conventions.

 - in articles part of the site there is article "D Slices" 
 written by Steven Schveighoffer, which abolishes current 
 relevant parts of current spec. According to the article, 
 dynamic array is runtime managed memory which in 
 implementation specific manner provides some set of operation 
 related to arrays. According to the article T[] is by no means 
 a dynamic array, but a slice. The article explicitly claims 
 that spec is wrong.

 So, there is contradiction between what T[] is. Either it is a 
 slice (and what is more important, not a dynamic array type) 
 which point by druntime managed dynamic array, or is object of 
 type dynamic array, which may point to heap or stack memory.

 Discussion shows that there is no clear consensus on this, so 
 there is need for Walter and Andrei to comment on this.

An array IS a slice. A slice IS an array. Nothing contradict anything.

There is some...peculiar stuff about slices though. If you slice a static array or a pointer and append to it the slice then points to a new distinct GC allocated array. It's not all that different from appending to a slice backed by a dynamic array an exhausted capacity but it can be surprising to people who don't know about this. Perhaps appending to a slice of non-gc allocated memory should be an error (you can't append to a static array, for instance).
May 30 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, May 30, 2013 at 08:46:16PM +0200, deadalnix wrote:
 On Thursday, 30 May 2013 at 18:25:08 UTC, Maxim Fomin wrote:
In neighbor thread (especially from 5 page
http://forum.dlang.org/thread/mailman.175.1369540733.13711.digitalmars-d puremagic.com?page=5)
there is discussion about current state of definitions in D
related to slices, slice expressions and arrays. There is
significant contradiction between documentation in different parts
of the D site and people who interpret it.

Problem boils down to following:

- in array and type official spec page, dynamic array is defined
as T[] type as "Dynamic arrays consist of a length and a pointer
to the array data.". The page also describes what slicing is. Also
expression page defines what SliceExpression is. Internally dmd
follows these conventions.

- in articles part of the site there is article "D Slices" written
by Steven Schveighoffer, which abolishes current relevant parts of
current spec. According to the article, dynamic array is runtime
managed memory which in implementation specific manner provides
some set of operation related to arrays. According to the article
T[] is by no means a dynamic array, but a slice. The article
explicitly claims that spec is wrong.

So, there is contradiction between what T[] is. Either it is a
slice (and what is more important, not a dynamic array type) which
point by druntime managed dynamic array, or is object of type
dynamic array, which may point to heap or stack memory.

Discussion shows that there is no clear consensus on this, so
there is need for Walter and Andrei to comment on this.

An array IS a slice. A slice IS an array. Nothing contradict anything.

I think the confusion comes from the inconsistent terminology used. There is a difference between "array" or "slice", in the sense of a pointer+length pair, and the contents of the array, in the sense of the memory block to contains the array elements. The pointer+length pair is what is visible to the programmer: when you declare T[] or take slices, you're just manipulating these pointer+length pairs. The underlying memory being pointed to, however, is something managed by the D runtime, and isn't directly manipulatable by the programmer. This is what Steven refers to as "dynamic arrays"; he uses the term "slice" for the pointer+length pairs seen by the programmer, whereas I think the spec uses both "array" and "slice" for the pointer+length pair, but leaves the runtime-managed memory blocks unspecified (it doesn't have to be specified, since these memory blocks are supposed to be implementation dependent). As long as you understand the difference between the pointer+length pair and the actual memory containing the array elements, there is no confusion and no contradiction. :) T -- Those who've learned LaTeX swear by it. Those who are learning LaTeX swear at it. -- Pete Bleackley
May 30 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, May 30, 2013 21:15:19 Brad Anderson wrote:
 On Thursday, 30 May 2013 at 18:46:17 UTC, deadalnix wrote:
 An array IS a slice. A slice IS an array. Nothing contradict
 anything.

There is some...peculiar stuff about slices though. If you slice a static array or a pointer and append to it the slice then points to a new distinct GC allocated array. It's not all that different from appending to a slice backed by a dynamic array an exhausted capacity but it can be surprising to people who don't know about this.

But that really doesn't have anything to do with the terminology used. When you operate on T[], you don't know what it refers to, and for the most part, you don't care (there are of course some exceptions with regards to escaping slices of static arrays and the like, but in general, what actually owns the memory for T[] is irrelevant). The problem is that the spec refers to T[] as dynamic arrays as well as slices, and the article takes the point of view that T[] isn't a dynamic array at all but just a slice of one with the dynamic array being the memory owned by the GC. It was written that way because Steven didn't like how the term dynamic array as used by the spec does not fully correspond to the general computer science term (since in CS in general, dynamic arrays own their memory, but they don't in D - the GC does). And so we have a disagreement as to whether T[] should be referred to as a dynamic array (in which case array slices and dynamic arrays are exactly the same thing), or whether the GC-own memory that T[] refers to should be referred to as the dynamic array.
 Perhaps appending to a slice of non-gc allocated memory should be
 an error (you can't append to a static array, for instance).

Why? It would have to be runtime error, not a compilation error, since the compiler can't possible detect it, and it can be quite useful to be able to pass a slice of a static array to a function which operates on dynamic arrays, and if it appends to it to create a new array, that's fine, because it's exactly the same thing that happens when a dynamic array has no extra capacity and you try to append to it. Trying to make appending to a slice of a static arary an error would just create an inconsistency in the language. Array slices aren't designed to know or care who owns the memory that they refer to. The only time when it matters is when you append to them, in which case, if they have no extra capacity (and not being allocated by the GC always means that they don't), then a new array is allocated on the GC heap. If you want to avoid that allocation, just avoid appending to arrays. You risk the allocation with slices which refer to GC memory as you do with any other kind of array slice. - Jonathan M Davis
May 30 2013
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/30/13 2:25 PM, Maxim Fomin wrote:
 In neighbor thread (especially from 5 page
 http://forum.dlang.org/thread/mailman.175.1369540733.13711.digitalmars-d puremagic.com?page=5)
 there is discussion about current state of definitions in D related to
 slices, slice expressions and arrays. There is significant contradiction
 between documentation in different parts of the D site and people who
 interpret it.

 Problem boils down to following:

 - in array and type official spec page, dynamic array is defined as T[]
 type as "Dynamic arrays consist of a length and a pointer to the array
 data.". The page also describes what slicing is. Also expression page
 defines what SliceExpression is. Internally dmd follows these conventions.

 - in articles part of the site there is article "D Slices" written by
 Steven Schveighoffer, which abolishes current relevant parts of current
 spec. According to the article, dynamic array is runtime managed memory
 which in implementation specific manner provides some set of operation
 related to arrays. According to the article T[] is by no means a dynamic
 array, but a slice. The article explicitly claims that spec is wrong.

 So, there is contradiction between what T[] is. Either it is a slice
 (and what is more important, not a dynamic array type) which point by
 druntime managed dynamic array, or is object of type dynamic array,
 which may point to heap or stack memory.

 Discussion shows that there is no clear consensus on this, so there is
 need for Walter and Andrei to comment on this.

T[] is both. You can use it to take a slice of any piece of memory, including freshly GC-allocated arrays. But when you try e.g. to append to it with ~=, that only works if the slice originated as a GC-allocated array. Andrei
May 30 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 30 May 2013 16:19:01 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:


 T[] is both. You can use it to take a slice of any piece of memory,  
 including freshly GC-allocated arrays. But when you try e.g. to append  
 to it with ~=, that only works if the slice originated as a GC-allocated  
 array.

This isn't *precisely* true. ~= works on a slice of a non-GC array. It just turns it into a GC-array-backed slice. -Steve
May 30 2013
prev sibling next sibling parent "Brad Anderson" <eco gnuk.net> writes:
On Thursday, 30 May 2013 at 20:19:01 UTC, Andrei Alexandrescu 
wrote:
 T[] is both. You can use it to take a slice of any piece of 
 memory, including freshly GC-allocated arrays. But when you try 
 e.g. to append to it with ~=, that only works if the slice 
 originated as a GC-allocated array.

 Andrei

It actually works with any slice and will create a new GC-allocated array and point to that for non-GC allocated arrays. void main() { char* a = cast(char*) malloc(5); auto slice_a = a[0..5]; slice_a ~= char.init; assert(slice_a.ptr != a); }
May 30 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 30 May 2013 16:15:36 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/30/13 1:44 PM, Maxim Fomin wrote:
 It would be good if Walter or Andrei comment on this.

Not sure I understand the context.

(moved here from other thread) The issue is that the spec defines T[] as a dynamic array type: http://dlang.org/arrays.html#dynamic-arrays While the article I wrote expresses my view of T[] as being a different type of beast called a slice, and specifically NOT a dynamic array (actually, someone pointed out quite correctly that a slice could be a slice of anything, and Ali came up with the term "array slice", which I think makes the most sense). I found this a helpful way of explaining D slices to those who are befuddled by them, and some people seem to agree with that. Whether it's "right" or not is really up to interpretation. Technically, if you never alias a slice, it certainly behaves just like a dynamic array. The thing that Maxim takes vehement issue with (not trying to make a strawman here, correct me if I'm wrong) is the idea that D's website contains both the spec and the article which say contradictory things. I think the spec should be updated to reflect the article's view (or at least note the discrepancy), since using the term "dynamic array" immediately invites newbies onto shaky ground when they think it is completely solid "Ah yes, I know what a dynamic array is. Hey, WTF???" Just my opinion. I'm perfectly fine with leaving things the way they are, or with removing the article if it's too controversial. -Steve
May 30 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, May 30, 2013 16:42:43 Steven Schveighoffer wrote:
 Just my opinion. I'm perfectly fine with leaving things the way they are,
 or with removing the article if it's too controversial.

I don't think that the article is a whole is particular controversal - just its use of terminology. And overall, it's a fantastic article. So, it would be far preferable to adjust its terminology than to get rid of it. And I have no problem with the article pointing out that D's use of the term "dynamic array" differs from how the term might be typically used elsewhere (particularly if that will help people understand D arrays better), but I do think that the article should use terminology which matches the spec - particularly since it's on dlang.org. - Jonathan M Davis
May 30 2013
prev sibling next sibling parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
Interestingly, on Lambda the Ultimate[0] today I found an article[1] tha=
t
discusses some of this - the Three Laws of Programming:

- What you get right, nobody mentions it.
- What you get wrong, people bitch about.
- What is difficult to understand you have to explain to people over and=

   over again.

D's spec on slices is covered by #3: "The difficult to understand stuff
is a real bummer. You have to explain it over and over again until
you=E2=80=99re sick, and some people never get it, you have to write hun=
dred of
mails and thousands of words explaining over and over again why this
stuff means and why it is so. For a language designer, or author, this
is a pain in the bottom."

The article also suggests tagging the source with the compiler version i=
t
was written for. Might be worth considering.

[0]: http://lambda-the-ultimate.org/node/4754
[1]: http://joearms.github.io/2013/05/31/a-week-with-elixir.html
-- =

Simen
May 31 2013
prev sibling parent "js.mdnq" <js_adddot+mdng gmail.com> writes:
On Friday, 31 May 2013 at 18:14:57 UTC, Simen Kjaeraas wrote:
 Interestingly, on Lambda the Ultimate[0] today I found an 
 article[1] that
 discusses some of this - the Three Laws of Programming:

 - What you get right, nobody mentions it.
 - What you get wrong, people bitch about.
 - What is difficult to understand you have to explain to people 
 over and
   over again.

And how else should it be? Just think if natural selection worked some other way...
May 31 2013