www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Dynamic array ot not

reply forkit <forkit gmail.com> writes:
so at this link: https://dlang.org/spec/arrays.html

it indicates an array of type[] is of type 'Dynamic array'.

with that in mind, I ask, is this below a 'Dynamic array'.

If not, why not?


int[][] mArr2 = array(iota(1, 9).chunks(2).map!array.array);
Jan 15 2022
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 1/15/22 20:09, forkit wrote:
 so at this link: https://dlang.org/spec/arrays.html

 it indicates an array of type[] is of type 'Dynamic array'.
I have a problem with calling type[] a dynamic array because it is a slice, which may be providing access to the elements of a dynamic array. One complexity in D's slices is that they will start providing access to a dynamic array upon appending or concatenation. Here is the proof: void main() { int[2] data; int[] slice = data[]; } 'slice' has *nothing* to do with any dynamic array. Period! It will provide access to one with the following operation: slice ~= 42; Again, 'slice' is not a dynamic array; it provides access to the elements of one. (That dynamic array is owned by the D runtime (more precisely, the GC).) Despite that reality, people want dynamic arrays in the language and call slices dynamic arrays. Ok, I feel better now. :)
 with that in mind, I ask, is this below a 'Dynamic array'.
Nothing different from what I wrote above.
 If not, why not?


 int[][] mArr2 = array(iota(1, 9).chunks(2).map!array.array);
mArr2 is a slice of slices. We can see from the initialization that there are many dynamic arrays behind the scenes. The outermost array() call is unnecessary because array() of a slice will produce another slice with the same type (perhaps e.g. a 'const' might be dropped) and elements. This is the same: int[][] mArr2 = iota(1, 9).chunks(2).map!array.array ; Ali
Jan 15 2022
parent reply forkit <forkit gmail.com> writes:
On Sunday, 16 January 2022 at 04:58:21 UTC, Ali Çehreli wrote:
 I have a problem with calling type[] a dynamic array because it 
 is a slice, which may be providing access to the elements of a 
 dynamic array.
Yes. A more useful way of describing [] would be to say: "[] represents a dynamic array except when it represents a slice - in which case it is merely shorthand for a slice that is referencing data stored somewhere else." Something that still confuese me in my example then: mArr[] is actually a slice, and not a dynamic array. That is, my slice is referencing data located somewhere else. But where exactly is that other data located? It seems to me, I have a slice of data that doesn't actually exist once I have that slice.
Jan 16 2022
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 1/16/22 01:43, forkit wrote:
 On Sunday, 16 January 2022 at 04:58:21 UTC, Ali Çehreli wrote:
 I have a problem with calling type[] a dynamic array because it is a
 slice, which may be providing access to the elements of a dynamic array.
Yes. A more useful way of describing [] would be to say: "[] represents a dynamic array except when it represents a slice
I still don't agree with that. :) To me, [] is always a slice. (Again, some other members of the community like to name it a "dynamic array" instead "slice".)
 - in
 which case it is merely shorthand for a slice that is referencing data
 stored somewhere else."
A slice always references data stored somewhere else.
 Something that still confuese me in my example then:
   mArr[] is actually a slice, and not a dynamic array. That is, my slice
 is referencing data located somewhere else. But where exactly is that
 other data located?
Can be anywhere: a) On the stack (as in my earlier example of int[2] static array) b) In "dynamic memory" (i.e. "GC memory") c) In malloc'ed memory (e.g. allocated by a C function) d) etc.
 It seems to me, I have a slice of data that doesn't
 actually exist once I have that slice.
It would be a bug if the slice references non-existing data. Slice consist of two things: 1) A pointer to the beginning of data 2) The number of elements that are being referenced at that location Going back to the three example of data I listed above: a) The data is on the stack and will die when the scope is exited. It would be a bug to refer to that data longer that its lifetime: int[] foo() { int[2] dataOnTheStack; return dataOnTheStack[]; } Luckily, dmd catches that bug in that case: Error: returning `dataOnTheStack[]` escapes a reference to local variable `dataOnTheStack` b) This is the most common case: The data is in dynamic memory. This is owned and managed by druntime. druntime includes the GC, which occasionally performs a cleanup to free unused memory. int[] bar(int[] input) { int[] output = input ~ 42; return output; } void main() { bar([ 1, 2 ]); } The first line of bar() appends 42 to an existing slice. That operation causes druntime to allocate new memory, copy the existing elements of 'input' there and also write 42 at the end of those elements. After bar's first line, this is a picture of 'input' and 'output' during bar(): input.ptr --> [ (some data) ] output.ptr --> [ (copy of 'input's data in dynamic memory) 42 ] Note that while 'input's elements may be e.g. on the stack, 'output's elements are definitely in dynamic memory. The data that is in dynamic memory will be alive as long as there is at least one reference to it. c) The data may be allocated by malloc: import std.stdio; import core.stdc.stdlib; void main() { enum count = 7; // Allocate some memory void* rawData = malloc(int.sizeof * count); // Access that data as int[] int[] slice = (cast(int*)rawData)[0..count]; // Yet another slice of half of those int[] half = slice[0..$/2]; // Free the memory free(rawData); // Ouch! Accessing dead data writeln(slice); writeln(half); } So, in all three examples it is the same D feature, a slice, that references data but the data is managed in different ways. Ali
Jan 16 2022
next sibling parent reply Salih Dincer <salihdb hotmail.com> writes:
On Sunday, 16 January 2022 at 11:43:40 UTC, Ali Çehreli wrote:
 void main() {
   enum count = 7;

   // Allocate some memory
   void* rawData = malloc(int.sizeof * count);
If count is not equal to 8 I get weird results! The reason of course, is the free(): // [93947717336544, 1, 2, 3, 4, 5, 6] I remarked it for exclusion purposes but I couldn't try it as char. [Here](https://run.dlang.io/is/yFmXwO) is my test code.
Jan 16 2022
next sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 1/16/22 07:32, Salih Dincer wrote:
 On Sunday, 16 January 2022 at 11:43:40 UTC, Ali Çehreli wrote:
 void main() {
   enum count = 7;

   // Allocate some memory
   void* rawData = malloc(int.sizeof * count);
In practice, malloc'ed memory is cleared e.g. by memset(). Or, there is calloc() which returns memory filled with zeros. It takes two parameters: void* rawData = calloc(count, int.sizeof); (Of course, one also needs to check that the returned value is not null before using it.)
 If count is not equal to 8 I get weird results! The reason of course, is
 the free():
 // [93947717336544, 1, 2, 3, 4, 5, 6]
I didn't know free wrote into the freed buffer but since it's undefined behavior, we shouldn't even be able to know whether it did or not. :/
 [Here](https://run.dlang.io/is/yFmXwO) is my test code.
You have the following loop there: foreach(i, ref s; slice) s = cast(int)i; Note that the assignment operator into a malloc'ed buffer works only for some types like the fundamental ones. Otherwise, the opAssign() of a user-defined type may be very unhappy with random or zero data of 'this' object. (Note that there really wouldn't be any user-defined type in the buffer; we would have just cast'ed it to fool ourselves.) The reason is, opAssign() may have to destroy existing members of 'this' during the assignment and destroying random or zero data may not work for a particular type. So, one would have to emplace() an object in that memory. Here is my explanation: http://ddili.org/ders/d.en/memory.html#ix_memory.construction,%20emplace Ok, this is too much theory and those are parts of the reasons why we don't generally use calloc or malloc. :) Ali
Jan 16 2022
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sun, Jan 16, 2022 at 08:21:29AM -0800, Ali Çehreli via Digitalmars-d-learn
wrote:
 On 1/16/22 07:32, Salih Dincer wrote:
 On Sunday, 16 January 2022 at 11:43:40 UTC, Ali Çehreli wrote:
 void main() {
   enum count = 7;

   // Allocate some memory
   void* rawData = malloc(int.sizeof * count);
In practice, malloc'ed memory is cleared e.g. by memset(). Or, there is calloc() which returns memory filled with zeros.
Correction: malloc() is not guaranteed to clear the allocated memory. Possibly for "performance reasons". Usually, though, I'd recommend using calloc() instead to initialize the allocated memory and prevent surprising results. (E.g., you allocate a buffer expecting it would be zeroed, but it isn't so the code that assumes it does produces garbled results. Or worse, if you're allocating memory for, e.g., a packet buffer to be transmitted across the network, failing to initialize it may leak the past contents of that memory to the internet. Cf. HeartBleed.) [...]
 If count is not equal to 8 I get weird results! The reason of
 course, is the free():
 // [93947717336544, 1, 2, 3, 4, 5, 6]
I didn't know free wrote into the freed buffer but since it's undefined behavior, we shouldn't even be able to know whether it did or not. :/
[...] This is likely caused by the implementation of malloc/free. Some implementations store pointers and other tracking information inside the free blocks themselves instead of using a separate data structure to track free memory (e.g., the start of the block may contain a pointer to the next available block). After calling free(), that memory is now a free block, so the implementation may start storing such information within the block. Some memory allocator implementations may also store a canary value at the beginning of freed blocks to detect double free's (i.e., a value unlikely to occur in user data that, if detected by free(), may indicate a bug in the code that's trying to free the same block twice). T -- Debian GNU/Linux: Cray on your desktop.
Jan 16 2022
parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 1/16/22 17:51, H. S. Teoh wrote:

 In practice, malloc'ed memory is cleared e.g. by memset(). Or, there is
 calloc() which returns memory filled with zeros.
Correction: malloc() is not guaranteed to clear the allocated memory.
Agreed. What I meant is, the programmer ordinarily clears it with memset().
 I didn't know free wrote into the freed buffer but since it's
 undefined behavior, we shouldn't even be able to know whether it did
 or not. :/
[...] This is likely caused by the implementation of malloc/free. Some implementations store pointers and other tracking information inside the free blocks themselves
Yes but they do not (and cannot) put it inside what they returned. They store such information in the area right before the returned pointer.
 Some memory allocator implementations may also store a canary value at
 the beginning of freed blocks to detect double free's
I thought of that as well but I doubt they would go against the mentioned performance ideals. My current theory is that our writeln() calls after the free needed memory for the output strings and it got the last freed memory. (Assuming a common implementation of a free list where the next allocated memory is the one most recently freed.) Well... speculation... :)
 unlikely to occur in user data that, if detected by free()
Yes, those are human-identifiable as well but I don't think it's related in this case because they look random values. (Speculation: A string's .ptr value. :) ) Ali
Jan 16 2022
prev sibling parent Era Scarecrow <rtcvb32 yahoo.com> writes:
On Sunday, 16 January 2022 at 15:32:41 UTC, Salih Dincer wrote:
 If count is not equal to 8 I get weird results! The reason of 
 course, is the free():
 // [93947717336544, 1, 2, 3, 4, 5, 6]
I wonder if you're seeing something you're not suppose to as part of the internals; Typically malloc for 16bit allocation to my understanding worked a little different, where you had 2 bytes BEFORE the memory block which specified a size, and if it was negative then it was a size of free memory (*This allowed a 64k block to have multiple allocations and de-allocations*). So you might have: **64, [64 bytes of data], -1024, [1024 unallocated memory],32,[32 bytes of allocated data, etc]**. in the above if you deallocated the 64 byte block, it would just become -1090 (*merging the next free block, 1024+64+2*). This would also explain why double freeing would break since it would be referring to free size and throw a fit, or it was no longer a length malloc/free could use. Also malloc if it returns anything guarantees at least that size, you might ask for 7 bytes but get 16 and the others is just ignored. Is this how the 32/64bit malloc/free work? Not sure, it wouldn't be unreal for Java and others to pre-allocate say a megabyte and then quickly give/manage a smaller block and extending or getting another block later. Though I'm sure others here have given better/more exact on internals or how allocation works in D.
Jan 16 2022
prev sibling parent reply forkit <forkit gmail.com> writes:
On Sunday, 16 January 2022 at 11:43:40 UTC, Ali Çehreli wrote:
 So, in all three examples it is the same D feature, a slice, 
 that references data but the data is managed in different ways.

 Ali
Well, it's fair to say, that 'range-based programming' is kinda new to me. With this statement: int[][] mArr = iota(1, 9).chunks(2).map!array.array; - one would intuitively expect, that at the end, you end up with a dynamically allocated array, intialised with it's argument. That is, you would expect some GC allocation to be going on, similar to what would happen with this code: int[][] mArr2 = [[1, 2], [3, 4], [5, 6], [7, 8]]; // GC allocation But it turns out: int[][] mArr = iota(1, 9).chunks(2).map!array.array; // no GC allocation going on at all, not anywhere. How can this be I asked myself? Then I watched this, and learn about 'memory disallocation'. http://dconf.org/2015/talks/bright.html Now I understand.. I think ;-)
Jan 16 2022
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 1/16/22 14:43, forkit wrote:

 Well, it's fair to say, that 'range-based programming' is kinda new 
to me. I hope it will be useful to you. :)
 int[][] mArr2 = [[1, 2], [3, 4], [5, 6], [7, 8]]; // GC allocation

 But it turns out:

 int[][] mArr = iota(1, 9).chunks(2).map!array.array; // no GC allocation
 going on at all, not anywhere.
That's not correct. There are many range algorithms that are lazy to defer memory allocation but array() is not one of those. array() does eagerly allocate memory, which is it's whole purpose: https://dlang.org/phobos/std_array.html#array "Allocates an array and initializes it with copies of the elements of range r." So, that range expression has the same allocations as your "GC allocation" line above.
 Then I watched this, and learn about 'memory disallocation'.

 http://dconf.org/2015/talks/bright.html

 Now I understand.. I think ;-)
That applies to most other range algorithms. array() is an expensive one. :) Ali
Jan 16 2022
parent reply forkit <forkit gmail.com> writes:
On Sunday, 16 January 2022 at 23:03:49 UTC, Ali Çehreli wrote:
 That's not correct. There are many range algorithms that are 
 lazy to defer memory allocation but array() is not one of 
 those. array() does eagerly allocate memory, which is it's 
 whole purpose:

   https://dlang.org/phobos/std_array.html#array

 "Allocates an array and initializes it with copies of the 
 elements of range r."

 So, that range expression has the same allocations as your "GC 
 allocation" line above.
Honestly, that is what I thought, and expected. However, when I compiled with: -profile=gc .. it indicated no GC allocation going on. That's really why I got confused - because, as you say, the array statement (as I understand it)will result in a dynamically allocated array (i.e. allocated on GC heap). But -profile=gc .. says no. It's not. int[][] mArr = [[1, 2], [3, 4], [5, 6], [7, 8]]; // GC allocation occurs. int[][] mArr = iota(1, 9).chunks(2).map!array.array; // No GC allocation occurs.
Jan 16 2022
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 1/16/22 15:14, forkit wrote:

 But -profile=gc .. says no. It's not.

 int[][] mArr = [[1, 2], [3, 4], [5, 6], [7, 8]]; // GC allocation occurs.
 int[][] mArr = iota(1, 9).chunks(2).map!array.array; // No GC allocation
 occurs.
Definitely a -profile=gc bug. Here are the existing ones: https://issues.dlang.org/buglist.cgi?quicksearch=profile%20gc Ali
Jan 16 2022
next sibling parent reply forkit <forkit gmail.com> writes:
On Sunday, 16 January 2022 at 23:34:41 UTC, Ali Çehreli wrote:
 Definitely a -profile=gc bug. Here are the existing ones:

   https://issues.dlang.org/buglist.cgi?quicksearch=profile%20gc

 Ali
yeah, a bug makes more sense ... otherwise I really would have had a slice to data that doesn't exist anywhere. some of those reported bugs are pretty old. .. oh well... just another thing in D, that I can't rely on :-(
Jan 16 2022
next sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/16/22 7:54 PM, forkit wrote:
 On Sunday, 16 January 2022 at 23:34:41 UTC, Ali Çehreli wrote:
 Definitely a -profile=gc bug. Here are the existing ones:

   https://issues.dlang.org/buglist.cgi?quicksearch=profile%20gc

 Ali
yeah, a bug makes more sense ... otherwise I really would have had a slice to data that doesn't exist anywhere. some of those reported bugs are pretty old. .. oh well... just another thing in D, that I can't rely on :-(
I think what is happening is that phobos is using a direct call to a compiler hook, which bypasses the mechanism that recognizes it as a GC allocation. It appears that profile=gc doesn't recognize any non-language GC calls. Even `GC.malloc` doesn't identify any allocations. https://issues.dlang.org/show_bug.cgi?id=14892 -Steve
Jan 16 2022
prev sibling parent reply forkit <forkit gmail.com> writes:
On Monday, 17 January 2022 at 00:54:19 UTC, forkit wrote:
 ..
module test; import std; safe void main() { // mArr1 is a dynamic array allocated on the gc heap. int[][] mArr1 = [[1, 2], [3, 4], [5, 6], [7, 8]]; static assert(is(typeof(mArr1.array()))); alias R1 = typeof(mArr1); static assert(isRandomAccessRange!R1 && isForwardRange!R1 && isInputRange!R1 && isBidirectionalRange!R1 && hasAssignableElements!R1 && !isInfinite!R1 && hasSlicing!R1); // mArr2 is a dynamic array allocated on the gc heap // although compiling with '-profile=gc' incorrectly suggests otherwise. int[][] mArr2 = iota(1, 9).chunks(2).map!array.array; static assert(is(typeof(mArr2.array()))); alias R2 = typeof(mArr2); static assert(isRandomAccessRange!R2 && isForwardRange!R2 && isInputRange!R2 && isBidirectionalRange!R2 && hasAssignableElements!R2 && !isInfinite!R2 && hasSlicing!R2); static assert(is(typeof(mArr1) == typeof(mArr2))); } /+ if add nogc to above code: (8): Error: array literal in ` nogc` function `D main` may cause a GC allocation (22): Error: ` nogc` function `D main` cannot call non- nogc function `std.array.array!(MapResult!(array, Chunks!(Result))).array` +/
Jan 16 2022
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/16/22 9:42 PM, forkit wrote:
 On Monday, 17 January 2022 at 00:54:19 UTC, forkit wrote:
      // mArr2 is a dynamic array allocated on the gc heap
      // although compiling with '-profile=gc' incorrectly suggests otherwise.
 
 if add  nogc to above code:
 (8): Error: array literal in ` nogc` function `D main` may cause a GC 
 allocation
 (22): Error: ` nogc` function `D main` cannot call non- nogc function 
 `std.array.array!(MapResult!(array, Chunks!(Result))).array`
Technically, this is using 2 different mechanisms. nogc is a function attribute that means it can't call other functions not marked as nogc. This is actually a *compile-time* mechanism. A call to `array` may not actually allocate, but *might* allocate, and therefore it's statically not allowed to be called from a nogc function. The profile=gc appears to only show GC allocations that the *compiler* initiates (i.e. via `new`, array operations (like appending) or closure allocations). It does not detect that the functions that actually allocate memory themselves (such as `core.memory.GC.malloc`) are GC allocations. This actually does not care whether a function might or might not allocate, but records when it actually does allocate. -Steve
Jan 16 2022
parent forkit <forkit gmail.com> writes:
On Monday, 17 January 2022 at 03:11:50 UTC, Steven Schveighoffer 
wrote:
 The profile=gc appears to only show GC allocations that the 
 *compiler* initiates (i.e. via `new`, array operations (like 
 appending) or closure allocations). It does not detect that the 
 functions that actually allocate memory themselves (such as 
 `core.memory.GC.malloc`) are GC allocations. This actually does 
 not care whether a function might or might not allocate, but 
 records when it actually does allocate.

 -Steve
yes, that seems to be the case: // ----- module test; import std; safe void main() { int[][] mArr2; mArr2 ~= iota(1, 9).chunks(2).map!array.array; // profilegc.log created ok. } // ----
Jan 16 2022
prev sibling parent reply Salih Dincer <salihdb hotmail.com> writes:
```d
import std; // If we summarize in code...
void main()
{
   // It's a dynamic array and its copy below:
   size_t[] arr = [1, 2, 3, 4, 5, 6];
   auto arrCopy = arr.dup;

   // This is its lazy range:
   auto range = arr.chunks(2);
   typeid(range).writeln(": ", range);

   // But this is its copy (you'll see that later!)
   auto array = arr.chunks(2).map!array.array;

   // This is a series it's created from slices:
   auto slices =[arr[0..2], arr[2..4], arr[4..6]];
   typeid(slices).writeln(": ", slices);

   // Equal for fleeting moment:
   assert(array == slices);

   // Now, the datasource is sorted but (!)
   // All copies will not be affected!
   arrCopy.writeln(" => ", arr.sort!"a>b");

   // and now they are not equal! Because,
   // slices were affected by the sort process,
   // the range too...
   assert(array != slices);

   // Taaata, magic...
   // Your eyes don't surprise you!
   typeid(range).writeln(": ", range);
   typeid(slices).writeln(": ", slices);
}
/* CONSOLE OUT:
std.range.Chunks!(ulong[]).Chunks: [[1, 2], [3, 4], [5, 6]]
ulong[][]: [[1, 2], [3, 4], [5, 6]]
[1, 2, 3, 4, 5, 6] => [6, 5, 4, 3, 2, 1]
std.range.Chunks!(ulong[]).Chunks: [[6, 5], [4, 3], [2, 1]]
ulong[][]: [[6, 5], [4, 3], [2, 1]]
*/
```
Jan 16 2022
parent Salih Dincer <salihdb hotmail.com> writes:
On Monday, 17 January 2022 at 01:06:05 UTC, Salih Dincer wrote:
 ```d
   // Taaata, magic...
   // Your eyes don't surprise you!
   typeid(range).writeln(": ", range);
   typeid(slices).writeln(": ", slices);
 ```
In fact, although range and slice seem to be equal to each other, they are not! Slice points directly to the array source, but range is not... Because range never has slices showing the array source. But it has criteria that make it effective. and when called, it processes those criteria (uses CPU power) and does the same thing each time. Range does the same thing every time, lazily. Maybe it's not lazy, because range is operated when necessary and always obeys. Ranges obey even if they are lazy. 😀 Good weeks
Jan 16 2022