digitalmars.D.learn - How does D distnguish managed pointers from raw pointers?

IGotD- (25/25) Oct 03 2019 According to the GC documentation this code snippet

Adam D. Ruppe (16/24) Oct 03 2019 D pointers are plain naked pointers. What that doc segment is
Andrea Fontana (13/20) Oct 03 2019 No it's not. char* is a plain pointer.

Johan Engelen (11/14) Oct 04 2019 ```

IGotD- (4/22) Oct 04 2019 What if you pass a static array to a function that expects a

Dennis (2/5) Oct 04 2019 No, you have to append [] to create a slice from the static array.

H. S. Teoh (17/23) Oct 04 2019 Actually, it *does* automatically convert the static array to a slice.

Dennis (4/6) Oct 04 2019 You're right, I'm confused. I recall there was a situation where

Adam D. Ruppe (6/9) Oct 04 2019 When passing to a range template it is necessary, otherwise the

H. S. Teoh (8/18) Oct 04 2019 Yeah, and it's always better to consciously slice it, and therefore be
Dennis (3/5) Oct 04 2019 Turns out I have this habit as well. I'm looking through some of

Jonathan M Davis (15/20) Oct 04 2019 Really, it should be required by the language, because it's not somethin...

H. S. Teoh (29/38) Oct 04 2019 [...]

Dennis (2/4) Oct 04 2019 Luckily it's caught by -dip1000

H. S. Teoh (5/10) Oct 04 2019 Nice!

rikki cattermole (20/46) Oct 03 2019 The pointer is raw.

IGotD- <nise nise.com> writes:

According to the GC documentation this code snippet

char* p = new char[10];
char* q = p + 6; // ok
q = p + 11;      // error: undefined behavior
q = p - 1;       // error: undefined behavior

suggests that char *p is really a "fat pointer" with size 
information.

However, if get some memory allocated by some C library that is 
allocated with malloc we have no size information. We would get a 
char * without any size information and according to the 
documentation we can do anything including access out of bounds.

How does D internally know that a pointer was previously 
allocated by the GC or malloc?

If we would replace the GC with reference counting. How would D 
be able to distinguish a reference counted pointer from a raw 
pointer at compile time in order to insert the code associated 
with the reference counting?

This brings me back to MS managed C++ where they actually had two 
types of "pointers" a managed pointer and the normal C++ 
pointers. Like this:

MyType^ instance = gcnew MyType();

In this case it was obvious what is done with GC and what wasn't 
(past tense since managed C++ is deprecated). In this case it 
would be trivial to replace the GC algorithm with whatever you 
want since the compiler know the type at compile time.

Oct 03 2019

Adam D. Ruppe <destructionator gmail.com> writes:

On Thursday, 3 October 2019 at 14:13:55 UTC, IGotD- wrote:
 suggests that char *p is really a "fat pointer" with size 
 information.

D pointers are plain naked pointers. What that doc segment is 
saying is it works like C - in-bounds arithmetic will work, out 
of bounds is undefined behavior. You can do it, but it might 
crash you or whatever.

There's no difference in the language between a Gc pointer and 
any other pointer. But....

 How does D internally know that a pointer was previously 
 allocated by the GC or malloc?

But, this is a bit more nuanced. D, the language, does not know 
how it was allocated, there's no difference in the type system, 
but the runtime can figure it out based on the pointer value, if 
it falls inside the range of the GC's allocated area.

It does NOT use that for bounds checking though! It is just an 
internal detail it uses for some of the GC function to help its 
sweeps and some of the interface functions.

 If we would replace the GC with reference counting. How would D 
 be able to distinguish a reference counted pointer from a raw 
 pointer at compile time in order to insert the code associated 
 with the reference counting?

It won't, D reference counting is and then would have to be done 
by different types.

Oct 03 2019

Andrea Fontana <nospam example.com> writes:

On Thursday, 3 October 2019 at 14:13:55 UTC, IGotD- wrote:
 According to the GC documentation this code snippet

 char* p = new char[10];
 char* q = p + 6; // ok
 q = p + 11;      // error: undefined behavior
 q = p - 1;       // error: undefined behavior

 suggests that char *p is really a "fat pointer" with size 
 information.

No it's not. char* is a plain pointer.

The example is wrong, since you can't assign a new char[10] to 
char*.

Probably they mean something like:
auto arr = new char[10]
char* p = arr.ptr;
...

This code actually compiles, but its behaviour is undefined, so 
it is a logical error.


In D arrays are fat pointer instead:

int[10] my_array;

my_array is actually a pair ptr+length.

Oct 03 2019

Johan Engelen <j j.nl> writes:

On Thursday, 3 October 2019 at 14:21:37 UTC, Andrea Fontana wrote:
 In D arrays are fat pointer instead:

 int[10] my_array;

 my_array is actually a pair ptr+length.

```
int[10] my_static_array;
int[] my_dynamic_array;
```

my_static_array will not be a fat pointer. Length is known at 
compile time. Address is known at link/load time so it's also not 
a pointer but just a normal variable (& will give you a pointer 
to the array data).
my_dynamic_array will be a pair for ptr+length.

-Johan

Oct 04 2019

IGotD- <nise nise.com> writes:

On Friday, 4 October 2019 at 15:03:04 UTC, Johan Engelen wrote:
 On Thursday, 3 October 2019 at 14:21:37 UTC, Andrea Fontana 
 wrote:
 In D arrays are fat pointer instead:

 int[10] my_array;

 my_array is actually a pair ptr+length.

 ```
 int[10] my_static_array;
 int[] my_dynamic_array;
 ```

 my_static_array will not be a fat pointer. Length is known at 
 compile time. Address is known at link/load time so it's also 
 not a pointer but just a normal variable (& will give you a 
 pointer to the array data).
 my_dynamic_array will be a pair for ptr+length.

 -Johan

What if you pass a static array to a function that expects a 
dynamic array. Will D automatically create a dynamic array from 
the static array?

Oct 04 2019

Dennis <dkorpel gmail.com> writes:

On Friday, 4 October 2019 at 18:30:17 UTC, IGotD- wrote:
 What if you pass a static array to a function that expects a 
 dynamic array. Will D automatically create a dynamic array from 
 the static array?

No, you have to append [] to create a slice from the static array.

Oct 04 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Oct 04, 2019 at 06:34:40PM +0000, Dennis via Digitalmars-d-learn wrote:
 On Friday, 4 October 2019 at 18:30:17 UTC, IGotD- wrote:
 What if you pass a static array to a function that expects a dynamic
 array. Will D automatically create a dynamic array from the static
 array?

 
 No, you have to append [] to create a slice from the static array.

Actually, it *does* automatically convert the static array to a slice.
Which is actually a bug, because you get problems like this:

	int[] func() {
		int[5] data = [ 1, 2, 3, 4, 5 ];
		return data; // implicit conversion to int[]
	}
	void main() {
		auto data = func();
		// Oops: data now references out-of-scope elements on the stack.
		// Expect garbage values and stack corruption exploits.
	}

See:
	https://issues.dlang.org/show_bug.cgi?id=15932


T

-- 
"How are you doing?" "Doing what?"

Oct 04 2019

Dennis <dkorpel gmail.com> writes:

On Friday, 4 October 2019 at 18:43:34 UTC, H. S. Teoh wrote:
 Actually, it *does* automatically convert the static array to a 
 slice.

You're right, I'm confused. I recall there was a situation where 
you had to explicitly slice a static array, but I can't think of 
it now.

Oct 04 2019

Adam D. Ruppe <destructionator gmail.com> writes:

On Friday, 4 October 2019 at 19:03:14 UTC, Dennis wrote:
 You're right, I'm confused. I recall there was a situation 
 where you had to explicitly slice a static array, but I can't 
 think of it now.

When passing to a range template it is necessary, otherwise the 
template will see it as non-resizable and it will fail the range 
constraint check.

(personally though I like to explicitly slice it all the time 
though, it is more clear and the habit is nice)

Oct 04 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Oct 04, 2019 at 07:08:04PM +0000, Adam D. Ruppe via Digitalmars-d-learn
wrote:
 On Friday, 4 October 2019 at 19:03:14 UTC, Dennis wrote:
 You're right, I'm confused. I recall there was a situation where you
 had to explicitly slice a static array, but I can't think of it now.

 
 When passing to a range template it is necessary, otherwise the
 template will see it as non-resizable and it will fail the range
 constraint check.
 
 (personally though I like to explicitly slice it all the time though,
 it is more clear and the habit is nice)

Yeah, and it's always better to consciously slice it, and therefore be
reminded to think about the implications of slicing it, so that you'll
be aware not to let the slice leak past the lifetime of the underlying
static array.


T

-- 
Unix was not designed to stop people from doing stupid things, because that
would also stop them from doing clever things. -- Doug Gwyn

Oct 04 2019

Dennis <dkorpel gmail.com> writes:

On Friday, 4 October 2019 at 19:08:04 UTC, Adam D. Ruppe wrote:
 (personally though I like to explicitly slice it all the time 
 though, it is more clear and the habit is nice)

Turns out I have this habit as well. I'm looking through some of 
my code and see redundant slicing everywhere.

Oct 04 2019

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Friday, October 4, 2019 1:22:26 PM MDT Dennis via Digitalmars-d-learn 
wrote:
 On Friday, 4 October 2019 at 19:08:04 UTC, Adam D. Ruppe wrote:
 (personally though I like to explicitly slice it all the time
 though, it is more clear and the habit is nice)

 Turns out I have this habit as well. I'm looking through some of
 my code and see redundant slicing everywhere.

Really, it should be required by the language, because it's not something
that you want to be hidden. It's an easy source of bugs - especially once
you start passing that dynamic array around. It's incredibly useful to be
able to do it, but you need to be careful with such code. It's the array
equivalent of taking the address of a local variable and passing a pointer
to it around.

IIRC, -dip1000 improves the situation by making it so that the type of a
slice of a static array is scope, but it's still easy to miss, since it only
affects  safe code. It should certainly be possible to slice a static array
in  system code without having to deal with scope, but the fact that
explicit slicing isn't required in such a case makes it more error-prone
than it would be if explicit slicing were required.

- Jonathan M Davis

Oct 04 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Oct 04, 2019 at 11:43:34AM -0700, H. S. Teoh via Digitalmars-d-learn
wrote:
 On Fri, Oct 04, 2019 at 06:34:40PM +0000, Dennis via Digitalmars-d-learn wrote:
 On Friday, 4 October 2019 at 18:30:17 UTC, IGotD- wrote:
 What if you pass a static array to a function that expects a
 dynamic array. Will D automatically create a dynamic array from
 the static array?

 
 No, you have to append [] to create a slice from the static array.

 
 Actually, it *does* automatically convert the static array to a slice.

[...]

Here's an actual working example that illustrates the pitfall of this
implicit conversion:

-----
	struct S {
		int[] data;
		this(int[] _data) { data = _data; }
	}
	S makeS() {
		int[5] data = [ 1, 2, 3, 4, 5 ];
		return S(data);
	}
	void func(S s) {
		import std.stdio;
		writeln("s.data = ", s.data);
	}
	void main() {
		S s = makeS();
		func(s);
	}
-----

Expected output:
	s.data = [1, 2, 3, 4, 5]

Actual output:
	s.data = [-2111884160, 32766, 1535478075, 22053, 5]


T

-- 
MSDOS = MicroSoft's Denial Of Service

Oct 04 2019

Dennis <dkorpel gmail.com> writes:

On Friday, 4 October 2019 at 18:53:30 UTC, H. S. Teoh wrote:
 Here's an actual working example that illustrates the pitfall 
 of this implicit conversion:

Luckily it's caught by -dip1000

Oct 04 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Oct 04, 2019 at 07:21:34PM +0000, Dennis via Digitalmars-d-learn wrote:
 On Friday, 4 October 2019 at 18:53:30 UTC, H. S. Teoh wrote:
 Here's an actual working example that illustrates the pitfall of
 this implicit conversion:

 
 Luckily it's caught by -dip1000

Nice!


T

-- 
"A man's wife has more power over him than the state has." -- Ralph Emerson

Oct 04 2019

rikki cattermole <rikki cattermole.co.nz> writes:

On 04/10/2019 3:13 AM, IGotD- wrote:
 According to the GC documentation this code snippet
 
 char* p = new char[10];
 char* q = p + 6; // ok
 q = p + 11;      // error: undefined behavior
 q = p - 1;       // error: undefined behavior
 
 suggests that char *p is really a "fat pointer" with size information.

The pointer is raw.
There is no size information stored with it.

The GC will store size information separately from it so it can know 
about reallocation and what its memory range is to search for.

 However, if get some memory allocated by some C library that is 
 allocated with malloc we have no size information. We would get a char * 
 without any size information and according to the documentation we can 
 do anything including access out of bounds.

Access out of bounds is do-able with a pointer allocated by the GC.

int[] array;
arr.length = 5;

int* arrayPointer = array.ptr;
int value = arrayPointer[10]; // compiles!!! but will segfault at runtime

And of course that won't work in  safe code.

 How does D internally know that a pointer was previously allocated by 
 the GC or malloc?

Either the GC has that information or it doesn't.

 If we would replace the GC with reference counting. How would D be able 
 to distinguish a reference counted pointer from a raw pointer at compile 
 time in order to insert the code associated with the reference counting?

It can't.

 This brings me back to MS managed C++ where they actually had two types 
 of "pointers" a managed pointer and the normal C++ pointers. Like this:
 
 MyType^ instance = gcnew MyType();
 
 In this case it was obvious what is done with GC and what wasn't (past 
 tense since managed C++ is deprecated). In this case it would be trivial 
 to replace the GC algorithm with whatever you want since the compiler 
 know the type at compile time.

There is only one type of pointer in D.

The GC is a library with language hooks. Nothing more than that.
It is easily swappable from within druntime.

But it does need to hook into threads and control them (e.g. thread 
local storage and pausing them) so there are a few restrictions like it 
must be chosen immediately after libc initialization at the start of 
druntime initialization.

Oct 03 2019

D Programming

C/C++ Programming

Other

digitalmars.D.learn - How does D distnguish managed pointers from raw pointers?