digitalmars.D.learn - C interface provides a pointer and a length... wrap without copying?

cy (14/14) Mar 11 2017 So a lovely C library does its own opaque allocation, and

ketmar (5/17) Mar 11 2017 yep, it was done before.
Nicholas Wilson (8/22) Mar 11 2017 A string *is* a pointer length pair, an immutable(char)[]. Your

cy (5/9) Mar 11 2017 Yes, but surely there's some silly requirement, like that the

ketmar (5/9) Mar 11 2017 why should it? a slice can point anywhere, and GC is smart enough to kno...
Jonathan M Davis via Digitalmars-d-learn (55/60) Mar 11 2017 No. A dynamic array is just a struct with a pointer and a length. Aside ...

cy <dlang verge.info.tm> writes:

So a lovely C library does its own opaque allocation, and 
provides access to the malloc'd memory, and that memory's length. 
Instead of copying the results into garbage collected memory 
(which would probably be smart) I was thinking about creating a 
structure like:

struct WrappedString {
   byte* ptr;
   size_t length;
}

And then implementing opIndex for it, and opEquals for all the 
different string types, and conversions to those types, and then 
it occurred to me that this sounds like a lot of work. Has 
anybody done this already? Made a pointer/length pair, that acts 
like a string?

Mar 11 2017

ketmar <ketmar ketmar.no-ip.org> writes:

cy wrote:

 So a lovely C library does its own opaque allocation, and provides access 
 to the malloc'd memory, and that memory's length. Instead of copying the 
 results into garbage collected memory (which would probably be smart) I 
 was thinking about creating a structure like:

 struct WrappedString {
    byte* ptr;
    size_t length;
 }

 And then implementing opIndex for it, and opEquals for all the different 
 string types, and conversions to those types, and then it occurred to me 
 that this sounds like a lot of work. Has anybody done this already? Made 
 a pointer/length pair, that acts like a string?

yep, it was done before.

  int* a = cast(int*)malloc(1024);
  auto b = a[0..1024];
  // yay, b is just an ordinary slice now!

Mar 11 2017

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Saturday, 11 March 2017 at 22:39:02 UTC, cy wrote:
 So a lovely C library does its own opaque allocation, and 
 provides access to the malloc'd memory, and that memory's 
 length. Instead of copying the results into garbage collected 
 memory (which would probably be smart) I was thinking about 
 creating a structure like:

 struct WrappedString {
   byte* ptr;
   size_t length;
 }

 And then implementing opIndex for it, and opEquals for all the 
 different string types, and conversions to those types, and 
 then it occurred to me that this sounds like a lot of work. Has 
 anybody done this already? Made a pointer/length pair, that 
 acts like a string?

A string *is* a pointer length pair, an immutable(char)[]. Your 
`WrappedString` is effectively a byte[].

All you need to do is:

ubyte[] arr; // or byte/char whatever is the pointed to type 
returned by giveMeTheMemory
arr = giveMeTheMemory()[0 .. getMeTheLength()];

No need to reimplement anything.

Mar 11 2017

cy <dlang verge.info.tm> writes:

On Saturday, 11 March 2017 at 23:43:54 UTC, Nicholas Wilson wrote:
 A string *is* a pointer length pair, an immutable(char)[].

Yes, but surely there's some silly requirement, like that the 
pointer must only ever point to garbage collected memory, or 
something?

 ubyte[] arr; // or byte/char whatever is the pointed to type 
 returned by giveMeTheMemory
 arr = giveMeTheMemory()[0 .. getMeTheLength()];

...guess not! :D

Mar 11 2017

ketmar <ketmar ketmar.no-ip.org> writes:

cy wrote:

 On Saturday, 11 March 2017 at 23:43:54 UTC, Nicholas Wilson wrote:
 A string *is* a pointer length pair, an immutable(char)[].

 Yes, but surely there's some silly requirement, like that the pointer 
 must only ever point to garbage collected memory, or something?

why should it? a slice can point anywhere, and GC is smart enough to know 
what memory it owns and what it isn't. if you'll try to append something to 
non-GC-owned slice, GC will make a copy first. so the only requirement is: 
"don't use `~=` on it if you don't want to have a memory leak".

Mar 11 2017

Jonathan M Davis via Digitalmars-d-learn writes:

On Sunday, March 12, 2017 02:47:19 cy via Digitalmars-d-learn wrote:
 On Saturday, 11 March 2017 at 23:43:54 UTC, Nicholas Wilson wrote:
 A string *is* a pointer length pair, an immutable(char)[].

 Yes, but surely there's some silly requirement, like that the
 pointer must only ever point to garbage collected memory, or
 something?

No. A dynamic array is just a struct with a pointer and a length. Aside from
avoiding accessing memory that is no longer valid or avoiding allocations,
it doesn't matter one whit what memory it points to.

char[5] a;
char[] b = a;

is perfectly valid as is slicing memory that comes from malloc or some other
crazy place. Most dynamic array operations don't even need to care about
what memory they refer to. If you access an element, it does the math and
dereferences it like you'd get with naked pointer arithmetic. The difference
is that it also does bounds checking for you, because it knows the length.
If you slice the array, then you just get an array with an adjusted pointer
and/or length.

The only times that the GC gets involved are when doing anything involving
appending. If you call capacity, reserve, or ~=, then the GC gets involved.
In those cases, the GC looks at the pointer to determine whether it can
append. If it finds that it's GC-allocated memory, then it will look to see
what room there is in the GC-allocated block after the array. If you're
calling capacity, it will just return how large the array can grow to
without being reallocated. If you're calling reserve or ~= it checks to see
whether the capacity is great enough to grow into that space. If not, it
will allocate a new block of memory, copy the array's elements into that,
and set the array to point to it. In the case of ~=, it will also grow the
array into that memory and put the new elements in it, whereas in the case
of reserve, it just does the reallocation. If there was enough space, then
~= will just expand the array into that space without reallocating, and
reserve will do nothing.

If you have a dynamic array that was not allocated by the GC (be it a slice
of a static array or malloc-ed memory or whatever), then its capacity will
be 0. So, capacity will tell you 0, and ~= and reserve will always result in
the array being reallocated.

I would suggest that you read this excellent article:

http://dlang.org/d-array-article.html

though I would point out that it uses the wrong terminology in that it
refers to the GC-allocated buffer as the dynamic array rather than T[], and
it refers to T[] as a slice, whereas the official terminology is that T[] is
the dynamic array and that buffer doesn't have an official name, and while
T[] _is_ a slice of memory (assuming that it's not null), slice refers to a
lot of other stuff in D (e.g. slicing a container gives you range over that
container, so it's a slice of the container, but it's not T[]). And your use
case here is a perfect example of why T[] is the dynamic array and not the
GC-allocated buffer.

Dynamic arrays simply don't care about the memory that they refer to,
because they don't manage their own memory. ~=, reserve, and capacity do
care about what memory a dynamic array refers to, but they work exactly the
same way regardless of what the dynamic array refers to. It's just a
question of when reallocations do or do not occur.

The big concern with slicing static arrays or malloc-ed memory to get a
dynamic array is that it's then up to you to ensure that that dynamic array
does not outlive the memory that it refers to. So, there is some danger
there, but that's no different from operating on raw pointers, and operating
on dynamic arrays gives better safety thanks to bounds-checking, and it also
works with appending, though that will cause the dynamic array to then refer
to GC-allocated memory instead of the original memory.

- Jonathan M Davis

Mar 11 2017

D Programming

C/C++ Programming

Other

digitalmars.D.learn - C interface provides a pointer and a length... wrap without copying?