www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - what's the right way to get char* from string?

reply aki <aki google.com> writes:
Hello,

When I need to call C function, often need to
have char* pointer from string.

"Interfacing to C++" page:
https://dlang.org/spec/cpp_interface.html
have following example.

extern (C) int strcmp(char* string1, char* string2);
import std.string;
int myDfunction(char[] s)
{
     return strcmp(std.string.toStringz(s), "foo");
}

but this is incorrect because toStringz() returns immutable 
pointer.
One way is to write mutable version of toStringz()

char* toStringzMutable(string s)  trusted pure nothrow {
     auto copy = new char[s.length + 1];
     copy[0..s.length] = s[];
     copy[s.length] = 0;
     return copy.ptr;
}

But I think this is common needs,
why it is not provided by Phobos?
(or tell me if it has)

Thanks,
aki
May 05 2016
next sibling parent reply pineapple <meapineapple gmail.com> writes:
On Thursday, 5 May 2016 at 07:49:46 UTC, aki wrote:
 Hello,

 When I need to call C function, often need to
 have char* pointer from string.
This might help: import std.traits : isSomeString; import std.string : toStringz; extern (C) int strcmp(char* string1, char* string2); int strcmpD0(S)(in S lhs, in S rhs) if(is(S == string) || is(S == const(char)[])) { // Best return strcmp( cast(char*) toStringz(lhs), cast(char*) toStringz(rhs) ); } int strcmpD1(S)(in S lhs, in S rhs) if(is(S == string) || is(S == const(char)[])) { // Works return strcmp( cast(char*) lhs.ptr, cast(char*) rhs.ptr ); } /+ int strcmpD2(S)(in S lhs, in S rhs) if(is(S == string) || is(S == const(char)[])) { // Breaks return strcmp( toStringz(lhs), toStringz(rhs) ); } +/ void main(){ import std.stdio; writeln(strcmpD0("foo", "bar")); // Best writeln(strcmpD1("foo", "bar")); // Works //writeln(strcmpD2("foo", "bar")); // Breaks }
May 05 2016
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/5/16 11:53 AM, pineapple wrote:
 On Thursday, 5 May 2016 at 07:49:46 UTC, aki wrote:
 Hello,

 When I need to call C function, often need to
 have char* pointer from string.
This might help: import std.traits : isSomeString; import std.string : toStringz; extern (C) int strcmp(char* string1, char* string2); int strcmpD0(S)(in S lhs, in S rhs) if(is(S == string) || is(S == const(char)[])) { // Best return strcmp( cast(char*) toStringz(lhs), cast(char*) toStringz(rhs) ); }
This is likely a correct solution, because strcmp does not modify any data in the string itself. Practically speaking, you can define strcmp as taking const(char)*. This
 int strcmpD1(S)(in S lhs, in S rhs) if(is(S == string) || is(S ==
 const(char)[])) { // Works
      return strcmp(
          cast(char*) lhs.ptr,
          cast(char*) rhs.ptr
      );
 }
Note, this only works if the strings are literals. Do not use this mechanism in general.
 /+
 int strcmpD2(S)(in S lhs, in S rhs) if(is(S == string) || is(S ==
 const(char)[])) { // Breaks
      return strcmp(
          toStringz(lhs),
          toStringz(rhs)
      );
 }
 +/
Given a possibility that you are calling a C function that may actually modify the data, there isn't a really good way to do this. Only thing I can think of is.. um... horrible: char *toCharz(string s) { auto cstr = s.toStringz; return cstr[0 .. s.length + 1].dup.ptr; } -Steve
May 05 2016
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/5/16 3:36 PM, Steven Schveighoffer wrote:
 Only thing I can think of is.. um... horrible:

 char *toCharz(string s)
 {
     auto cstr = s.toStringz;
     return cstr[0 .. s.length + 1].dup.ptr;
 }
Ignore this. What Jonathan said :) -Steve
May 05 2016
prev sibling next sibling parent reply Jonathan M Davis via Digitalmars-d-learn writes:
On Thu, 05 May 2016 07:49:46 +0000
aki via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:

 Hello,

 When I need to call C function, often need to
 have char* pointer from string.

 "Interfacing to C++" page:
 https://dlang.org/spec/cpp_interface.html
 have following example.

 extern (C) int strcmp(char* string1, char* string2);
 import std.string;
 int myDfunction(char[] s)
 {
      return strcmp(std.string.toStringz(s), "foo");
 }

 but this is incorrect because toStringz() returns immutable
 pointer.
 One way is to write mutable version of toStringz()

 char* toStringzMutable(string s)  trusted pure nothrow {
      auto copy = new char[s.length + 1];
      copy[0..s.length] = s[];
      copy[s.length] = 0;
      return copy.ptr;
 }

 But I think this is common needs,
 why it is not provided by Phobos?
 (or tell me if it has)
If you want a different mutability, then use the more general function std.utf.toUTFz. e.g. from the documentation: auto p1 = toUTFz!(char*)("hello world"); auto p2 = toUTFz!(const(char)*)("hello world"); auto p3 = toUTFz!(immutable(char)*)("hello world"); auto p4 = toUTFz!(char*)("hello world"d); auto p5 = toUTFz!(const(wchar)*)("hello world"); auto p6 = toUTFz!(immutable(dchar)*)("hello world"w); - Jonathan M Davis
May 05 2016
parent aki <aki google.com> writes:
On Thursday, 5 May 2016 at 11:35:09 UTC, Jonathan M Davis wrote:
 If you want a different mutability, then use the more general 
 function std.utf.toUTFz. e.g. from the documentation:

     auto p1 = toUTFz!(char*)("hello world");
     auto p2 = toUTFz!(const(char)*)("hello world");
     auto p3 = toUTFz!(immutable(char)*)("hello world");
     auto p4 = toUTFz!(char*)("hello world"d);
     auto p5 = toUTFz!(const(wchar)*)("hello world");
     auto p6 = toUTFz!(immutable(dchar)*)("hello world"w);

 - Jonathan M Davis
Ah! This can be a solution. Thanks Jonathan. -- aki.
May 05 2016
prev sibling next sibling parent Alex Parrill <initrd.gz gmail.com> writes:
On Thursday, 5 May 2016 at 07:49:46 UTC, aki wrote:
 extern (C) int strcmp(char* string1, char* string2);
This signature of strcmp is incorrect. strcmp accepts const char* arguments [1], which in D would be written as const(char)*. The immutable(char)* values returned from toStringz are implicitly convertible to const(char)* and are therefore useable as-is as arguments to strcmp. import std.string; extern (C) int strcmp(const(char)* string1, const(char)* string2); auto v = strcmp(somestring1.toStringz, somestring2.toStringz); [1] http://linux.die.net/man/3/strcmp
May 05 2016
prev sibling parent ZombineDev <petar.p.kirov gmail.com> writes:
On Thursday, 5 May 2016 at 07:49:46 UTC, aki wrote:
 Hello,

 When I need to call C function, often need to
 have char* pointer from string.

 "Interfacing to C++" page:
 https://dlang.org/spec/cpp_interface.html
 have following example.

 extern (C) int strcmp(char* string1, char* string2);
 import std.string;
 int myDfunction(char[] s)
 {
     return strcmp(std.string.toStringz(s), "foo");
 }

 but this is incorrect because toStringz() returns immutable 
 pointer.
 One way is to write mutable version of toStringz()

 char* toStringzMutable(string s)  trusted pure nothrow {
     auto copy = new char[s.length + 1];
     copy[0..s.length] = s[];
     copy[s.length] = 0;
     return copy.ptr;
 }

 But I think this is common needs,
 why it is not provided by Phobos?
 (or tell me if it has)

 Thanks,
 aki
In this particular case, if you `import core.stdc.string : strcmp`, instead of providing your own extern declaration it should work, because in there the signature is correctly typed as `in char*` which is essentially the same as `const(char*)` which can accept both mutable, const and immutable arguments. Also it has the correct attributes so you can call it from `pure`, `nothrow` and ` nogc` code. As others have said, when you do need to convert a string slice to a pointer to a null terminated char/wchar/dchar string, `toUTFz` can be very useful. But where possible, you should prefer functions that take an explicit length parameter, so you can avoid memory allocation: ``` string s1, s2; import std.algorithm : min; import core.stdc.string : strncmp; strncmp(s1.ptr, s2.ptr, min(s1.length, s2.length)); // (`min` is used to prevent the C function from // accessing data beyond the smallest // of the two string slices). ``` Also string slices that point to a **whole** string literal are automatically null-terminated: ``` // lit is zero-terminated string lit = "asdf"; assert (lit.ptr[lit.length] == '\0'); assert (strlen(lit.ptr) == lit.length); ``` However you need to be very careful, because as soon as you make a sub-slice, this property disappears: ``` // slice is not zero-terminated. string slice = lit[0..2]; assert (slice.ptr[length] == 'd'); assert (strlen(slice.ptr) != slice.length); ``` This means that you can't be sure that a string slice is zero-termninated unless you can see it in your code that it points to a string literal and you're sure that it would never be changed to point to something else (like something returned from a function).
May 05 2016