www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - string to char array?

reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
quick question: What is the most efficient way to covert a string 
to a char array?
Jun 02 2015
next sibling parent reply "Dennis Ritchie" <dennis.ritchie mail.ru> writes:
On Tuesday, 2 June 2015 at 15:07:58 UTC, Kyoji Klyden wrote:
 quick question: What is the most efficient way to covert a 
 string to a char array?
string s = "str"; char[] strArr = s.dup;
Jun 02 2015
parent reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Tuesday, 2 June 2015 at 15:26:50 UTC, Dennis Ritchie wrote:
 On Tuesday, 2 June 2015 at 15:07:58 UTC, Kyoji Klyden wrote:
 quick question: What is the most efficient way to covert a 
 string to a char array?
string s = "str"; char[] strArr = s.dup;
Thanks! :)
Jun 02 2015
parent "Meta" <jared771 gmail.com> writes:
On Tuesday, 2 June 2015 at 15:32:12 UTC, Kyoji Klyden wrote:
 On Tuesday, 2 June 2015 at 15:26:50 UTC, Dennis Ritchie wrote:
 On Tuesday, 2 June 2015 at 15:07:58 UTC, Kyoji Klyden wrote:
 quick question: What is the most efficient way to covert a 
 string to a char array?
string s = "str"; char[] strArr = s.dup;
Thanks! :)
Note that this will allocate a new garbage collected array.
Jun 02 2015
prev sibling parent reply "Alex Parrill" <initrd.gz gmail.com> writes:
On Tuesday, 2 June 2015 at 15:07:58 UTC, Kyoji Klyden wrote:
 quick question: What is the most efficient way to covert a 
 string to a char array?
A string is, by definition in D, a character array, specifically `immutable(char)[]`. It's not like, for example, Java in which it's a completely separate type; you can perform all the standard array operations on strings. If you need to mutate a string, then you can create a mutable `char[]` by doing `somestring.dup` as Dennis already mentioned.
Jun 02 2015
next sibling parent "Dennis Ritchie" <dennis.ritchie mail.ru> writes:
On Tuesday, 2 June 2015 at 15:53:33 UTC, Alex Parrill wrote:
 A string is, by definition in D, a character array, 
 specifically `immutable(char)[]`. It's not like, for example, 
 Java in which it's a completely separate type; you can perform 
 all the standard array operations on strings.
Yes, I believe that this is a problem in D, and because when you create a multidimensional array mutable strings having real troubles with .deepDup. I think that will solve the problem of a new string data type is a built-in D, and because writing .dup, to create a mutated string, - it's really funny! This is problem!
Jun 02 2015
prev sibling parent reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Tuesday, 2 June 2015 at 15:53:33 UTC, Alex Parrill wrote:
 On Tuesday, 2 June 2015 at 15:07:58 UTC, Kyoji Klyden wrote:
 quick question: What is the most efficient way to covert a 
 string to a char array?
A string is, by definition in D, a character array, specifically `immutable(char)[]`. It's not like, for example, Java in which it's a completely separate type; you can perform all the standard array operations on strings. If you need to mutate a string, then you can create a mutable `char[]` by doing `somestring.dup` as Dennis already mentioned.
The problem I was having was actually that an opengl function (specifically glShaderSource) wouldn't accept strings. I'm still can't get it to work actually :P glShaderSource (uint, int, const(char*)*, const(int)*) This one function is a bugger, been going at this for hours. On Tuesday, 2 June 2015 at 15:38:24 UTC, Meta wrote:
 Note that this will allocate a new garbage collected array.
Thx for the heads up
Jun 02 2015
parent reply "Alex Parrill" <initrd.gz gmail.com> writes:
On Tuesday, 2 June 2015 at 16:23:26 UTC, Kyoji Klyden wrote:
 On Tuesday, 2 June 2015 at 15:53:33 UTC, Alex Parrill wrote:
 On Tuesday, 2 June 2015 at 15:07:58 UTC, Kyoji Klyden wrote:
 quick question: What is the most efficient way to covert a 
 string to a char array?
A string is, by definition in D, a character array, specifically `immutable(char)[]`. It's not like, for example, Java in which it's a completely separate type; you can perform all the standard array operations on strings. If you need to mutate a string, then you can create a mutable `char[]` by doing `somestring.dup` as Dennis already mentioned.
The problem I was having was actually that an opengl function (specifically glShaderSource) wouldn't accept strings. I'm still can't get it to work actually :P glShaderSource (uint, int, const(char*)*, const(int)*) This one function is a bugger, been going at this for hours. On Tuesday, 2 June 2015 at 15:38:24 UTC, Meta wrote:
 Note that this will allocate a new garbage collected array.
Thx for the heads up
glShaderSource accepts an array of null-terminated strings. Try this: import std.string : toStringz; string sources = source.toStringz; int len = source.length; glShaderSource(id, sources, 1, &sources, &len);
Jun 02 2015
parent reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Tuesday, 2 June 2015 at 16:26:30 UTC, Alex Parrill wrote:
 On Tuesday, 2 June 2015 at 16:23:26 UTC, Kyoji Klyden wrote:
 On Tuesday, 2 June 2015 at 15:53:33 UTC, Alex Parrill wrote:
 On Tuesday, 2 June 2015 at 15:07:58 UTC, Kyoji Klyden wrote:
 quick question: What is the most efficient way to covert a 
 string to a char array?
A string is, by definition in D, a character array, specifically `immutable(char)[]`. It's not like, for example, Java in which it's a completely separate type; you can perform all the standard array operations on strings. If you need to mutate a string, then you can create a mutable `char[]` by doing `somestring.dup` as Dennis already mentioned.
The problem I was having was actually that an opengl function (specifically glShaderSource) wouldn't accept strings. I'm still can't get it to work actually :P glShaderSource (uint, int, const(char*)*, const(int)*) This one function is a bugger, been going at this for hours. On Tuesday, 2 June 2015 at 15:38:24 UTC, Meta wrote:
 Note that this will allocate a new garbage collected array.
Thx for the heads up
glShaderSource accepts an array of null-terminated strings. Try this: import std.string : toStringz; string sources = source.toStringz; int len = source.length; glShaderSource(id, sources, 1, &sources, &len);
src: string source = readText("test.glvert"); const string sources = source.toStringz; const int len = source.length; GLuint vertShader = glCreateShader( GL_VERTEX_SHADER ); glShaderSource(vertShader, 1, &sources, &len); pt.d(26): Error: cannot implicitly convert expression (toStringz(source)) of type immutable(char)* to const(string) pt.d(34): Error: function pointer glShaderSource (uint, int, const(char*)*, const(int)*) is not callable using argument types (uint, int, const(string)*, const(int)*) - I also tried passing the char array instead but no go.. What am I missing? :\
Jun 02 2015
next sibling parent reply "Alex Parrill" <initrd.gz gmail.com> writes:
On Tuesday, 2 June 2015 at 16:41:38 UTC, Kyoji Klyden wrote:

 src:

         string source = readText("test.glvert");
 	
 	const string sources = source.toStringz;
 	const int len = source.length;
 	
 	GLuint vertShader = glCreateShader( GL_VERTEX_SHADER );
 	
 	glShaderSource(vertShader, 1, &sources, &len);

 pt.d(26): Error: cannot implicitly convert expression 
 (toStringz(source)) of type immutable(char)* to const(string)

 pt.d(34): Error: function pointer glShaderSource (uint, int, 
 const(char*)*, const(int)*) is not callable using argument 
 types (uint, int, const(string)*, const(int)*)

 -

 I also tried passing the char array instead but no go.. What am 
 I missing? :\
Oops, do `const immutable(char)* sources = source.toStringz` (or just use `auto sources = ...`).
Jun 02 2015
parent "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Tuesday, 2 June 2015 at 17:03:32 UTC, Alex Parrill wrote:
 On Tuesday, 2 June 2015 at 16:41:38 UTC, Kyoji Klyden wrote:

 src:

        string source = readText("test.glvert");
 	
 	const string sources = source.toStringz;
 	const int len = source.length;
 	
 	GLuint vertShader = glCreateShader( GL_VERTEX_SHADER );
 	
 	glShaderSource(vertShader, 1, &sources, &len);

 pt.d(26): Error: cannot implicitly convert expression 
 (toStringz(source)) of type immutable(char)* to const(string)

 pt.d(34): Error: function pointer glShaderSource (uint, int, 
 const(char*)*, const(int)*) is not callable using argument 
 types (uint, int, const(string)*, const(int)*)

 -

 I also tried passing the char array instead but no go.. What 
 am I missing? :\
Oops, do `const immutable(char)* sources = source.toStringz` (or just use `auto sources = ...`).
OMG IT FINALLY WORKS :O (%1 of the program complete!) Thankyou very much, this was a huge help! It's such a small piece but I feel like a learned alot from this. :)
Jun 02 2015
prev sibling parent reply "Kagamin" <spam here.lot> writes:
On Tuesday, 2 June 2015 at 16:41:38 UTC, Kyoji Klyden wrote:
 src:

         string source = readText("test.glvert");
 	
 	const string sources = source.toStringz;
 	const int len = source.length;
 	
 	GLuint vertShader = glCreateShader( GL_VERTEX_SHADER );
 	
 	glShaderSource(vertShader, 1, &sources, &len);
Judging by the docs, you don't need null-terminated strings and can use native D strings, just pass their lengths: string source = readText("test.glvert"); const char* sources = source.ptr; const GLint len = cast(GLint)source.length; GLuint vertShader = glCreateShader( GL_VERTEX_SHADER ); glShaderSource(vertShader, 1, &sources, &len);
Jun 03 2015
parent reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Wednesday, 3 June 2015 at 08:11:16 UTC, Kagamin wrote:
 On Tuesday, 2 June 2015 at 16:41:38 UTC, Kyoji Klyden wrote:
 src:

        string source = readText("test.glvert");
 	
 	const string sources = source.toStringz;
 	const int len = source.length;
 	
 	GLuint vertShader = glCreateShader( GL_VERTEX_SHADER );
 	
 	glShaderSource(vertShader, 1, &sources, &len);
Judging by the docs, you don't need null-terminated strings and can use native D strings, just pass their lengths: string source = readText("test.glvert"); const char* sources = source.ptr; const GLint len = cast(GLint)source.length; GLuint vertShader = glCreateShader( GL_VERTEX_SHADER ); glShaderSource(vertShader, 1, &sources, &len);
Oh that also works quite well! Is casting necessary there though? DerelictGL treats GL types as D types, and .length is size_t so wouldn't it just turn into an int regardless?? Also the one part I don't understand is with &sources. So is this passing sources as a reference, but sources itself is a pointer to a pointer? I'm just a tad confused on how this part works :S
Jun 03 2015
next sibling parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Wednesday, 3 June 2015 at 10:21:20 UTC, Kyoji Klyden wrote:
 On Wednesday, 3 June 2015 at 08:11:16 UTC, Kagamin wrote:
 On Tuesday, 2 June 2015 at 16:41:38 UTC, Kyoji Klyden wrote:
 src:

       string source = readText("test.glvert");
 	
 	const string sources = source.toStringz;
 	const int len = source.length;
 	
 	GLuint vertShader = glCreateShader( GL_VERTEX_SHADER );
 	
 	glShaderSource(vertShader, 1, &sources, &len);
Judging by the docs, you don't need null-terminated strings and can use native D strings, just pass their lengths: string source = readText("test.glvert"); const char* sources = source.ptr; const GLint len = cast(GLint)source.length; GLuint vertShader = glCreateShader( GL_VERTEX_SHADER ); glShaderSource(vertShader, 1, &sources, &len);
Oh that also works quite well! Is casting necessary there though? DerelictGL treats GL types as D types, and .length is size_t so wouldn't it just turn into an int regardless??
size_t can be 32bit or 64bit, depending on your platform. Don't know how large GLint is. Assigning 64bit to 32bit requires an explicit cast, because it can lose information.
 Also the one part I don't understand is with &sources. So is 
 this passing sources as a reference, but sources itself is a 
 pointer to a pointer? I'm just a tad confused on how this part 
 works :S
A string (or any other array slice for that matter) is internally the equivalent of: struct Slice(T) { T* ptr; size_t length; } (maybe the order of fields is different, I never remember that part) `&source` will give you the address of that structure.
Jun 03 2015
parent reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Wednesday, 3 June 2015 at 10:28:50 UTC, Marc Schütz wrote:
 A string (or any other array slice for that matter) is 
 internally the equivalent of:

     struct Slice(T) {
         T* ptr;
         size_t length;
     }

 (maybe the order of fields is different, I never remember that 
 part)

 `&source` will give you the address of that structure.
hmm I still a bit confused.. So in "const char* sources = source.ptr;" sources is just turning the property ptr of source into a variable, and then in glShaderSource you're passing the memory address of sources (which is technically source.ptr) to the function? Do I have that right? If I do, then I think this all makes sense
Jun 03 2015
parent reply "anonymous" <anonymous example.com> writes:
On Wednesday, 3 June 2015 at 10:56:21 UTC, Kyoji Klyden wrote:
 So in "const char* sources = source.ptr;" sources is just 
 turning the property ptr of source into a variable,
yes
 and then in glShaderSource you're passing the memory address of 
 sources
yes
 (which is technically source.ptr)
No, the address of sources is not the same as `source.ptr`.
 to the function?

 Do I have that right? If I do, then I think this all makes sense
Jun 03 2015
parent reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
Ooooh okay, I'm starting to get it. I think this last question 
should clear it up for me: When a string is made, how is the 
struct Slice handled? What does ptr get assigned?
Jun 03 2015
parent reply "anonymous" <anonymous example.com> writes:
On Wednesday, 3 June 2015 at 11:23:09 UTC, Kyoji Klyden wrote:
 Ooooh okay, I'm starting to get it. I think this last question 
 should clear it up for me: When a string is made, how is the 
 struct Slice handled? What does ptr get assigned?
ptr is a pointer to the first char, in other words the address of the first char.
Jun 03 2015
parent "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Wednesday, 3 June 2015 at 11:28:14 UTC, anonymous wrote:
 On Wednesday, 3 June 2015 at 11:23:09 UTC, Kyoji Klyden wrote:
 Ooooh okay, I'm starting to get it. I think this last question 
 should clear it up for me: When a string is made, how is the 
 struct Slice handled? What does ptr get assigned?
ptr is a pointer to the first char, in other words the address of the first char.
I think I get how it's all basically working now. I've realized that my confusing is all coming from not understanding how D handles arrays, so I'm going to go look into that for a while. Thanks for the help!
Jun 03 2015
prev sibling next sibling parent "anonymous" <anonymous example.com> writes:
On Wednesday, 3 June 2015 at 10:21:20 UTC, Kyoji Klyden wrote:
 On Wednesday, 3 June 2015 at 08:11:16 UTC, Kagamin wrote:
 On Tuesday, 2 June 2015 at 16:41:38 UTC, Kyoji Klyden wrote:
[...]
 	string source = readText("test.glvert");

 	const char* sources = source.ptr;
[...]
 	glShaderSource(vertShader, 1, &sources, &len);
[...]
 Also the one part I don't understand is with &sources. So is 
 this passing sources as a reference, but sources itself is a 
 pointer to a pointer? I'm just a tad confused on how this part 
 works :S
`&sources` is a pointer to `sources`. `sources` itself is a pointer to a char (leaving const-ness aside). So `&sources` is a pointer to a pointer to a char.
Jun 03 2015
prev sibling parent reply "Kagamin" <spam here.lot> writes:
On Wednesday, 3 June 2015 at 10:21:20 UTC, Kyoji Klyden wrote:
 Also the one part I don't understand is with &sources. So is 
 this passing sources as a reference, but sources itself is a 
 pointer to a pointer? I'm just a tad confused on how this part 
 works :S
For some weird reason the function accepts an array of strings for shader source instead of one string. In C speak char* is a string, char** is an array of strings - that's passed to the function.
Jun 03 2015
parent reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Wednesday, 3 June 2015 at 11:46:25 UTC, Kagamin wrote:
 On Wednesday, 3 June 2015 at 10:21:20 UTC, Kyoji Klyden wrote:
 Also the one part I don't understand is with &sources. So is 
 this passing sources as a reference, but sources itself is a 
 pointer to a pointer? I'm just a tad confused on how this part 
 works :S
For some weird reason the function accepts an array of strings for shader source instead of one string. In C speak char* is a string, char** is an array of strings - that's passed to the function.
That's what I found so confusing about the opengl docs. Just guessing here but char* is a pointer to the first char in the string, then what exactly is char**? Is it pointing to the first char of the first string in an array? Does C/D just scrub through memory until it finds the end of an array? Also what signifies an end of array, or any other keypoints? aaaaahh I have so many questions O_o
Jun 03 2015
next sibling parent "Kagamin" <spam here.lot> writes:
On Wednesday, 3 June 2015 at 11:59:56 UTC, Kyoji Klyden wrote:
 That's what I found so confusing about the opengl docs. Just 
 guessing here but char* is a pointer to the first char in the 
 string, then what exactly is char**? Is it pointing to the 
 first char of the first string in an array?
If you use a pointer for a string, you can have an array of such pointers as array of strings, then char** would point to the first pointer in that array.
 Does C/D just scrub through memory until it finds the end of an 
 array? Also what signifies an end of array, or any other 
 keypoints?
C has various conventions to indicate the length, this function uses three conventions simultaneously, so you can choose, which suits you the best.
Jun 03 2015
prev sibling parent reply ketmar <ketmar ketmar.no-ip.org> writes:
On Wed, 03 Jun 2015 11:59:56 +0000, Kyoji Klyden wrote:

 That's what I found so confusing about the opengl docs. Just guessing
 here but char* is a pointer to the first char in the string, then what
 exactly is char**? Is it pointing to the first char of the first string
 in an array?
it's a pointer to array of pointers to first chars of strings. ;-)=
Jun 03 2015
parent reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Thursday, 4 June 2015 at 03:25:24 UTC, ketmar wrote:
 On Wed, 03 Jun 2015 11:59:56 +0000, Kyoji Klyden wrote:

 That's what I found so confusing about the opengl docs. Just 
 guessing
 here but char* is a pointer to the first char in the string, 
 then what
 exactly is char**? Is it pointing to the first char of the 
 first string
 in an array?
it's a pointer to array of pointers to first chars of strings. ;-)
Ohh okay. So this is how the function is able to take multiple strings then.. How was I supposed to know it was an array though? Is it because it was a string type pointer? Also does D primarily use explicit length field strings? Thanks!
Jun 04 2015
parent reply "anonymous" <anonymous example.com> writes:
On Thursday, 4 June 2015 at 21:35:40 UTC, Kyoji Klyden wrote:
 On Thursday, 4 June 2015 at 03:25:24 UTC, ketmar wrote:
 On Wed, 03 Jun 2015 11:59:56 +0000, Kyoji Klyden wrote:
[...]
 what
 exactly is char**? Is it pointing to the first char of the 
 first string
 in an array?
it's a pointer to array of pointers to first chars of strings. ;-)
Ohh okay. So this is how the function is able to take multiple strings then.. How was I supposed to know it was an array though? Is it because it was a string type pointer?
Generally, a `char**` is a pointer to a pointer to a char. There may be more pointers to chars behind the pointed-to one. And there may be more chars behind the pointed-to ones. You can't know just from the type. You have to read the documentation of the involved functions for the specifics.
 Also does D primarily use explicit length field strings?
I'm not sure if I understand you right, but yes, D arrays carry their length. And D `string`s are arrays. You should encounter things like `char**` pretty much only when talking to C code. By the way, there are subtly different meanings of "array" and "string" which I hope you're aware of, but just to be sure: "array" can refer to D array types, i.e. a pointer-length pair, e.g. char[]. Or it can refer to the general concept of a contiguous sequence of elements in memory. And as a special case, "string" can refer to D's `string` type, which is an alias for `immutable(char)[]`. Or it can refer to a contiguous sequence of characters in memory. And when ketmar writes: "it's a pointer to array of pointers to first chars of strings", then "array" and "string" are meant in the generic way, not in the D-specific way.
Jun 04 2015
next sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 06/04/2015 03:28 PM, anonymous wrote:

 Generally, a `char**` is a pointer to a pointer to a char. There may be
 more pointers to chars behind the pointed-to one. And there may be more
 chars behind the pointed-to ones. You can't know just from the type.
Yep, "C's biggest mistake": http://www.drdobbs.com/architecture-and-design/cs-biggest-mistake/228701625 Ali
Jun 04 2015
prev sibling parent reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Thursday, 4 June 2015 at 22:28:50 UTC, anonymous wrote:
 Generally, a `char**` is a pointer to a pointer to a char. 
 There may be more pointers to chars behind the pointed-to one. 
 And there may be more chars behind the pointed-to ones. You 
 can't know just from the type. You have to read the 
 documentation of the involved functions for the specifics.
Alright, kinda a bummer I need to do some digging for each third-party function I use, but oh well. This probably comes from lack of experience but I can't really imagine ever writing something that is more than one or two pointers long.. like wouldn't that call for a redesign of whatever library was being written? On Thursday, 4 June 2015 at 22:33:13 UTC, Ali Çehreli wrote:
 Yep, "C's biggest mistake":


 http://www.drdobbs.com/architecture-and-design/cs-biggest-mistake/228701625

 Ali
Thx for the link. I think I read this a couple years ago, but at the time had no idea what Walter was talking about. (probably because I was only using Python back when I read it)
 Also does D primarily use explicit length field strings?
I'm not sure if I understand you right, but yes, D arrays carry their length. And D `string`s are arrays. You should encounter things like `char**` pretty much only when talking to C code. By the way, there are subtly different meanings of "array" and "string" which I hope you're aware of, but just to be sure: "array" can refer to D array types, i.e. a pointer-length pair, e.g. char[]. Or it can refer to the general concept of a contiguous sequence of elements in memory. And as a special case, "string" can refer to D's `string` type, which is an alias for `immutable(char)[]`. Or it can refer to a contiguous sequence of characters in memory. And when ketmar writes: "it's a pointer to array of pointers to first chars of strings", then "array" and "string" are meant in the generic way, not in the D-specific way.
Yeah that's what I meant. I just got the phrasing for that out of a compiler book I have. I now see alot of my confusion is coming from me poorly assuming C, D, and sometimes C++ are functioning/handling data the same way. Clearly there's more to it than simply ProgrammingLanguage->IR->MachineCode (obviously to horribly simplify it). So how does D store arrays in memory then? I know you already explained this part, but.. Does the slice's pointer point to the slice's position in memory? Then if an array isn't sequential, is it atleast a sequence of pointers to the slice structs (& those are just in whatever spot in memory they could get?) There's a slice for each array index..right? Or is it only for the first element? Thanks!
Jun 05 2015
next sibling parent reply "Kagamin" <spam here.lot> writes:
On Friday, 5 June 2015 at 17:27:18 UTC, Kyoji Klyden wrote:
 Does the slice's pointer point to the slice's position in 
 memory? Then if an array isn't sequential, is it atleast a 
 sequence of pointers to the slice structs (& those are just in 
 whatever spot in memory they could get?)
BTW, try to write in assembler, it will give you perfect understanding of all things like memory layout, calling conventions, alignment etc :3
Jun 05 2015
parent reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Friday, 5 June 2015 at 18:06:25 UTC, Kagamin wrote:
 On Friday, 5 June 2015 at 17:27:18 UTC, Kyoji Klyden wrote:
 Does the slice's pointer point to the slice's position in 
 memory? Then if an array isn't sequential, is it atleast a 
 sequence of pointers to the slice structs (& those are just in 
 whatever spot in memory they could get?)
BTW, try to write in assembler, it will give you perfect understanding of all things like memory layout, calling conventions, alignment etc :3
I did a tiny bit before actually, but I wanna go back to it once I have the time soooo bad (it's pretty high up on my to-do list). I think it's really fun :D
Jun 05 2015
parent reply "Kagamin" <spam here.lot> writes:
Well, reading assembler is good enough:

void f(int[] a)
{
   a[0]=0;
   a[1]=1;
   a[2]=2;
}

Here pointer is passed in rsi register and length - in rdi:

void f(int[]):
	push	rax
	test	rdi, rdi
	je	.LBB0_4
	mov	dword ptr [rsi], 0
	cmp	rdi, 1
	jbe	.LBB0_5
	mov	dword ptr [rsi + 4], 1
	cmp	rdi, 2
	jbe	.LBB0_6
	mov	dword ptr [rsi + 8], 2
	pop	rax
	ret
.LBB0_4:
	mov	edi, 55
	mov	esi, .L.str
	mov	edx, 5
	call	_d_arraybounds
.LBB0_5:
	mov	edi, 55
	mov	esi, .L.str
	mov	edx, 6
	call	_d_arraybounds
.LBB0_6:
	mov	edi, 55
	mov	esi, .L.str
	mov	edx, 7
	call	_d_arraybounds

You play with assembler generated for D code at 
http://ldc.acomirei.ru/
Jun 05 2015
parent reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Friday, 5 June 2015 at 18:30:53 UTC, Kagamin wrote:
 Well, reading assembler is good enough:

 void f(int[] a)
 {
   a[0]=0;
   a[1]=1;
   a[2]=2;
 }

 Here pointer is passed in rsi register and length - in rdi:

 void f(int[]):
 	push	rax
 	test	rdi, rdi
 	je	.LBB0_4
 	mov	dword ptr [rsi], 0
 	cmp	rdi, 1
 	jbe	.LBB0_5
 	mov	dword ptr [rsi + 4], 1
 	cmp	rdi, 2
 	jbe	.LBB0_6
 	mov	dword ptr [rsi + 8], 2
 	pop	rax
 	ret
 .LBB0_4:
 	mov	edi, 55
 	mov	esi, .L.str
 	mov	edx, 5
 	call	_d_arraybounds
 .LBB0_5:
 	mov	edi, 55
 	mov	esi, .L.str
 	mov	edx, 6
 	call	_d_arraybounds
 .LBB0_6:
 	mov	edi, 55
 	mov	esi, .L.str
 	mov	edx, 7
 	call	_d_arraybounds

 You play with assembler generated for D code at 
 http://ldc.acomirei.ru/
Never said I was good at asm but I'll give it a shot... So push rax to the top of the memory stack, test if rdi == rdi since yes jump to.LBB0_4, in LBB0_4 move the value 55 into edi, then move .L.str (whatever that is) into esi, then 5 into edx, then call _d_arraybounds (something from Druntime maybe?) then LBB0_4 has nothing left so go back, move the value 0 into a 32-bit pointer(to rsi register), if rdi == 1 jump to LBB0_5 (pretty much the same as LBB0_4), then move 1 into the pointer (which points to rsi[+ 4 bytes cuz it's an int]), so on and so forth until we pop rax from the memory stack and return. How did I do? :P (hopefully at least B grade) I'm not really sure what .L.str or _d_arraybounds is, but I'm guessing it's the D runtime? Also in the mov parts, is that moving 1 into the pointer or into the rsi register? And is rsi + 4, still in rsi, or does it move to a different register?
Jun 05 2015
parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Friday, 5 June 2015 at 19:19:23 UTC, Kyoji Klyden wrote:
 On Friday, 5 June 2015 at 18:30:53 UTC, Kagamin wrote:
 Well, reading assembler is good enough:

 void f(int[] a)
 {
  a[0]=0;
  a[1]=1;
  a[2]=2;
 }

 Here pointer is passed in rsi register and length - in rdi:

 void f(int[]):
 	push	rax
 	test	rdi, rdi
 	je	.LBB0_4
 	mov	dword ptr [rsi], 0
 	cmp	rdi, 1
 	jbe	.LBB0_5
 	mov	dword ptr [rsi + 4], 1
 	cmp	rdi, 2
 	jbe	.LBB0_6
 	mov	dword ptr [rsi + 8], 2
 	pop	rax
 	ret
 .LBB0_4:
 	mov	edi, 55
 	mov	esi, .L.str
 	mov	edx, 5
 	call	_d_arraybounds
 .LBB0_5:
 	mov	edi, 55
 	mov	esi, .L.str
 	mov	edx, 6
 	call	_d_arraybounds
 .LBB0_6:
 	mov	edi, 55
 	mov	esi, .L.str
 	mov	edx, 7
 	call	_d_arraybounds

 You play with assembler generated for D code at 
 http://ldc.acomirei.ru/
Never said I was good at asm but I'll give it a shot... So push rax to the top of the memory stack, test if rdi == rdi since yes jump to.LBB0_4, in LBB0_4 move the value 55 into edi, then move .L.str (whatever that is) into esi, then 5 into edx, then call _d_arraybounds (something from Druntime maybe?) then LBB0_4 has nothing left so go back, move the value 0 into a 32-bit pointer(to rsi register), if rdi == 1 jump to LBB0_5 (pretty much the same as LBB0_4), then move 1 into the pointer (which points to rsi[+ 4 bytes cuz it's an int]), so on and so forth until we pop rax from the memory stack and return. How did I do? :P (hopefully at least B grade)
Almost correct :-) The part of "has nothing left, so go back" is wrong. The call to _d_arraybounds doesn't return, because it throws an Error.
 I'm not really sure what .L.str or _d_arraybounds is, but I'm 
 guessing it's the D runtime?
Yes, inside the `f` function, the compiler cannot know the length of the array during compilation. To keep you from accidentally accessing invalid memory (e.g. if the array has only two elements, but you're trying to access the third), it automatically inserts a check, and calls that runtime helper function to throw an Error if the check fails. .L.str is most likely the address of the error message or filename, and 55 is its length. The 5/6/7 values are the respective line numbers. You can disable this behaviour by compiling with `dmd -boundscheck=off`.
 Also in the mov parts, is that moving 1 into the pointer or 
 into the rsi register? And is rsi + 4, still in rsi, or does it 
 move to a different register?
It stores the `1` into the memory pointed to by `rsi`, or `rsi+4` etc. This is what the brackets [...] mean. Because it's an array of ints, and ints are 4 bytes in size, [rsi] is the first element, [rsi+4] the second, and [rsi+8] the third. `rsi+4` is just a temporary value that is only used during the store, it's not saved into a (named) register. This is a peculiarity of the x86 processors; they allow quite complex address calculations for memory accesses.
Jun 06 2015
parent reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Saturday, 6 June 2015 at 10:12:54 UTC, Marc Schütz wrote:
 ...
Almost correct :-) The part of "has nothing left, so go back" is wrong. The call to _d_arraybounds doesn't return, because it throws an Error.
 ...
Yes, inside the `f` function, the compiler cannot know the length of the array during compilation. To keep you from accidentally accessing invalid memory (e.g. if the array has only two elements, but you're trying to access the third), it automatically inserts a check, and calls that runtime helper function to throw an Error if the check fails. .L.str is most likely the address of the error message or filename, and 55 is its length. The 5/6/7 values are the respective line numbers. You can disable this behaviour by compiling with `dmd -boundscheck=off`.
Thanks for the reply! so I'm a tad unsure of what exactly is happening in this asm, mainly because I'm only roughly familiar with x86 instruction set. _d_arraybounds throws an error because it can't access the runtime? or because as you said the compiler can't know the length of the array? for .L.str, 55 is the length of the address..?
 Also in the mov parts, is that moving 1 into the pointer or 
 into the rsi register? And is rsi + 4, still in rsi, or does 
 it move to a different register?
It stores the `1` into the memory pointed to by `rsi`, or `rsi+4` etc. This is what the brackets [...] mean. Because it's an array of ints, and ints are 4 bytes in size, [rsi] is the first element, [rsi+4] the second, and [rsi+8] the third. `rsi+4` is just a temporary value that is only used during the store, it's not saved into a (named) register. This is a peculiarity of the x86 processors; they allow quite complex address calculations for memory accesses.
Does the address just get calculated whenever the program using this asm, then? :o
Jun 06 2015
parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Saturday, 6 June 2015 at 17:31:15 UTC, Kyoji Klyden wrote:
 On Saturday, 6 June 2015 at 10:12:54 UTC, Marc Schütz wrote:
 ...
Almost correct :-) The part of "has nothing left, so go back" is wrong. The call to _d_arraybounds doesn't return, because it throws an Error.
 ...
Yes, inside the `f` function, the compiler cannot know the length of the array during compilation. To keep you from accidentally accessing invalid memory (e.g. if the array has only two elements, but you're trying to access the third), it automatically inserts a check, and calls that runtime helper function to throw an Error if the check fails. .L.str is most likely the address of the error message or filename, and 55 is its length. The 5/6/7 values are the respective line numbers. You can disable this behaviour by compiling with `dmd -boundscheck=off`.
Thanks for the reply! so I'm a tad unsure of what exactly is happening in this asm, mainly because I'm only roughly familiar with x86 instruction set. _d_arraybounds throws an error because it can't access the runtime? or because as you said the compiler can't know the length of the array?
_d_arraybounds() always throws an error because that's its purpose. It's implemented here: https://github.com/D-Programming-Language/druntime/blob/master/src/core/exception.d#L640 My point was that _d_arraybounds never returns, instead it throws that Error object. The compiler inserts the checks for the array length whenever you access an array element, _except_ if it can either prove that the array is always long enough (e.g. if its a fixed-size array), in which case it can leave the check out because it's unnecessary, or if it can prove that the array is never long enough, in which case it may already print an error during compilation.
 for .L.str, 55 is the length of the address..?
No, the length of the string. It's roughly the equivalent of this pseudo-code: extern void _d_arraybounds(void* filename_ptr, size_t filename_len, size_t line); void f(void* a_ptr, size_t a_length) { if(a_length == 0) goto LBB0_4; *cast(int*) a_ptr = 0; // line 5 if(a_length <= 1) goto LBB0_5; *cast(int*) (a_ptr+4) = 1; // line 6 if(a_length <= 2) goto LBB0_6; *cast(int*) (a_ptr+8) = 1; // line 7 return; LBB0_4: // (pretend this filename is 55 chars long) static string __FILE__ = "/path/to/your/source/file.d"; _d_arraybounds(__FILE__.ptr, __FILE__.length, 5 /* line number */); LBB0_5: _d_arraybounds(__FILE__.ptr, __FILE__.length, 6 /* line number */); LBB0_6: _d_arraybounds(__FILE__.ptr, __FILE__.length, 7 /* line number */); }
 Also in the mov parts, is that moving 1 into the pointer or 
 into the rsi register? And is rsi + 4, still in rsi, or does 
 it move to a different register?
It stores the `1` into the memory pointed to by `rsi`, or `rsi+4` etc. This is what the brackets [...] mean. Because it's an array of ints, and ints are 4 bytes in size, [rsi] is the first element, [rsi+4] the second, and [rsi+8] the third. `rsi+4` is just a temporary value that is only used during the store, it's not saved into a (named) register. This is a peculiarity of the x86 processors; they allow quite complex address calculations for memory accesses.
Does the address just get calculated whenever the program using this asm, then? :o
Yes, but it is extremely fast. I'm pretty sure accessing memory at [RSI] and [RSI+4] both take exactly the same time (but can't find a reference now).
Jun 06 2015
parent reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Saturday, 6 June 2015 at 18:43:08 UTC, Marc Schütz wrote:
 _d_arraybounds() always throws an error because that's its 
 purpose. It's implemented here:
 https://github.com/D-Programming-Language/druntime/blob/master/src/core/exception.d#L640

 My point was that _d_arraybounds never returns, instead it 
 throws that Error object.

 The compiler inserts the checks for the array length whenever 
 you access an array element, _except_ if it can either prove 
 that the array is always long enough (e.g. if its a fixed-size 
 array), in which case it can leave the check out because it's 
 unnecessary, or if it can prove that the array is never long 
 enough, in which case it may already print an error during 
 compilation.
Okay I think I roughly get it. Another thing that was on my to-do list just moved up in priority I think.. that thing is learning everything about the D runtime.
 No, the length of the string.

 It's roughly the equivalent of this pseudo-code:

 extern void _d_arraybounds(void* filename_ptr, size_t 
 filename_len, size_t line);

 void f(void* a_ptr, size_t a_length) {
     if(a_length == 0)
         goto LBB0_4;
     *cast(int*) a_ptr = 0;      // line 5
     if(a_length <= 1)
         goto LBB0_5;
     *cast(int*) (a_ptr+4) = 1;  // line 6
     if(a_length <= 2)
         goto LBB0_6;
     *cast(int*) (a_ptr+8) = 1;  // line 7
     return;
 LBB0_4:
     // (pretend this filename is 55 chars long)
     static string __FILE__ = "/path/to/your/source/file.d";
     _d_arraybounds(__FILE__.ptr, __FILE__.length, 5 /* line 
 number */);
 LBB0_5:
     _d_arraybounds(__FILE__.ptr, __FILE__.length, 6 /* line 
 number */);
 LBB0_6:
     _d_arraybounds(__FILE__.ptr, __FILE__.length, 7 /* line 
 number */);
 }
 ...
Yes, but it is extremely fast. I'm pretty sure accessing memory at [RSI] and [RSI+4] both take exactly the same time (but can't find a reference now).
Oh I forgot that the path is part of the filename, so it now makes a bit more sense on why the names might be so long. I'm also getting the logic here in a theoretical sense, but in a practical sense, not quite yet. That'll probably take doing experiments if anything. Do you perchance have any links to learning resources for the D runtime(aside from just the github repository), and also maybe x86 architecture stuff? (I know intel has some 1000+ page pdf on their site, but I think that's more for hardware and/or OS designers..) Thanks! :)
Jun 07 2015
parent reply "Kagamin" <spam here.lot> writes:
On Sunday, 7 June 2015 at 17:41:11 UTC, Kyoji Klyden wrote:
 Do you perchance have any links to learning resources for the D 
 runtime(aside from just the github repository), and also maybe 
 x86 architecture stuff? (I know intel has some 1000+ page pdf 
 on their site, but I think that's more for hardware and/or OS 
 designers..)
Well, instruction reference is all you need: http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2008/10/24594_APM_v3.pdf
Jun 08 2015
parent "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Monday, 8 June 2015 at 09:54:28 UTC, Kagamin wrote:
 On Sunday, 7 June 2015 at 17:41:11 UTC, Kyoji Klyden wrote:
 Do you perchance have any links to learning resources for the 
 D runtime(aside from just the github repository), and also 
 maybe x86 architecture stuff? (I know intel has some 1000+ 
 page pdf on their site, but I think that's more for hardware 
 and/or OS designers..)
Well, instruction reference is all you need: http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2008/10/24594_APM_v3.pdf
Awesome, this is exactly what I was looking for! Thanks!
Jun 08 2015
prev sibling parent reply "anonymous" <anonymous example.com> writes:
On Friday, 5 June 2015 at 17:27:18 UTC, Kyoji Klyden wrote:
 On Thursday, 4 June 2015 at 22:28:50 UTC, anonymous wrote:
[...]
 By the way, there are subtly different meanings of "array" and 
 "string" which I hope you're aware of, but just to be sure:
 "array" can refer to D array types, i.e. a pointer-length 
 pair, e.g. char[]. Or it can refer to the general concept of a 
 contiguous sequence of elements in memory.
 And as a special case, "string" can refer to D's `string` 
 type, which is an alias for `immutable(char)[]`. Or it can 
 refer to a contiguous sequence of characters in memory.
 And when ketmar writes: "it's a pointer to array of pointers 
 to first chars of strings", then "array" and "string" are 
 meant in the generic way, not in the D-specific way.
[...]
 So how does D store arrays in memory then?

 I know you already explained this part, but..
 Does the slice's pointer point to the slice's position in 
 memory? Then if an array isn't sequential, is it atleast a 
 sequence of pointers to the slice structs (& those are just in 
 whatever spot in memory they could get?)
 There's a slice for each array index..right? Or is it only for 
 the first element?
Oh boy, I think I might have put more confusion on top than taking away from it. I didn't mean to say that D's arrays are not sequential. They are. Or more specifically, the elements are. "Array" is often meant to mean just that: a contiguous sequence of elements in memory. This is the same in C and in D. If you have a pointer to such a sequence, and you know the number of elements (or what element is last), then you can access them all. In C, the pointer and the length are usually passed separately. But D groups them together and calls it a "dynamic array" or "slice": `ElementType[]`. And they're often simply called "arrays". This is done to confuse newbies, of course. There's an article on slices (dynamic arrays): http://dlang.org/d-array-article.html C doesn't know about D's slice structure, of course. So when talking to C, it's often necessary to shuffle things around. For example, where a D function has a parameter `char[] arr`, a C version may have two parameters: `size_t length, char* pointer_to_first`. And if you want to call that C version with a D slice, you'd do `cfun(dslice.length, dslice.ptr)`. It gets interesting with arrays of arrays. In D that would be `char[][] arr`. And in C it could be `size_t length, char** pointers_to_firsts, size_t* lengths`. Now what to do? A `char[][]` refers to a sequence of D slices, but the C function expects two sequences: one of pointers, and one of lengths. The memory layout is incompatible. You'd have to split the `char[][]` up into two arrays: a `char*[] ptrs` and a `size_t[] lengths`, and then call the function as `cfun(ptrs.length, ptrs.ptr, lengths.ptr)` (Hope I'm not makings things worse with this.)
Jun 05 2015
next sibling parent reply "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Friday, 5 June 2015 at 19:18:39 UTC, anonymous wrote:
 On Friday, 5 June 2015 at 17:27:18 UTC, Kyoji Klyden wrote:
 On Thursday, 4 June 2015 at 22:28:50 UTC, anonymous wrote:
[...]
 By the way, there are subtly different meanings of "array" 
 and "string" which I hope you're aware of, but just to be 
 sure:
 "array" can refer to D array types, i.e. a pointer-length 
 pair, e.g. char[]. Or it can refer to the general concept of 
 a contiguous sequence of elements in memory.
 And as a special case, "string" can refer to D's `string` 
 type, which is an alias for `immutable(char)[]`. Or it can 
 refer to a contiguous sequence of characters in memory.
 And when ketmar writes: "it's a pointer to array of pointers 
 to first chars of strings", then "array" and "string" are 
 meant in the generic way, not in the D-specific way.
[...]
 So how does D store arrays in memory then?

 I know you already explained this part, but..
 Does the slice's pointer point to the slice's position in 
 memory? Then if an array isn't sequential, is it atleast a 
 sequence of pointers to the slice structs (& those are just in 
 whatever spot in memory they could get?)
 There's a slice for each array index..right? Or is it only for 
 the first element?
Oh boy, I think I might have put more confusion on top than taking away from it. I didn't mean to say that D's arrays are not sequential. They are. Or more specifically, the elements are. "Array" is often meant to mean just that: a contiguous sequence of elements in memory. This is the same in C and in D. If you have a pointer to such a sequence, and you know the number of elements (or what element is last), then you can access them all. In C, the pointer and the length are usually passed separately. But D groups them together and calls it a "dynamic array" or "slice": `ElementType[]`. And they're often simply called "arrays". This is done to confuse newbies, of course. There's an article on slices (dynamic arrays): http://dlang.org/d-array-article.html C doesn't know about D's slice structure, of course. So when talking to C, it's often necessary to shuffle things around. For example, where a D function has a parameter `char[] arr`, a C version may have two parameters: `size_t length, char* pointer_to_first`. And if you want to call that C version with a D slice, you'd do `cfun(dslice.length, dslice.ptr)`. It gets interesting with arrays of arrays. In D that would be `char[][] arr`. And in C it could be `size_t length, char** pointers_to_firsts, size_t* lengths`. Now what to do? A `char[][]` refers to a sequence of D slices, but the C function expects two sequences: one of pointers, and one of lengths. The memory layout is incompatible. You'd have to split the `char[][]` up into two arrays: a `char*[] ptrs` and a `size_t[] lengths`, and then call the function as `cfun(ptrs.length, ptrs.ptr, lengths.ptr)` (Hope I'm not makings things worse with this.)
Okay, so it's primarily an interfacing with C problem that started all this? (My brain is just completely scrambled at this point xP ) So pretty much the slice gives you the pointer to the start of the array in memory and also how many elements are in the array. Then depending on the array type it'll jump that many bytes for each element. (So 5 indexes in an int array, would start at address 0xblahblah00 , then go to 0xblahblah04, until it reaches 0xblahblah16(?) or something like that) If I FINALLY have it right, then that makes alot of sense actually.
Jun 05 2015
parent reply "anonymous" <anonymous example.com> writes:
On Friday, 5 June 2015 at 19:30:58 UTC, Kyoji Klyden wrote:
 Okay, so it's primarily an interfacing with C problem that 
 started all this? (My brain is just completely scrambled at 
 this point xP )
Yeah, you wanted to call glShaderSource, which is a C function and as such it's not aware of D slices. So things get more complicated than they would be in D alone.
 So pretty much the slice gives you the pointer to the start of 
 the array in memory and also how many elements are in the array.
Yes.
 Then depending on the array type it'll jump that many bytes for 
 each element. (So 5 indexes in an int array, would start at 
 address 0xblahblah00 , then go to 0xblahblah04, until it 
 reaches 0xblahblah16(?) or something like that)
Yes.
 If I FINALLY have it right, then that makes alot of sense 
 actually.
Sweet.
Jun 05 2015
parent "Kyoji Klyden" <kyojiklyden yahoo.com> writes:
On Friday, 5 June 2015 at 19:41:03 UTC, anonymous wrote:
 On Friday, 5 June 2015 at 19:30:58 UTC, Kyoji Klyden wrote:
 Okay, so it's primarily an interfacing with C problem that 
 started all this? (My brain is just completely scrambled at 
 this point xP )
Yeah, you wanted to call glShaderSource, which is a C function and as such it's not aware of D slices. So things get more complicated than they would be in D alone.
 So pretty much the slice gives you the pointer to the start of 
 the array in memory and also how many elements are in the 
 array.
Yes.
 Then depending on the array type it'll jump that many bytes 
 for each element. (So 5 indexes in an int array, would start 
 at address 0xblahblah00 , then go to 0xblahblah04, until it 
 reaches 0xblahblah16(?) or something like that)
Yes.
 If I FINALLY have it right, then that makes alot of sense 
 actually.
Sweet.
Awesome thankyou very much for all your help!(and ofcourse everyone else who posted, too!)
Jun 05 2015
prev sibling parent "sigod" <sigod.mail gmail.com> writes:
On Friday, 5 June 2015 at 19:18:39 UTC, anonymous wrote:
 If you have a pointer to such a sequence, and you know the 
 number of elements (or what element is last), then you can 
 access them all.
I never really worked with C or C++, but I'm sure you also need to know element size.
Jun 06 2015