www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Rune strings. Like in Go.

reply Alexey <invalid email.address> writes:
Can we have them in D? :)
Sep 30 2021
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Thursday, 30 September 2021 at 20:44:54 UTC, Alexey wrote:
 Can we have them in D? :)
We have them already. :) What Go calls a "rune" is called a `dchar` in D, and a string of them is a `dchar[]`.
Sep 30 2021
parent reply Alexey <invalid email.address> writes:
On Thursday, 30 September 2021 at 20:57:34 UTC, Paul Backus wrote:
 On Thursday, 30 September 2021 at 20:44:54 UTC, Alexey wrote:
 Can we have them in D? :)
We have them already. :) What Go calls a "rune" is called a `dchar` in D, and a string of them is a `dchar[]`.
Go's runes and D's dchars - have different behavior. Go's rune string, being converted to byte array like so `[]byte(string_var)` and saved to file - results in UTF-8, while dchar is UTF-32. this means - the frequent conversions between string and dstring in D is required for confortable work with unicode.
Sep 30 2021
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Sep 30, 2021 at 10:47:25PM +0000, Alexey via Digitalmars-d wrote:
 On Thursday, 30 September 2021 at 20:57:34 UTC, Paul Backus wrote:
 On Thursday, 30 September 2021 at 20:44:54 UTC, Alexey wrote:
 Can we have them in D? :)
We have them already. :) What Go calls a "rune" is called a `dchar` in D, and a string of them is a `dchar[]`.
Go's runes and D's dchars - have different behavior. Go's rune string, being converted to byte array like so `[]byte(string_var)` and saved to file - results in UTF-8, while dchar is UTF-32. this means - the frequent conversions between string and dstring in D is required for confortable work with unicode.
This is not true. D strings are autodecoded with Phobos range functions, i.e., if you iterate over a string with Phobos, you will get a stream of dchars without having to convert the encoding. (Ironically enough, autodecoding is regarded as a bad thing!) Also, IIRC, writing a stream of dchars to a string sink, e.g., appender!string, will automatically encode into UTF-8, so no explicit conversion is needed afterwards. T -- If you look at a thing nine hundred and ninety-nine times, you are perfectly safe; if you look at it the thousandth time, you are in frightful danger of seeing it for the first time. -- G. K. Chesterton
Sep 30 2021
prev sibling parent jfondren <julian.fondren gmail.com> writes:
On Thursday, 30 September 2021 at 22:47:25 UTC, Alexey wrote:
 Go's rune string, being converted to byte array like so 
 `[]byte(string_var)` and saved to file - results in UTF-8, 
 while dchar is UTF-32.
You're explicitly asking for a byte cast and instead of a byte cast you get a reencoding? You might prefer that because it's familiar, but that's a really confusing thing to do. `std.string.representation` meanwhile turns a `dchar[]` into an `uint[]`. And to get UTF-8 in a file, just write the string: ```d import std; void write() { string noel = "no\u0308el"; dstring dstr = noel.toUTF32; File("a", "w").writeln(noel); File("b", "w").writeln(dstr); } void main() { write; ubyte[] fromA = cast(ubyte[]) read("a"); ubyte[] fromB = cast(ubyte[]) read("b"); assert(fromA == fromB); writeln(fromA, "\n", fromB); } ``` output: ``` [110, 111, 204, 136, 101, 108, 10] [110, 111, 204, 136, 101, 108, 10] ```
Sep 30 2021
prev sibling next sibling parent Basile B. <b2.temp gmx.com> writes:
On Thursday, 30 September 2021 at 20:44:54 UTC, Alexey wrote:
 Can we have them in D? :)
do you want a verbatim about runs on IRC ?
Sep 30 2021
prev sibling parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 9/30/21 1:44 PM, Alexey wrote:
 Can we have them in D? :)
Looks like Go "could" have them. ;) Ali
Sep 30 2021