digitalmars.D - Random string samples & unicode
- bearophile <bearophileHUGS lycos.com> Sep 10 2010
- bearophile <bearophileHUGS lycos.com> Sep 11 2010
- bearophile <bearophileHUGS lycos.com> Sep 11 2010
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Sep 11 2010
- bearophile <bearophileHUGS lycos.com> Sep 11 2010
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Sep 11 2010
- bearophile <bearophileHUGS lycos.com> Sep 11 2010
- bearophile <bearophileHUGS lycos.com> Sep 11 2010
- bearophile <bearophileHUGS lycos.com> Sep 11 2010
- "Steven Schveighoffer" <schveiguy yahoo.com> Sep 13 2010
- Andrej Mitrovic <andrej.mitrovich gmail.com> Sep 11 2010
The need to take a random sample without replacement is very common. For
example this is how in Python 2.x I create a random string without replacement
of fixed size from a input string of chars:
from random import sample
d = "0123456789"
print "".join(sample(d, 2))
This seems similar D2 code:
import std.stdio, std.random, std.array, std.range;
void main() {
dchar[] d = "0123456789"d.dup;
dchar[] res = array(take(randomCover(d, rndGen), 2));
writeln(res);
}
There randomCover() doesn't work with a string, a dstrings or with a char[]. If
later you need to process that res dchar[] with std.string you will have
troubles.
But randomShuffle() is able to shuffle a char[] in place:
import std.stdio, std.random;
void main() {
char[] d = "0123456789".dup;
randomShuffle(d);
writeln(d);
}
If randomCover() receives a char[] I think in theory it has to yield its
shuffled chars. And if it receives a string it has to yield its shuffled dchars
(converted from the chars). A string may contain UFT8 chars that are longer
than 1 byte, but a char[] is not a string, and if you want its items in random
order, it has to act like randomShuffle().
My head hurts, and I don't know what the right thing to do is.
Maybe I have to work with ubyte[] instead of char[], and add casts:
import std.stdio, std.random, std.array, std.range;
void main() {
char[] d = "0123456789".dup;
char[] res = cast(char[])array(take(randomCover(cast(ubyte[])d, rndGen),
2));
writeln(res);
}
Ideas welcome.
Bye,
bearophile
Sep 10 2010
There randomCover() doesn't work with a string, a dstrings or with a char[]. If later you need to process that res dchar[] with std.string you will have troubles.
The problems are more widespread, this is a simple generator of terms of the "look and say" sequence (to generate a member of the sequence from the previous member, read off the digits of the previous member, counting the number of digits in groups of the same digit: http://en.wikipedia.org/wiki/Look_and_say_sequence ): import std.stdio, std.conv, std.algorithm; string lookAndSay(string input) { string result; foreach (g; group(input)) result ~= to!string(g._1) ~ (cast(char)g._0); return result; } void main() { string last = "1"; writeln(last); foreach (i; 0 .. 10) { last = lookAndSay(last); writeln(last); } } I was not able to remove that cast(char), even if I replace all strings in that program with dstrings. Is someone else using D2? Bye, bearophile
Sep 11 2010
Andrej Mitrovic:I think this might be a compiler bug:
I'll add it to Bugzilla later. But even if you remove that bug, forcing me to use dstrings in the whole program is strange. Or maybe it's a good thing, and the natural state for D programs is to just use dstrings everywhere. Andrei may offer his opinion on the situation. Bye, bearophile
Sep 11 2010
On 9/11/10 10:24 CDT, bearophile wrote:Andrej Mitrovic:I think this might be a compiler bug:
I'll add it to Bugzilla later. But even if you remove that bug, forcing me to use dstrings in the whole program is strange. Or maybe it's a good thing, and the natural state for D programs is to just use dstrings everywhere. Andrei may offer his opinion on the situation. Bye, bearophile
This goes into "bearophile's odd posts coming now and then". Andrei
Sep 11 2010
Andrei Alexandrescu:This goes into "bearophile's odd posts coming now and then".
You aren't helping solve those problems. Bye, bearophile
Sep 11 2010
On 9/11/10 9:48 CDT, Andrej Mitrovic wrote:I think this might be a compiler bug: import std.conv : to; void main() { string mystring; dchar mydchar; // ok, appending dchar to string mystring ~= mydchar; // error: incompatible types for // ((cast(uint)mydchar) ~ (cast(uint)mydchar)): 'uint' and 'uint' mystring ~= mydchar ~ mydchar; }
You can't concatenate two integrals. Andrei
Sep 11 2010
Andrei Alexandrescu:You can't concatenate two integrals.
The compiler has full type information, so what's wrong in concatenating two char or two dchar into a string or dstring? And I think there are other problems: http://d.puremagic.com/issues/show_bug.cgi?id=4853 Bye, bearophile
Sep 11 2010
The compiler has full type information, so what's wrong in concatenating two char or two dchar into a string or dstring?
But in C the ~ among two chars has a different meaning, so in D you may at best disallow it.And I think there are other problems: http://d.puremagic.com/issues/show_bug.cgi?id=4853
So that's invalid, I have closed it. Using a bit of contortions it's possible to write lookAndSay() with no casts, but the code is not good still: import std.stdio, std.conv, std.algorithm; string lookAndSay(string input) { string result; foreach (g; group(input)) { string s = to!string(g._1); s ~= g._0; // string ~ dchar wrong, string ~= dchar good result ~= s; } return result; } void main() { string last = "1"; writeln(last); foreach (i; 0 .. 10) { last = lookAndSay(last); writeln(last); } } Bye, bearophile
Sep 11 2010
foreach (g; group(input)) { string s = to!string(g._1); s ~= g._0; // string ~ dchar wrong, string ~= dchar good result ~= s; }
Shorter: foreach (g; group(input)) result ~= text(g._1, g._0); bearophile
Sep 11 2010
On Sat, 11 Sep 2010 13:20:25 -0400, bearophile <bearophileHUGS lycos.com> wrote:Andrei Alexandrescu:You can't concatenate two integrals.
The compiler has full type information, so what's wrong in concatenating two char or two dchar into a string or dstring?
It's ambiguous also: string s1 = "abc", s2 = "def"; auto x = s1 ~ s2; would you expect x to be "abcdef" or ["abc", "def"]? Essentially, one of the arguments to concatenation must be an array type in order to avoid ambiguity. Fortunately, you can get the results you wish with the bracket notation: auto x = [s1, s2]; -Steve
Sep 13 2010
I think this might be a compiler bug:
import std.conv : to;
void main()
{
string mystring;
dchar mydchar;
// ok, appending dchar to string
mystring ~=3D mydchar;
// error: incompatible types for
// ((cast(uint)mydchar) ~ (cast(uint)mydchar)): 'uint' and 'uint'
mystring ~=3D mydchar ~ mydchar;
}
On Sat, Sep 11, 2010 at 3:42 PM, bearophile <bearophileHUGS lycos.com> wrot=
e:
There randomCover() doesn't work with a string, a dstrings or with a cha=
If later you need to process that res dchar[] with std.string you will h=
The problems are more widespread, this is a simple generator of terms of =
previous member, read off the digits of the previous member, counting the n=
umber of digits in groups of the same digit: http://en.wikipedia.org/wiki/L=
ook_and_say_sequence ):
import std.stdio, std.conv, std.algorithm;
string lookAndSay(string input) {
=A0 =A0string result;
=A0 =A0foreach (g; group(input))
=A0 =A0 =A0 =A0result ~=3D to!string(g._1) ~ (cast(char)g._0);
=A0 =A0return result;
}
void main() {
=A0 =A0string last =3D "1";
=A0 =A0writeln(last);
=A0 =A0foreach (i; 0 .. 10) {
=A0 =A0 =A0 =A0last =3D lookAndSay(last);
=A0 =A0 =A0 =A0writeln(last);
=A0 =A0}
}
I was not able to remove that cast(char), even if I replace all strings i=
Is someone else using D2?
Bye,
bearophile
Sep 11 2010









bearophile <bearophileHUGS lycos.com> 