digitalmars.D.learn - How to loop through characters of a string in D language?
- BoQsc (14/14) Dec 08 2021 Let's say I want to skip characters and build a new string.
- Biotronic (10/24) Dec 08 2021 import std.stdio : writeln;
- BoQsc (27/52) Dec 08 2021 I somehow have universal cross language hate for this kind of
- Adam D Ruppe (7/8) Dec 08 2021 foreach(ch; a) {
- BoQsc (21/29) Dec 08 2021 Thanks Adam.
- bauss (2/16) Dec 08 2021 string b = a.replace(";", "");
- BoQsc (13/35) Dec 08 2021 Thanks, that's what I used to do few years ago.
- forkit (8/44) Dec 08 2021 It's also worth noting the differences in compiler output, as
- Stanislav Blinov (4/11) Dec 08 2021 You're passing a literal. Try passing a runtime value (e.g. a
- kdevel (3/4) Dec 09 2021 đź‘Ť
- Salih Dincer (18/32) Dec 08 2021 I always use split() and joiner pair. You can customize it as you
- Rumbu (6/12) Dec 09 2021 Since it seems there is a contest here:
- IGotD- (2/7) Dec 10 2021 Would that become two for loops or not?
- Rumbu (20/30) Dec 10 2021 I thought it's a beauty contest.
- forkit (4/5) Dec 10 2021 Well, if it's a beauty contest, then i got a beauty..
- =?ISO-8859-1?Q?Lu=EDs_Ferreira?= (8/18) Dec 10 2021 charset=utf-8
- Arjan (4/17) Dec 10 2021 ```d
- forkit (4/5) Dec 10 2021 I don't think we have enough ways of doing the same thing yet...
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (14/21) Dec 11 2021 Using libraries can trigger hidden allocations.
- forkit (17/18) Dec 11 2021 ok. fine. no unnecessary, hidden allocations then.
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (18/37) Dec 11 2021 ```putchar(…)``` is too slow!
- forkit (5/6) Dec 12 2021 On planet Mars maybe, but here on earth, my computer can do about
- bauss (2/11) Dec 12 2021 Can I borrow a couple of your ticks?
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (18/37) Dec 11 2021 ```putchar(…)``` is too slow!
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/5) Dec 11 2021 Shouldn't be there. Residual leftovers… (I don't want to confuse
- Stanislav Blinov (4/7) Dec 11 2021 A function with that name, and calling alloca to boot, cannot be
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/11) Dec 11 2021 :-)
- russhy (181/181) Dec 11 2021 Here is mine
- Rumbu (5/10) Dec 11 2021 You know that this is already in phobos?
- russhy (4/19) Dec 11 2021 you need to import a 8k lines of code module that itself imports
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (33/35) Dec 12 2021 I agree.
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (12/17) Dec 12 2021 Bug, it fails if the string ends or starts with ';'.
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (31/31) Dec 12 2021 Of course, since it is easy to mess up and use ranges in the
- Matheus (20/22) Dec 10 2021 My C way of thinking while using D:
- Stanislav Blinov (14/26) Dec 10 2021 Oooh, finally someone suggested to preallocate storage for all
- Rumbu (3/4) Dec 10 2021 http://lemire.me/blog/2017/01/20/how-quickly-can-you-remove-spaces-from-...
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (18/20) Dec 10 2021 ```
- Stanislav Blinov (15/38) Dec 11 2021 That is about 500% not what I meant. At all. Original code in
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (8/13) Dec 11 2021 You worry too much, just have fun with differing ways of
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (19/23) Dec 11 2021 Scanning short strings twice is not all that expensive as they
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (28/32) Dec 13 2021 Like this?
- Salih Dincer (45/71) Dec 22 2021 It seems faster than algorithms in Phobos. We would love to see
- Stanislav Blinov (10/21) Dec 23 2021 You're comparing apples and oranges. When benchmarking, at least
- Salih Dincer (6/9) Dec 23 2021 I looked now and you're right. Insomuch that it should be
- rumbu (23/31) Dec 23 2021 It seems because MallocReplace is cheating a lot:
Let's say I want to skip characters and build a new string. The string example to loop/iterate: ``` import std.stdio; void main() { string a="abc;def;ab"; } ``` The character I want to skip: `;` Expected result: ``` abcdefab ```
Dec 08 2021
On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:Let's say I want to skip characters and build a new string. The string example to loop/iterate: ``` import std.stdio; void main() { string a="abc;def;ab"; } ``` The character I want to skip: `;` Expected result: ``` abcdefab ```import std.stdio : writeln; import std.algorithm.iteration : filter; import std.conv : to; void main() { string a = "abc;def;ab"; string b = a.filter!(c => c != ';').to!string; writeln(b); }
Dec 08 2021
On Wednesday, 8 December 2021 at 11:35:39 UTC, Biotronic wrote:On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:I somehow have universal cross language hate for this kind of algorithm. I'm not getting used to the syntax and that leads to poor readability. But that might be just me. Anyways, Here is what I've come up with. ``` import std.stdio; void main() { string a = "abc;def;ab"; string b; for(int i=0; i<a.length; i++){ write(i); writeln(a[i]); if (a[i] != ';'){ b ~= a[i]; } } writeln(b); } ```Let's say I want to skip characters and build a new string. The string example to loop/iterate: ``` import std.stdio; void main() { string a="abc;def;ab"; } ``` The character I want to skip: `;` Expected result: ``` abcdefab ```[..] string b = a.filter!(c => c != ';').to!string; writeln(b); }
Dec 08 2021
On Wednesday, 8 December 2021 at 13:01:32 UTC, BoQsc wrote: [...]I'm not getting used to the syntax and that leads to poor readability.It depends on what you expect when you read source code. I don't want to read how seats in the memory are assigned to bits and bytes. Instead I want to read what is done.But that might be just me.Unfortunately not.Anyways, Here is what I've come up with. ``` import std.stdio; void main() { string a = "abc;def;ab"; string b; for(int i=0; i<a.length; i++){ write(i); writeln(a[i]); if (a[i] != ';'){ b ~= a[i]; } } writeln(b); } ```PRO: - saves two lines of boilerplate code CONS: - raw loop - postinc ++ is only permitted in ++C - inconsistent spacing around "=" - mixing tabs and spaces for indentation - arrow code
Dec 09 2021
On Thursday, 9 December 2021 at 18:00:42 UTC, kdevel wrote:PRO: - saves two lines of boilerplate code CONS: - raw loop - postinc ++ is only permitted in ++C - inconsistent spacing around "=" - mixing tabs and spaces for indentation - arrow codemore PROs: - You become less dependent on someone else's library. - You learn how to do some things yourself. ;-) of course, I would prefer a less verbose, and safer version, which D enables, such as: foreach(val; a) { writeln(val); if (val != ';') { b ~= val; } }
Dec 09 2021
On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:The string example to loop/iterate:foreach(ch; a) { } does the individual chars of the string you can also foreach(dchar ch; a) { } to decode the utf 8
Dec 08 2021
On Wednesday, 8 December 2021 at 12:49:39 UTC, Adam D Ruppe wrote:On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:Thanks Adam. This is how it would look implemented. ``` import std.stdio; void main() { string a = "abc;def;ab"; string b; foreach(ch; a) { if (ch != ';'){ b ~= ch; } writeln(ch); } writeln(b); } ```The string example to loop/iterate:foreach(ch; a) { } does the individual chars of the string you can also foreach(dchar ch; a) { } to decode the utf 8
Dec 08 2021
On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:Let's say I want to skip characters and build a new string. The string example to loop/iterate: ``` import std.stdio; void main() { string a="abc;def;ab"; } ``` The character I want to skip: `;` Expected result: ``` abcdefab ```string b = a.replace(";", "");
Dec 08 2021
On Wednesday, 8 December 2021 at 14:16:16 UTC, bauss wrote:On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:Thanks, that's what I used to do few years ago. It's a great solution I forget about and it works. ``` import std.stdio; import std.array; void main() { string a="abc;def;ab"; string b = a.replace(";", ""); writeln(b); } ```Let's say I want to skip characters and build a new string. The string example to loop/iterate: ``` import std.stdio; void main() { string a="abc;def;ab"; } ``` The character I want to skip: `;` Expected result: ``` abcdefab ```string b = a.replace(";", "");
Dec 08 2021
On Wednesday, 8 December 2021 at 14:27:22 UTC, BoQsc wrote:On Wednesday, 8 December 2021 at 14:16:16 UTC, bauss wrote:It's also worth noting the differences in compiler output, as well as the time taken to compile, these two approaches: (1) string str = "abc;def;ab".filter!(c => c != ';').to!string; (2) string str = "abc;def;ab".replace(";", ""); see: https://d.godbolt.org/z/3dWYsEGsrOn Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:Thanks, that's what I used to do few years ago. It's a great solution I forget about and it works. ``` import std.stdio; import std.array; void main() { string a="abc;def;ab"; string b = a.replace(";", ""); writeln(b); } ```Let's say I want to skip characters and build a new string. The string example to loop/iterate: ``` import std.stdio; void main() { string a="abc;def;ab"; } ``` The character I want to skip: `;` Expected result: ``` abcdefab ```string b = a.replace(";", "");
Dec 08 2021
On Wednesday, 8 December 2021 at 22:18:23 UTC, forkit wrote:It's also worth noting the differences in compiler output, as well as the time taken to compile, these two approaches: (1) string str = "abc;def;ab".filter!(c => c != ';').to!string; (2) string str = "abc;def;ab".replace(";", ""); see: https://d.godbolt.org/z/3dWYsEGsrYou're passing a literal. Try passing a runtime value (e.g. a command line argument). Also, -O2 -release :) Uless, of course, your goal is to look at debug code.
Dec 08 2021
On Wednesday, 8 December 2021 at 22:35:35 UTC, Stanislav Blinov wrote:You're passing a literal. Try passing a runtime value (e.g. a command line argument). Also, -O2 -release :) Uless, of course, your goal is to look at debug code.but this will change nothing. the compilation cost of using .replace, will always be apparent (compared to the presented alternative), both in less time taken to compile, and smaller size of executable.
Dec 08 2021
On Wednesday, 8 December 2021 at 22:55:02 UTC, forkit wrote:On Wednesday, 8 December 2021 at 22:35:35 UTC, Stanislav Blinov wrote:well... maybe not that apparent afterall ;-) .. the mysteries of compiler optimisation ....You're passing a literal. Try passing a runtime value (e.g. a command line argument). Also, -O2 -release :) Uless, of course, your goal is to look at debug code.but this will change nothing. the compilation cost of using .replace, will always be apparent (compared to the presented alternative), both in less time taken to compile, and smaller size of executable.
Dec 08 2021
On Wednesday, 8 December 2021 at 14:16:16 UTC, bauss wrote: [...]string b = a.replace(";", "");đź‘Ť
Dec 09 2021
On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:Let's say I want to skip characters and build a new string. The string example to loop/iterate: ``` import std.stdio; void main() { string a="abc;def;ab"; } ``` The character I want to skip: `;` Expected result: ``` abcdefab ```I always use split() and joiner pair. You can customize it as you want: ```d import std.stdio : writeln; import std.algorithm : joiner; import std.array : split; bool isWhite(dchar c) safe pure nothrow nogc { return c == ' ' || c == ';' || (c >= 0x09&& c <= 0x0D); } void main() { string str = "a\nb c\t;d e f;a b "; str.split!isWhite.joiner.writeln(); //abcdefab } ```
Dec 08 2021
On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:Let's say I want to skip characters and build a new string. The character I want to skip: `;` Expected result: ``` abcdefab ```Since it seems there is a contest here: ```d "abc;def;ghi".split(';').join(); ``` :)
Dec 09 2021
On Friday, 10 December 2021 at 06:24:27 UTC, Rumbu wrote:Since it seems there is a contest here: ```d "abc;def;ghi".split(';').join(); ``` :)Would that become two for loops or not?
Dec 10 2021
On Friday, 10 December 2021 at 11:06:21 UTC, IGotD- wrote:On Friday, 10 December 2021 at 06:24:27 UTC, Rumbu wrote:I thought it's a beauty contest. ```d string stripsemicolons(string s) { string result; // prevent reallocations result.length = s.length; result.length = 0; //append to string only when needed size_t i = 0; while (i < s.length) { size_t j = i; while (i < s.length && s[i] != ';') ++i; result ~= s[j..i]; } } ```Since it seems there is a contest here: ```d "abc;def;ghi".split(';').join(); ``` :)Would that become two for loops or not?
Dec 10 2021
On Friday, 10 December 2021 at 12:15:18 UTC, Rumbu wrote:I thought it's a beauty contest.Well, if it's a beauty contest, then i got a beauty.. char[("abc;def;ab".length - count("abc;def;ab", ";"))] b = "abc;def;ab".replace(";", "");
Dec 10 2021
charset=utf-8 Content-Transfer-Encoding: quoted-printable Yes it will=2E You can use lazy templates instead, like splitter and joiner= , which splits and joins lazily, respectively=2E LDC can optimize those tem= plates fairly well and avoid too much lazy calls and pretty much constructs= the logic equivalent to for loop=2E On 10 December 2021 11:06:21 WET, IGotD- via Digitalmars-d-learn <digitalm= ars-d-learn puremagic=2Ecom> wrote:On Friday, 10 December 2021 at 06:24:27 UTC, Rumbu wrote:=20 Since it seems there is a contest here: =20 ```d "abc;def;ghi"=2Esplit(';')=2Ejoin(); ``` =20 :)Would that become two for loops or not?
Dec 10 2021
On Friday, 10 December 2021 at 06:24:27 UTC, Rumbu wrote:On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:```d "abc;def;ghi".tr(";", "", "d" ); ```Let's say I want to skip characters and build a new string. The character I want to skip: `;` Expected result: ``` abcdefab ```Since it seems there is a contest here: ```d "abc;def;ghi".split(';').join(); ``` :)
Dec 10 2021
On Friday, 10 December 2021 at 22:35:58 UTC, Arjan wrote:"abc;def;ghi".tr(";", "", "d" );I don't think we have enough ways of doing the same thing yet... so here's one more.. "abc;def;ghi".substitute(";", "");
Dec 10 2021
On Saturday, 11 December 2021 at 00:39:15 UTC, forkit wrote:On Friday, 10 December 2021 at 22:35:58 UTC, Arjan wrote:Using libraries can trigger hidden allocations. ``` import std.stdio; string garbagefountain(string s){ if (s.length == 1) return s == ";" ? "" : s; return garbagefountain(s[0..$/2]) ~ garbagefountain(s[$/2..$]); } int main() { writeln(garbagefountain("abc;def;ab")); return 0; } ```"abc;def;ghi".tr(";", "", "d" );I don't think we have enough ways of doing the same thing yet... so here's one more.. "abc;def;ghi".substitute(";", "");
Dec 11 2021
On Saturday, 11 December 2021 at 08:05:01 UTC, Ola Fosheim Grøstad wrote:Using libraries can trigger hidden allocations.ok. fine. no unnecessary, hidden allocations then. // ------------------ module test; import core.stdc.stdio : putchar; nothrow nogc void main() { string str = "abc;def;ab"; ulong len = str.length; for (ulong i = 0; i < len; i++) { if (cast(int) str[i] != ';') putchar(cast(int) str[i]); } } // ------------------
Dec 11 2021
On Saturday, 11 December 2021 at 08:46:32 UTC, forkit wrote:On Saturday, 11 December 2021 at 08:05:01 UTC, Ola Fosheim Grøstad wrote:```putchar(…)``` is too slow! ``` safe: extern (C) long write(long, const void *, long); void donttrythisathome(string s, char stripchar) trusted { import core.stdc.stdlib; char* begin = cast(char*)alloca(s.length); char* end = begin; foreach(c; s) if (c != stripchar) *(end++) = c; write(0, begin, end - begin); } system void main() { string str = "abc;def;ab"; donttrythisathome(str, ';'); } ````Using libraries can trigger hidden allocations.ok. fine. no unnecessary, hidden allocations then. // ------------------ module test; import core.stdc.stdio : putchar; nothrow nogc void main() { string str = "abc;def;ab"; ulong len = str.length; for (ulong i = 0; i < len; i++) { if (cast(int) str[i] != ';') putchar(cast(int) str[i]); } } // ------------------
Dec 11 2021
On Saturday, 11 December 2021 at 09:25:37 UTC, Ola Fosheim Grøstad wrote:```putchar(…)``` is too slow!On planet Mars maybe, but here on earth, my computer can do about 4 billion ticks per second, and my entire program (using putchar) takes only 3084 ticks.
Dec 12 2021
On Monday, 13 December 2021 at 05:46:06 UTC, forkit wrote:On Saturday, 11 December 2021 at 09:25:37 UTC, Ola Fosheim Grøstad wrote:Can I borrow a couple of your ticks?```putchar(…)``` is too slow!On planet Mars maybe, but here on earth, my computer can do about 4 billion ticks per second, and my entire program (using putchar) takes only 3084 ticks.
Dec 12 2021
On Saturday, 11 December 2021 at 08:46:32 UTC, forkit wrote:On Saturday, 11 December 2021 at 08:05:01 UTC, Ola Fosheim Grøstad wrote:```putchar(…)``` is too slow! ``` safe: extern (C) long write(long, const void *, long); void donttrythisathome(string s, char stripchar) trusted { import core.stdc.stdlib; char* begin = cast(char*)alloca(s.length); char* end = begin; foreach(c; s) if (c != stripchar) *(end++) = c; write(0, begin, end - begin); } system void main() { string str = "abc;def;ab"; donttrythisathome(str, ';'); } ````Using libraries can trigger hidden allocations.ok. fine. no unnecessary, hidden allocations then. // ------------------ module test; import core.stdc.stdio : putchar; nothrow nogc void main() { string str = "abc;def;ab"; ulong len = str.length; for (ulong i = 0; i < len; i++) { if (cast(int) str[i] != ';') putchar(cast(int) str[i]); } } // ------------------
Dec 11 2021
On Saturday, 11 December 2021 at 09:34:17 UTC, Ola Fosheim Grøstad wrote:systemShouldn't be there. Residual leftovers… (I don't want to confuse newbies!)
Dec 11 2021
On Saturday, 11 December 2021 at 09:34:17 UTC, Ola Fosheim Grøstad wrote:void donttrythisathome(string s, char stripchar) trusted { import core.stdc.stdlib; char* begin = cast(char*)alloca(s.length);A function with that name, and calling alloca to boot, cannot be trusted ;)
Dec 11 2021
On Saturday, 11 December 2021 at 09:40:47 UTC, Stanislav Blinov wrote:On Saturday, 11 December 2021 at 09:34:17 UTC, Ola Fosheim Grøstad wrote::-) But I am very trustworthy person! PROMISE!!!void donttrythisathome(string s, char stripchar) trusted { import core.stdc.stdlib; char* begin = cast(char*)alloca(s.length);A function with that name, and calling alloca to boot, cannot be trusted ;)
Dec 11 2021
Here is mine - 0 allocations - configurable - let's you use it how you wish - fast ```D import std; void main() { string a = "abc;def;ab"; writeln("a => ", a); foreach(item; split(a, ';')) writeln("\t", item); string b = "abc; def ;ab"; writeln("a => ", b); foreach(item; split(b, ';', SplitOption.TRIM)) writeln("\t", item); string c= "abc; ; ;def ;ab"; writeln("a => ",c); foreach(item; split(c, ';', SplitOption.TRIM | SplitOption.REMOVE_EMPTY)) writeln("\t", item); } SplitIterator!T split(T)(const(T)[] buffer, const(T) delimiter, SplitOption option = SplitOption.NONE) { return SplitIterator!T(buffer, delimiter, option); } struct SplitIterator(T) { const(T)[] buffer; const(T) delimiter; SplitOption option; int index = 0; int count() { int c = 0; foreach(line; this) { c++; } index = 0; return c; } const(T) get(int index) { return buffer[index]; } int opApply(scope int delegate(const(T)[]) dg) { auto length = buffer.length; for (int i = 0; i < length; i++) { if (buffer[i] == '\0') { length = i; break; } } int result = 0; for (int i = index; i < length; i++) { int entry(int start, int end) { // trim only if we got something if ((end - start > 0) && (option & SplitOption.TRIM)) { for (int j = start; j < end; j++) if (buffer[j] == ' ') start += 1; else break; for (int k = end; k >= start; k--) if (buffer[k - 1] == ' ') end -= 1; else break; // nothing left if(start >= end) return 0; } //printf("%i to %i :: %i :: total: %lu\n", start, end, index, buffer.length); return dg(buffer[start .. end]) != 0; } auto c = buffer[i]; if (c == delimiter) { if (i == index && (option & SplitOption.REMOVE_EMPTY)) { // skip if we keep finding the delimiter index = i + 1; continue; } if ((result = entry(index, i)) != 0) break; // skip delimiter for next result index = i + 1; } // handle what's left if ((i + 1) == length) { result = entry(index, i + 1); } } return result; } // copy from above, only replace if above has changed int opApply(scope int delegate(int, const(T)[]) dg) { auto length = buffer.length; for (int i = 0; i < length; i++) { if (buffer[i] == '\0') { length = i; break; } } int n = 0; int result = 0; for (int i = index; i < length; i++) { int entry(int start, int end) { // trim only if we got something if ((end - start > 0) && (option & SplitOption.TRIM)) { for (int j = start; j < end; j++) if (buffer[j] == ' ') start += 1; else break; for (int k = end; k >= start; k--) if (buffer[k - 1] == ' ') end -= 1; else break; // nothing left if(start >= end) return 0; } //printf("%i to %i :: %i :: total: %lu\n", start, end, index, buffer.length); return dg(n++, buffer[start .. end]) != 0; } auto c = buffer[i]; if (c == delimiter) { if (i == index && (option & SplitOption.REMOVE_EMPTY)) { // skip if we keep finding the delimiter index = i + 1; continue; } if ((result = entry(index, i)) != 0) break; // skip delimiter for next result index = i + 1; } // handle what's left if ((i + 1) == length) { result = entry(index, i + 1); } } return result; } } enum SplitOption { NONE = 0, REMOVE_EMPTY = 1, TRIM = 2 } ```
Dec 11 2021
On Saturday, 11 December 2021 at 14:42:53 UTC, russhy wrote:Here is mine - 0 allocations - configurable - let's you use it how you wish - fastYou know that this is already in phobos? ``` "abc;def;ghi".splitter(';').joiner ```
Dec 11 2021
On Saturday, 11 December 2021 at 18:51:12 UTC, Rumbu wrote:On Saturday, 11 December 2021 at 14:42:53 UTC, russhy wrote:you need to import a 8k lines of code module that itself imports other modules, and then the code is hard to read https://github.com/dlang/phobos/blob/v2.098.0/std/algorithm/iteration.d#L2917Here is mine - 0 allocations - configurable - let's you use it how you wish - fastYou know that this is already in phobos? ``` "abc;def;ghi".splitter(';').joiner ```
Dec 11 2021
On Saturday, 11 December 2021 at 19:50:55 UTC, russhy wrote:you need to import a 8k lines of code module that itself imports other modules, and then the code is hard to readI agree. ``` safe: auto deatheater(char stripchar)(string str) { struct voldemort { immutable(char)* begin, end; bool empty(){ return begin == end; } char front(){ return *begin; } char back() trusted{ return *(end-1); } void popFront() trusted{ while(begin != end){begin++; if (*begin != stripchar) break; } } void popBack() trusted{ while(begin != end){end--; if (*(end-1) != stripchar) break; } } this(string s) trusted{ begin = s.ptr; end = s.ptr + s.length; } } return voldemort(str); } void main() { import std.stdio; string str = "abc;def;ab"; foreach(c; deatheater!';'(str)) write(c); writeln(); foreach_reverse(c; deatheater!';'(str)) write(c); } ```
Dec 12 2021
On Sunday, 12 December 2021 at 08:58:29 UTC, Ola Fosheim Grøstad wrote:this(string s) trusted{ begin = s.ptr; end = s.ptr + s.length; } }Bug, it fails if the string ends or starts with ';'. Fix: ``` this(string s) trusted{ begin = s.ptr; end = s.ptr + s.length; while(begin!=end && *begin==stripchar) begin++; while(begin!=end && *(end-1)==stripchar) end--; } ```
Dec 12 2021
Of course, since it is easy to mess up and use ranges in the wrong way, you might want to add ```assert```s. That is most likely *helpful* to newbies that might want to use your kickass library function: ``` auto helpfuldeatheater(char stripchar)(string str) { struct voldemort { immutable(char)* begin, end; bool empty(){ return begin == end; } char front(){ assert(!empty); return *begin; } char back() trusted{ assert(!empty); return *(end-1); } void popFront() trusted{ assert(!empty); while(begin != end){begin++; if (*begin != stripchar) break; } } void popBack() trusted{ assert(!empty); while(begin != end){end--; if (*(end-1) != stripchar) break; } } this(string s) trusted{ begin = s.ptr; end = s.ptr + s.length; while(begin!=end && *begin==stripchar) begin++; while(begin!=end && *(end-1)==stripchar) end--; } } return voldemort(str); } ```
Dec 12 2021
On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:... The character I want to skip: `;`My C way of thinking while using D: import std; string stripsemicolons(string input){ char[] s = input.dup; int j=0; for(int i=0;i<input.length;++i){ if(s[i] == ';'){ continue; } s[j++] = s[i]; } s.length = j; return s.idup; } void main(){ string s = ";testing;this;thing!;"; writeln(s); writeln(s.stripsemicolons); return; } Matheus.
Dec 10 2021
On Friday, 10 December 2021 at 13:22:58 UTC, Matheus wrote:My C way of thinking while using D: import std; string stripsemicolons(string input){ char[] s = input.dup; int j=0; for(int i=0;i<input.length;++i){ if(s[i] == ';'){ continue; } s[j++] = s[i]; } s.length = j; return s.idup; }Oooh, finally someone suggested to preallocate storage for all these reinventions of the wheel :D I would suggest instead of the final idup checking the length and only duplicating if certain waste threshold is broken, otherwise just doing https://dlang.org/phobos/std_exception.html#assumeUnique (or a cast to string). The result is unique either way. Threshold could be relative for short strings and absolute for long ones. Makes little sense reallocating if you only waste a couple bytes, but makes perfect sense if you've just removed pages and pages of semicolons ;) Be interesting to see if this thread does evolve into a SIMD search...
Dec 10 2021
On Friday, 10 December 2021 at 18:47:53 UTC, Stanislav Blinov wrote:Be interesting to see if this thread does evolve into a SIMDhttp://lemire.me/blog/2017/01/20/how-quickly-can-you-remove-spaces-from-a-string/
Dec 10 2021
On Friday, 10 December 2021 at 18:47:53 UTC, Stanislav Blinov wrote:Oooh, finally someone suggested to preallocate storage for all these reinventions of the wheel :D``` import std.stdio; char[] dontdothis(string s, int i=0, int skip=0){ if (s.length == i) return new char[](i - skip); if (s[i] == ';') return dontdothis(s, i+1, skip+1); auto r = dontdothis(s, i+1, skip); r[i-skip] = s[i]; return r; } int main() { string s = "abc;def;ab"; string s_new = cast(string)dontdothis(s); writeln(s_new); return 0; } ```
Dec 10 2021
On Friday, 10 December 2021 at 23:53:47 UTC, Ola Fosheim Grøstad wrote:```d char[] dontdothis(string s, int i=0, int skip=0){ if (s.length == i) return new char[](i - skip); if (s[i] == ';') return dontdothis(s, i+1, skip+1); auto r = dontdothis(s, i+1, skip); r[i-skip] = s[i]; return r; } ```That is about 500% not what I meant. At all. Original code in question: - duplicates string unconditionally as mutable storage - uses said mutable storage to gather all non-semicolons - duplicates said mutable storage (again) as immutable I suggested to make the second duplicate conditional, based on amount of space freed by skipping semicolons. What you're showing is... indeed, don't do this, but I fail to see what that has to do with my suggestion, or the original code.Scanning short strings twice is not all that expensive as they will stay in the CPU cache > when you run over them a second time.```d import std.stdio; safe: string stripsemicolons(string s) trusted { int i,n; foreach(c; s) n += c != ';'; // premature optimization auto r = new char[](n); foreach(c; s) if (c != ';') r[i++] = c; return cast(string)r; } ```Again, that is a different algorithm than what I was responding to. But sure, short strings - might as well. So long as you do track the distinction somewhere up in the code and don't simply call this on all strings.
Dec 11 2021
On Saturday, 11 December 2021 at 09:26:06 UTC, Stanislav Blinov wrote:What you're showing is... indeed, don't do this, but I fail to see what that has to do with my suggestion, or the original code.You worry too much, just have fun with differing ways of expressing the same thing. (Recursion can be completely fine if the compiler supports it well. Tail recursion that is, not my example.)Again, that is a different algorithm than what I was responding to.Slightly different, but same idea. Isn't the point of this thread to present N different ways of doing the same thing? :-)
Dec 11 2021
On Friday, 10 December 2021 at 18:47:53 UTC, Stanislav Blinov wrote:Threshold could be relative for short strings and absolute for long ones. Makes little sense reallocating if you only waste a couple bytes, but makes perfect sense if you've just removed pages and pages of semicolons ;)Scanning short strings twice is not all that expensive as they will stay in the CPU cache when you run over them a second time. ``` import std.stdio; safe: string stripsemicolons(string s) trusted { int i,n; foreach(c; s) n += c != ';'; // premature optimization auto r = new char[](n); foreach(c; s) if (c != ';') r[i++] = c; return cast(string)r; } int main() { writeln(stripsemicolons("abc;def;ab")); return 0; } ```
Dec 11 2021
On Friday, 10 December 2021 at 18:47:53 UTC, Stanislav Blinov wrote:Threshold could be relative for short strings and absolute for long ones. Makes little sense reallocating if you only waste a couple bytes, but makes perfect sense if you've just removed pages and pages of semicolons ;)Like this? ``` safe: string prematureoptimizations(string s, char stripchar) trusted { import core.memory; immutable uint flags = GC.BlkAttr.NO_SCAN|GC.BlkAttr.APPENDABLE; char* begin = cast(char*)GC.malloc(s.length+1, flags); char* end = begin + 1; foreach(c; s) { immutable size_t notsemicolon = c != stripchar; // hack: avoid conditional by writing semicolon outside buffer *(end - notsemicolon) = c; end += notsemicolon; } immutable size_t len = end - begin - 1; begin = cast(char*)GC.realloc(begin, len, flags); return cast(string)begin[0..len]; } void main() { import std.stdio; string str = "abc;def;ab"; writeln(prematureoptimizations(str, ';')); } ```
Dec 13 2021
On Monday, 13 December 2021 at 09:36:57 UTC, Ola Fosheim Grøstad wrote:```d safe: string prematureoptimizations(string s, char stripchar) trusted { import core.memory; immutable uint flags = GC.BlkAttr.NO_SCAN|GC.BlkAttr.APPENDABLE; char* begin = cast(char*)GC.malloc(s.length+1, flags); char* end = begin + 1; foreach(c; s) { immutable size_t notsemicolon = c != stripchar; // hack: avoid conditional by writing semicolon outside buffer *(end - notsemicolon) = c; end += notsemicolon; } immutable size_t len = end - begin - 1; begin = cast(char*)GC.realloc(begin, len, flags); return cast(string)begin[0..len]; } void main() { import std.stdio; string str = "abc;def;ab"; writeln(prematureoptimizations(str, ';')); } ```It seems faster than algorithms in Phobos. We would love to see this in our new Phobos. ```d enum str = "abc;def;gh"; enum res = "abcdefgh"; void main() { void mallocReplace() { import core.memory; immutable uint flags = GC.BlkAttr.NO_SCAN| GC.BlkAttr.APPENDABLE; char* begin = cast(char*)GC.malloc(str.length+1, flags); char* end = begin + 1; foreach(c; str) { immutable size_t f = c != ';'; *(end - f) = c; end += f; } immutable size_t len = end - begin - 1; begin = cast(char*)GC.realloc(begin, len, flags); assert(begin[0..len] == res); } void normalReplace() { import std.string; string result = str.replace(';',""); assert(result == res); } void delegate() t1 = &normalReplace; void delegate() t2 = &mallocReplace; import std.stdio : writefln; import std.datetime.stopwatch : benchmark; auto bm = benchmark!(t1, t2)(1_000_000); writefln("Replace: %s msecs", bm[0].total!"msecs"); writefln("Malloc : %s msecs", bm[1].total!"msecs"); }/* Console Out: Replace: 436 msecs Malloc : 259 msecs */ ```
Dec 22 2021
On Thursday, 23 December 2021 at 07:14:35 UTC, Salih Dincer wrote:It seems faster than algorithms in Phobos. We would love to see this in our new Phobos. ```d void mallocReplace()void normalReplace() string result = str.replace(';',""); }/* Console Out: Replace: 436 msecs Malloc : 259 msecs */ ```You're comparing apples and oranges. When benchmarking, at least look at the generated assembly first. replace is not in Phobos, it's a D runtime vestige. It's not getting inlined even in release builds with lto, whereas that manual version would. Also, benchmark with runtime strings, not literals, otherwise the compiler might even swallow the thing whole. What you're benchmarking is, basically, inlined optimized search in a literal versus a function call.
Dec 23 2021
On Thursday, 23 December 2021 at 16:13:49 UTC, Stanislav Blinov wrote:You're comparing apples and oranges. When benchmarking, at least look at the generated assembly first.I looked now and you're right. Insomuch that it should be eggplant not apple, banana not orange...:) Because it's an irrelevant benchmarking! Thank you all...
Dec 23 2021
On Thursday, 23 December 2021 at 07:14:35 UTC, Salih Dincer wrote:It seems faster than algorithms in Phobos. We would love to see this in our new Phobos.Replace: 436 msecs Malloc : 259 msecs */It seems because MallocReplace is cheating a lot: - it is not called through another function like replace is called; - accesses directly the constant str; - assumes that it has a single character to replace; - assumes that the character will be deleted not replaced with something; - assumes that the character is always ';' - assumes that the replacing string is not bigger than the replaced one, so it knows exactly how much space to allocate; - does not have any parameter, at least on x86 this means that there is no arg pushes when it's called. - does not return a string, just compares its result with another constant; Since we already know all this stuff, we can go further :) ```d string superFast() { enum r = str.replace(";", ""); return r; } ```Replace: 436 msecs Malloc : 259 msecs SuperFast: 0 msecs
Dec 23 2021