www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How to loop through characters of a string in D language?

reply BoQsc <vaidas.boqsc gmail.com> writes:
Let's say I want to skip characters and build a new string.

The string example to loop/iterate:

```
import std.stdio;

void main()
{
     string a="abc;def;ab";

}
```

The character I want to skip: `;`

Expected result:
```
abcdefab
```
Dec 08 2021
next sibling parent reply Biotronic <simen.kjaras gmail.com> writes:
On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:
 Let's say I want to skip characters and build a new string.

 The string example to loop/iterate:

 ```
 import std.stdio;

 void main()
 {
     string a="abc;def;ab";

 }
 ```

 The character I want to skip: `;`

 Expected result:
 ```
 abcdefab
 ```
import std.stdio : writeln; import std.algorithm.iteration : filter; import std.conv : to; void main() { string a = "abc;def;ab"; string b = a.filter!(c => c != ';').to!string; writeln(b); }
Dec 08 2021
parent reply BoQsc <vaidas.boqsc gmail.com> writes:
On Wednesday, 8 December 2021 at 11:35:39 UTC, Biotronic wrote:
 On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:
 Let's say I want to skip characters and build a new string.

 The string example to loop/iterate:

 ```
 import std.stdio;

 void main()
 {
     string a="abc;def;ab";

 }
 ```

 The character I want to skip: `;`

 Expected result:
 ```
 abcdefab
 ```
[..] string b = a.filter!(c => c != ';').to!string; writeln(b); }
I somehow have universal cross language hate for this kind of algorithm. I'm not getting used to the syntax and that leads to poor readability. But that might be just me. Anyways, Here is what I've come up with. ``` import std.stdio; void main() { string a = "abc;def;ab"; string b; for(int i=0; i<a.length; i++){ write(i); writeln(a[i]); if (a[i] != ';'){ b ~= a[i]; } } writeln(b); } ```
Dec 08 2021
parent reply kdevel <kdevel vogtner.de> writes:
On Wednesday, 8 December 2021 at 13:01:32 UTC, BoQsc wrote:
[...]
 I'm not getting used to the syntax and that leads to poor 
 readability.
It depends on what you expect when you read source code. I don't want to read how seats in the memory are assigned to bits and bytes. Instead I want to read what is done.
 But that might be just me.
Unfortunately not.
 Anyways,
 Here is what I've come up with.

 ```
 import std.stdio;

 void main()
 {
     string a = "abc;def;ab";
 	string b;
 	
 	for(int i=0; i<a.length; i++){
 		write(i);
 		writeln(a[i]);
 		if (a[i] != ';'){
 			b ~= a[i];
 		}
 		
 	}
 	
     writeln(b);
 }
 ```
PRO: - saves two lines of boilerplate code CONS: - raw loop - postinc ++ is only permitted in ++C - inconsistent spacing around "=" - mixing tabs and spaces for indentation - arrow code
Dec 09 2021
parent forkit <forkit gmail.com> writes:
On Thursday, 9 December 2021 at 18:00:42 UTC, kdevel wrote:
 PRO:

 - saves two lines of boilerplate code

 CONS:

 - raw loop
 - postinc ++ is only permitted in ++C
 - inconsistent spacing around "="
 - mixing tabs and spaces for indentation
 - arrow code
more PROs: - You become less dependent on someone else's library. - You learn how to do some things yourself. ;-) of course, I would prefer a less verbose, and safer version, which D enables, such as: foreach(val; a) { writeln(val); if (val != ';') { b ~= val; } }
Dec 09 2021
prev sibling next sibling parent reply Adam D Ruppe <destructionator gmail.com> writes:
On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:
 The string example to loop/iterate:
foreach(ch; a) { } does the individual chars of the string you can also foreach(dchar ch; a) { } to decode the utf 8
Dec 08 2021
parent BoQsc <vaidas.boqsc gmail.com> writes:
On Wednesday, 8 December 2021 at 12:49:39 UTC, Adam D Ruppe wrote:
 On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:
 The string example to loop/iterate:
foreach(ch; a) { } does the individual chars of the string you can also foreach(dchar ch; a) { } to decode the utf 8
Thanks Adam. This is how it would look implemented. ``` import std.stdio; void main() { string a = "abc;def;ab"; string b; foreach(ch; a) { if (ch != ';'){ b ~= ch; } writeln(ch); } writeln(b); } ```
Dec 08 2021
prev sibling next sibling parent reply bauss <jj_1337 live.dk> writes:
On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:
 Let's say I want to skip characters and build a new string.

 The string example to loop/iterate:

 ```
 import std.stdio;

 void main()
 {
     string a="abc;def;ab";

 }
 ```

 The character I want to skip: `;`

 Expected result:
 ```
 abcdefab
 ```
string b = a.replace(";", "");
Dec 08 2021
next sibling parent reply BoQsc <vaidas.boqsc gmail.com> writes:
On Wednesday, 8 December 2021 at 14:16:16 UTC, bauss wrote:
 On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:
 Let's say I want to skip characters and build a new string.

 The string example to loop/iterate:

 ```
 import std.stdio;

 void main()
 {
     string a="abc;def;ab";

 }
 ```

 The character I want to skip: `;`

 Expected result:
 ```
 abcdefab
 ```
string b = a.replace(";", "");
Thanks, that's what I used to do few years ago. It's a great solution I forget about and it works. ``` import std.stdio; import std.array; void main() { string a="abc;def;ab"; string b = a.replace(";", ""); writeln(b); } ```
Dec 08 2021
parent reply forkit <forkit gmail.com> writes:
On Wednesday, 8 December 2021 at 14:27:22 UTC, BoQsc wrote:
 On Wednesday, 8 December 2021 at 14:16:16 UTC, bauss wrote:
 On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:
 Let's say I want to skip characters and build a new string.

 The string example to loop/iterate:

 ```
 import std.stdio;

 void main()
 {
     string a="abc;def;ab";

 }
 ```

 The character I want to skip: `;`

 Expected result:
 ```
 abcdefab
 ```
string b = a.replace(";", "");
Thanks, that's what I used to do few years ago. It's a great solution I forget about and it works. ``` import std.stdio; import std.array; void main() { string a="abc;def;ab"; string b = a.replace(";", ""); writeln(b); } ```
It's also worth noting the differences in compiler output, as well as the time taken to compile, these two approaches: (1) string str = "abc;def;ab".filter!(c => c != ';').to!string; (2) string str = "abc;def;ab".replace(";", ""); see: https://d.godbolt.org/z/3dWYsEGsr
Dec 08 2021
parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Wednesday, 8 December 2021 at 22:18:23 UTC, forkit wrote:

 It's also worth noting the differences in compiler output, as 
 well as the time taken to compile, these two approaches:

 (1)
 string str = "abc;def;ab".filter!(c => c != ';').to!string;

 (2)
 string str = "abc;def;ab".replace(";", "");

 see: https://d.godbolt.org/z/3dWYsEGsr
You're passing a literal. Try passing a runtime value (e.g. a command line argument). Also, -O2 -release :) Uless, of course, your goal is to look at debug code.
Dec 08 2021
parent reply forkit <forkit gmail.com> writes:
On Wednesday, 8 December 2021 at 22:35:35 UTC, Stanislav Blinov 
wrote:
 You're passing a literal. Try passing a runtime value (e.g. a 
 command line argument). Also, -O2 -release :) Uless, of course, 
 your goal is to look at debug code.
but this will change nothing. the compilation cost of using .replace, will always be apparent (compared to the presented alternative), both in less time taken to compile, and smaller size of executable.
Dec 08 2021
parent forkit <forkit gmail.com> writes:
On Wednesday, 8 December 2021 at 22:55:02 UTC, forkit wrote:
 On Wednesday, 8 December 2021 at 22:35:35 UTC, Stanislav Blinov 
 wrote:
 You're passing a literal. Try passing a runtime value (e.g. a 
 command line argument). Also, -O2 -release :) Uless, of 
 course, your goal is to look at debug code.
but this will change nothing. the compilation cost of using .replace, will always be apparent (compared to the presented alternative), both in less time taken to compile, and smaller size of executable.
well... maybe not that apparent afterall ;-) .. the mysteries of compiler optimisation ....
Dec 08 2021
prev sibling parent kdevel <kdevel vogtner.de> writes:
On Wednesday, 8 December 2021 at 14:16:16 UTC, bauss wrote:
[...]
 string b = a.replace(";", "");
đź‘Ť
Dec 09 2021
prev sibling next sibling parent Salih Dincer <salihdb hotmail.com> writes:
On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:
 Let's say I want to skip characters and build a new string.

 The string example to loop/iterate:

 ```
 import std.stdio;

 void main()
 {
     string a="abc;def;ab";

 }
 ```

 The character I want to skip: `;`

 Expected result:
 ```
 abcdefab
 ```
I always use split() and joiner pair. You can customize it as you want: ```d import std.stdio : writeln; import std.algorithm : joiner; import std.array : split; bool isWhite(dchar c) safe pure nothrow nogc { return c == ' ' || c == ';' || (c >= 0x09&& c <= 0x0D); } void main() { string str = "a\nb c\t;d e f;a b "; str.split!isWhite.joiner.writeln(); //abcdefab } ```
Dec 08 2021
prev sibling next sibling parent reply Rumbu <rumbu rumbu.ro> writes:
On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:
 Let's say I want to skip characters and build a new string.
 The character I want to skip: `;`

 Expected result:
 ```
 abcdefab
 ```
Since it seems there is a contest here: ```d "abc;def;ghi".split(';').join(); ``` :)
Dec 09 2021
next sibling parent reply IGotD- <nise nise.com> writes:
On Friday, 10 December 2021 at 06:24:27 UTC, Rumbu wrote:

 Since it seems there is a contest here:

 ```d
 "abc;def;ghi".split(';').join();
 ```

 :)
Would that become two for loops or not?
Dec 10 2021
next sibling parent reply Rumbu <rumbu rumbu.ro> writes:
On Friday, 10 December 2021 at 11:06:21 UTC, IGotD- wrote:
 On Friday, 10 December 2021 at 06:24:27 UTC, Rumbu wrote:

 Since it seems there is a contest here:

 ```d
 "abc;def;ghi".split(';').join();
 ```

 :)
Would that become two for loops or not?
I thought it's a beauty contest. ```d string stripsemicolons(string s) { string result; // prevent reallocations result.length = s.length; result.length = 0; //append to string only when needed size_t i = 0; while (i < s.length) { size_t j = i; while (i < s.length && s[i] != ';') ++i; result ~= s[j..i]; } } ```
Dec 10 2021
parent forkit <forkit gmail.com> writes:
On Friday, 10 December 2021 at 12:15:18 UTC, Rumbu wrote:
 I thought it's a beauty contest.
Well, if it's a beauty contest, then i got a beauty.. char[("abc;def;ab".length - count("abc;def;ab", ";"))] b = "abc;def;ab".replace(";", "");
Dec 10 2021
prev sibling parent =?ISO-8859-1?Q?Lu=EDs_Ferreira?= <lsferreira riseup.net> writes:
 charset=utf-8
Content-Transfer-Encoding: quoted-printable

Yes it will=2E You can use lazy templates instead, like splitter and joiner=
, which splits and joins lazily, respectively=2E LDC can optimize those tem=
plates fairly well and avoid too much lazy calls and pretty much constructs=
 the logic equivalent to for loop=2E

On 10 December 2021 11:06:21 WET, IGotD- via Digitalmars-d-learn <digitalm=
ars-d-learn puremagic=2Ecom> wrote:
On Friday, 10 December 2021 at 06:24:27 UTC, Rumbu wrote:

=20
 Since it seems there is a contest here:
=20
 ```d
 "abc;def;ghi"=2Esplit(';')=2Ejoin();
 ```
=20
 :)
Would that become two for loops or not?
Dec 10 2021
prev sibling parent reply Arjan <arjan ask.me.to> writes:
On Friday, 10 December 2021 at 06:24:27 UTC, Rumbu wrote:
 On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:
 Let's say I want to skip characters and build a new string.
 The character I want to skip: `;`

 Expected result:
 ```
 abcdefab
 ```
Since it seems there is a contest here: ```d "abc;def;ghi".split(';').join(); ``` :)
```d "abc;def;ghi".tr(";", "", "d" ); ```
Dec 10 2021
parent reply forkit <forkit gmail.com> writes:
On Friday, 10 December 2021 at 22:35:58 UTC, Arjan wrote:
 "abc;def;ghi".tr(";", "", "d" );
I don't think we have enough ways of doing the same thing yet... so here's one more.. "abc;def;ghi".substitute(";", "");
Dec 10 2021
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 11 December 2021 at 00:39:15 UTC, forkit wrote:
 On Friday, 10 December 2021 at 22:35:58 UTC, Arjan wrote:
 "abc;def;ghi".tr(";", "", "d" );
I don't think we have enough ways of doing the same thing yet... so here's one more.. "abc;def;ghi".substitute(";", "");
Using libraries can trigger hidden allocations. ``` import std.stdio; string garbagefountain(string s){ if (s.length == 1) return s == ";" ? "" : s; return garbagefountain(s[0..$/2]) ~ garbagefountain(s[$/2..$]); } int main() { writeln(garbagefountain("abc;def;ab")); return 0; } ```
Dec 11 2021
parent reply forkit <forkit gmail.com> writes:
On Saturday, 11 December 2021 at 08:05:01 UTC, Ola Fosheim 
Grøstad wrote:
 Using libraries can trigger hidden allocations.
ok. fine. no unnecessary, hidden allocations then. // ------------------ module test; import core.stdc.stdio : putchar; nothrow nogc void main() { string str = "abc;def;ab"; ulong len = str.length; for (ulong i = 0; i < len; i++) { if (cast(int) str[i] != ';') putchar(cast(int) str[i]); } } // ------------------
Dec 11 2021
next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 11 December 2021 at 08:46:32 UTC, forkit wrote:
 On Saturday, 11 December 2021 at 08:05:01 UTC, Ola Fosheim 
 Grøstad wrote:
 Using libraries can trigger hidden allocations.
ok. fine. no unnecessary, hidden allocations then. // ------------------ module test; import core.stdc.stdio : putchar; nothrow nogc void main() { string str = "abc;def;ab"; ulong len = str.length; for (ulong i = 0; i < len; i++) { if (cast(int) str[i] != ';') putchar(cast(int) str[i]); } } // ------------------
```putchar(…)``` is too slow! ``` safe: extern (C) long write(long, const void *, long); void donttrythisathome(string s, char stripchar) trusted { import core.stdc.stdlib; char* begin = cast(char*)alloca(s.length); char* end = begin; foreach(c; s) if (c != stripchar) *(end++) = c; write(0, begin, end - begin); } system void main() { string str = "abc;def;ab"; donttrythisathome(str, ';'); } ````
Dec 11 2021
parent reply forkit <forkit gmail.com> writes:
On Saturday, 11 December 2021 at 09:25:37 UTC, Ola Fosheim 
Grøstad wrote:
 ```putchar(…)``` is too slow!
On planet Mars maybe, but here on earth, my computer can do about 4 billion ticks per second, and my entire program (using putchar) takes only 3084 ticks.
Dec 12 2021
parent bauss <jj_1337 live.dk> writes:
On Monday, 13 December 2021 at 05:46:06 UTC, forkit wrote:
 On Saturday, 11 December 2021 at 09:25:37 UTC, Ola Fosheim 
 Grøstad wrote:
 ```putchar(…)``` is too slow!
On planet Mars maybe, but here on earth, my computer can do about 4 billion ticks per second, and my entire program (using putchar) takes only 3084 ticks.
Can I borrow a couple of your ticks?
Dec 12 2021
prev sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 11 December 2021 at 08:46:32 UTC, forkit wrote:
 On Saturday, 11 December 2021 at 08:05:01 UTC, Ola Fosheim 
 Grøstad wrote:
 Using libraries can trigger hidden allocations.
ok. fine. no unnecessary, hidden allocations then. // ------------------ module test; import core.stdc.stdio : putchar; nothrow nogc void main() { string str = "abc;def;ab"; ulong len = str.length; for (ulong i = 0; i < len; i++) { if (cast(int) str[i] != ';') putchar(cast(int) str[i]); } } // ------------------
```putchar(…)``` is too slow! ``` safe: extern (C) long write(long, const void *, long); void donttrythisathome(string s, char stripchar) trusted { import core.stdc.stdlib; char* begin = cast(char*)alloca(s.length); char* end = begin; foreach(c; s) if (c != stripchar) *(end++) = c; write(0, begin, end - begin); } system void main() { string str = "abc;def;ab"; donttrythisathome(str, ';'); } ````
Dec 11 2021
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 11 December 2021 at 09:34:17 UTC, Ola Fosheim 
Grøstad wrote:
  system
Shouldn't be there. Residual leftovers… (I don't want to confuse newbies!)
Dec 11 2021
prev sibling parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Saturday, 11 December 2021 at 09:34:17 UTC, Ola Fosheim 
Grøstad wrote:

 void donttrythisathome(string s, char stripchar)  trusted {
 	import core.stdc.stdlib;
     char* begin = cast(char*)alloca(s.length);
A function with that name, and calling alloca to boot, cannot be trusted ;)
Dec 11 2021
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 11 December 2021 at 09:40:47 UTC, Stanislav Blinov 
wrote:
 On Saturday, 11 December 2021 at 09:34:17 UTC, Ola Fosheim 
 Grøstad wrote:

 void donttrythisathome(string s, char stripchar)  trusted {
 	import core.stdc.stdlib;
     char* begin = cast(char*)alloca(s.length);
A function with that name, and calling alloca to boot, cannot be trusted ;)
:-) But I am very trustworthy person! PROMISE!!!
Dec 11 2021
parent reply russhy <russhy_s gmail.com> writes:
Here is mine

- 0 allocations

- configurable

- let's you use it how you wish

- fast


```D
import std;
void main()
{
     string a = "abc;def;ab";
     writeln("a => ", a);

     foreach(item; split(a, ';'))
         writeln("\t", item);


     string b = "abc;    def   ;ab";
     writeln("a => ", b);

     foreach(item; split(b, ';', SplitOption.TRIM))
         writeln("\t", item);


     string c= "abc;    ;       ;def   ;ab";
     writeln("a => ",c);

     foreach(item; split(c, ';', SplitOption.TRIM | 
SplitOption.REMOVE_EMPTY))
         writeln("\t", item);
}

SplitIterator!T split(T)(const(T)[] buffer, const(T) delimiter, 
SplitOption option = SplitOption.NONE)
{
     return SplitIterator!T(buffer, delimiter, option);
}

struct SplitIterator(T)
{
     const(T)[] buffer;
     const(T) delimiter;
     SplitOption option;
     int index = 0;

	int count()
	{
		int c = 0;
		foreach(line; this)
		{
			c++;
		}
		index = 0;
		return c;
	}

	const(T) get(int index)
	{
		return buffer[index];
	}
	
     int opApply(scope int delegate(const(T)[]) dg)
     {
         auto length = buffer.length;
         for (int i = 0; i < length; i++)
         {
             if (buffer[i] == '\0')
             {
                 length = i;
                 break;
             }
         }

         int result = 0;
         for (int i = index; i < length; i++)
         {
             int entry(int start, int end)
             {
                 // trim only if we got something
                 if ((end - start > 0) && (option & 
SplitOption.TRIM))
                 {
                     for (int j = start; j < end; j++)
                         if (buffer[j] == ' ')
                             start += 1;
                         else
                             break;
                     for (int k = end; k >= start; k--)
                         if (buffer[k - 1] == ' ')
                             end -= 1;
                         else
                             break;
					
					// nothing left
					if(start >= end) return 0;
                 }

				//printf("%i to %i :: %i :: total: %lu\n", start, end, index, 
buffer.length);
                 return dg(buffer[start .. end]) != 0;
             }

             auto c = buffer[i];
             if (c == delimiter)
             {
                 if (i == index && (option & 
SplitOption.REMOVE_EMPTY))
                 {
                     // skip if we keep finding the delimiter
                     index = i + 1;
                     continue;
                 }

                 if ((result = entry(index, i)) != 0)
                     break;

                 // skip delimiter for next result
                 index = i + 1;
             }

             // handle what's left
             if ((i + 1) == length)
             {
                 result = entry(index, i + 1);
             }
         }
         return result;
     }

	// copy from above, only replace if above has changed
     int opApply(scope int delegate(int, const(T)[]) dg)
     {
         auto length = buffer.length;
         for (int i = 0; i < length; i++)
         {
             if (buffer[i] == '\0')
             {
                 length = i;
                 break;
             }
         }

		int n = 0;
         int result = 0;
         for (int i = index; i < length; i++)
         {
             int entry(int start, int end)
             {
                 // trim only if we got something
                 if ((end - start > 0) && (option & 
SplitOption.TRIM))
                 {
                     for (int j = start; j < end; j++)
                         if (buffer[j] == ' ')
                             start += 1;
                         else
                             break;
                     for (int k = end; k >= start; k--)
                         if (buffer[k - 1] == ' ')
                             end -= 1;
                         else
                             break;
					
					// nothing left
					if(start >= end) return 0;
                 }

				//printf("%i to %i :: %i :: total: %lu\n", start, end, index, 
buffer.length);
                 return dg(n++, buffer[start .. end]) != 0;
             }

             auto c = buffer[i];
             if (c == delimiter)
             {
                 if (i == index && (option & 
SplitOption.REMOVE_EMPTY))
                 {
                     // skip if we keep finding the delimiter
                     index = i + 1;
                     continue;
                 }

                 if ((result = entry(index, i)) != 0)
                     break;

                 // skip delimiter for next result
                 index = i + 1;
             }

             // handle what's left
             if ((i + 1) == length)
             {
                 result = entry(index, i + 1);
             }
         }
         return result;
     }
}


enum SplitOption
{
     NONE = 0,
     REMOVE_EMPTY = 1,
     TRIM = 2
}

```
Dec 11 2021
parent reply Rumbu <rumbu rumbu.ro> writes:
On Saturday, 11 December 2021 at 14:42:53 UTC, russhy wrote:
 Here is mine

 - 0 allocations

 - configurable

 - let's you use it how you wish

 - fast
You know that this is already in phobos? ``` "abc;def;ghi".splitter(';').joiner ```
Dec 11 2021
parent reply russhy <russhy_s gmail.com> writes:
On Saturday, 11 December 2021 at 18:51:12 UTC, Rumbu wrote:
 On Saturday, 11 December 2021 at 14:42:53 UTC, russhy wrote:
 Here is mine

 - 0 allocations

 - configurable

 - let's you use it how you wish

 - fast
You know that this is already in phobos? ``` "abc;def;ghi".splitter(';').joiner ```
you need to import a 8k lines of code module that itself imports other modules, and then the code is hard to read https://github.com/dlang/phobos/blob/v2.098.0/std/algorithm/iteration.d#L2917
Dec 11 2021
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 11 December 2021 at 19:50:55 UTC, russhy wrote:
 you need to import a 8k lines of code module that itself 
 imports other modules, and then the code is hard to read
I agree. ``` safe: auto deatheater(char stripchar)(string str) { struct voldemort { immutable(char)* begin, end; bool empty(){ return begin == end; } char front(){ return *begin; } char back() trusted{ return *(end-1); } void popFront() trusted{ while(begin != end){begin++; if (*begin != stripchar) break; } } void popBack() trusted{ while(begin != end){end--; if (*(end-1) != stripchar) break; } } this(string s) trusted{ begin = s.ptr; end = s.ptr + s.length; } } return voldemort(str); } void main() { import std.stdio; string str = "abc;def;ab"; foreach(c; deatheater!';'(str)) write(c); writeln(); foreach_reverse(c; deatheater!';'(str)) write(c); } ```
Dec 12 2021
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Sunday, 12 December 2021 at 08:58:29 UTC, Ola Fosheim Grøstad 
wrote:
         this(string s) trusted{
             begin = s.ptr;
             end = s.ptr + s.length;
         }
 	}
Bug, it fails if the string ends or starts with ';'. Fix: ``` this(string s) trusted{ begin = s.ptr; end = s.ptr + s.length; while(begin!=end && *begin==stripchar) begin++; while(begin!=end && *(end-1)==stripchar) end--; } ```
Dec 12 2021
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
Of course, since it is easy to mess up and use ranges in the 
wrong way, you might want to add ```assert```s. That is most 
likely *helpful* to newbies that might want to use your kickass 
library function:

```
auto helpfuldeatheater(char stripchar)(string str) {
	struct voldemort {
         immutable(char)* begin, end;
         bool empty(){ return begin == end; }
         char front(){ assert(!empty); return *begin; }
         char back() trusted{ assert(!empty); return *(end-1); }
         void popFront() trusted{
			assert(!empty);
     		while(begin != end){begin++; if (*begin != stripchar) 
break; }
         }
         void popBack() trusted{
             assert(!empty);
             while(begin != end){end--; if (*(end-1) != stripchar) 
break; }
         }
         this(string s) trusted{
             begin = s.ptr;
             end = s.ptr + s.length;
             while(begin!=end && *begin==stripchar) begin++;
             while(begin!=end && *(end-1)==stripchar) end--;
         }
	}
     return voldemort(str);
}
```
Dec 12 2021
prev sibling parent reply Matheus <matheus gmail.com> writes:
On Wednesday, 8 December 2021 at 11:23:45 UTC, BoQsc wrote:
 ...
 The character I want to skip: `;`
My C way of thinking while using D: import std; string stripsemicolons(string input){ char[] s = input.dup; int j=0; for(int i=0;i<input.length;++i){ if(s[i] == ';'){ continue; } s[j++] = s[i]; } s.length = j; return s.idup; } void main(){ string s = ";testing;this;thing!;"; writeln(s); writeln(s.stripsemicolons); return; } Matheus.
Dec 10 2021
parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Friday, 10 December 2021 at 13:22:58 UTC, Matheus wrote:

 My C way of thinking while using D:

 import std;

 string stripsemicolons(string input){
     char[] s = input.dup;
     int j=0;
     for(int i=0;i<input.length;++i){
         if(s[i] == ';'){ continue; }
         s[j++] = s[i];
     }
     s.length = j;
     return s.idup;
 }
Oooh, finally someone suggested to preallocate storage for all these reinventions of the wheel :D I would suggest instead of the final idup checking the length and only duplicating if certain waste threshold is broken, otherwise just doing https://dlang.org/phobos/std_exception.html#assumeUnique (or a cast to string). The result is unique either way. Threshold could be relative for short strings and absolute for long ones. Makes little sense reallocating if you only waste a couple bytes, but makes perfect sense if you've just removed pages and pages of semicolons ;) Be interesting to see if this thread does evolve into a SIMD search...
Dec 10 2021
next sibling parent Rumbu <rumbu rumbu.ro> writes:
On Friday, 10 December 2021 at 18:47:53 UTC, Stanislav Blinov 
wrote:
 Be interesting to see if this thread does evolve into a SIMD
http://lemire.me/blog/2017/01/20/how-quickly-can-you-remove-spaces-from-a-string/
Dec 10 2021
prev sibling next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 10 December 2021 at 18:47:53 UTC, Stanislav Blinov 
wrote:
 Oooh, finally someone suggested to preallocate storage for all 
 these reinventions of the wheel :D
``` import std.stdio; char[] dontdothis(string s, int i=0, int skip=0){ if (s.length == i) return new char[](i - skip); if (s[i] == ';') return dontdothis(s, i+1, skip+1); auto r = dontdothis(s, i+1, skip); r[i-skip] = s[i]; return r; } int main() { string s = "abc;def;ab"; string s_new = cast(string)dontdothis(s); writeln(s_new); return 0; } ```
Dec 10 2021
parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Friday, 10 December 2021 at 23:53:47 UTC, Ola Fosheim Grøstad 
wrote:

```d
 char[] dontdothis(string s, int i=0, int skip=0){
     if (s.length == i) return new char[](i - skip);
     if (s[i] == ';') return dontdothis(s, i+1, skip+1);
     auto r = dontdothis(s, i+1, skip);
     r[i-skip] = s[i];
     return r;
 }
```
That is about 500% not what I meant. At all. Original code in question: - duplicates string unconditionally as mutable storage - uses said mutable storage to gather all non-semicolons - duplicates said mutable storage (again) as immutable I suggested to make the second duplicate conditional, based on amount of space freed by skipping semicolons. What you're showing is... indeed, don't do this, but I fail to see what that has to do with my suggestion, or the original code.
 Scanning short strings twice is not all that expensive as they 
 will stay in the CPU cache > when you run over them a second 
 time.
```d
 import std.stdio;

  safe:
 string stripsemicolons(string s)  trusted {
     int i,n;
     foreach(c; s) n += c != ';'; // premature optimization
     auto r = new char[](n);
     foreach(c; s) if (c != ';') r[i++] = c;
     return cast(string)r;
 }
```
Again, that is a different algorithm than what I was responding to. But sure, short strings - might as well. So long as you do track the distinction somewhere up in the code and don't simply call this on all strings.
Dec 11 2021
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 11 December 2021 at 09:26:06 UTC, Stanislav Blinov 
wrote:
 What you're showing is... indeed, don't do this, but I fail to 
 see what that has to do with my suggestion, or the original 
 code.
You worry too much, just have fun with differing ways of expressing the same thing. (Recursion can be completely fine if the compiler supports it well. Tail recursion that is, not my example.)
 Again, that is a different algorithm than what I was responding 
 to.
Slightly different, but same idea. Isn't the point of this thread to present N different ways of doing the same thing? :-)
Dec 11 2021
prev sibling next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 10 December 2021 at 18:47:53 UTC, Stanislav Blinov 
wrote:
 Threshold could be relative for short strings and absolute for 
 long ones. Makes little sense reallocating if you only waste a 
 couple bytes, but makes perfect sense if you've just removed 
 pages and pages of semicolons ;)
Scanning short strings twice is not all that expensive as they will stay in the CPU cache when you run over them a second time. ``` import std.stdio; safe: string stripsemicolons(string s) trusted { int i,n; foreach(c; s) n += c != ';'; // premature optimization auto r = new char[](n); foreach(c; s) if (c != ';') r[i++] = c; return cast(string)r; } int main() { writeln(stripsemicolons("abc;def;ab")); return 0; } ```
Dec 11 2021
prev sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 10 December 2021 at 18:47:53 UTC, Stanislav Blinov 
wrote:
 Threshold could be relative for short strings and absolute for 
 long ones. Makes little sense reallocating if you only waste a 
 couple bytes, but makes perfect sense if you've just removed 
 pages and pages of semicolons ;)
Like this? ``` safe: string prematureoptimizations(string s, char stripchar) trusted { import core.memory; immutable uint flags = GC.BlkAttr.NO_SCAN|GC.BlkAttr.APPENDABLE; char* begin = cast(char*)GC.malloc(s.length+1, flags); char* end = begin + 1; foreach(c; s) { immutable size_t notsemicolon = c != stripchar; // hack: avoid conditional by writing semicolon outside buffer *(end - notsemicolon) = c; end += notsemicolon; } immutable size_t len = end - begin - 1; begin = cast(char*)GC.realloc(begin, len, flags); return cast(string)begin[0..len]; } void main() { import std.stdio; string str = "abc;def;ab"; writeln(prematureoptimizations(str, ';')); } ```
Dec 13 2021
parent reply Salih Dincer <salihdb hotmail.com> writes:
On Monday, 13 December 2021 at 09:36:57 UTC, Ola Fosheim Grøstad 
wrote:
 ```d
  safe:

 string prematureoptimizations(string s, char stripchar) 
  trusted {
     import core.memory;
     immutable uint flags = 
 GC.BlkAttr.NO_SCAN|GC.BlkAttr.APPENDABLE;
     char* begin = cast(char*)GC.malloc(s.length+1, flags);
     char* end = begin + 1;
     foreach(c; s) {
         immutable size_t notsemicolon = c != stripchar;
         // hack: avoid conditional by writing semicolon outside 
 buffer
         *(end - notsemicolon) = c;
         end += notsemicolon;
     }
     immutable size_t len = end - begin - 1;
     begin = cast(char*)GC.realloc(begin, len, flags);
     return cast(string)begin[0..len];
 }

 void main() {
     import std.stdio;
     string str = "abc;def;ab";
     writeln(prematureoptimizations(str, ';'));
 }
 ```
It seems faster than algorithms in Phobos. We would love to see this in our new Phobos. ```d enum str = "abc;def;gh"; enum res = "abcdefgh"; void main() { void mallocReplace() { import core.memory; immutable uint flags = GC.BlkAttr.NO_SCAN| GC.BlkAttr.APPENDABLE; char* begin = cast(char*)GC.malloc(str.length+1, flags); char* end = begin + 1; foreach(c; str) { immutable size_t f = c != ';'; *(end - f) = c; end += f; } immutable size_t len = end - begin - 1; begin = cast(char*)GC.realloc(begin, len, flags); assert(begin[0..len] == res); } void normalReplace() { import std.string; string result = str.replace(';',""); assert(result == res); } void delegate() t1 = &normalReplace; void delegate() t2 = &mallocReplace; import std.stdio : writefln; import std.datetime.stopwatch : benchmark; auto bm = benchmark!(t1, t2)(1_000_000); writefln("Replace: %s msecs", bm[0].total!"msecs"); writefln("Malloc : %s msecs", bm[1].total!"msecs"); }/* Console Out: Replace: 436 msecs Malloc : 259 msecs */ ```
Dec 22 2021
next sibling parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Thursday, 23 December 2021 at 07:14:35 UTC, Salih Dincer wrote:

 It seems faster than algorithms in Phobos. We would love to see 
 this in our new Phobos.

 ```d
   void mallocReplace()
   void normalReplace()
     string result = str.replace(';',"");

 }/* Console Out:
 Replace: 436 msecs
 Malloc : 259 msecs
 */
 ```
You're comparing apples and oranges. When benchmarking, at least look at the generated assembly first. replace is not in Phobos, it's a D runtime vestige. It's not getting inlined even in release builds with lto, whereas that manual version would. Also, benchmark with runtime strings, not literals, otherwise the compiler might even swallow the thing whole. What you're benchmarking is, basically, inlined optimized search in a literal versus a function call.
Dec 23 2021
parent Salih Dincer <salihdb hotmail.com> writes:
On Thursday, 23 December 2021 at 16:13:49 UTC, Stanislav Blinov 
wrote:
 You're comparing apples and oranges.
 When benchmarking, at least look at
 the generated assembly first.
I looked now and you're right. Insomuch that it should be eggplant not apple, banana not orange...:) Because it's an irrelevant benchmarking! Thank you all...
Dec 23 2021
prev sibling parent rumbu <rumbu rumbu.ro> writes:
On Thursday, 23 December 2021 at 07:14:35 UTC, Salih Dincer wrote:

 It seems faster than algorithms in Phobos. We would love to see 
 this in our new Phobos.
 Replace: 436 msecs
 Malloc : 259 msecs
 */
It seems because MallocReplace is cheating a lot: - it is not called through another function like replace is called; - accesses directly the constant str; - assumes that it has a single character to replace; - assumes that the character will be deleted not replaced with something; - assumes that the character is always ';' - assumes that the replacing string is not bigger than the replaced one, so it knows exactly how much space to allocate; - does not have any parameter, at least on x86 this means that there is no arg pushes when it's called. - does not return a string, just compares its result with another constant; Since we already know all this stuff, we can go further :) ```d string superFast() { enum r = str.replace(";", ""); return r; } ```
 Replace: 436 msecs
 Malloc : 259 msecs
 SuperFast: 0 msecs
Dec 23 2021