digitalmars.D.bugs - [Issue 5977] New: String splitting with empty separator
- d-bugmail puremagic.com (58/58) May 10 2011 http://d.puremagic.com/issues/show_bug.cgi?id=5977
- d-bugmail puremagic.com (7/7) Sep 25 2011 http://d.puremagic.com/issues/show_bug.cgi?id=5977
- d-bugmail puremagic.com (10/10) Oct 22 2012 http://d.puremagic.com/issues/show_bug.cgi?id=5977
- d-bugmail puremagic.com (25/48) Oct 22 2012 http://d.puremagic.com/issues/show_bug.cgi?id=5977
- d-bugmail puremagic.com (11/11) Jan 03 2013 http://d.puremagic.com/issues/show_bug.cgi?id=5977
http://d.puremagic.com/issues/show_bug.cgi?id=5977 Summary: String splitting with empty separator Product: D Version: D2 Platform: x86 OS/Version: Windows Status: NEW Keywords: patch Severity: normal Priority: P2 Component: Phobos AssignedTo: nobody puremagic.com ReportedBy: bearophile_hugs eml.cc --- Comment #0 from bearophile_hugs eml.cc 2011-05-10 16:30:55 PDT --- This D2 program seems to go in infinte loop (dmd 2.053beta): import std.string; void main() { split("a test", ""); } ------------------------ My suggestion is to add code like this in std.array.split(): if (delim.length == 0) return split(s); This means that en empty splitting string is like splitting on generic whitespace. This is useful in code like: auto foo(string txt, string delim="") { return txt.split(delim); } This means that calling foo with no arguments splits txt on whitespace, otherwise splits on the given string. This allows to use the two forms of split in foo() without if conditions. This is done in Python too, where None is used instead of an empty string. The modified split is something like (there is a isSomeString!S2 because are special, they aren't generic arrays, splitting on whitespace is meaningful for strings only): Unqual!(S1)[] split(S1, S2)(S1 s, S2 delim) if (isForwardRange!(Unqual!S1) && isForwardRange!S2) { Unqual!S1 us = s; if (isSomeString!S2 && delim.length == 0) { return split(s); } else { auto app = appender!(Unqual!(S1)[])(); foreach (word; std.algorithm.splitter(us, delim)) { app.put(word); } return app.data; } } Beside this change, I presume std.algorithm.splitter() too needs to test for an empty delim. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 10 2011
http://d.puremagic.com/issues/show_bug.cgi?id=5977 --- Comment #1 from bearophile_hugs eml.cc 2011-09-25 08:16:21 PDT --- Alternative: throw an ArgumentError("delim argument is empty") exception if delim is empty. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 25 2011
http://d.puremagic.com/issues/show_bug.cgi?id=5977 monarchdodra gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |daniel350 bigpond.com --- Comment #2 from monarchdodra gmail.com 2012-10-22 02:42:42 PDT --- *** Issue 8551 has been marked as a duplicate of this issue. *** -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 22 2012
http://d.puremagic.com/issues/show_bug.cgi?id=5977 monarchdodra gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED CC| |monarchdodra gmail.com AssignedTo|nobody puremagic.com |monarchdodra gmail.com --- Comment #3 from monarchdodra gmail.com 2012-10-22 02:52:16 PDT --- (In reply to comment #0)This D2 program seems to go in infinte loop (dmd 2.053beta): import std.string; void main() { split("a test", ""); } ------------------------ My suggestion is to add code like this in std.array.split(): if (delim.length == 0) return split(s); This means that en empty splitting string is like splitting on generic whitespace. This is useful in code like: auto foo(string txt, string delim="") { return txt.split(delim); }I think it is a bad idea on two counts: 1. If the user wanted that behavior, he'd have written it as such. If the user actually passed a seperator that is an empty range, he probably didn't mean for it split by spaces. 2. I think it would also bring a deviation of behavior between strings and non-strings. Supposing r is empty: * "hello world".split(""); //Ok, split white * [1, 2].split(r); //Derp. (In reply to comment #1)Alternative: throw an ArgumentError("delim argument is empty") exception if delim is empty.I *really* think that is a *much* saner approach. Splitting with an empty separator is just not logic. Trying to force a default behavior in that scenario is wishful thinking (IMO). I think it should throw an error. I'll implement this. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 22 2012
http://d.puremagic.com/issues/show_bug.cgi?id=5977 hsteoh quickfur.ath.cx changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hsteoh quickfur.ath.cx --- Comment #4 from hsteoh quickfur.ath.cx 2013-01-03 20:28:42 PST --- FWIW, in perl, splitting on an empty string simply returns an array of characters. I think that better reflects the symmetry of join("", array). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 03 2013