www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Splitting a string on multiple tokens

reply "ixid" <nuaccount gmail.com> writes:
Is there an effective way of splitting a string with a set of 
tokens? Splitter feels rather limited and multiple passes gives 
you an array of arrays of strings rather than an array of 
strings. I'm not sure if I'm missing an obvious application of 
library methods or if this is absent.
Oct 09 2012
parent reply "jerro" <a a.com> writes:
On Wednesday, 10 October 2012 at 00:18:17 UTC, ixid wrote:
 Is there an effective way of splitting a string with a set of 
 tokens? Splitter feels rather limited and multiple passes gives 
 you an array of arrays of strings rather than an array of 
 strings. I'm not sure if I'm missing an obvious application of 
 library methods or if this is absent.
You can use std.regex.splitter like this: auto r = regex(`,| |(--)`); auto str = "string we,want--to,split"; writeln(splitter(str, r)); //will pring ["string", "we", "want", "to", "split"]
Oct 09 2012
parent reply "ixid" <nuaccount gmail.com> writes:
On Wednesday, 10 October 2012 at 02:21:05 UTC, jerro wrote:
 On Wednesday, 10 October 2012 at 00:18:17 UTC, ixid wrote:
 Is there an effective way of splitting a string with a set of 
 tokens? Splitter feels rather limited and multiple passes 
 gives you an array of arrays of strings rather than an array 
 of strings. I'm not sure if I'm missing an obvious application 
 of library methods or if this is absent.
You can use std.regex.splitter like this: auto r = regex(`,| |(--)`); auto str = "string we,want--to,split"; writeln(splitter(str, r)); //will pring ["string", "we", "want", "to", "split"]
Thank you, though that removes the tokens and being varied those would be messy to replace. Is there a way that lets you cut on tokens and keep those tokens at the ends of the statements they cause to get cut? This seem like basic parsing features that are absent.
Oct 10 2012
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 11-Oct-12 06:40, ixid wrote:
 On Wednesday, 10 October 2012 at 02:21:05 UTC, jerro wrote:
 On Wednesday, 10 October 2012 at 00:18:17 UTC, ixid wrote:
 Is there an effective way of splitting a string with a set of tokens?
 Splitter feels rather limited and multiple passes gives you an array
 of arrays of strings rather than an array of strings. I'm not sure if
 I'm missing an obvious application of library methods or if this is
 absent.
You can use std.regex.splitter like this: auto r = regex(`,| |(--)`); auto str = "string we,want--to,split"; writeln(splitter(str, r)); //will pring ["string", "we", "want", "to", "split"]
Thank you, though that removes the tokens and being varied those would be messy to replace. Is there a way that lets you cut on tokens and keep those tokens at the ends of the statements they cause to get cut? This seem like basic parsing features that are absent.
Well I guess something along these lines: auto r = regex(`(?<=,| |(--))`); auto str = "string we,want--to,split"; writeln(splitter(str, r)); //will print: ["string ", "we,", "want--", "to,", "split"] And just in case - splitter doesn't copy anything, it just slices the original array. If you meant to use this for tight loops like in a compiler then you really need something handcrafted for optimal speed. -- Dmitry Olshansky
Oct 11 2012