www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.string.indexOf with an optional start-at parameter?

reply =?UTF-8?B?QWxla3NhbmRhciBSdcW+acSNacSH?= <ruzicic.aleksandar gmail.com> writes:
I needed std.string.indexOf to accept start position in the string to
start the search at. I was really surprised when I realized that this
(to me) standard parameter is "missing" (I'm used to indexOf in
javascript, strpos in php and equivalent methods in other languages,
which support start offset parameter).

There might be some other function (in some other module) that does
what I want but I wasn't able to find it (I find D's documentation not
easy to search and read), so I've copied indexOf to my module and
added wanted functionality:

https://gist.github.com/900589

now, I'm able to write, for example:

auto pos = indexOf(haystack, '$', 10); // will starts search at 11th
char in haystack

and

auto pos = indexOf(haystack, '$', -5); // will starts search at 5th
char from the end

My question is: is there a reason why there is no this functionality
in phobos (maybe there's some language feature I'm not aware of?) and
if no such reason exists, would it be possible to add it in future
version of phobos/dmd?
Apr 03 2011
next sibling parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Sun, 03 Apr 2011 13:39:40 -0400, Aleksandar Ružičić  
<ruzicic.aleksandar gmail.com> wrote:

 I needed std.string.indexOf to accept start position in the string to
 start the search at. I was really surprised when I realized that this
 (to me) standard parameter is "missing" (I'm used to indexOf in
 javascript, strpos in php and equivalent methods in other languages,
 which support start offset parameter).

 There might be some other function (in some other module) that does
 what I want but I wasn't able to find it (I find D's documentation not
 easy to search and read), so I've copied indexOf to my module and
 added wanted functionality:

 https://gist.github.com/900589

 now, I'm able to write, for example:

 auto pos = indexOf(haystack, '$', 10); // will starts search at 11th
 char in haystack

auto pos = indexOf(haystack[10..$], '$') + 10;
 and

 auto pos = indexOf(haystack, '$', -5); // will starts search at 5th
 char from the end

auto pos = indexOf(haystack[$-5..$], '$') + haystack.length-5;
 My question is: is there a reason why there is no this functionality
 in phobos (maybe there's some language feature I'm not aware of?) and
 if no such reason exists, would it be possible to add it in future
 version of phobos/dmd?

Yes, the language feature is called slicing. See above. Also, you may want to look at the various find methods in std.algorithm. Generally, it's better to work with ranges/slices than indexes due to UTF's encoding scheme.
Apr 03 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/3/11 1:14 PM, Aleksandar Ružičić wrote:
 I thought first of slicing, but isn't that making a copy of string?

It's not.
 And also, if I'm not mistaken if you slice out of range bounds (i.e.
 haystack[5..$] when haystack.length<  5) you'll get exception, right?

Correct.
 That's why I think this would be nice to have feature, so you don't
 have to worry if start position is within the string bounds, and you
 won't need to write this:

I think that's a natural and simple improvement of indexOf. The one aspect that I'm unsure about is starting from the end for negative indices. Could you please submit an enhancement request to bugzilla? Andrei
Apr 03 2011
parent reply KennyTM~ <kennytm gmail.com> writes:
On Apr 4, 11 02:29, Aleksandar Ružičić wrote:
 On Sun, Apr 3, 2011 at 8:16 PM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org>  wrote:
 It's not.

Seems I've missed that in the docs, I tought it will always make a copy :)
 I think that's a natural and simple improvement of indexOf. The one aspect
 that I'm unsure about is starting from the end for negative indices.

Negative indices might seem a bit odd but it's standard in other languages (like javascript and php which I've already mentioned). I would even like to see this in D:

You mean Python and Ruby. - Javascript does not support negative index. In fact, JS has no true arrays, it only has associative array. - PHP does not support negative index. http://ideone.com/8MZ2T Many other languages that I've heard of like C#, C, C++, Go, Haskell and Java also do not support negative index. Also, interestingly, Perl 5 had negative index, but Perl 6 killed it. (http://perlcabal.org/syn/S09.html#Negative_and_differential_subscripts) "The Perl 6 semantics avoids indexing discontinuities (a source of subtle runtime errors), and provides ordinal access in both directions at both ends of the array." This does not mean negative index is useless (I use it all the time when programming in Python), but D shouldn't add a feature just because other languages have it, or even you think that language had it.
 array[-2];  // get 'a' from "foobar"

 same for slicing:

 array[-4..2];  // get "ob" from "foobar"

 Could you please submit an enhancement request to bugzilla?

sure!

Apr 03 2011
parent KennyTM~ <kennytm gmail.com> writes:
On Apr 4, 11 04:07, Aleksandar Ružičić wrote:
 You mean Python and Ruby.

   - Javascript does not support negative index. In fact, JS has no true
 arrays, it only has associative array.
   - PHP does not support negative index. http://ideone.com/8MZ2T

I was talking about javascript's String.prototype.indexOf () and php's strpos functions, not about array indexing.

I see.
 But even for that I wasn't correct :/. Negative start-at index is
 avaliable for substr (both, in php and js), that's why I have confused
 it with indexOf (I thought these things are consistent..)

PHP will never be consistent. ;)
 And javascript _does_ have true arrays, but it _doesn't_ have true
 associative arrays (those are object literals).

I would not call it a true array if it is indexed by string internally. Anyway, this is not the main point.
 This does not mean negative index is useless (I use it all the time when
 programming in Python), but D shouldn't add a feature just because other
 languages have it, or even you think that language had it.

I know, I was just expressing my opinion (what I would like to see in a language, I never programmed in phyton or perl, so I was thinking that negative indices for array indexing are not supported in any language that I know of), I wasn't proposing a new feature :)

Right.
Apr 03 2011
prev sibling next sibling parent =?UTF-8?B?QWxla3NhbmRhciBSdcW+acSNacSH?= <ruzicic.aleksandar gmail.com> writes:
I thought first of slicing, but isn't that making a copy of string?
And also, if I'm not mistaken if you slice out of range bounds (i.e.
haystack[5..$] when haystack.length < 5) you'll get exception, right?

That's why I think this would be nice to have feature, so you don't
have to worry if start position is within the string bounds, and you
won't need to write this:

 auto pos =3D indexOf(haystack[$-5..$], '$') + haystack.length-5;

when you want to start search from the end (since it's somehow less readable than indexOf(haystack, '$', -5)). On Sun, Apr 3, 2011 at 7:55 PM, Robert Jacques <sandford jhu.edu> wrote:
 On Sun, 03 Apr 2011 13:39:40 -0400, Aleksandar Ru=C5=BEi=C4=8Di=C4=87
 <ruzicic.aleksandar gmail.com> wrote:

 I needed std.string.indexOf to accept start position in the string to
 start the search at. I was really surprised when I realized that this
 (to me) standard parameter is "missing" (I'm used to indexOf in
 javascript, strpos in php and equivalent methods in other languages,
 which support start offset parameter).

 There might be some other function (in some other module) that does
 what I want but I wasn't able to find it (I find D's documentation not
 easy to search and read), so I've copied indexOf to my module and
 added wanted functionality:

 https://gist.github.com/900589

 now, I'm able to write, for example:

 auto pos =3D indexOf(haystack, '$', 10); // will starts search at 11th
 char in haystack

auto pos =3D indexOf(haystack[10..$], '$') + 10;
 and

 auto pos =3D indexOf(haystack, '$', -5); // will starts search at 5th
 char from the end

auto pos =3D indexOf(haystack[$-5..$], '$') + haystack.length-5;
 My question is: is there a reason why there is no this functionality
 in phobos (maybe there's some language feature I'm not aware of?) and
 if no such reason exists, would it be possible to add it in future
 version of phobos/dmd?

Yes, the language feature is called slicing. See above. Also, you may wan=

 to look at the various find methods in std.algorithm. Generally, it's bet=

 to work with ranges/slices than indexes due to UTF's encoding scheme.

Apr 03 2011
prev sibling next sibling parent =?UTF-8?B?QWxla3NhbmRhciBSdcW+acSNacSH?= <ruzicic.aleksandar gmail.com> writes:
On Sun, Apr 3, 2011 at 8:16 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 It's not.

Seems I've missed that in the docs, I tought it will always make a copy :)
 I think that's a natural and simple improvement of indexOf. The one aspect
 that I'm unsure about is starting from the end for negative indices.

Negative indices might seem a bit odd but it's standard in other languages (like javascript and php which I've already mentioned). I would even like to see this in D: array[-2]; // get 'a' from "foobar" same for slicing: array[-4..2]; // get "ob" from "foobar"
 Could you please submit an enhancement request to bugzilla?

sure!
Apr 03 2011
prev sibling next sibling parent =?UTF-8?B?QWxla3NhbmRhciBSdcW+acSNacSH?= <ruzicic.aleksandar gmail.com> writes:
 You mean Python and Ruby.

 =C2=A0- Javascript does not support negative index. In fact, JS has no tr=

 arrays, it only has associative array.
 =C2=A0- PHP does not support negative index. http://ideone.com/8MZ2T

I was talking about javascript's String.prototype.indexOf () and php's strpos functions, not about array indexing. But even for that I wasn't correct :/. Negative start-at index is avaliable for substr (both, in php and js), that's why I have confused it with indexOf (I thought these things are consistent..) And javascript _does_ have true arrays, but it _doesn't_ have true associative arrays (those are object literals).
 This does not mean negative index is useless (I use it all the time when
 programming in Python), but D shouldn't add a feature just because other
 languages have it, or even you think that language had it.

I know, I was just expressing my opinion (what I would like to see in a language, I never programmed in phyton or perl, so I was thinking that negative indices for array indexing are not supported in any language that I know of), I wasn't proposing a new feature :)
Apr 03 2011
prev sibling parent =?UTF-8?B?QWxla3NhbmRhciBSdcW+acSNacSH?= <ruzicic.aleksandar gmail.com> writes:
On Sun, Apr 3, 2011 at 10:56 PM, KennyTM~ <kennytm gmail.com> wrote:
 And javascript _does_ have true arrays, but it _doesn't_ have true
 associative arrays (those are object literals).

I would not call it a true array if it is indexed by string internally. Anyway, this is not the main point.

You're right that JS arrays are not a point here, but I must again disagree with you :) I write Javascript code for living, so I think I know what I'm talking about: var a = ["foo", "bar", "baz"]; // defining an array a[0]; // "foo" a["0"]; // this would also work, but only because JS casts "0" to integer implicitly there is no string indexing with arrays, only with objects (associative arrays, maps, call it as you like): var o = {foo: "bar", 0: "baz"}; // defining an object (a.k.a AA) o.foo; // "bar" o["foo"]; // same, returns "bar" o[0]; // "baz" now, this just looks like indexing an array, but it really ain't, it's property getter, but JS allows you have numeric properties so it can be confusing, I admit. That's all I have to say about JS arrays, won't be talking non-D anymore :) Regards, Aleksandar
Apr 03 2011