www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 7716] New: Add an indexed overload to countUntil

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7716

           Summary: Add an indexed overload to countUntil
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: andrej.mitrovich gmail.com



09:55:27 PDT ---
Sometimes I want to find the index of a substring, but not necessarily the
index of the first occurence. E.g. in the string "foo x foo y" I want to get
the index of the second foo. Here's a simple implementation based on the
existing countUntil:

import std.algorithm : startsWith;
import std.array;
import std.traits;

sizediff_t countUntil(alias pred = "a == b", R, N)(R haystack, N needle, size_t
index)
if (is(typeof(startsWith!pred(haystack, needle))))
{
    size_t count;
    static if (isNarrowString!R)
    {
        // Narrow strings are handled a bit differently
        auto length = haystack.length;
        for (; !haystack.empty; haystack.popFront())
        {
            if (startsWith!pred(haystack, needle))
            {
                if (count == index)
                    return length - haystack.length;
                else
                    count++;
            }
        }
    }
    else
    {
        typeof(return) result;
        for (; !haystack.empty; ++result, haystack.popFront())
        {
            if (startsWith!pred(haystack, needle))
            {
                if (count == index)
                    return result;
                else
                    count++;
            }
        }
    }
    return -1;
}

void main()
{
    string s = "foo x foo y";

    auto idx1 = s.countUntil("foo", 0);
    auto idx2 = s.countUntil("foo", 1);
    auto idx3 = s.countUntil("foo", 2);

    assert(idx1 == 0);
    assert(idx2 == 6);
    assert(idx3 == -1);
}

Equivalent functionality could be added to std.string.indexOf to allow
case-unsensitive searches.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 15 2012
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7716


Andrej Mitrovic <andrej.mitrovich gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX



14:51:28 PST ---
This niche feature belongs in user-space libraries, not Phobos.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 10 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7716


bearophile_hugs eml.cc changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bearophile_hugs eml.cc



This belongs in Phobos. See Python string functions, that have optional
start-end:

 help(str.find)
Help on method_descriptor: find(...) S.find(sub [,start [,end]]) -> int Return the lowest index in S where substring sub is found, such that sub is contained within s[start:end]. Optional arguments start and end are interpreted as in slice notation. Return -1 on failure.
 help(str.index)
Help on method_descriptor: index(...) S.index(sub [,start [,end]]) -> int Like S.find() but raise ValueError when the substring is not found. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 10 2012
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7716


monarchdodra gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |monarchdodra gmail.com




 Sometimes I want to find the index of a substring, but not necessarily the
 index of the first occurence. E.g. in the string "foo x foo y" I want to get
 the index of the second foo.
This would conflict with this other request: http://d.puremagic.com/issues/show_bug.cgi?id=5507 which would allow counting until multiple needles, eg: //---- string s = "foo x bar y"; size_t index = s.countUntil("bar", 'x'); assert(index == 4); //---- So what you are asking for may have to be implemented as "countUntilN" or something: //----- string s = "foo x bar y"; size_t index = s.countUntilN(1, "bar", 'x'); assert(index == 6); //First found 'x', then found "bar". //-----
 Here's a simple implementation based on the
 existing countUntil:
Just want to point out that that implementation is based on an old and buggy implementation of countUntil. 2 problems: 1. No support for reference type ranges. 2. Incorrect support of unicode. The issue 1 is not that big of an issue, but is unacceptable of a standard library. Issue 2 depends on what you want. Do you want the *index* in the string, or the logical *range_position*: "日本語".countUntil(本); //1 popFront once "日本語".countUntil(本); //3 slice index: "日本語"[3 .. $] If you plan on using countUntil(N), then I'd recommend basing your implementation on https://github.com/D-Programming-Language/phobos/pull/951 Finally, keep in mind that the standard function "count" skips over overlaps: "ababab".count("abab"); //produces 1. Only 1 match. However, your countUntil implementation would find that second match: "ababab".countUntilN(1, "abab"); //Produces 2 This would be kind of weird: "Given a range which holds 1 instance of a needle, countUntilN is capable of finding the second occurrence of said needle :/" -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 11 2012