www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Detecting at compile time if a string is zero terminated

reply Johannes Pfau <spam example.com> writes:
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Hi,
I'm currently writing a small wrapper layer for gettext. My current
getText template looks like this:

----------------------------------------------------
string getText()(string msgid, string domain =3D null, Category cat =3D
Category.Messages) {
    auto cmsg =3D toStringz(msgid);
    auto nmsg =3D dcgettext(domain ? toStringz(domain) : null,
              cmsg, cat);
    if(cmsg =3D=3D nmsg)
        return msgid;
    else
    {
        string nstr =3D cast(string)nmsg[0 .. strlen(nmsg)];
        return nstr;
    }
}

string getText(string msgid, string domain =3D null, Category cat =3D
Category.Messages)() {
    auto nmsg =3D dcgettext(domain ? domain.ptr : null,
              msgid.ptr, cat);
    if(msgid.ptr =3D=3D nmsg)
        return msgid;
    else
    {
        string nstr =3D cast(string)nmsg[0 .. strlen(nmsg)];
        return nstr;
    }
}
----------------------------------------------------

As string literals in D are zero terminated, there's no need
for the toStringz overhead. The overload taking compile time
parameters takes advantage of that. The code works and can be used like
this:
----------------------------------------------------
    writeln(getText!"Hello World!"); //no toStringz
    writeln(getText("Hello World!")); //toStringz
----------------------------------------------------

But if somehow possible I'd like to merge the templates so that there
is only one way to call getText and the fastest way is chosen
automatically.

Does anyone know how to do that?
--=20
Johannes Pfau
Jan 19 2011
next sibling parent reply Jesse Phillips <jessekphillips+D gmail.com> writes:
First off no. Second, is their really going to be a performance gain from this.
I wouldn't expect static strings to be converted very often. And last I will
copy and past a comment from the source code:

198 	    /+ Unfortunately, this isn't reliable.
199 	     We could make this work if string literals are put
200 	     in read-only memory and we test if s[] is pointing into
201 	     that.
202 	
203 	     /* Peek past end of s[], if it's 0, no conversion necessary.
204 	     * Note that the compiler will put a 0 past the end of static
205 	     * strings, and the storage allocator will put a 0 past the end
206 	     * of newly allocated char[]'s.
207 	     */
208 	     char* p = &s[0] + s.length;
209 	     if (*p == 0)
210 	     return s;
211 	     +/
Jan 19 2011
parent Jesse Phillips <jessekphillips+D gmail.com> writes:
I would bet that you'd end up spending more time translating the string then
copying it.

Didn't think to look at what type the function accepted. I figured that any
such optimization would exist inside of toStringz if it was possible.
Jan 20 2011
prev sibling parent Johannes Pfau <spam example.com> writes:
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Jesse Phillips wrote:
First off no. Second, is their really going to be a performance gain
from this. I wouldn't expect static strings to be converted very
often. And last I will copy and past a comment from the source code:

Thanks for your reply. In case you don't know: gettext is used to translate strings. You call gettext("english string") and it returns the translated string. Gettext might be the only corner case, but the strings gettext returns are usually not cached and big projects could translate many strings, so I thought it could be an issue. But maybe I'm overestimating that. I had a look at the source code of toStringz and found the comment you mentioned. The comment is for toStringz(const(char)[] s) toStringz(string s) is even more interesting in this case as it does do that optimization in most cases. I think that's good enough ;-) --=20 Johannes Pfau
Jan 20 2011