www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - nonallocating unicode string manipulations

--047d7b5d301a74ab1604e1ad8e3b
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

phobos is lacking nonallocating string manipulation functions.

I made these inplace string=3D>string functions (see unittests):
----
auto takeInplace(T)(T a,size_t n)if(is(T=3D=3Dstring));
auto slice(T)(T a,size_t u, size_t v)if(is(T=3D=3Dstring));
unittest{
auto a=3D"=E2=89=88a=C3=A7=C3=A7=E2=88=9Aef";
auto b=3Da.takeInplace(3);
assert(b=3D=3D"=E2=89=88a=C3=A7");
assert(a.ptr=3D=3Db.ptr);
assert(a.takeInplace(10)=3D=3Da);
}
unittest{
import std.range;
auto a=3D"=E2=89=88a=C3=A7=C3=A7=E2=88=9Aef";
auto b=3Da.slice(2,6);
assert(a.slice(2,6)=3D=3D"=C3=A7=C3=A7=E2=88=9Ae");
assert(a.slice(2,6).ptr=3D=3Da.slice(2,3).ptr);
assert(a.slice(0,a.walkLength) is a);
import std.exception;
assertThrown(a.slice(2,8));
assertThrown(a.slice(2,1));
}
----

A)
would they belong in phobos? which module?

B)
I'd also like to have an efficient range interface over strings that has
reference semantics on 'front' property:
----
auto a=3D"=E2=89=88a=C3=A7=C3=A7=E2=88=9Aef";
foreach(i, ref ai; a.byElement){
  alias T=3Dtypeof(a); //eg:T=3D=3Dchar[]
  alias E=3DForEachType!T; //eg: E=3D=3Dchar
  assert(is(typeof(ai) =3D=3D T)); //ai is same type as a; it's a slice int=
o it
  if(i=3D=3D0)
    assert(ai.ptr =3D=3D a.ptr);
}
----

This would make it easy for example to implement other non-allocating
inplace unicode functions, such as: toUpper:
auto toUpper(T)(T a){
foreach(i, ref ai; a.byElement){
ai[]=3Dai.toUpper.to!(typeof(ai)); //throws if some rare character has it's
toUpper of different size as its lowercase.
}
}

--047d7b5d301a74ab1604e1ad8e3b
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div>phobos is lacking nonallocating string manipulation functions.</div><d=
iv><br></div>I made these inplace string=3D&gt;string functions (see unitte=
sts):<div><div>----</div><div>auto takeInplace(T)(T a,size_t n)if(is(T=3D=
=3Dstring));</div>
<div>auto slice(T)(T a,size_t u, size_t v)if(is(T=3D=3Dstring));<br><div></=
div></div><div><div>unittest{</div><div><div><span class=3D"Apple-tab-span"=
 style=3D"white-space:pre">	</span>auto a=3D&quot;=E2=89=88a=C3=A7=C3=A7=E2=
=88=9Aef&quot;;</div><div><span class=3D"Apple-tab-span" style=3D"white-spa=
ce:pre">	</span>auto b=3Da.takeInplace(3);</div>
<div><span class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>asser=
t(b=3D=3D&quot;=E2=89=88a=C3=A7&quot;);</div><div><span class=3D"Apple-tab-=
span" style=3D"white-space:pre">	</span>assert(a.ptr=3D=3Db.ptr);</div><div=
<span class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>assert(a.=

<div>}</div></div><div><div>unittest{</div><div><span class=3D"Apple-tab-sp= an" style=3D"white-space:pre"> </span>import std.range;</div><div><span cla= ss=3D"Apple-tab-span" style=3D"white-space:pre"> </span>auto a=3D&quot;=E2= =89=88a=C3=A7=C3=A7=E2=88=9Aef&quot;;</div> <div><span class=3D"Apple-tab-span" style=3D"white-space:pre"> </span>auto = b=3Da.slice(2,6);</div><div><span class=3D"Apple-tab-span" style=3D"white-s= pace:pre"> </span>assert(a.slice(2,6)=3D=3D&quot;=C3=A7=C3=A7=E2=88=9Ae&quo= t;);</div><div><span class=3D"Apple-tab-span" style=3D"white-space:pre"> </= span>assert(a.slice(2,6).ptr=3D=3Da.slice(2,3).ptr);</div> <div><span class=3D"Apple-tab-span" style=3D"white-space:pre"> </span>asser= t(a.slice(0,a.walkLength) is a);</div><div><span class=3D"Apple-tab-span" s= tyle=3D"white-space:pre"> </span>import std.exception;</div><div><span clas= s=3D"Apple-tab-span" style=3D"white-space:pre"> </span>assertThrown(a.slice= (2,8));</div> <div><span class=3D"Apple-tab-span" style=3D"white-space:pre"> </span>asser= tThrown(a.slice(2,1));</div><div>}</div></div><div>----</div></div><div><br=
</div><div>A)</div><div>would they belong in phobos? which module?</div><d=

<br></div><div>B)</div><div>I&#39;d also like to have an efficient range in= terface over strings that has reference semantics on &#39;front&#39; proper= ty:</div><div>----</div><div>auto a=3D&quot;=E2=89=88a=C3=A7=C3=A7=E2=88=9A= ef&quot;;</div><div>foreach(i, ref ai; a.byElement){</div> <div>=C2=A0 alias T=3Dtypeof(a); //eg:T=3D=3Dchar[]</div><div>=C2=A0 alias = E=3DForEachType!T; //eg: E=3D=3Dchar</div><div>=C2=A0 assert(is(typeof(ai) = =3D=3D T)); //ai is same type as a; it&#39;s a slice into it</div><div>=C2= =A0 if(i=3D=3D0)</div><div>=C2=A0 =C2=A0 assert(ai.ptr =3D=3D a.ptr);</div> <div>}</div><div><div>----</div></div><div><br></div><div>This would make i= t easy for example to implement other non-allocating inplace unicode functi= ons, such as: toUpper:</div><div>auto toUpper(T)(T a){</div><div><div>forea= ch(i, ref ai; a.byElement){</div> </div><div>ai[]=3D<a href=3D"http://ai.toUpper.to">ai.toUpper.to</a>!(typeo= f(ai)); //throws if some rare character has it&#39;s toUpper of different s= ize as its lowercase.</div><div>}</div><div>}</div><div><br></div></div> --047d7b5d301a74ab1604e1ad8e3b--
Jul 16 2013