www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Ropes (concatenation trees) for strings in D ?

reply "Carl Sturtivant" <sturtivant gmail.com> writes:
This is the idea I mean.
http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.14.9450&rep=rep1&type=pdf
Here's a C++ implementation supported I think by gcc.
http://www.sgi.com/tech/stl/Rope.html

Is there a D implementation of a rope out there somewhere?

How is unicode handled? Just by using dchar?
Aug 13 2014
next sibling parent "Messenger" <dont shoot.me> writes:
On Thursday, 14 August 2014 at 05:57:18 UTC, Carl Sturtivant 
wrote:
 This is the idea I mean.
 http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.14.9450&rep=rep1&type=pdf
 Here's a C++ implementation supported I think by gcc.
 http://www.sgi.com/tech/stl/Rope.html

 Is there a D implementation of a rope out there somewhere?

 How is unicode handled? Just by using dchar?
Hmm. Appender?

import std.array; import std.range; enum arrayLength = 1_000_000; void main() { auto writer = appender!string; auto xes = new char[](arrayLength); // dynamic on heap xes[] = 'x'; assert(xes[0] == 'x'); assert(xes[100] == 'x'); assert(xes[1000] == 'x'); writer.reserve((2 * arrayLength) + 3); // optional writer.put(xes); // or writer ~= xes; writer.put("abc"); writer.put(xes); // normal slicing op assert(writer.data[arrayLength..(arrayLength+3)] == "abc"); // .array allocates a new array but not sure how else to do it assert(writer.data.retro.array[arrayLength..(arrayLength+3)] == "cba"); }
Aug 14 2014
prev sibling next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 08/13/2014 10:57 PM, Carl Sturtivant wrote:
 This is the idea I mean.
 
http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.14.94 0&rep=rep1&type=pdf
 Here's a C++ implementation supported I think by gcc.
 http://www.sgi.com/tech/stl/Rope.html

 Is there a D implementation of a rope out there somewhere?
I am not aware of any but std.range.chain can be useful: import std.range; void main() { auto r = chain(chain("The qui", "ck brown"), " fox"); assert(r.equal("The quick brown fox")); }
 How is unicode handled? Just by using dchar?
dchar is not necessary when using the rope as a sequence but it looks like each section must be a valid UTF-8 sequence. The following example where a Unicode character (ü) is split between two sections fails: auto r = chain(chain("The q\xc3", "\xbcick brown"), " fox"); // std.utf.UTFException ./phobos/std/utf.d(1109): Attempted // to decode past the end of a string (at index 1) assert(r.equal("The qüick brown fox")); Of course, it works when the character is complete: auto r = chain(chain("The qü", "ick brown"), " fox"); assert(r.equal("The qüick brown fox")); But yes, for O(1) character access dchar is needed. (As you know, it can't be exactly O(1) when there are a large amount of sections and one needs to be memory-efficient for character index lookup.) Ali
Aug 14 2014
next sibling parent Timothee Cour via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> writes:
On Thu, Aug 14, 2014 at 3:59 AM, Ali =C3=87ehreli <
digitalmars-d-learn puremagic.com> wrote:

 On 08/13/2014 10:57 PM, Carl Sturtivant wrote:
 This is the idea I mean.
 http://citeseer.ist.psu.edu/viewdoc/download?doi=3D10.1.1.
14.9450&rep=3Drep1&type=3Dpdf
 Here's a C++ implementation supported I think by gcc.
 http://www.sgi.com/tech/stl/Rope.html

 Is there a D implementation of a rope out there somewhere?
I am not aware of any but std.range.chain can be useful:
chain is very limited compared to what ropes provide. Eg: efficient inserting of a string inside a rope, space efficient maintaining of entire edit history, rebalancing, etc. Ropes are very well suited for various applications (text editors, file system indexing, large scale text processing etc). std.rope would be a very welcome addition for real world applications.
 import std.range;

 void main()
 {
     auto r =3D chain(chain("The qui", "ck brown"), " fox");
     assert(r.equal("The quick brown fox"));

 }

 How is unicode handled? Just by using dchar?
dchar is not necessary when using the rope as a sequence but it looks lik=
e
 each section must be a valid UTF-8 sequence. The following example where =
a
 Unicode character (=C3=BC) is split between two sections fails:

     auto r =3D chain(chain("The q\xc3", "\xbcick brown"), " fox");
     // std.utf.UTFException ./phobos/std/utf.d(1109): Attempted
     // to decode past the end of a string (at index 1)
     assert(r.equal("The q=C3=BCick brown fox"));

 Of course, it works when the character is complete:

     auto r =3D chain(chain("The q=C3=BC", "ick brown"), " fox");
     assert(r.equal("The q=C3=BCick brown fox"));

 But yes, for O(1) character access dchar is needed. (As you know, it can'=
t
 be exactly O(1) when there are a large amount of sections and one needs t=
o
 be memory-efficient for character index lookup.)

 Ali
Aug 15 2014
prev sibling parent reply ketmar via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> writes:
On Fri, 15 Aug 2014 19:04:10 -0700
Timothee Cour via Digitalmars-d-learn
<digitalmars-d-learn puremagic.com> wrote:

sounds like my C library based on this article:
http://e98cuenc.free.fr/wordprocessor/piecetable.html

i'm slowly converting my C code to D (nothing fancy yet, still C-style).

it's not a 'real' rope -- it's designed for text editing tasks, and it
allocates like crazy now, but it's pretty usable to writing rich text
editors and renderers in various GUI toolkits.

don't know when i'll finish first working version though, it's all
little tedious and i have alot other things to do. but i'll announce it
when it will be done. ;-)
Aug 15 2014
parent reply "MrSmith" <mrsmith33 yandex.ru> writes:
On Saturday, 16 August 2014 at 02:26:29 UTC, ketmar via 
Digitalmars-d-learn wrote:
 On Fri, 15 Aug 2014 19:04:10 -0700
 Timothee Cour via Digitalmars-d-learn
 <digitalmars-d-learn puremagic.com> wrote:

 sounds like my C library based on this article:
 http://e98cuenc.free.fr/wordprocessor/piecetable.html

 i'm slowly converting my C code to D (nothing fancy yet, still 
 C-style).

 it's not a 'real' rope -- it's designed for text editing tasks, 
 and it
 allocates like crazy now, but it's pretty usable to writing 
 rich text
 editors and renderers in various GUI toolkits.

 don't know when i'll finish first working version though, it's 
 all
 little tedious and i have alot other things to do. but i'll 
 announce it
 when it will be done. ;-)
I've done some progress on making piece table some time ago, but have no time atm. Have a look https://github.com/MrSmith33/textedit-d It supports inserting, removing, undo, redo and slicing. Provides forward range interface. Can you also share your progress?
Aug 16 2014
next sibling parent ketmar via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> writes:
On Sat, 16 Aug 2014 10:25:01 +0000
MrSmith via Digitalmars-d-learn <digitalmars-d-learn puremagic.com>
wrote:

 Can you also share your progress?
sure. the project has no public repo yet, but you can take a look at it's current state here: http://ketmar.no-ip.org/etx.txz no undo/redo support for now, and it accepts only dchars, but you can find working (i hope, at least unittests works ;-) rb-tree and piecetable there. no slicing yet, and range support is nearly non-existing, but piecetable supports extensible text styles! ;-) i'm not really happy with design though. it mindlessly allocates alot of small classes (it should use pools instead, i believe) and high-level text rendering interface is not ported yet. i'm planning to announce it in d.announces when it will be ready to 'go public'. not sure yet if i'll leave it 'as is' or will try to rewrite in in something like 'D style' before announcing. ah, and it needs at least minimal documentation too. ;-)
Aug 16 2014
prev sibling parent ketmar via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> writes:
On Sat, 16 Aug 2014 10:25:01 +0000
MrSmith via Digitalmars-d-learn <digitalmars-d-learn puremagic.com>
wrote:

 Can you also share your progress?
here is public repo: http://repo.or.cz/w/etx.d.git remember, this is not even alpha-quality code (but it seems to be stable enough to build something upon it). API sux, no docs and so on. enjoy. GPLv3.
Aug 16 2014
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
Search gave result "enough rope to hang yourself".
Aug 15 2014
prev sibling parent Justin Whear <justin economicmodeling.com> writes:
On Thu, 14 Aug 2014 05:57:17 +0000, Carl Sturtivant wrote:

 This is the idea I mean.
 http://citeseer.ist.psu.edu/viewdoc/download?
doi=10.1.1.14.9450&rep=rep1&type=pdf
 Here's a C++ implementation supported I think by gcc.
 http://www.sgi.com/tech/stl/Rope.html
 
 Is there a D implementation of a rope out there somewhere?
 
 How is unicode handled? Just by using dchar?
I was working on a D Rope implementation years ago (now lost to time), when I came across this article and ditched it: http://scienceblogs.com/goodmath/2009/02/18/gap-buffers-or-why-bother- with-1/ The short version is that while Ropes are interesting theoretically, they have a bad implementation complexity compared to much simpler structures which give more than adequate results for most tasks.
Aug 18 2014