www.digitalmars.com         C & C++   DMDScript  

D - Strings in D

reply Matthias Becker <Matthias_member pathlink.com> writes:
on www.digitalmars.com/d/cppstrings.html you can read the folowing

(quote)
C++ strings, as implemented by STLport, are by value and are 0-terminated. [The
latter is an implementation choice, but STLport seems to be the most popular
implementation.] This, coupled with no garbage collection, has some
consequences. First of all, any string created must make its own copy of the
string data. The 'owner' of the string data must be kept track of, because when
the owner is deleted all references become invalid. If one tries to avoid the
dangling reference problem by treating strings as value types, there will be a
lot of overhead of memory allocation, data copying, and memory deallocation.
Next, the 0-termination implies that strings cannot refer to other strings.
String data in the data segment, stack, etc., cannot be referred to. 
D strings are reference types, and the memory is garbage collected. This means
that only references need to be copied, not the string data. D strings can refer
to data in the static data segment, data on the stack, data inside other
strings, objects, file buffers, etc. There's no need to keep track of the
'owner' of the string data. 

The obvious question is if multiple D strings refer to the same string data,
what happens if the data is modified? All the references will now point to the
modified data. This can have its own consequences, which can be avoided if the
copy-on-write convention is followed. All copy-on-write is is that if a string
is written to, an actual copy of the string data is made first. 

The result of D strings being reference only and garbage collected is that code
that does a lot of string manipulating, such as an lzw compressor, can be a lot
more efficient in terms of both memory consumption and speed. 
(/quote)


Sorry, but this text is a bit stupid. It seems like you assume C++-coders to be
stupid. C++ knows references. If you pass a string to a function you pass it by
reference of course.

void foo (const std::string & the_string)
{ ... }

The problem with garbagecollection is solved by smart-pointers. the most common
ones are boost::shared_ptr. And I know no good C++-coder that doesn't use boost
(www.boost.org), so please compare D with C++ + boost, because everything else
is not pragmatic.

And about your copy on write "optimization": read the folowing (it's only the
third part. You find the other articles on the same site)
http://www.gotw.ca/gotw/045.htm
Oct 29 2003
next sibling parent "Lars Ivar Igesund" <larsivi stud.ntnu.no> writes:
"Matthias Becker" <Matthias_member pathlink.com> wrote in message
news:bnobm4$542>

 The problem with garbagecollection is solved by smart-pointers. the most

 ones are boost::shared_ptr. And I know no good C++-coder that doesn't use

 (www.boost.org), so please compare D with C++ + boost, because everything

 is not pragmatic.

I know several, and they are of the best. Lars Ivar Igesund
Oct 29 2003
prev sibling next sibling parent "Matthew Wilson" <matthew-hat -stlsoft-dot.-org> writes:
 And I know no good C++-coder that doesn't use boost

Are you kidding? Are we in a world where there's only one way to do things? Isn't that Java? -- Matthew Wilson STLSoft moderator and C++ monomaniac (http://www.stlsoft.org) Contributing editor, C/C++ Users Journal (www.synesis.com.au/articles.html#columns) "But if less is more, think how much more more will be!" -- Dr Frazier Crane ---------------------------------------------------------------------------- ---
Oct 29 2003
prev sibling parent Ilya Minkov <minkov cs.tum.edu> writes:
Matthias Becker wrote:

 Sorry, but this text is a bit stupid. It seems like you assume C++-coders to be
 stupid. C++ knows references. If you pass a string to a function you pass it by
 reference of course.

True, since it is isually evident whether you modify Strings or not. Those cases where it is not certain, are really not worth the worry. It's almost the same case in D, except that here you can decide dynamically whether you want to copy a string or not.
 void foo (const std::string & the_string)
 { ... }
 
 The problem with garbagecollection is solved by smart-pointers. the most common
 ones are boost::shared_ptr. And I know no good C++-coder that doesn't use boost
 (www.boost.org), so please compare D with C++ + boost, because everything else
 is not pragmatic.

Some C++ programmers rely more on ref-counted smart pointers, others rely on global garbage collection - depending on prior experience and the project at hand. Though the current D GC is not better than than Boehm's C++ GC, but it has a potential to become up to 2 orders of magnitude faster and thus less obtrusive.
 And about your copy on write "optimization": read the folowing (it's only the
 third part. You find the other articles on the same site)
 http://www.gotw.ca/gotw/045.htm

This article is about "smart" string implementations, which *force* COW on strings. They check the count on each operation, hence count acess must be atomic. However, in D strings are stupid garbage-collected slices. They don't really use COW. It is only a convention, that all libraries return a copy of the string if they modify it, instead of changing the existing one. If the old one is not used any longer, it will eventually be collected by a GC. No count is ever maitained or checked, and thus there is no interference. It is equivalent to smart use of copying String implementation in C++, and only marginally slower because of the use of GC, which is unavoidable because D doesn't support Scope-based guaranteed destruction like in C++. -eye
Oct 29 2003