www.digitalmars.com         C & C++   DMDScript  

D - An interesting read: Scalable Computer Programming Languages

reply "Andrew Edwards" <edwardsac spamfreeusa.com> writes:
http://www.cs.caltech.edu/~mvanier/hacking/rants/scalable_computer_programming_languages.html
Jul 26 2003
next sibling parent reply Helmut Leitner <helmut.leitner chello.at> writes:
Andrew Edwards wrote:
 
 http://www.cs.caltech.edu/~mvanier/hacking/rants/scalable_computer_programming_languages.html

It *is* interesting, but basically the view of a teacher. -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com
Jul 26 2003
parent reply "Achilleas Margaritis" <axilmar b-online.gr> writes:
"Helmut Leitner" <helmut.leitner chello.at> wrote in message
news:3F226DD2.8CDD04DA chello.at...
 Andrew Edwards wrote:


ng_languages.html
 It *is* interesting, but basically the view of a teacher.

 --
 Helmut Leitner    leitner hls.via.at
 Graz, Austria   www.hls-software.com

I am sorry to say this, but the person that wrote this article is largely ignorant. Here is a piece of it: "There is a cost to GC, both in time and space efficiency." A GC that uses a thread is surely not a good solution, since GC can be done without threads. "Well-designed garbage collectors (especially generational GC) can be extremely efficient" Java sucks speedwise. Show me a real-life language with good garbage collection that does not hamper performance. " (more efficient, for instance, than naive approaches such as reference counting). " But GC uses reference counting. If it did not, how the GC mechanism will know if something is referenced or not ? Furthermore, I don't see why reference counting is bad. Even with cyclic references, objects can be manually deleted (and thus break the cycle). "However, in order to do this they tend to have significantly greater space usages than programs without GC (I've heard estimates on the order of 50% more total space used)." I don't know about them, but I have done a complete C++ framework with reference counting and without any memory problems. Each object gets one more int. How is that memory consuming ? "On the other hand, a program that leaks memory has the greatest space usage of all. I've wasted way too much of my life hunting down memory leaks in large C programs, and I have no interest in continuing to do so". This is because he did not use proper software engineering techniques. And he did not use C++ (or D ;-)). "However, the reverse is also often true; many program optimizations (normally performed automatically by the compiler) are rendered much more difficult or impossible in code that uses pointers. In other words, languages that enable micro-optimizations often make macro-optimizations impossible. " I've heard this a lot of times. But no one cares to put up an example. Until I see a real-life example of pointers doing bad speedwise, I'll believe otherwise. "The author of the Eiffel language, Bertrand Meyer, has said that (I'm paraphrasing) "you can have pointer arithmetic, or you can have correct programs, but you can't have both". I agree with him. I think that direct memory access through pointers is the single biggest barrier to programming language scalability. " But there are languages that are extremely powerful, using pointers, but they have no memory leaks. Take ADA, for example. A fine example of a programming language. It uses pointers, but constrained. "The usual reason why many programmers don't like static type checking is that type declarations are verbose and detract from the purity of the algorithm. " Yes, but if you come back to the code a year after, which one make more sense ? x = 0; or double x = 0; I like the C type system. Automatic type inference may save a few strokes, but it makes compiling slower and it prohibits me from quickly catching what a piece of code does (by eye scanning). Imagine being in the middle of a two-page algorithm with no types!!! hell!!! "The relative virtues of static versus dynamic type checking are one of the great holy wars amongst computer language researchers and users". Well, in real-life (and not in a research lab), static checking wins every time. "This feature is so useful that almost all new languages incorporate it. Interestingly, exceptions also work much better in the presence of garbage collection; avoiding memory leaks in a language like C++ that has exception handling but has no GC is quite tricky (see Scott Meyers' books Effective C++ and More Effective C++ for an extensive description of this issue). This is yet another argument for garbage collection (as if we needed one). " Nope, it's not tricky. It's much more deterministic. And if your objects are coded the right way, there is no memory leak, since stack unwinding will release resources properly. Speaking of the stack, languages with no object allocation on the stack Suck (*cough* Java *cough*). Oh come on now. Please. A program is as good as its engineers are. NASA writes programs in C(or even in assembler!!!) which he calls a "primitive language", but the bug percentage is less than 1%!!! on the other hand, I've seen programmers (newbies) to make Java crawl, simply because they must allocate everything, even the smallest integer objects on the heap!!! (and before somebody jumps at me, let me tell you that I am a software engineer for a small company that does defense subcontranting for THALES (ex Thomson) and TRS (THALES RAYTHEON), and I have quite a big experience in C, C++, Java and ADA. Java being better than C++ is a myth. It's only advantage is the "write once, run everywhere" and the huge array of classes already made. But this has nothing to do with the language itself).
Jul 26 2003
next sibling parent reply Ilya Minkov <midiclub 8ung.at> writes:
Achilleas Margaritis wrote:
 I am sorry to say this, but the person that wrote this article is largely
 ignorant. Here is a piece of it:

Like the most of people. :)
 "There is a cost to GC, both in time and space efficiency."
 
 A GC that uses a thread is surely not a good solution, since GC can be done
 without threads.

If a GC is to cope with a threaded environment anyway, it better be a thread. Thus it can only pause one or two threads and leave the rest running.
 "Well-designed garbage collectors (especially generational GC) can be
 extremely efficient"
 
 Java sucks speedwise. Show me a real-life language with good garbage
 collection that does not hamper performance.

Examples: * C with Boehm GC * OCaml A very minor slowdown. Not comparable with that of Java. About 10% slowdown with all-scan option, and almost no slowdown if you hand-tune the allocation type for each allocation. Like "no pointers", "don't delete", "pointers in the first x bytes only", and so on.
 " (more efficient, for instance, than naive approaches such as reference
 counting). "
 
 But GC uses reference counting. If it did not, how the GC mechanism will
 know if something is referenced or not ? 

No, it doesn't. A GC tracks allocation of all objects, and whenever the time comes it scans the stack for pointers to allocated objects. These are in turn scannned for pointers. Each object which GC comes across in this process, is marked as "reachable". Afterwards, all objects which have not been marked can be deleted.
 Furthermore, I don't see why
 reference counting is bad. Even with cyclic references, objects can be
 manually deleted (and thus break the cycle).

Manual refcounting is fast but error-prone. Automated one incurs cost per each assignment and parameter pass at function call. Even the "obvious" cases are not optimised out. Thus, it turns out that "total" GC is significantly less overhead than "total" reference counting.
 "However, in order to do this they tend to have significantly greater space
 usages than programs without GC (I've heard estimates on the order of 50%
 more total space used)."
 
 I don't know about them, but I have done a complete C++ framework with
 reference counting and without any memory problems. Each object gets one
 more int. How is that memory consuming ?

I haven't read the article, but i believe this means the programs without refcounting. You usually allocate your data in a pool or some other structure. That means, that if aliasing is possible, you cannot reliably say when to delete a single value. You can only delete a pool after the job is done. Another strategy is to forbid aliasing, and thus waste some space. It's a decision to make - depending on what is cheaper. Can't say that a memory overhead is always evil. A GC by itself may consume large amounts of memory.
 "On the other hand, a program that leaks memory has the greatest space usage
 of all. I've wasted way too much of my life hunting down memory leaks in
 large C programs, and I have no interest in continuing to do so".
 
 This is because he did not use proper software engineering techniques. And
 he did not use C++ (or D ;-)).

D does not use reference counting. :) And if someone pervades the small memory footprint at any cost that's what he gets - either C++-style refcounting which tends to be slow if overused, or memory leaks...
 "However, the reverse is also often true; many program optimizations
 (normally performed automatically by the compiler) are rendered much more
 difficult or impossible in code that uses pointers. In other words,
 languages that enable micro-optimizations often make macro-optimizations
 impossible. "
 
 I've heard this a lot of times. But no one cares to put up an example. Until
 I see a real-life example of pointers doing bad speedwise, I'll believe
 otherwise.

That's a reason why there are e.g. unaliased objects of 2 kinds in Sather. In Sather, e.g. INT is a library object, however, because it's immutable it works just as fast as C int. And in fact resolves one-to-one to it, with stack storage, copying, and all. You can create your own types which behave like that easily.
 "The author of the Eiffel language, Bertrand Meyer, has said that (I'm
 paraphrasing) "you can have pointer arithmetic, or you can have correct
 programs, but you can't have both". I agree with him. I think that direct
 memory access through pointers is the single biggest barrier to programming
 language scalability. "
 
 But there are languages that are extremely powerful, using pointers, but
 they have no memory leaks. Take ADA, for example. A fine example of a
 programming language. It uses pointers, but constrained.

How come it doesn't have memoty leaks? Sorry, i don't know ADA. Either it uses a kind of automatic memory management, or it *does* have memory leaks. What kind of constraint is there? I have some Delphi experience, and Pascal/Delphi is quite prone to leaks, evenif they are not so often due to some reason, be it possibilities for better program organisation or similar things.
 "The usual reason why many programmers don't like static type checking is
 that type declarations are verbose and detract from the purity of the
 algorithm. "
 
 Yes, but if you come back to the code a year after, which one make more
 sense ?
 
 x = 0;
 
 or
 
 double x = 0;
 
 I like the C type system. Automatic type inference may save a few strokes,
 but it makes compiling slower and it prohibits me from quickly catching what
 a piece of code does (by eye scanning). Imagine being in the middle of a
 two-page algorithm with no types!!! hell!!!

You are right. OCaml manual says something like "no need to state the obvious over and over again", while things are not that obvious. One has to keep track of types anyway, and writing them down just helps. If one doesn't want to encode type into names or comments, one has to rely on some static system. Sather has a system not too unsimilar to other languages, but it has 2 forms short notations: you can leave out a name of constructor, if you construct the object of the same type as a variable to place it in (and not a subtype), which saves typing the same type in the same line twice, and a "::=" type inference assignment operator, where the type is obvious to the compiler anyway. There is no real type inference, and this simple plug helps separate long expressions into few more readable parts. The Manual discourages overuse of the latter practice.
 "The relative virtues of static versus dynamic type checking are one of the
 great holy wars amongst computer language researchers and users".
 
 Well, in real-life (and not in a research lab), static checking wins every
 time.

I don't think dynamic typechecking has any chance in a research lab. And on the contrary: there is such a vast amount of users writing in Perl with its guess-your-type system which counldn't be much worse, that it gets really scary.
 "This feature is so useful that almost all new languages incorporate it.
 Interestingly, exceptions also work much better in the presence of garbage
 collection; avoiding memory leaks in a language like C++ that has exception
 handling but has no GC is quite tricky (see Scott Meyers' books Effective
 C++ and More Effective C++ for an extensive description of this issue). This
 is yet another argument for garbage collection (as if we needed one). "
 
 Nope, it's not tricky. It's much more deterministic. And if your objects are
 coded the right way, there is no memory leak, since stack unwinding will
 release resources properly.

Have you read EC++ and MEC++? Well, it does requiere some effort and some thought, while GC requieres none.
 (and before somebody jumps at me, let me tell you that I am a software
 engineer for a small company that does defense subcontranting for THALES (ex
 Thomson) and TRS (THALES RAYTHEON), and I have quite a big experience in C,
 C++, Java and ADA. Java being better than C++ is a myth. It's only advantage
 is the "write once, run everywhere" and the huge array of classes already
 made. But this has nothing to do with the language itself).

C++ being a [paste-anything-here] language is a myth as well. But hey, there have been so many libraries written for it, we can't flush them all down the drain, can we? :) There's no doubt that Java is way more primitive and that the guys have actually missed a chance to make it somewhat better than C++ ... Like, why are there no properties? Quite a surprise for a language which is not performance-centric. And yet many, many things. [jump!] :> -i.
Jul 26 2003
parent reply "Achilleas Margaritis" <axilmar b-online.gr> writes:
"Ilya Minkov" <midiclub 8ung.at> wrote in message
news:bfunit$g7v$1 digitaldaemon.com...
 Achilleas Margaritis wrote:
 I am sorry to say this, but the person that wrote this article is


 ignorant. Here is a piece of it:

Like the most of people. :)
 "There is a cost to GC, both in time and space efficiency."

 A GC that uses a thread is surely not a good solution, since GC can be


 without threads.

If a GC is to cope with a threaded environment anyway, it better be a thread. Thus it can only pause one or two threads and leave the rest running.

But if it is a thread, it means that for every pointer that it can be accessed by the GC, it has to provide synchronization. Which in turn, means, provide a mutex locking for each pointer. Which in turn means to enter the kernel a lot of times. Now, a program can have thousands of pointers lying around. I am asking you, what is the fastest way ? to enter the kernel 1000 times to protect each pointer or to pause a little and clean up the memory ? I know what I want. Furthermore, a 2nd thread makes the implementation terribly complicated. When the Java's GC kicks in, although in theory running in parallel, the program freezes. GC is a mistake, in my opinion. I've never had memory leaks with C++, since I always 'delete' what I 'new'.
 "Well-designed garbage collectors (especially generational GC) can be
 extremely efficient"

 Java sucks speedwise. Show me a real-life language with good garbage
 collection that does not hamper performance.

Examples: * C with Boehm GC * OCaml A very minor slowdown. Not comparable with that of Java. About 10% slowdown with all-scan option, and almost no slowdown if you hand-tune the allocation type for each allocation. Like "no pointers", "don't delete", "pointers in the first x bytes only", and so on.

But if you have to hand-tune the allocation type, it breaks the promise of ''just only allocate the objects you want, and forget about everything else". And this "hand-tuning" that you are saying is a tough nut to crack. For example, a lot of code goes into our Java applications for reusing the objects. Well, If I have to make such a big effort to "hand-tune", I better take over memory allocation and delete the objects myself. And I am talking again about real-life programming languages.
 " (more efficient, for instance, than naive approaches such as reference
 counting). "

 But GC uses reference counting. If it did not, how the GC mechanism will
 know if something is referenced or not ?

No, it doesn't. A GC tracks allocation of all objects, and whenever the time comes it scans the stack for pointers to allocated objects. These are in turn scannned for pointers. Each object which GC comes across in this process, is marked as "reachable". Afterwards, all objects which have not been marked can be deleted.

It can't be using a stack, since a stack is a LIFO thing. Pointers can be nullified in any order. Are you saying that each 'pointer' is allocated from a special area in memory ? if it is so, what happens with member pointers ? what is their implementation in reality ? Is a member pointer a pointer to a pointer in reality ? if it is so, it's bad. Really bad. And how does the GC marks an object as unreachable ? it has to count how many pointers track it. Otherwise, it does not know how many references are there to it. So, it means reference counting, in reality. If it does not use any way of reference counting as you imply, it has first to reset the 'reachable' flag for every object, then scan pointers and set the 'reachable' flag for those objects that they have pointers that point to them. And I am asking you, how is that more efficient than simple reference counting (which is local, i.e. only when a new pointer is created/destroyed, the actual reference counter integer is affected).
 Furthermore, I don't see why
 reference counting is bad. Even with cyclic references, objects can be
 manually deleted (and thus break the cycle).

Manual refcounting is fast but error-prone. Automated one incurs cost per each assignment and parameter pass at function call. Even the "obvious" cases are not optimised out.

Manual refcouting is error prone, and I agree. Automated incurs no cost at all, unless that you are saying that increasing and decreasing an integer is a serious cost for modern CPUs. Furthermore, not every pointer needs to be reference counted. In my implementation, only those pointers that are concerned with the object's lifetime manage reference counting. Every method that accepts a pointer as a parameter, is a normal C++ pointer. In other words, only member pointers are special pointers that do reference counting. Temporary pointers allocated on the stack do not do reference counting. And there is a reason for it: since they are temporary, they are guarranteed to release the reference when destroyed. Of course, you may say now that some call may destroy the object and leave the stack pointers dangling. And I will say to you, that it's your algorithm's fault, not of the library's: since the inner call destroyed the object, it was not supposed to be accessed afterwards. So, as you can see, automated refcounting works like a breeze. And you also get the benefit of determinism: you know when destructors are called; and then, you can have stack objects that, when destroyed, do away with all the side effects (for example, a File object closes the file automatically when destroyed).
 Thus, it turns out that "total" GC is significantly less overhead than
 "total" reference counting.

Nope, it does not, as I have demonstrated above.
 "However, in order to do this they tend to have significantly greater


 usages than programs without GC (I've heard estimates on the order of


 more total space used)."

 I don't know about them, but I have done a complete C++ framework with
 reference counting and without any memory problems. Each object gets one
 more int. How is that memory consuming ?

I haven't read the article, but i believe this means the programs without refcounting. You usually allocate your data in a pool or some other structure. That means, that if aliasing is possible, you cannot reliably say when to delete a single value. You can only delete a pool after the job is done. Another strategy is to forbid aliasing, and thus waste some space. It's a decision to make - depending on what is cheaper. Can't say that a memory overhead is always evil. A GC by itself may consume large amounts of memory.

If the working set is not in the cache, it means a lot of cache misses, thus a slow program. Refcounting only gives 4 bytes extra to each object. If you really want to know when to delete an object, I'll tell you the right moment: when it is no more referenced. And how do you achieve that ? with refcounting.
 "On the other hand, a program that leaks memory has the greatest space


 of all. I've wasted way too much of my life hunting down memory leaks in
 large C programs, and I have no interest in continuing to do so".

 This is because he did not use proper software engineering techniques.


 he did not use C++ (or D ;-)).

D does not use reference counting. :) And if someone pervades the small memory footprint at any cost that's what he gets - either C++-style refcounting which tends to be slow if overused, or memory leaks...

As I told earlier, the trick is to use refcounting where it must be used. In other words, not for pointers allocated on the stack.
 "However, the reverse is also often true; many program optimizations
 (normally performed automatically by the compiler) are rendered much


 difficult or impossible in code that uses pointers. In other words,
 languages that enable micro-optimizations often make macro-optimizations
 impossible. "

 I've heard this a lot of times. But no one cares to put up an example.


 I see a real-life example of pointers doing bad speedwise, I'll believe
 otherwise.

That's a reason why there are e.g. unaliased objects of 2 kinds in Sather. In Sather, e.g. INT is a library object, however, because it's immutable it works just as fast as C int. And in fact resolves one-to-one to it, with stack storage, copying, and all. You can create your own types which behave like that easily.

Real-life programming languages only, please. You still don't give me an example of how initialization fails with aliasing.
 "The author of the Eiffel language, Bertrand Meyer, has said that (I'm
 paraphrasing) "you can have pointer arithmetic, or you can have correct
 programs, but you can't have both". I agree with him. I think that


 memory access through pointers is the single biggest barrier to


 language scalability. "

 But there are languages that are extremely powerful, using pointers, but
 they have no memory leaks. Take ADA, for example. A fine example of a
 programming language. It uses pointers, but constrained.

How come it doesn't have memoty leaks? Sorry, i don't know ADA. Either it uses a kind of automatic memory management, or it *does* have memory leaks. What kind of constraint is there? I have some Delphi experience, and Pascal/Delphi is quite prone to leaks, evenif they are not so often due to some reason, be it possibilities for better program organisation or similar things.

At first I thought too that ADA was similar to PASCAL. Well, it is syntactically similar, but that's about it. It's pointer usage is constrained. For example, you can do pointer arithmetic, but it is bounds-checked. You can't have pointer casting, unless it is explicitely specified as an alias on the stack.
 "The usual reason why many programmers don't like static type checking


 that type declarations are verbose and detract from the purity of the
 algorithm. "

 Yes, but if you come back to the code a year after, which one make more
 sense ?

 x = 0;

 or

 double x = 0;

 I like the C type system. Automatic type inference may save a few


 but it makes compiling slower and it prohibits me from quickly catching


 a piece of code does (by eye scanning). Imagine being in the middle of a
 two-page algorithm with no types!!! hell!!!

You are right. OCaml manual says something like "no need to state the obvious over and over again", while things are not that obvious. One has to keep track of types anyway, and writing them down just helps. If one doesn't want to encode type into names or comments, one has to rely on some static system. Sather has a system not too unsimilar to other languages, but it has 2 forms short notations: you can leave out a name of constructor, if you construct the object of the same type as a variable to place it in (and not a subtype), which saves typing the same type in the same line twice, and a "::=" type inference assignment operator, where the type is obvious to the compiler anyway. There is no real type inference, and this simple plug helps separate long expressions into few more readable parts. The Manual discourages overuse of the latter practice.

A cleverer solution would be to have automatic type insertion from the IDE: when I type 'x = 0.0', the IDE converts it to: 'double x = 0.0'. After all, it's a typing problem, right ? we are frustrated to type the things that the computer should understand by itself. But that does not have to do about what the program should be like. Here is a little thought about Java's lack of templates, which is related to the problem of going back to the code and instantly realizing what's happenning: in C++, when I go back to the code, I can easily remember what type the 'list' or 'map' had because it is mentioned in the templates. In Java, I can't do that, since everything works at the Object level. So, I have to go back to the point that objects are inserted into the list or map and check the type of object inserted. This has two solutions, none of which is very ellegant: either name the collection relevant to the types it uses, for example: TreeMap intToStringMap; or use the Javadoc comments to explicitely note which kind of object the map has. For example: /** maps strings to integers */ TreeMap nameIds; This has another consequence: two different programmers putting different objects into the same map, and only discovering it when the program runs and raises an exception. This is way explicit statement of types is very important. We should not mix the 'fast typing' problem with the actual programming language.
 "The relative virtues of static versus dynamic type checking are one of


 great holy wars amongst computer language researchers and users".

 Well, in real-life (and not in a research lab), static checking wins


 time.

I don't think dynamic typechecking has any chance in a research lab. And on the contrary: there is such a vast amount of users writing in Perl with its guess-your-type system which counldn't be much worse, that it gets really scary.
 "This feature is so useful that almost all new languages incorporate it.
 Interestingly, exceptions also work much better in the presence of


 collection; avoiding memory leaks in a language like C++ that has


 handling but has no GC is quite tricky (see Scott Meyers' books


 C++ and More Effective C++ for an extensive description of this issue).


 is yet another argument for garbage collection (as if we needed one). "

 Nope, it's not tricky. It's much more deterministic. And if your objects


 coded the right way, there is no memory leak, since stack unwinding will
 release resources properly.

Have you read EC++ and MEC++? Well, it does requiere some effort and some thought, while GC requieres none.

Nope, but I don't have any memory leaks in my apps, except only when I forget to delete things. But that's my problem. It's an engineering problem, not a language problem.
 (and before somebody jumps at me, let me tell you that I am a software
 engineer for a small company that does defense subcontranting for THALES


 Thomson) and TRS (THALES RAYTHEON), and I have quite a big experience in


 C++, Java and ADA. Java being better than C++ is a myth. It's only


 is the "write once, run everywhere" and the huge array of classes


 made. But this has nothing to do with the language itself).

C++ being a [paste-anything-here] language is a myth as well. But hey, there have been so many libraries written for it, we can't flush them all down the drain, can we? :) There's no doubt that Java is way more primitive and that the guys have actually missed a chance to make it somewhat better than C++ ... Like, why are there no properties? Quite a surprise for a language which is not performance-centric. And yet many, many things. [jump!] :> -i.

Nope, it isn't. C++ is the only language that cuts it for me: 1) you always know what is happening. It is deterministic. 2) it has quite straightforward syntax, unlike ADA. 3) supports generics in the best way I have seen (except D of course :-) ). This is very important. 4) lot's of things can be automated, including memory management. 5) supports every programming technique and paradigm God knows. ADA is too strict, Java sucks, Basic is good for only small projects and prototyping. I also have knowledge of ML(and Haskell), although I would not say that it is a programming language to build large applications with. All the other languages are well on a theoritical basis. The only problem I see with C++ is the lack of standard (and free!!!) libraries across different operating systems, especially for the UI.
Jul 27 2003
next sibling parent reply "Carlos Santander B." <carlos8294 msn.com> writes:
"Achilleas Margaritis" <axilmar b-online.gr> wrote in message
news:bg0g9d$286f$1 digitaldaemon.com...
|
| All the other languages are well on a theoritical basis. The only problem
I
| see with C++ is the lack of standard (and free!!!) libraries across
| different operating systems, especially for the UI.
|

I'm sorry if I stick my nose where I shouldn't, but...

I had to build a program that solved the 8-puzzle in either C, C++ or Java.
I started in D and had it done quite quickly, but the teacher wouldn't
accept (lisp freak, he said: if you want to do in another language, use
lisp... I don't know that much lisp!), and since I didn't want Java, I
re-did it in C++. While in D I could do it in a weekend because of its
simplicity and power, in C++ it took my a whole week because of it being way
too complex.

I must say I learnt many things about C++ that I was never thought, and the
program turned out really good (in DMC it works like a charm, in BCC and
VC6, not so much) but sometimes I was in hell. The last problem I had was
allocating memory for a char *. That can't be good.

What I'm trying to say is C++ gives you a lot of power, yes, but sometimes
its complexity can be... overwhelming, let's say.

(Take all what I've from someone who realized this week that knowing *some*
Turbo C++ 3.0 isn't knowing C++)

覧覧覧覧覧覧覧覧覧覧覧覧
Carlos Santander


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.504 / Virus Database: 302 - Release Date: 2003-07-24
Jul 27 2003
parent "Walter" <walter digitalmars.com> writes:
"Carlos Santander B." <carlos8294 msn.com> wrote in message
news:bg0ili$2ahl$1 digitaldaemon.com...
 I had to build a program that solved the 8-puzzle in either C, C++ or

 I started in D and had it done quite quickly, but the teacher wouldn't
 accept (lisp freak, he said: if you want to do in another language, use
 lisp... I don't know that much lisp!), and since I didn't want Java, I
 re-did it in C++. While in D I could do it in a weekend because of its
 simplicity and power, in C++ it took my a whole week because of it being

 too complex.

Can you email me both versions? This sounds like a great example!
Aug 17 2003
prev sibling next sibling parent reply "Sean L. Palmer" <palmer.sean verizon.net> writes:
Good for you!

You can be the most conscientious programmer in the world, but it is still
rather easy to get into a situation where it is not clear where or when the
object should be deleted.

With C++/manual memory management, you either get dangling pointers (which
leads to double-deletions or access to invalid objects), or memory leaks.
It takes either a very simple program logic, or a superhuman effort by the
programmers, to avoid all the memory problems.

Just because you haven't personally run into them doesn't make the problems
go away for everybody else.  It is a real issue.

The sucky thing is, that the job of figuring out when it's safe to delete
the objects, of keeping the "reference counts" or GC, or what-have-you, is a
rather straightforward chore, something the computer could do for you.  This
is why we all want memory management built into the language, so we don't
have to worry about forgetting to delete pointers, or deleting them too
early, or forgetting to clean all pointers to the deleted memory.

Manual memory management is what makes programming in C++ such a PITA.

Sean

"Achilleas Margaritis" <axilmar b-online.gr> wrote in message
news:bg0g9d$286f$1 digitaldaemon.com...
 GC is a mistake, in my opinion. I've never had memory leaks with C++,

 I always 'delete' what I 'new'.

Jul 27 2003
parent "Walter" <walter digitalmars.com> writes:
"Sean L. Palmer" <palmer.sean verizon.net> wrote in message
news:bg17pj$30u1$1 digitaldaemon.com...
 Manual memory management is what makes programming in C++ such a PITA.

I use a garbage collector with C++ now, too <g>.
Aug 17 2003
prev sibling next sibling parent "DeadCow" <deadcow-remove-this free.fr> writes:
"Achilleas Margaritis" <axilmar b-online.gr> a 馗rit dans le message news:
bg0g9d$286f$1 digitaldaemon.com...

 GC is a mistake, in my opinion. I've never had memory leaks with C++,

 I always 'delete' what I 'new'.

Is it realy that simple ? I dont think so ! =)
 And how does the GC marks an object as unreachable ? it has to count how
 many pointers track it. Otherwise, it does not know how many references

 there to it. So, it means reference counting, in reality.

*root references* are found at compile-time. Thus, at run-time, GC start from roots and jump down from reference to reference, marking each chunk as reachable. Finally, unreachable chunks are deleted. It's the "mark & sweep" algorithm.
 If it does not use any way of reference counting as you imply, it has

 to reset the 'reachable' flag for every object, then scan pointers and set
 the 'reachable' flag for those objects that they have pointers that point

 them. And I am asking you, how is that more efficient than simple

 counting (which is local, i.e. only when a new pointer is

 the actual reference counter integer is affected).

Some points: - Reference counting can't handle cycle references, and that's a *big* problem. - GC collection only occur when program run out of memory, reference counting waste cycles on each assignment. - Reference counting dont compact heap. [snip]
 Of course, you may say now that some call may destroy the object and leave
 the stack pointers dangling. And I will say to you, that it's your
 algorithm's fault, not of the library's: since the inner call destroyed

 object, it was not supposed to be accessed afterwards.

Well, if you fail coding big application in asm, that's your algorithm's fault too =). But language & compiler are supposed to help us writing programs. C/C++ dont realy help for memory managment.
 So, as you can see, automated refcounting works like a breeze. And you

 get the benefit of determinism: you know when destructors are called; and
 then, you can have stack objects that, when destroyed, do away with all

 side effects (for example, a File object closes the file automatically

 destroyed).

 Thus, it turns out that "total" GC is significantly less overhead than
 "total" reference counting.

Nope, it does not, as I have demonstrated above.

im not convinced =) -- Nicolas Repiquet
Jul 27 2003
prev sibling next sibling parent reply Ilya Minkov <midiclub 8ung.at> writes:
Achilleas Margaritis wrote:

 But if it is a thread, it means that for every pointer that it can be
  accessed by the GC, it has to provide synchronization. Which in 
 turn, means, provide a mutex locking for each pointer.

Nope. If GC scans a thread it freeses it *completely* - mutexing would actually be intolerable performance-wise. And please: the GC which you get with a Java VM is crap. There has been done a lot of research to improve it, but Sun doesn't seem to be interested. Most obviously, a threaded GC is bad for a single-threaded or mostly-single-threaded environments like D or C++. However, a free routine should be an low-priority thread.
 GC is a mistake, in my opinion. I've never had memory leaks with C++,
  since I always 'delete' what I 'new'.

That's what good code structuring in C++ allowes for. And i believe the author of the original article (actually a rant) just understood that recently. C++ takes a few thousand pages of reading to understand.
 But if you have to hand-tune the allocation type, it breaks the 
 promise of ''just only allocate the objects you want, and forget 
 about everything else". And this "hand-tuning" that you are saying is
  a tough nut to crack. For example, a lot of code goes into our Java 
 applications for reusing the objects. Well, If I have to make such a 
 big effort to "hand-tune", I better take over memory allocation and 
 delete the objects myself.

In GC-enabled languages you definately cannot tune it by hand - that's what a language compiler is supposed to do since it's a trivial thing. In C (and maybe C++) you can consider 3 cases: - you allocate very usual storage - lika a class - which contins pointers and can contain them everywhere. - you allocate an array of data, which definately doesn't have to be scanned. It can be textures or something along these lines. - you allocate storage which begins with maybe pointers, then an array with no pointers. An exmaple of that is a struct with an open-ended array at its end - strings and other stuff is often implemented this way in C.
 It can't be using a stack, since a stack is a LIFO thing. Pointers 
 can be nullified in any order. Are you saying that each 'pointer' is 
 allocated from a special area in memory ? if it is so, what happens 
 with member pointers ? what is their implementation in reality ? Is a
  member pointer a pointer to a pointer in reality ? if it is so, it's
  bad. Really bad.

ARGH!!!! I'M NOT GOING TO EXPLAIN YOU EACH AND EVERY BIT WHAT YOU DON'T UNDERSATAND!!! TAKE A SANE BOOK! It explains better than myself and doesn't get impatient. * a usual allocation procedure is replaced by that of a GC. GC allocates memory on a usual heap -- or sometimes in preallocated buffers but that's for performance only. * when allocating memory, it stores beginning and end of an allocated block in an efficiently searchable structure, so that additional information can be associated with it (destructor, reachability, and so on). So, you are dealing with completely usual pointers, just that GC needs to be informed of every allocation. And FYI, program execution stack is just an array of values, which are in case of 32-bit CPU 4 bytes large. In the beginnng of a program, you can make a local integer variable and take its adress. This would be almost the beginning of a stack. Then, at some later point, preferably deep within nested function calls, make another local variable, and take its adress as well. This range denotes an array which would be a vital prt of your application's stack. It's that simple. Every value in this array is treated as a potential pointer. That's how Boehm GC works, language-specific GCs can use more efficient strategies. D's GC is currently like Boehm GC except that it's slower. Both incur *no* runtime cost, a tiny allocation overhead which disappears due to optimisations, and a relatively long collection phase, which stops an application. There also exist so-called 3 color collectors, which incur some constant run-time slowdown, but don't have to stop an application at all. That's a kind of GC used in OCaml.
 And how does the GC marks an object as unreachable ? it has to count 
 how many pointers track it. Otherwise, it does not know how many 
 references are there to it. So, it means reference counting, in 
 reality.

IT DOESN'T DO REFERENCE COUNTING!!! YOU SEEM NOT TO READ WHAT I WRITE! Please re-read everything, and pinpoint what you don't understand. Or CONSULT LITERATURE. You don't even google before you rant off! http://www.iecc.com/gclist/GC-faq.html GC DOESN'T MARK ANYTHING AS *UN*REACHABLE! GC DOESN'T MARK ANY POINTERS AT ALL!!! GC traverses every pointer in every memory region. Memory regions, which it stumbles over, become marked as *reachable*, and the rest will be freed. To have a starting point, stack and registers are scanned first. There is a number of optimisations, which reduce scanning overhead: - Following information is used: - A number, which isn't multiple of 4, cannot be a pointer into beginning of a memory range. This is enforced by an allocator. - There are adress range regions, which cannot contain anything useful. These are sorted out very fast. Besides, languages built around a GC can reduce the cost of scan phase by collecting additional information during the program run. Applications become slower in total, but you get rid of this nasty pauses.
 If it does not use any way of reference counting as you imply, it has
  first to reset the 'reachable' flag for every object, then scan 
 pointers and set the 'reachable' flag for those objects that they 
 have pointers that point to them.

Substantially correct. Just that these scans happen once in a while, thus amortising the cost. It is obvious, that a program which doesn't do any memory management at all is faster than the one which does - given it gets all the memory which it needs. In a theoretical case of an unlimited memory, it's the same with GC. Just GC waits till memory is full, and then it kicks in to clean some up. So, during this time in which GC doesn't kick in, a total refcounter would collect and discard all the information that the GC collects once for a great number of times.
 And I am asking you, how is that more efficient than simple reference
 counting (which is local, i.e. only when a new pointer is
 created/destroyed, the actual reference counter integer is affected).
 

it's done through a smartpointer it's not - and very bad in terms of space. Usually, not using reference counting is faster, and is the common and safe practice in C++ programs. BTW, that's also what you are saying.
 Thus, it turns out that "total" GC is significantly less overhead 
 than "total" reference counting.

Nope, it does not, as I have demonstrated above.

Your (and common) C++ practice, is fairly efficient, and is basically manual memory management with some automated help sprinkled in. However, D has completely automatic memory management by design, and refcounting everything would be much slower that GCing everything. I can clearly see that manual + refcounted memory management is the right thing for C++.
 So, as you can see, automated refcounting works like a breeze. And 
 you also get the benefit of determinism: you know when destructors 
 are called; and then, you can have stack objects that, when 
 destroyed, do away with all the side effects (for example, a File 
 object closes the file automatically when destroyed).

This is definately a good thing.
 If the working set is not in the cache, it means a lot of cache 
 misses, thus a slow program. Refcounting only gives 4 bytes extra to 
 each object. If you really want to know when to delete an object, 
 I'll tell you the right moment: when it is no more referenced. And 
 how do you achieve that ? with refcounting.

 As I told earlier, the trick is to use refcounting where it must be 
 used. In other words, not for pointers allocated on the stack.

You cannot tell that to an automated memory management system. :)
 Real-life programming languages only, please. You still don't give me
  an example of how initialization fails with aliasing.

What are you doing here? Is D a real-life language? Definately not yet. And Sather is also quite a good candidate. There's neither anything unnatural nor complicated in it. Just amazingly simple and good sloutions. Something that may be called elegance, if you like, and which is foreign to any C descendant. :) I don't think i can follow you. What does initialisation have to do with aliasing? Efficiency (compined with simplicity) is the reason why unaliased objects exist.
 At first I thought too that ADA was similar to PASCAL. Well, it is 
 syntactically similar, but that's about it. It's pointer usage is 
 constrained. For example, you can do pointer arithmetic, but it is 
 bounds-checked. You can't have pointer casting, unless it is 
 explicitely specified as an alias on the stack.

Though these are nice, i *still* don't see how these reduce memory leaks. There must be more to it.
 A cleverer solution would be to have automatic type insertion from 
 the IDE: when I type
 
 'x = 0.0',
 
 the IDE converts it to:
 
 'double x = 0.0'.
 
 After all, it's a typing problem, right ? we are frustrated to type 
 the things that the computer should understand by itself. But that 
 does not have to do about what the program should be like.

Cool idea.
 Here is a little thought about Java's lack of templates, which is 
 related to the problem of going back to the code and instantly 
 realizing what's happenning:

Argh. My Delphi practice says: don't use containers as they are! You can subclass a container, and hide all the casts there. This is no more than a screenful of code -- and a significant help onwards. I don't want to justify Java, but it's a sort-of solution.
 This is way explicit statement of types is very important. We should 
 not mix the 'fast typing' problem with the actual programming 
 language.

Yup. Why doesn't anyone use Java extentions - Pizza, Nice, Kiev -- they all have templates, properties, and othar things which make Java much easier and better.
 Nope, but I don't have any memory leaks in my apps, except only when 
 I forget to delete things. But that's my problem. It's an engineering
  problem, not a language problem.

HAHAHAHAA! Don't you want to be free of this headache at all? In fact, it's such a common "engineering problem", that it's only natural to search for a solution to it in the language. And yet: there are many people, which may be well educated, profesional, and so on, but they just can't keep tack of too many things in their head. They deserve help. And you can free up your brain for better things, which doesn't happen with Java because it's a flawed thing. Now scroll up and read your own quote: "I've never had memory leaks with C++, since I always 'delete' what I 'new'" Assembly language is very error-prone -- and yet, acording to you it would be "an engineering problem, not a language problem"! Same to a lesser extent to C, C++, and so on.
 Nope, it isn't. C++ is the only language that cuts it for me:
 
 1) you always know what is happening. It is deterministic.

It's not very verbose as to what's happening. Requieres some brain strain.
 2) it has quite straightforward syntax, unlike ADA.

Straightforward syntax? Don't make me laugh.
 3) supports generics in the best way I have seen (except D of course
 :-) ). This is very important.

Sather has generics integrated with their typesystem so tightly, that any implementation class is a template as well. I must confess, C++ templates really impress me each time for new. First, it turns out that sorting through algorithms is 10x faster than with C's qsort! Then i find dynamic closures, constructed using templates, and similar wizardry -- even a complete functional programming emulation -- something for which they weren't thought.
 4) lot's of things can be automated, including memory management.

Most importantly, it is very flexible as what you automate and what not: you may use pools, GCs, refcounts, and lots of other stuff. 5) supports every programming technique and paradigm God
 knows.

C++ is very poorly suited to aspect-oriented programming. Well, like most other languages out there. I believe only Lisp and its descendants, as well as some very special languages are good at it. Sather is a pure OO language, and yet it has closures which allow to do functional programming in it.
 ADA is too strict, Java sucks, Basic is good for only small projects 
 and prototyping. I also have knowledge of ML(and Haskell), although I
  would not say that it is a programming language to build large 
 applications with.

That's all of your language baggage? C'mon! Forget basic, and take a deep breath of something smarter. Java was probably a marketing gag to help microsoft market VB. :) ML and Haskell are very limited lab toys, even unlike OCaml. What do you mean with "ADA is too strict"? I dislike Eiffel because it becomes PITA in quite a number of situations in which even Sather doesn't. "Exception means a broken program", silly loop conditions, old contract hunting you in 10th generation, and so on. It is so strict about safety, that trying to write software which matches its criteria becomes by itself an unsafe thing to do. :)
 All the other languages are well on a theoritical basis. The only 
 problem I see with C++ is the lack of standard (and free!!!) 
 libraries across different operating systems, especially for the UI.

Name any language besides Java which has them? There are wonderful cross-platform C++ libraries, it's just that they are not standard. And yet: not even Java is perfect in this respect. Its standard Swing and AWT are so sluggish... On the other hand, there is IBM's SWT aka Eclipse GUI, which is quite fast, but yet again: it's not standard... -i.
Jul 27 2003
parent reply Bill Cox <bill viasic.com> writes:
Hi, Ilya.

Ilya Minkov wrote:
...

 In GC-enabled languages you definately cannot tune it by hand - that's
 what a language compiler is supposed to do since it's a trivial thing.

Always nice to read your comments, which are always thoughful. However, I have to disagree with this statement. In theory, a good GC system automates what I would build by hand otherwise. I find that in practice, it doesn't quite get there. GC systems of today typically don't even allocate objects of a given class in contiguous memory, which dramatically impacts cache and paging performance. As an example of code that I wouldn't leave up to today's GCs, the inner loop of some printed circuit board and IC routers create "wave" objects to find minimum paths through mazes. Once a minimum path has been found, all the waves get deleted at once. Creation and deletion of waves can easily dominate a router's run-time if left up to the GC. In practice, waves can be allocated in an array up front, and a wave creation can simply be an auto-increment on a global variable. Deleting all the waves is done simply by assigning 0 to the global variable. Of course, this can be done in D, and GC doesn't get in the way. I also don't think that it's a trivial to do memory management well. I look forward to the day that it's done so well that I can stop overiding GC. At a minimum, the compiler needs to do a global analysis of my code, and preferably take some run-time statistics, before making choices for memory layout. Then, it could also do cool automatic optimizations like inserting unions into my classes for fields that have non-overlapping lifetimes, which would be a maintainence nightmare to do by hand. It could factor out the most frequently used fields of objects accessed in inner loops, and put those fields densly together in memory to enhance cache performance. It might even insert the 'delete' statements for some classes automatically to eliminate GC overhead. For infrequently accessed classes, it could ignore byte alignment of objects and pack them more densly. Languages like C++ are already broken in this reguard, since the exact layout of memory for each class is visible to the coder, and specified in the standard, and not the same on all machines. D does a much better job of memory abstraction, which makes advanced optimizations possible. Bill
Jul 28 2003
parent reply Ilya Minkov <midiclub 8ung.at> writes:
Hello.

Bill Cox wrote:
 Always nice to read your comments, which are always thoughful. 

Thanks, but how'd i diserve that? :)
  However, I have to disagree with this statement.

What I meant there, was selecting one of 3 allocation options in a standard Boehm GC, which is BTW not only used for its original purpose, but also by many runtime environments for GC-centric languages. I believe MONO and DOTGNU .NET-compatible runtimes do, Sather does, and i bet also lots of others. Now, selecting one of its 3 most usual allocation options is trivial by object type, and would already do performance a lot of good. That's what is going to be also implemented in D, sort-of, just better.
 In theory, a good GC system automates what I would build by hand 
 otherwise.  I find that in practice, it doesn't quite get there.  GC 
 systems of today typically don't even allocate objects of a given class 
 in contiguous memory, which dramatically impacts cache and paging 
 performance.

They allocate same-sized objects in continuous memory. That's fairly close, but could be better.
 As an example of code that I wouldn't leave up to today's GCs, the inner 
 loop of some printed circuit board and IC routers create "wave" objects 
 to find minimum paths through mazes.  Once a minimum path has been 
 found, all the waves get deleted at once.  Creation and deletion of 
 waves can easily dominate a router's run-time if left up to the GC.  In 
 practice, waves can be allocated in an array up front, and a wave 
 creation can simply be an auto-increment on a global variable.  Deleting 
 all the waves is done simply by assigning 0 to the global variable.

That's a pattern which definately doesn't requiere GC. C++ is very good at such things.
 Of course, this can be done in D, and GC doesn't get in the way.

In a way it does. It still needs to scan this area which would reduce total application performance. Or you shut it off and you're back at where you begun.
 I also don't think that it's a trivial to do memory management well.  I 
 look forward to the day that it's done so well that I can stop overiding 
 GC.  

Good memory management is no way trivial. Just that if completely automated memory management is a goal, GC scores good compared to other currently known solutions. As oppsed to that, a refcounter is a perfectly good help when memory management is largely manual. I have stumbled over an experimental GC-enabled smartpointer implementation for C++, and it turns out to be significantly slower than a refcounter. Its major purpose is to be able to collect circular data. Its implementation doesn't only keep track of memory blocks (like the usual GC), but also of the smartpointers themselves. When a refcounter would simply do an increment or a decrement and a test, it has to actually go and insert or delete this pointer into/from a set by traversing a binary tree! This means that slowness of such system grows logarithmically with the number of pointers to be tracked, while that of refcounting is constant. This GC is useless beyond very small scale. This shows, that a GC is currently a very inefficient supplement to a manual memory management system, while reference counting is efficient enough. The advantage of this implementation is that it doesn't need to scan the stack, which pays off when used very sparingly. GC has to be used for the major part of the application to be able to match even mediocre manual memory management. It also appears that even if you generally use a conservative GC in C++, you can still exclude all the classes you manage manually from scanning. Which in turn means that a half-used GC would be of less performance impact than in D. What you are speaking of, is basically that a compiler should be able to pick up and implement the best of the memory management strategy patterns, which currently belong to manual memory management (among other things). However, we're simply not there yet with our current knowedge. :(
 Languages like C++ are already broken in this reguard, since the exact 
 layout of memory for each class is visible to the coder, and specified 
 in the standard, and not the same on all machines.  D does a much better 
 job of memory abstraction, which makes advanced optimizations possible.

Wait... In C++, you have to distinguish, just like in D, 2 kinds of classes: those which contain a VTable pointer and those which don't. Now, the standard doesn't specify, whether the VTable pointer has to be in the beginning (which is common though), at the end, or at any other position in a class storage. That means, that a layout of any class which has any virtual member function is implicitly broken and can be tuned by the compiler at will. Re-order, optimise, ... If i understood it correctly. But then again, for classes which have any predecessors, storage members have to retain the same layout among themselves as in the predecessor, as they compose a part of classe's interface. And that for an obvious performance reason. Currently in D we have a similar situation. To the contrary, in Sather you cannot acess data members directly - only through functions, which behave like D's getters/setters. These are generated automatically if one doesn't write specialised ones. Classes are completely independant of the previous generation - and in fact this means that when using a class deep in hierarchy, it doesn't have to put all of its parents in the executable. Nor even it is generated completely, but only to a part which is actually inferred to be invokable. Funny thing: invokability scanner in Sather is conceptually and algorithmically very close to reachability scanner in garbage collection. It must be both very essential things then. :) Another interesting feature in Sather is a separation between interfaces and concrete classes. Its transition to a D world could be a final class. A final class cannot be subclassed, but that's its strength - since usually calls to it can be made without VTable and possibly inlined. Just that i'm afraid that in D this would look like a kludge, while in Sather it's not even a constraint, since you can still reuse both its code and its interface. This also elegantly *implicitly* separates public from private details and leads to better designs. I believe that when doing full-program compilation, D shall be able to identify final classes by itself. -i.
Jul 28 2003
parent reply Bill Cox <bill viasic.com> writes:
Hi, Ilya.

...

 Now, selecting one of its 3 most usual allocation options is trivial by 
 object type, and would already do performance a lot of good. That's what 
 is going to be also implemented in D, sort-of, just better.

Manual specification of just a few options would be great.
 Wait... In C++, you have to distinguish, just like in D, 2 kinds of 
 classes: those which contain a VTable pointer and those which don't. 
 Now, the standard doesn't specify, whether the VTable pointer has to be 
 in the beginning (which is common though), at the end, or at any other 
 position in a class storage. That means, that a layout of any class 
 which has any virtual member function is implicitly broken and can be 
 tuned by the compiler at will. Re-order, optimise, ... If i understood 
 it correctly. But then again, for classes which have any predecessors, 
 storage members have to retain the same layout among themselves as in 
 the predecessor, as they compose a part of classe's interface. And that 
 for an obvious performance reason.

I think C++ inherits the field layout of structures from C. They are defined to be located sequentially in memory in the order declared in the structure or class. The VTable pointer is less of a problem, I think. D defines structures to have C compatable layout, but field sequence is undefined in a class. In fact, I think in D you could spread fields out over memory, and not have them all sequentially, which can be handy for optimizing cache performance.
 Currently in D we have a similar situation. To the contrary, in Sather 
 you cannot acess data members directly - only through functions, which 
 behave like D's getters/setters. These are generated automatically if 
 one doesn't write specialised ones. Classes are completely independant 
 of the previous generation - and in fact this means that when using a 
 class deep in hierarchy, it doesn't have to put all of its parents in 
 the executable. Nor even it is generated completely, but only to a part 
 which is actually inferred to be invokable. Funny thing: invokability 
 scanner in Sather is conceptually and algorithmically very close to 
 reachability scanner in garbage collection. It must be both very 
 essential things then. :)
 
 Another interesting feature in Sather is a separation between interfaces 
 and concrete classes. Its transition to a D world could be a final 
 class. A final class cannot be subclassed, but that's its strength - 
 since usually calls to it can be made without VTable and possibly 
 inlined. Just that i'm afraid that in D this would look like a kludge, 
 while in Sather it's not even a constraint, since you can still reuse 
 both its code and its interface. This also elegantly *implicitly* 
 separates public from private details and leads to better designs. I 
 believe that when doing full-program compilation, D shall be able to 
 identify final classes by itself.
 
 -i.

Sather does seem to get a lot right. It's too bad Sather didn't use C-like syntax. I think that hurt the language's popularity. Bill
Jul 29 2003
next sibling parent reply "Sean L. Palmer" <palmer.sean verizon.net> writes:
"Bill Cox" <bill viasic.com> wrote in message
news:3F2685D2.2040704 viasic.com...
 Sather does seem to get a lot right.  It's too bad Sather didn't use
 C-like syntax.  I think that hurt the language's popularity.

You're right. I followed the link to Sather someone (probably Mark Evans) gave me with great interest. I was astounded at each feature I read off the feature list, and each answer to the FAQ's made me shiver with anticipation of using this terrific language. I was practically drooling by the time I got to the "sample code" section of the website, thinking surely I had finally found the language for me. And the syntax made me immediately want to puke. I haven't even bothered to try it. ;) It's not so bad that they used wordy syntax like Pascal, because I used to program Pascal. I can deal with it, but I wouldn't like it. But they *require* capitalization on some keywords, and force the way you capitalize idents. That's just wrong. That's such a hideous basic flaw, running so contrary to my way of programming, I could never convert. Sean
Jul 29 2003
parent Ilya Minkov <midiclub 8ung.at> writes:
Sean L. Palmer wrote:
 It's not so bad that they used wordy syntax like Pascal, because I used to
 program Pascal.  I can deal with it, but I wouldn't like it.  But they
 *require* capitalization on some keywords, and force the way you capitalize
 idents.  That's just wrong.  That's such a hideous basic flaw, running so
 contrary to my way of programming, I could never convert.

I got used to the aesthetics very fast. * almost all identifiers should be in a legible all_lowercase_letters style. Look, it reads almost like normal text! ThisIsMuchLessReadable. HowDidIWriteThatAbbreviation is a common problem with mixed-case and case-sensitive syntax - if you have an abbreviation like DOS, do you write it capitalised or in all-capitals? Try it with a few abbreviations, and you may find some you want to write in one style, and one in the other. * class names are the only ones requiered to be UPPERCASE. So, types are visually easy to track in the code. Surprisingly even to me, this doesn't look ugly, unlike uppercase C macros which do. You don't type them in each line anyway. You can even use type inference within functions if you're really sick of typing. :) * all keywords are lowercase. * operators are somewhat funny: ~ logical not /= unequalilty * the single frankly not very beautiful thing is that iterators are named with en exclaimation mark. and possibly that you need to specify out parameters again when calling functions. well, it goes in line with Sather's explicit style. That's no way a language flow. Personally, i find mixed-case identifiers which begin with a lowercase letter, like they are used in Java and here, awfully disguisting. This is so much unlike a natural language... I'm by far not the only person who strongly dislikes C syntax. http://www.csse.monash.edu.au/~damian/papers/PDF/ModestProposal.pdf -i.
Jul 29 2003
prev sibling parent Ilya Minkov <midiclub 8ung.at> writes:
Hello.

It took me long to remember i wanted to answer this one yet. :)


Bill Cox wrote:

 Manual specification of just a few options would be great.

That's what i also thought of... But it's a complex problem. Where do you want to specify, say, "nogc" -- for memory areas which are neither scanned, nor allocated by GC? Either at the instantiation site, or in the class itself... The problem is, you have to consider that if a nogc object is a container -- whatever it holds should also be allocated in a nogc mode, in a case it is immediately placed in this container. And they may in turn be containers. Can the decision be made at compile-time? Definately not in all cases when nogc at instantiation is used. If specified in the class, it would mean recursive creation of clone or sub-classes for contained types, which would also be no-gc. Or is there another, multi-context solution?
 I think C++ inherits the field layout of structures from C.  They are
 defined to be located sequentially in memory in the order declared
 in the structure or class.  The VTable pointer is less of a problem,
 I think.  D defines structures to have C compatable layout, but field
 sequence is undefined in a class.  In fact, I think in D you could 
 spread fields out over memory, and not have them all sequentially,
 which can be handy for optimizing cache performance.

You could also interleave C++ classes, with some minor risk, if all their fields are private.
 Sather does seem to get a lot right.  It's too bad Sather didn't use
 C-like syntax.  I think that hurt the language's popularity.

I couldn't care less about popularity. :) Currently even its library is more advanced than that of D. Besides, C-like syntax is evil. :) Not that i would ever propose to change D's syntax to Pascal all out of a sudden... D goes in a general direction which is inevitable. What i do care about, is whether project is alive or not. And in the case of Sather, it is dead since mid-2002. -i.
Aug 07 2003
prev sibling next sibling parent "Walter" <walter digitalmars.com> writes:
"Achilleas Margaritis" <axilmar b-online.gr> wrote in message
news:bg0g9d$286f$1 digitaldaemon.com...
 GC is a mistake, in my opinion. I've never had memory leaks with C++,

 I always 'delete' what I 'new'.

The trouble I've had came when interfacing to a DLL that had memory leaks, and I had no way to change that DLL.
Aug 17 2003
prev sibling next sibling parent "Walter" <walter digitalmars.com> writes:
Check out the book on garbage collection in
www.digitalmars.com/bibliography.html
Aug 17 2003
prev sibling parent "Mike Wynn" <mike.wynn l8night.co.uk> writes:
"Achilleas Margaritis" <axilmar b-online.gr> wrote in message
news:bg0g9d$286f$1 digitaldaemon.com...
 But if it is a thread, it means that for every pointer that it can be
 accessed by the GC, it has to provide synchronization. Which in turn,

 provide a mutex locking for each pointer. Which in turn means to enter the
 kernel a lot of times. Now, a program can have thousands of pointers lying
 around. I am asking you, what is the fastest way ? to enter the kernel

 times to protect each pointer or to pause a little and clean up the memory

 I know what I want. Furthermore, a 2nd thread makes the implementation
 terribly complicated. When the Java's GC kicks in, although in theory
 running in parallel, the program freezes.

locks on pointers (you nead a write barrier (check that unwalked object is not being put into a fully walked object) and return barrier (check when you leave a function)) but they only need to be active if the GC is active)
 GC is a mistake, in my opinion. I've never had memory leaks with C++,

 I always 'delete' what I 'new'.

newed implies to me they you have either not found them, have only worked on a project that requires simple data structures, or use a large amount of stack based object and/or lots of copying. GC is a good idea, it (assuming you trust the GC writer) gives you certainty that you not only will your objects get cleaned up, but that you will never delete an object that you should not have (or delete something twice), once you start working with data structures that have more than one "owner" GC allows you to design much more compact structures and potentially faster code (no copies, not manual checks, or ref couts etc)
 But if you have to hand-tune the allocation type, it breaks the promise of
 ''just only allocate the objects you want, and forget about everything
 else". And this "hand-tuning" that you are saying is a tough nut to crack.
 For example, a lot of code goes into our Java applications for reusing the
 objects. Well, If I have to make such a big effort to "hand-tune", I

 take over memory allocation and delete the objects myself.

cache chains of heap blocks so allocation of frequently used objects is fast (there should be a block of the right size al ready waiting for use) and by holding a set of object "live" but outside your program you are doing two things that may not be desireable, one, your back to the situation that you may have a reference held to the object you have manually cached so when "re-alocated" someone else will get a shock. and you are also potentially increasing the work the GC does by having a large root set (all your cached objects)
 And I am talking again about real-life programming languages.

 But GC uses reference counting. If it did not, how the GC mechanism



 know if something is referenced or not ?

No, it doesn't. A GC tracks allocation of all objects, and whenever the time comes it scans the stack for pointers to allocated objects. These are in turn scannned for pointers. Each object which GC comes across in this process, is marked as "reachable". Afterwards, all objects which have not been marked can be deleted.

It can't be using a stack, since a stack is a LIFO thing. Pointers can be nullified in any order. Are you saying that each 'pointer' is allocated

 a special area in memory ? if it is so, what happens with member pointers

 what is their implementation in reality ? Is a member pointer a pointer to

 pointer in reality ? if it is so, it's bad. Really bad.

memory points to an object or is a value of some kind.
 And how does the GC marks an object as unreachable ? it has to count how
 many pointers track it. Otherwise, it does not know how many references

 there to it. So, it means reference counting, in reality.

do not understand the basics. good GC's do not reference count, they do not care howmany ref's all they care about it that there is more than 0 refs. think of gc as a process that takes a piece of string and ties it to all the object it can find, by starting at the "root set" which is all statics and the stacks of any running threads. then once its "walked" all objects it pulls the piece of string, anything not attached it obviously garbage and it is "unreachable"
 If it does not use any way of reference counting as you imply, it has

 to reset the 'reachable' flag for every object, then scan pointers and set
 the 'reachable' flag for those objects that they have pointers that point

 them. And I am asking you, how is that more efficient than simple

 counting (which is local, i.e. only when a new pointer is

 the actual reference counter integer is affected).

above) the mark phase requires one of two things, either a method of determining is a pointer sized (and aligned usually) value is a pointer to object or not (this is what D does) or by having "ref bits" somewhere on the stack and within the object header etc to determine the "object tree". next the sweep, now you walk the heap(s) which it a linear walk either resetting the objects header to "unwalked" or adding to the free list if it is still "unwalked".
 So, as you can see, automated refcounting works like a breeze. And you

 get the benefit of determinism: you know when destructors are called; and
 then, you can have stack objects that, when destroyed, do away with all

 side effects (for example, a File object closes the file automatically

 destroyed).

needed. D has "auto" objects to give you this determinism. personally I prefer "try, catch, finally" for doing close on exit, using stack objects can cause problems if you pass them to someone else library code (and it for some reason holds onto them)
 Thus, it turns out that "total" GC is significantly less overhead than
 "total" reference counting.

Nope, it does not, as I have demonstrated above.

closely with a GC designer for a while and started to see how GC's actually reduce the work done to perform automated resource management.
 If the working set is not in the cache, it means a lot of cache misses,

 a slow program. Refcounting only gives 4 bytes extra to each object. If

 really want to know when to delete an object, I'll tell you the right
 moment: when it is no more referenced. And how do you achieve that ? with
 refcounting.

store length) you only need 2 bits for gc info so having 4 byte aligned object starts (which you want anyway [or 16 byte for speed on some systems]) gives you 2 free bits in the heap length field but most also do have a header, and again its usually 4 bytes.
 As I told earlier, the trick is to use refcounting where it must be used.

 other words, not for pointers allocated on the stack.

and it is true in hard real time env, this determinism can be better).
 In Sather, e.g. INT is a library object, however, because it's immutable
 it works just as fast as C int. And in fact resolves one-to-one to it,
 with stack storage, copying, and all. You can create your own types
 which behave like that easily.

Real-life programming languages only, please. You still don't give me an example of how initialization fails with aliasing.

and Java. what makes Sather "not real world"
 How come it doesn't have memoty leaks? Sorry, i don't know ADA. Either
 it uses a kind of automatic memory management, or it *does* have memory
 leaks. What kind of constraint is there? I have some Delphi experience,
 and Pascal/Delphi is quite prone to leaks, evenif they are not so often
 due to some reason, be it possibilities for better program organisation
 or similar things.

At first I thought too that ADA was similar to PASCAL. Well, it is syntactically similar, but that's about it. It's pointer usage is constrained. For example, you can do pointer arithmetic, but it is bounds-checked. You can't have pointer casting, unless it is explicitely specified as an alias on the stack.

can free stuff you should not from under your own feet too. pointer might be a bit more restricted in their use, but thats not the cause of most memory leaks.
 All the other languages are well on a theoritical basis. The only problem

 see with C++ is the lack of standard (and free!!!) libraries across
 different operating systems, especially for the UI.

for VxWindow or VGui and I believe you can get a MFC for Mac and Unix (you'll have to pay though).
Aug 18 2003
prev sibling parent reply Helmut Leitner <helmut.leitner chello.at> writes:
I agree with most of your points but...

Achilleas Margaritis wrote:
 I am sorry to say this, but the person that wrote this article is largely
 ignorant. Here is a piece of it:
 
 "There is a cost to GC, both in time and space efficiency."

What is wrong with this?
 "Well-designed garbage collectors (especially generational GC) can be
 extremely efficient"
 
 Java sucks speedwise. Show me a real-life language with good garbage
 collection that does not hamper performance.

I found the MS Windows implementation (JIT) rather efficient and typically only about 20-30% slower than comparable C code. I would not use "suck". But it seems that there were a number of slow implementations where you had to pay >100%.
 " (more efficient, for instance, than naive approaches such as reference
 counting). "
 
 But GC uses reference counting. If it did not, how the GC mechanism will
 know if something is referenced or not ? 

Books list 3 main GC methods: - reference counting - mark / sweep - copying GC
 Furthermore, I don't see why
 reference counting is bad. Even with cyclic references, objects can be
 manually deleted (and thus break the cycle).

That's not considered save. But there seem to be methods to solve the cycle problem of the naive reference counting implementation. -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com
Jul 26 2003
parent "Achilleas Margaritis" <axilmar b-online.gr> writes:
"Helmut Leitner" <helmut.leitner chello.at> wrote in message
news:3F236F13.B37603FB chello.at...
 I agree with most of your points but...

 Achilleas Margaritis wrote:
 I am sorry to say this, but the person that wrote this article is


 ignorant. Here is a piece of it:

 "There is a cost to GC, both in time and space efficiency."

What is wrong with this?
 "Well-designed garbage collectors (especially generational GC) can be
 extremely efficient"

 Java sucks speedwise. Show me a real-life language with good garbage
 collection that does not hamper performance.

I found the MS Windows implementation (JIT) rather efficient and typically only about 20-30% slower than comparable C code. I would not use "suck". But it seems that there were a number of slow implementations where you had to pay >100%.

20-30% not being slower enough ? any slowness that I can notice while working (for example, the program freezing every now and then for a little) deserves the word 'sucks' for me. I am too strict maybe, but that's just me.
 " (more efficient, for instance, than naive approaches such as reference
 counting). "

 But GC uses reference counting. If it did not, how the GC mechanism will
 know if something is referenced or not ?

Books list 3 main GC methods: - reference counting - mark / sweep - copying GC
 Furthermore, I don't see why
 reference counting is bad. Even with cyclic references, objects can be
 manually deleted (and thus break the cycle).

That's not considered save. But there seem to be methods to solve the cycle problem of the naive reference counting implementation. -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com

Jul 27 2003
prev sibling next sibling parent reply "Matthew Wilson" <matthew stlsoft.org> writes:
Academic piffle
Jul 26 2003
parent reply "Matthew Wilson" <matthew stlsoft.org> writes:
Well that's a bit strong perhaps.

Had I not answered within 10 minutes of achieving consciousness this
morning - before my polite hormones started flowing - I would say that I am
always sceptical of any statements asserting one language is better than
another.

I believe C++ to be a superior language than Java, though I use it a lot
more so am probably biased. But I would not choose to use C++ to implement
an e-commerce back-end, when J2EE is so simple, ubiquitous and reliable.

I would not use C, C++, C# or Java to write text file processing code. I use
Perl or Python (depending on whether I need more powerful regex or want to
do a bit of OO in there).

And the list goes on and on.

I've (thankfully) never written a line of COBOL, but no less an authority
that Robert Glass says it is still the preeminent language for certain
classes of business software, and I believe him. Why? Because no language is
perfect, almost all features are useful to someone at sometime, and the idea
that a single language and its set of features will in any way compare in
importance to the intelligence and experience of practitioners is fanciful
and does a disservice to us all.
Jul 26 2003
next sibling parent reply "Achilleas Margaritis" <axilmar b-online.gr> writes:
"Matthew Wilson" <matthew stlsoft.org> wrote in message
news:bfv0op$oeh$1 digitaldaemon.com...
 Well that's a bit strong perhaps.

 Had I not answered within 10 minutes of achieving consciousness this
 morning - before my polite hormones started flowing - I would say that I

 always sceptical of any statements asserting one language is better than
 another.

 I believe C++ to be a superior language than Java, though I use it a lot
 more so am probably biased. But I would not choose to use C++ to implement
 an e-commerce back-end, when J2EE is so simple, ubiquitous and reliable.

You would not use it because it lacks something like J2EE, not because Java is a better language. We have at last to differenciate between 'language', 'libraries' and 'environment'. Although C++ is a better language, it totally lacks the Java 'envirornment' and tha Java 'libraries'.
 I would not use C, C++, C# or Java to write text file processing code. I

 Perl or Python (depending on whether I need more powerful regex or want to
 do a bit of OO in there).

Again a problem of available libraries.
 And the list goes on and on.

 I've (thankfully) never written a line of COBOL, but no less an authority
 that Robert Glass says it is still the preeminent language for certain
 classes of business software, and I believe him. Why? Because no language

 perfect, almost all features are useful to someone at sometime, and the

 that a single language and its set of features will in any way compare in
 importance to the intelligence and experience of practitioners is fanciful
 and does a disservice to us all.

Jul 27 2003
parent "Matthew Wilson" <matthew stlsoft.org> writes:
 I believe C++ to be a superior language than Java, though I use it a lot
 more so am probably biased. But I would not choose to use C++ to


 an e-commerce back-end, when J2EE is so simple, ubiquitous and reliable.

You would not use it because it lacks something like J2EE, not because

 is a better language. We have at last to differenciate between 'language',
 'libraries' and 'environment'. Although C++ is a better language, it

 lacks the Java 'envirornment' and tha Java 'libraries'.

Take your point here.
 I would not use C, C++, C# or Java to write text file processing code. I

 Perl or Python (depending on whether I need more powerful regex or want


 do a bit of OO in there).

Again a problem of available libraries.

but not here. Regex is part of Perl's syntax, and I struggle to imagine a way in which it could be more simply & succinctly supported in C++
Jul 27 2003
prev sibling next sibling parent reply "Achilleas Margaritis" <axilmar b-online.gr> writes:
And since we are talking about libraries, D will make it only if it has
standard libraries for gui, networking, database etc. Otherwise, it will be
like C++: a technically superior language, but not the language of choice.
Jul 27 2003
parent reply "Matthew Wilson" <matthew stlsoft.org> writes:
"Achilleas Margaritis" <axilmar b-online.gr> wrote in message
news:bg0gks$28j8$1 digitaldaemon.com...
 And since we are talking about libraries, D will make it only if it has
 standard libraries for gui, networking, database etc. Otherwise, it will

 like C++: a technically superior language, but not the language of choice.

Agree here. I am hoping that as soon as the remaining language debates are done that everyone will refocus on making significant, efficient, flexible and easy-to-use libraries
Jul 27 2003
parent "Sean L. Palmer" <palmer.sean verizon.net> writes:
You will get libraries for any language that has a sufficient number of
people interested in writing code in that language.  Obtain rights to the
cream of the crop and call them standard.

Sean

"Matthew Wilson" <matthew stlsoft.org> wrote in message
news:bg28pl$1270$2 digitaldaemon.com...
 "Achilleas Margaritis" <axilmar b-online.gr> wrote in message
 news:bg0gks$28j8$1 digitaldaemon.com...
 And since we are talking about libraries, D will make it only if it has
 standard libraries for gui, networking, database etc. Otherwise, it will

 like C++: a technically superior language, but not the language of


 Agree here. I am hoping that as soon as the remaining language debates are
 done that everyone will refocus on making significant, efficient, flexible
 and easy-to-use libraries

Jul 28 2003
prev sibling parent reply "Walter" <walter digitalmars.com> writes:
"Matthew Wilson" <matthew stlsoft.org> wrote in message
news:bfv0op$oeh$1 digitaldaemon.com...
 I've (thankfully) never written a line of COBOL, but no less an authority
 that Robert Glass says it is still the preeminent language for certain
 classes of business software, and I believe him. Why? Because no language

 perfect, almost all features are useful to someone at sometime, and the

 that a single language and its set of features will in any way compare in
 importance to the intelligence and experience of practitioners is fanciful
 and does a disservice to us all.

That was true, up until D came along. <g>
Sep 08 2003
parent "Matthew Wilson" <matthew stlsoft.org> writes:
Honk!

LOL

"Walter" <walter digitalmars.com> wrote in message
news:bjiia0$1dnr$1 digitaldaemon.com...
 "Matthew Wilson" <matthew stlsoft.org> wrote in message
 news:bfv0op$oeh$1 digitaldaemon.com...
 I've (thankfully) never written a line of COBOL, but no less an


 that Robert Glass says it is still the preeminent language for certain
 classes of business software, and I believe him. Why? Because no


 is
 perfect, almost all features are useful to someone at sometime, and the

 that a single language and its set of features will in any way compare


 importance to the intelligence and experience of practitioners is


 and does a disservice to us all.

That was true, up until D came along. <g>

Sep 08 2003
prev sibling parent reply Mark Evans <Mark_member pathlink.com> writes:
This is a good article by an open-minded person.  The response on D news has
been entirely predictable, though - our way or the highway, one might say.

Of similar interest is the recent "hackers and painters" thread at LL-discuss
where Michael Vanier and several luminaries weigh in -- Paul Graham, Todd
Proebsting of MS Research, Neel Krishnaswami designer of Needle, and many
others.

D news
http://www.digitalmars.com/drn-bin/wwwnews?D
LL
http://www.ai.mit.edu/~gregs/ll1-discuss-archive-html/threads.html#03074
Jul 27 2003
parent "Andrew Edwards" <edwardsac spamfreeusa.com> writes:
"Mark Evans" <Mark_member pathlink.com> wrote in message
news:bg19ge$187$1 digitaldaemon.com...
 This is a good article by an open-minded person.  The response on D news

 been entirely predictable, though - our way or the highway, one might say.

The debate has certainly been interesting though. I have learned allot from reading the comments posted thus far. Thanks for the links! Andrew
Jul 27 2003