www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - C++ traps that D2 doesn't avoid yet?

reply bearophile <bearophileHUGS lycos.com> writes:
This page lists a lot of C++ traps and pitfalls in a short space. I assume D
has to avoid or render meaningless a quite high percentage of them.
I'd like to list what of them D2 doesn't improve/fix yet, but my limited
knowledge of C++ lets me understand only a certain percentage of such list of
items (probably less than 75%) so probably I am not qualified yet to compare
that with the D situation. Later I may try to write such differential list
anyway...

http://www.johndeacon.net/Cpp/trapsAndPitfalls.asp

The nice thing of that list is that I think most (maybe 80-90%) of those traps
and pitfalls (and several others that aren't listed, like allowing simple
syntactical mistakes like putting a & where a && was requires or vice versa,
etc) can be avoided by a language like D with little or no penalty in running
space and time (and actually D avoids several of them already), but to avoid
some of them you have to pay the price of sometimes having to step away from
C/Java syntax, or probably worse having to assign a different semantics to some
C syntaxes (Walter has explained that this is sometimes dangerous because most
programmers know C/C++/Java and not D, but seeing the huge troubles C++ has
gone to be as much compatible as possible with C I'd say that keeping the
original C semantics when it is known to cause troubles is _always_ bad. In
such situations I prefer a language that acts correctly than one that acts
badly just to be like C, and maybe I don't even like a language that doesn't
use some handy C syntax just because the semantics is now different).

Bye,
bearophile
Nov 05 2008
next sibling parent Walter Bright <newshound1 digitalmars.com> writes:
bearophile wrote:
 or
 probably worse having to assign a different semantics to some C
 syntaxes (Walter has explained that this is sometimes dangerous
 because most programmers know C/C++/Java and not D, but seeing the
 huge troubles C++ has gone to be as much compatible as possible with
 C I'd say that keeping the original C semantics when it is known to
 cause troubles is _always_ bad. In such situations I prefer a
 language that acts correctly than one that acts badly just to be like
 C, and maybe I don't even like a language that doesn't use some handy
 C syntax just because the semantics is now different).

The trouble with this is when translating code. For example, take the md5 algorithm which is published in C, and translate it to D. It has some fairly complex arithmetic expressions in them. Do you know how they work? I don't. I have no idea. I doubt hardly anyone interested in translating them does. This means that if the meaning of an expression *silently* is altered by transliterating them to D, the new md5.d will give wrong results and it will be very difficult for the programmer to figure out where the problem is. This is why when D changes the semantics, it tries to restrict itself to changes that will produce compile time errors if the C is not transliterated correctly. For example, the C-style cast gives an error when it appears in D code, which prompts the user that some changes are necessary.
Nov 05 2008
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
bearophile wrote:
 This page lists a lot of C++ traps and pitfalls in a short space. I
 assume D has to avoid or render meaningless a quite high percentage
 of them. I'd like to list what of them D2 doesn't improve/fix yet,
 but my limited knowledge of C++ lets me understand only a certain
 percentage of such list of items (probably less than 75%) so probably
 I am not qualified yet to compare that with the D situation. Later I
 may try to write such differential list anyway...
 
 http://www.johndeacon.net/Cpp/trapsAndPitfalls.asp

At a glance, D does resolve a lot of them. But preparing a detailed response to his list is many hours of work.
Nov 05 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Walter Bright:
 At a glance, D does resolve a lot of them. But preparing a detailed 
 response to his list is many hours of work.

I have removed the most obvious ones and cleaned up the list a little: 1) C++ is officially based on C. C features like name hiding can interfere with C++ features like function overriding. 2) Function name overloading is convenient and good when appropriate; but confusing if misused. Be careful that overloading and name hiding don't interfere. 3) Primitive operands, arguments and returns get converted to their operators' and functions' expectations. 4) The presence of templates means that class member declarations involving a pointer to a template parameter might be interpreted as multiplication if you don't disambiguate with "typename". 5) Because the default for both objects and primitives is to pass to and from functions by copy, copy constructors are called more often than one might first imagine. 6) The special functions will be used more than a sub-guru C++er would imagine. C++ makes copies and temporaries under several circumstances. Special functions have even more reason than usual to avoid side-effects and to be very careful about throwing. (And never throw from a destructor.) 7) C++ doesn't take the attitude that incorrect primitive operand or argument types should be compile errors; it assumes you meant what you wrote and want the operands and arguments to be converted. 8) Although you can start an identifier with an underscore, you shouldn't. Leading underscores are added to identifiers by C++ translators. 9) You might find bizarre things happening if you have an identifier like bitor or compl. They are alternative keyword forms of | and ~. 10) Tidying up columns of numbers with leading zeroes turns them into octal (base 8) numbers. 11) It's not an error to have two character symbols within the single quote marks of a char literal (constant). 12) The encodings for char and wchar_t are not defined. Don't assume ASCII or Unicode. 13) Literal floating-point values (constants) do not have type float; they have type double. 14) Dividing by zero doesn't result in an exception. It results in undefined behavior. You must check your divisors yourself. 15) Because assignment expressions are expressions and evaluate to values, and because values can frequently be converted to bool, accidentally using assignment where you meant comparison will almost always compile and fail silently at run-time 16) The order in which operands are evaluated is not defined. The fact that, for example, addition is defined to be left-associative, doesn't necessarily mean that the operands of the addition operator are evaluated from the left. 17) The operands of logical and and logical or are evaluated from the left; however, there is a new uncertainty: not all the operands need be evaluated at all. 18) Variables can hold values or pointers; and this applies to objects of class type as well as primitives. Be careful you don't assign to a pointer thinking it's a value, or vice-versa. Consider "Hungarian naming". 19) Under assignment, when a pointer to an object is assigned another pointer to an object, they end up pointing at the same object. However, if an object-holding variable (or a reference to an object) is assigned another object, it becomes some kind of copy. 20) The increment (but not the decrement) operator can be applied to bool! It will set it to true whatever it was. 21) The precedence order of some operators including the assignment operators and the conditional (arithmetic if) was changed in ISO C++. 22) Whether you use curly brackets or not, selection and iteration statements control blocks. And the block begins at the opening parenthesis, not at the opening curly bracket (if there is one). 23) Selection and iteration statements should be controlled by boolean expressions, but it is not a compile error to put in numeric expression. 24) Although the (somewhat redundant) parentheses that hold the control logic are mandatory, curly brackets are not. Don't leave traps for maintenance programmers always put the curly brackets in. 25) The = symbol is actually the assignment operator, so it's all too easy to say "if I can assign b to a" rather than "if a is equal to b". In C++, because numeric results can be converted to bool, such mistakes usually compiles, to fail silently later. 26) If you don't put the curly brackets into an "if", it will be difficult to tell which "if" an "else" belongs to, if the "if" is, or ever becomes, nested. 27) The switch statement is really a slightly-tarted-up goto. This hints that maybe the switch should be avoided. 28) The switch statement involves just one block. The sections within are demarcated only with statement labels. You will need break statements after each (including the last for maintainers) section. 29) Accidentally drop one of the colons (:s) of a scope resolution operator, and you might just end up with a labeled statement. 30) A switch gets exponentially more difficult to maintain. One new type to be handled can involve updating many, many switch statements. Contrast this with adding a new class behind a polymorphic type. Prefer object-orientation over switches. 31) Many, many examples of the for statement have i++ to increment the loop counter. If i is an int, it's not really significant, but there's always a chance that it might become an iterator, and then ++i is more efficient and here does just the same as i++. 32) A C++ compiler isn't required to report an error if a value-returning function has no return statement. 33) (-PI <= theta <= PI) is always true, because the first comparison results in true or false which then convert to 0 or 1 which are both less than PI. In languages that don't convert boolean the expression is thankfully a compile error. 34) This is almost an anti-trap since consts are nearly always good. However, if you forget to make your query-only member functions const, then const objects of your class won't be able to accomplish much. 35) Applying delete to a pointer doesn't set it to zero; but you should probably consider doing so. 36) Incorrectly accessing an array of arrays (when simulating a multi-dimensional array) as arr[i, j] instead of arr[i][j] is not a compile error even though it won't do what you probably expect. 37) References create aliases for identifiers. Be careful not to end up doing daft things like self-assignment. Self assignment might go horribly wrong for classes with badly-written copy assignment operators. 38) Do not return via references unless you are absolutely sure that you know exactly what you are doing. (And make sure you know if and how your compiler supports the return value optimization.) 39) Don't try to be tidy by including empty parentheses when initializing object in variables via the default constructor. You will actually declaring parameterless functions instead. 40) Don't think of public as meaning the same for data and function members. Public data members can be accessed; but public member functions can't be called. Public member functions define the signatures of messages that all other objects can send. 41) Unlike, say, Smalltalk, C++ has class-level encapsulation. An object isn't prevented from accessing the private data members of another instance of its class. Apart from the copy constructor and the copy assignment operator, don't allow one instance to access the data of another instance. 42) C++ doesn't enforce the private access category for data members. One day you'll be tempted to make a data member of a base class protected. Resist. Don't do it. From that moment on, the base class becomes unmaintainable. 43) Object instance arguments are passed by value, i.e. by copying. But forget to provide the copy constructor and one will be provided for you. And Finagle's Law says that anytime you do forget, the implicit copy constructor's behavior won't be right. 44) Forgotten parentheses on an argumentless message or call will often compile because a function's address is a valid numeric expression. 45) Const objects can change. While passing out a pointer (or reference) to a data member is hardly a subtle trap, data members can be declared mutable however; so we hope that such data member have been designed to be undetectable from outside their object. 46) Static data members are close to global. Static member function uses are close to calls (rather than messages). Overusing static members takes one away from object-orientation and towards the less useful class-orientation. 47) Never mix up the two initialization syntax schemes of "=" and "()". Try to always use ()-style initialization. Never, ever allow the initialization of an object of class type to involve = and () at the same time. 48) The C memory mechanisms malloc and free still exist in C++, of course. It's not a compile mistake to free() something newed or to delete something malloc()ed; just lethal. 49) It's not a compile error to "delete" (rather than "delete[]") something that was "new []"ed. It's almost certainly a memory problem though. 50) To realize how quickly it could become impossible to decide who takes out the garbage, one only has to recall that the default argument passing and return mechanisms are by-copy; and that the implicit copy constructor does shallow copy. 51) Looking for smart pointers to help with garbage collection and with polymorphic collections, one turns to the library. But the smart pointer one finds there auto_ptr mustn't be used for either of those purposes. 52) Don't pass an auto_ptr to a function by value. The auto_ptr will be copied and the copy will become the owner When the function finishes, the local copied auto_ptr goes out of scope and it will delete what it was pointing at. 53) Almost uniquely among languages today, C++ allows one to store entire objects in variables. This leads to no end of complications; and certainly leads to the special functions being special. 54) A class without constructors may well have implicit constructors. There are only two circumstance when an implicit copy constructor wouldn't be provided; and only a few more where a default constructor wouldn't. And the implicit ones are public. 55) People sometimes try to call one constructor from another. Don't. You can write something that looks like it might just be a call to a constructor but it won't be. Constructors are always declaration statement stuff rather than expression statement stuff. 56) A one-argument constructor will be implicitly used as a conversion function (argument type to class type) unless you say not to with the "explicit" keyword. And constructors with default parameter values might be also be considered single-argument. 57) Another reason not to use the = sign in declaration/initialization is to avoid encouraging the mistaken belief that the copy assignment operator would be used. It wouldn't be. The copy constructor will be used for both = style and () style initialization. 58) The implicit copy constructor effectively does shallow copy it copies pointers but not pointees. 59) Lulled into complacency by constructor chaining, you might imagine that the copy assignment operator automatically chains as well. It doesn't. You will probably want to though. 60) Data members holding objects of class type are default constructed before the constructor's code block runs. Initialize them via the member initialization list instead. 61) If you initialize an object of class type via {} expression lists, almost any change to the class will invalidate such an initialization. 62) The order of initializers in a member initialization list is irrelevant. Don't try, in a member intiailzation list, to initialize one member in terms of another unless you know the initialization order rules. 63) Although destructors can be called explicitly, it is an exotic and probably dangerous thing to do. Be happy with automatic calling of destructors. 64) The implicit destructor is non-polymorphic (non-virtual). Just about everyone, just about all the time must consider whether their destructor should be polymorphic (virtual); and probably conclude that it should be. 65) You might not think you need a destructor, but unless you can prove that your class could never be a base class, you should provide a virtual (not pure virtual) destructor, even it is has an empty code block. 66) Remember that neither the number of arguments nor their position contribute to the "name" of a member function. So an overload of unary +, say, in a derived class would hide a binary + in a base class. 67) Operator overloads are inherited but the implicit copy assignment operator usually hides any base class copy assignment operator. 68) You can't make && or || (or ,) operator overloads work intuitively, so don't provide them. (You can't mimic guaranteed operator evaluation order or lazy evaluation with member functions.) 69) You might be tempted to provide operator bool in order that your objects can easily be tested for validity. But beware that once your objects can be converted to bool, they can also be converted to int. 70) [Well known but] Not only is int converted to long, for example, but long is converted to int, and float is even converted to int. If you're lucky, a friendly compiler will issue a warning. So treat C++ compiler warnings as though they were errors. 71) Inheritance of implementation has not turned out to be the labor-saving device we thought it would be. (Unlike composition, it doesn't function through low-coupling, program-by-contract messages.) 72) While the maintainability of the derived classes goes up because of factoring, the maintainability of the base class rapidly goes down. So be sure your code is stable before you start introducing inheritance of implementation. 73) Given the fragility of base classes, they should always be destined and designed to be base classes. A class should never just slip into becoming a base class. Design your concrete, derived classes so that they could never become base classes. So why don't we ban inheritance? Because there's something more important than implementation to inherit. Public inheritance gives objects extra types; it supports polymorphism. 74) Private data members are present in the instances of derived classes. They just can't be accessed by derived class code (which is a good thing). 75) The class keyword brings private access for the members. But that includes inheritance; and private inheritance is not the object-oriented way. Private inheritance is the (now not very popular) mixin way. 76) Don't be trapped into mentioning base class names any more than you have to. Use the "typedef BaseClass super" trick to avoid it. And don't use any class name as part of a function name. 77) [As mentioned earlier] Beware if you think you're overloading a member function from a base class. A derived class member function with the same name as a base class member function name-hides the base class one. A using declaration can sort this out. 78) Unless you use virtual member functions, C++ doesn't give you polymorphic behavior. By default C++ uses the pointer type to select member functions. That way lies redundant overriding and inconsistent binding. 79) Don't be misled into thinking that there's anything "ghostly" or "not really there" about virtual member functions. Pronounce "virtual" as "polymorphic". (The ghostly ones are the pure virtuals.) 80) It's not a compile error to forget to pass a derived class object by pointer or reference, when the parameter declaration was for a base class object. What will happen is that the derived bits will be "sliced" off as the argument is copied onto the stack. 81) Although making a member function polymorphic (virtual) means that the object selects the member function, it's still the pointer type that selects the access category. If you override a member function in a different access category, a) we hope you've done it to be more restrictive, not less, and b) the type of the pointer chooses the access category determining class. 82) Don't be misled by the unforgivably obscure syntax (= 0) into thinking that pure virtual member functions are some dark and dusty corner that is to be ignored. They are pivotal to good OO. 83) Don't imagine that a class with no code and no data members is worthless. A class with nothing but pure virtual member functions (a pABC) is an excellent and simple type that classes can implement. 84) Don't by misled by some books (or even libraries and frameworks) into thinking it's OK to instantiate base classes. It's almost never a good idea you end up with a class doing three jobs, and two is bad enough. 85) If you disregard the earlier advice to allow one or none of your base classes to have implementation, you will also and always have to consider whether your inheritance should be virtual inheritance. 86) Don't imagine that because the RTTI (run-time type identification) was introduced, you should use it. Treat "What's your class?" as a rude question and use it only once in a blue moon. Overuse of the RTTI probably indicates an architecture that is failing not having enough 1:m association relationships or the 1:m association relationships having the wrong types. 87) It's not an afterthought (I hope) that the type_info object supports one in avoiding mentioning a class by name. Avoid building class names into anything but declarations. And even in declarations avoid concrete class names as types. 88) Putting "using namespace std" makes a very large set of unseen names available to clash in "what on earth's going on" kinds of ways with your identifiers - particularly at some point in the future. 89) There might be more namespaces than you thought in which C++ will look for a function. C++ also searches for a function in the namespaces in which any of a function's arguments of class type are defined. 90) If any function you call (or that it calls, or ...) could throw, you need to be an order of magnitude more careful a programmer than you would otherwise have needed. 91) A derived object catcher that follows a catcher for its base class can never be reached. 92) Although you don't have to catch exception objects by reference, getting into the habit of doing anything else is asking for trouble. See bit slicing. This list isn't complete, there are several things we have discussed in the past that aren't listed there. And then, there are other traps/troubles not listed there because specific of D and probably absent from C++. Bye, bearophile
Nov 06 2008
parent Walter Bright <newshound1 digitalmars.com> writes:
bearophile wrote:
 This list isn't complete, there are several things we have discussed
 in the past that aren't listed there.

For one thing, it doesn't mention concurrency gotchas.
Nov 06 2008