www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - the cast mess

reply "Vladimir Panteleev" <thecybershadow gmail.com> writes:
------------6M576omFnqT5d1LaXNefkz
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit

Hello digitalmars.D,

I'm concerned about the cast() situation with D. Right now, we use the same
keyword and syntax to do three different things:

1) The obvious plain-old-data cast (casting one pointer type to another,
structures, etc.)
2) Type conversion casts (casting floats to ints)
3) Dynamic casts for classes (may result in null)
(There may be more magic going on behind the scenes when e.g. casting between
arrays with different element sizes - a proposal I saw some time ago to replace
start/length with start/end would get rid of that)

I attached a small program demonstrating a few uses of cast that may have
unexpected results for new users. 

I'd like to ask: am I the only one who thinks that this inconsistent behavior
of one of the language's base features sticks out like a sore thumb?
I can see where this mess is coming from - the first two are from C, and the
third is from the more dynamic languages (C++ has dynamic_cast but it's a
separate language construct). I'm not aware of a language that abuses the same
cast operator for such varying effects.
Someone doing a code review will have to put in extra effort to tell apart safe
casts from unsafe ones.
In a metaprogramming context, nothing can be said about "cast(T)x" - whether
it's type-safe, what overheads will it incur, etc.

I have the following suggestions to remedy the problem. They're probably too
drastic even for D2, so even if they'd get accepted - probably for D3:

1) Use the cast() keyword/syntax only to change the compiler's representation
of the type. The source and target types must be of the same size. Practically
it would be the same as *cast(T)(&x) with the current system, except with the
additional size check. The compiler should issue a warning when previous
versions would have performed data conversion.

2) Move all "magic" value type conversions (float => int etc.) to intrinsic
functions declared in the standard library.
An additional argument for this move is that, IMO, taking the integer part of a
floating-point number should be as accessible as rounding the number to the
nearest integer, or any other float=>int conversions (trunc/ceil/etc.)

3) Switch to a new syntax for dynamic_cast-style casts. I propose the "c is T"
syntax, which should function identically to the current "cast(T)c" and is much
more readable.

Alternatively, we could bring back C-style casts for safe and "magic" type
conversions (points 2 and 3), and leave cast() (or rename it to something
scarier) for unsafe "reinterpret" casts.

-- 
Best regards,
 Vladimir                          mailto:thecybershadow gmail.com
------------6M576omFnqT5d1LaXNefkz
Content-Disposition: attachment; filename=safetest.d
Content-Type: application/octet-stream; name=safetest.d
Content-Transfer-Encoding: Base64

Ly8vIGNvbXBpbGUgd2l0aCBkbWQgLXNhZmUgc2FmZXRlc3QuZA0KbW9kdWxlIHNh
ZmV0ZXN0Ow0KDQpzdHJ1Y3QgWCB7IGludCB4OyB9DQpzdHJ1Y3QgWSB7IGZsb2F0
IHk7IH0NCg0KY2xhc3MgQSB7fQ0KY2xhc3MgQiA6IEEge30NCmNsYXNzIEMge30N
Cg0Kdm9pZCBtYWluKCkNCnsNCgkvLy8gUE9EIGNhc3QNCglYIHggPSBYKDUpOw0K
CVkgeSA9IGNhc3QoWSl4Ow0KCQ0KCS8vLyB0eXBlIGNvbnZlcnNpb24gY2FzdCAt
IHdvcmtzIGxpa2UgYW4gaW50cmluc2ljIGZ1bmN0aW9uDQoJZmxvYXQgZiA9IC01
LjdmOw0KCWFzc2VydChjYXN0KGludClmID09IC01KTsNCgkvLy8gbWFnaWMhIHRo
ZSBjb21waWxlciB0dXJuZWQgdGhlIGZsb2F0IHRvIGFuIGludCBieSB0YWtpbmcg
dGhlIGludGVnZXIgcGFydA0KDQoJZmxvYXRbXSBmYSA9IFsxLjIsIDIuMywgMy40
XTsNCglhdXRvIGlhID0gY2FzdChpbnRbXSlmYTsNCgkvLyBhc3NlcnQoaWFbMF09
PTEpOw0KCS8vLyB0aGUgbWFnaWMgdmFuaXNoZXMgd2hlbiBkZWFsaW5nIHdpdGgg
YXJyYXlzIC0gdGhpcyBtaWdodCBhcHBlYXIgdW5leHBlY3RlZC9pbmNvbnNpc3Rl
bnQgdG8gbmV3IHVzZXJzDQoNCgkvLy8gZHluYW1pY19jYXN0IGJlaGF2aW9yDQoJ
QSBhID0gbmV3IEI7DQoJYXNzZXJ0KGEgIWlzIG51bGwpOw0KCUIgYiA9IGNhc3Qo
QilhOw0KCWFzc2VydChiICFpcyBudWxsKTsNCglDIGMgPSBjYXN0KEMpYTsNCglh
c3NlcnQoYyBpcyBudWxsKTsgLy8vIG1vcmUgbWFnaWMhIHRoZSBjb21waWxlciB3
aWxsIGxvb2sgYXQgdGhlIGNsYXNzJ3MgUlRUSSB0byBzZWUgaWYgd2UgY2FuIHNh
ZmVseSBkb3duY2FzdA0KCQ0KCS8vIGMgPSAqY2FzdChDKikoJmEpOyAvLy8gdGhp
cyBzZWVtcyB0byBiZSB0aGUgb25seSB3YXkgdG8gZG8gYW4gdW5zYWZlICh6ZXJv
LW92ZXJoZWFkKSBjYXN0IC0gDQoJICAgICAgICAgICAgICAgICAgICAgIC8vLyB0
YWtlIHRoZSByZWZlcmVuY2UncyBwb2ludGVyLCBjYXN0IHRoYXQgdGhlbiBkZXJl
ZmVyZW5jZSB0aGF0IC0gDQoJICAgICAgICAgICAgICAgICAgICAgIC8vLyBxdWl0
ZSB1Z2x5IGFuZCB0aGUgb2JqZWN0IHJlZmVyZW5jZSBtdXN0IGJlIGluIGEgdmFy
aWFibGUgDQoJICAgICAgICAgICAgICAgICAgICAgIC8vLyAoZG9lc24ndCB3b3Jr
IGluIHNhZmUgbW9kZSwgb2J2aW91c2x5KQ0KDQoJYXV0byBhYSA9IGNhc3QoQVtd
KWZhOyAvLy8gcGVyZmVjdGx5IHZhbGlkIGNvZGUsIGV2ZW4gd2l0aCAtc2FmZQ0K
CS8vZGVsZXRlIGFhWzBdOyAvLy8gdGhpcyB3aWxsIGNhdXNlIGFuIGFjY2VzcyB2
aW9sYXRpb24NCgkvLy8gbm90aWNlIHRoYXQgd2UgY2FuIGNhc3QgYXJyYXlzIG9m
IGludHMgdG8gY2xhc3NlcyB0byBnZW5lcmF0ZSBmYWtlIHBvaW50ZXJzLCBhbmQg
ImJyZWFrIiAtc2FmZQ0KfQ0K

------------6M576omFnqT5d1LaXNefkz--
May 17 2009
next sibling parent reply "Tim Matthews" <tim.matthews7 gmail.com> writes:
Having a float -5.7 magically turn into -5 is usually what is wanted for a  
float to int conversion. I am not sure if anyone would want it to be  
converted to -1061788058 instead but dmd currently allows both by having  
the normal way easiest to do and having the unusuall way achievable by  
placing it in an array or struct for the cast.

I am not sure whether I like the inconstistency but is there a use case  
for when -5.7 should ever be -1061788058?

When such invalid casts like "auto aa = cast(A[])fa;" are visible at  
compile time why doesn't all the error throwing happen at compile time?
May 18 2009
next sibling parent Tomas Lindquist Olsen <tomas.l.olsen gmail.com> writes:
On Mon, May 18, 2009 at 10:28 AM, Tim Matthews <tim.matthews7 gmail.com> wrote:
 Having a float -5.7 magically turn into -5 is usually what is wanted for a
 float to int conversion. I am not sure if anyone would want it to be
 converted to -1061788058 instead but dmd currently allows both by having the
 normal way easiest to do and having the unusuall way achievable by placing
 it in an array or struct for the cast.

 I am not sure whether I like the inconstistency but is there a use case for
 when -5.7 should ever be -1061788058?

 When such invalid casts like "auto aa = cast(A[])fa;" are visible at compile
 time why doesn't all the error throwing happen at compile time?

cast(T) is also very poorly documented.
May 18 2009
prev sibling parent Kagamin <spam here.lot> writes:
Tim Matthews Wrote:

 When such invalid casts like "auto aa = cast(A[])fa;" are visible at  
 compile time why doesn't all the error throwing happen at compile time?

currently cast behaves mostly like C cast, special case being dynamic cast and opCast. So it's very unsafe.
May 18 2009
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
Vladimir Panteleev wrote:
 Hello digitalmars.D,
 
 I'm concerned about the cast() situation with D. Right now, we use the same
keyword and syntax to do three different things:
 
 1) The obvious plain-old-data cast (casting one pointer type to another,
structures, etc.)
 2) Type conversion casts (casting floats to ints)
 3) Dynamic casts for classes (may result in null)
 (There may be more magic going on behind the scenes when e.g. casting between
arrays with different element sizes - a proposal I saw some time ago to replace
start/length with start/end would get rid of that)

4) removing const/immutable.
 
 I attached a small program demonstrating a few uses of cast that may have
unexpected results for new users. 
 
 I'd like to ask: am I the only one who thinks that this inconsistent behavior
of one of the language's base features sticks out like a sore thumb?
 I can see where this mess is coming from - the first two are from C, and the
third is from the more dynamic languages (C++ has dynamic_cast but it's a
separate language construct). I'm not aware of a language that abuses the same
cast operator for such varying effects.
 Someone doing a code review will have to put in extra effort to tell apart
safe casts from unsafe ones.
 In a metaprogramming context, nothing can be said about "cast(T)x" - whether
it's type-safe, what overheads will it incur, etc.
 
 I have the following suggestions to remedy the problem. They're probably too
drastic even for D2, so even if they'd get accepted - probably for D3:
 
 1) Use the cast() keyword/syntax only to change the compiler's representation
of the type. The source and target types must be of the same size. Practically
it would be the same as *cast(T)(&x) with the current system, except with the
additional size check. The compiler should issue a warning when previous
versions would have performed data conversion.
 
 2) Move all "magic" value type conversions (float => int etc.) to intrinsic
functions declared in the standard library.
 An additional argument for this move is that, IMO, taking the integer part of
a floating-point number should be as accessible as rounding the number to the
nearest integer, or any other float=>int conversions (trunc/ceil/etc.)

Personally, I _hate_ the fact that you can cast a float to an int. It's pretty amazing to me that Intel has done three different hardware modifications to get around a careless decision in the original C spec.
 3) Switch to a new syntax for dynamic_cast-style casts. I propose the "c is T"
syntax, which should function identically to the current "cast(T)c" and is much
more readable.
 
 Alternatively, we could bring back C-style casts for safe and "magic" type
conversions (points 2 and 3), and leave cast() (or rename it to something
scarier) for unsafe "reinterpret" casts.
 

May 18 2009
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 18 May 2009 05:09:10 -0400, Don <nospam nospam.com> wrote:

 Vladimir Panteleev wrote:
 Hello digitalmars.D,
  I'm concerned about the cast() situation with D. Right now, we use the  
 same keyword and syntax to do three different things:
  1) The obvious plain-old-data cast (casting one pointer type to  
 another, structures, etc.)
 2) Type conversion casts (casting floats to ints)
 3) Dynamic casts for classes (may result in null)
 (There may be more magic going on behind the scenes when e.g. casting  
 between arrays with different element sizes - a proposal I saw some  
 time ago to replace start/length with start/end would get rid of that)

4) removing const/immutable.

5) casting TO immutable. Note that this is far too easy, since the implications are not provable by the compiler. I'm all for overhauling the cast system, As long as we can make the syntax better than C++ (and I suspect we can). -Steve
May 18 2009
prev sibling parent reply grauzone <none example.net> writes:
I'd propose to remove cast(T) from the language, and replace it by 
several special template functions in object.d:
- a "B cast!(A, B)(A x);", which _always_ does the most safe thing, it 
would allow the following: casting between objects and interfaces, all 
implicit casts allowed by the language, and possibly use declared opCast 
from user data types
- a constCast() to fight with thew const/immutable/shared mess
- a lossyCast() for numeric conversions
- possibly a checkedCast() for numeric conversions, which dynamically 
checks for overflows and throws an exception on failure
- reinterpretCast(), the most evil thing that can exist

Note that some casts need compiler support. Like float<->int conversions 
is done with special CPU instructions on the lowest level. To allow 
implementation of the cast functions in pure D code, these instructions 
can be provided as intrinsic functions. (Just like bit scan or port I/O 
instructions.)

The worst thing about the current cast(T) operator is that it easily 
allows reinterpret cast. This is SO NOT SAFE! Especially with arrays, it 
doesn't do something what the user expects. It's a hidden reinterpret 
cast. This really should be changed.

(Oh btw., with my proposal, you could use cast!(T) for proper array 
casts, which allocate a new array, casts each element with cast!(T), and 
return it; or you could use reinterpretCast() to get the current behavior.)
May 18 2009
parent Christopher Wright <dhasenan gmail.com> writes:
grauzone wrote:
 The worst thing about the current cast(T) operator is that it easily 
 allows reinterpret cast. This is SO NOT SAFE! Especially with arrays, it 
 doesn't do something what the user expects. It's a hidden reinterpret 
 cast. This really should be changed.

I use this functionality in a few places. However, there are several other ways of doing it.
May 18 2009