www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Proposal for Implicit Conversion of Types

reply "Rioshin an'Harthen" <rharth75 hotmail.com> writes:
(Note: This post is best viewed with a fixed-width font)

Proposal for Implicit Conversion of Types
=========================================

This proposal spawned from the discussion "No more implicit
conversion real->complex?!" between myself and Don Clugston
after the change introduced in D version 0.150.


References
----------

See the thread "No more implicit conversion real->complex?!"
http://www.digitalmars.com/d/archives/digitalmars/D/35512.html


Rationale
---------

D tries, and makes a good job of, being mathematically correct
when it comes to eg. floating point variables initialized to
NaN's. Although this has raised some controversy on the newsgroup,
Walter's take on it has remained constant - if we could, we would
have a corresponding initialization for integral values.

Mathematical correctness (at least for me) implies implicit conversion
between real values and complex values, which was removed in version
0.150. Thus, the discussion between me and Don Clugston in the
aforementioned thread, during which this proposal was mainly hashed out.


Problem
-------

The problem which 0.150 fixed was the following:

real sin(real x);
creal sin(creal c);

sin(3.0); // error - multiple matching overloads


Proposal
--------

The D language has 24 basic data types, divisible into 8 type families
or types with the same archetype. These are:

 archetype | types (in order of smallest to largest)
-----------+-----------------------------------------
 void      | void
 bool      | bool
 cent      | byte, short, int, long, cent
 ucent     | ubyte, ushort, uint, ulong, ucent
 real      | float, double, real
 ireal     | ifloat, idouble, ireal
 creal     | cfloat, cdouble, creal
 dchar     | char, wchar, dchar

On a function call with multiple overloaded versions matching, do the
following:

1. Try an exact match

   Look for a function with the signature (disregarding return type)
   exactly matching the call to said function.

   In the problem above, the 3.0 in the call to sin is double. We do not
   have a version of sin for double, so we can't match exactly.

2. Try a widening conversion

   Look for a function with the signature (disregarding return type)
   with the smallest size larger than the current within the same
   family of types.

   In the problem above, since 3.0 is a double, and we can't match an
   exact double, try looking for a real version, which we find. Use it.

3. Try a semantic changing conversion

   Look for a function with a signature where the semantics changes
   according to rules specified below. This time we switch to a type
   not having the same archetype as the current type. We still prefer
   the smallest type possible, eg. double cannot be converted into
   ifloat (loss of data), but idouble is possible.

   In the example above, remove the function real sin(real x). Now
   call it with the same 3.0 double. We can't match a double parameter,
   nor a widened floating point value, so try with a change in semantics,
   and since complex numbers are a superset of real numbers, try complex
   numbers. We match creal sin(creal c).

Semantic Changing Conversion

Allow the following semantic changing conversions implicitly
(trying to remember what D currently allows), as well as those
that are good to have:

 original | new archetypes
----------+----------------
 void     | -
 bool     | -
 cent     | bool, real, creal
 ucent    | bool, real, creal
 real     | creal
 ireal    | creal
 creal    | -
 dchar    | bool, cent

The main feature of this table is that narrowing conversions
must be specified explicitly, while widening ones should go
through as is. However, boolean conversions work the opposite,
see below for more information.

For example, if the type float is to be converted, the preferred
is to convert withing the family, first trying double, then real.
If this is not possible, then try converting into complex numbers,
first ifloat, second idouble, and finally ireal.

Another example, for double to be converted, we prefer trying
for a real. If this fails, we try complex numbers. ifloat is too
small a type to be able to handle the double, so we can't convert
to it implicitly, so we try idouble first, and then ireal.

Boolean Conversions

The conversion from bool to integral values are not a part of
this proposal. The conversion of integers and unsigned integers
to bools is allowed as implicit, mainly because a lot of code
uses the feature in question. This proposal also decided to
treat character types for implicit conversion into bools,
to allow easier testing of the null character when handling
reading characters from a stream.

This is because a boolean is not an integer or a character,
but integers and characters can be treated as booleans.
Real numbers can easily be treated as booleans as well, but
it is much less common to see code written treating reals
as bools, and treating imaginary and complex numbers as
booleans make less sense. Thus, the decision was made to
not allow implicit conversion of all floating point types,
including the real, imaginary, and complex numbers, into
boolean true/false values.

Multi-Argument Functions

Multi-argument functions pose a problem for implicit
conversion. Given (example by Don Clugston):

1: func(real, creal);
2: func(creal, real);
3: func(creal, creal);

which alternative should

func(3.0, 2.0)

match, or should it match at all? Originally, I said the third
one; now I'm not so sure about that. After thinking about this
some more for the purpose of writing this proposal, I'm more
inclined to having the compiler give an error in this case.

However, giving an error is not an ideal solution - some scheme
to select a function to call might be better. At least an error
forces the developer to think which version he wants to call,
and add explicit casts where necessary, or writing a wrapper
like

real func(real a, real b)
{
   return cast(real) func(cast(creal) a, cast(creal) b);
}

for it.


Errors and Warnings
-------------------

I am going to describe command line flags passed to the compiler
as if the compiler was gcc, which I know best. This is just
for my own ease, and it should be easily understood and possible
to convert to dmd-specific flags.

The compiler should generate an error if steps 1 to 3 in the
above proposal fail.

The compiler should also generate an error if there are multiple
legal matches to a multi-argument function, see the topic above.

It would be good to have a compiler flag, eg. -Wsemantic, to
optionally display warnings for semantic-changing implicit
conversions. This is so that the author of software can, if
necessary, spot possible bugs relating to implicit conversions
allowed between semantic borders.

-- 
  Mikael Segercrantz
  software engineer 
Apr 04 2006
parent reply Don Clugston <dac nospam.com.au> writes:
Rioshin an'Harthen wrote:
 Proposal for Implicit Conversion of Types
 =========================================
 
 This proposal spawned from the discussion "No more implicit
 conversion real->complex?!" between myself and Don Clugston
 after the change introduced in D version 0.150.

 archetype | types (in order of smallest to largest)
 -----------+-----------------------------------------
  void      | void
  bool      | bool
  cent      | byte, short, int, long, cent
  ucent     | ubyte, ushort, uint, ulong, ucent
  real      | float, double, real
  ireal     | ifloat, idouble, ireal
  creal     | cfloat, cdouble, creal
  dchar     | char, wchar, dchar

Very well presented! There's one aspect that I think could be a problem -- conversions between signed and unsigned types. As written, that would mean that ushort -> ulong is preferred over ushort ->short. Since the language currently allows short and ushort to be interchanged without error (unless you enable warnings), I don't think the lookup rules can be different to that. So I think that right now, the cent/ucent categories will need to be combined: if more than one such conversion is possible, it's an error. Otherwise we end up in the 'should signed/unsigned conversions be an error' debate which has historically been unfruitful.
Apr 05 2006
parent "Rioshin an'Harthen" <rharth75 hotmail.com> writes:
"Don Clugston" <dac nospam.com.au> wrote in message 
news:e104fr$1hgk$1 digitaldaemon.com...
 Rioshin an'Harthen wrote:
 Proposal for Implicit Conversion of Types
 =========================================

 This proposal spawned from the discussion "No more implicit
 conversion real->complex?!" between myself and Don Clugston
 after the change introduced in D version 0.150.

 archetype | types (in order of smallest to largest)
 -----------+-----------------------------------------
  void      | void
  bool      | bool
  cent      | byte, short, int, long, cent
  ucent     | ubyte, ushort, uint, ulong, ucent
  real      | float, double, real
  ireal     | ifloat, idouble, ireal
  creal     | cfloat, cdouble, creal
  dchar     | char, wchar, dchar

Very well presented!

Thank you. :)
 There's one aspect that I think could be a problem -- conversions between 
 signed and unsigned types.
 As written, that would mean that ushort -> ulong is preferred over 
 ushort ->short. Since the language currently allows short and ushort to be 
 interchanged without error (unless you enable warnings), I don't think the 
 lookup rules can be different to that.

 So I think that right now, the cent/ucent categories will need to be 
 combined: if more than one such conversion is possible, it's an error.
 Otherwise we end up in the 'should signed/unsigned conversions be an 
 error' debate which has historically been unfruitful.

True, it is a problem with this proposal. It is not enough to add implicit semantic-changing conversions between the cent and ucent archetypes, since it would still prefer the widening conversions. However, I feel doubtful about combining the archetypes together. I would actually prefer to force signed to unsigned conversions to be explicit, since negative numbers can be "lost". Unsigned to signed is not that much of a problem, since (unless the type is ucent), it is possible to convert to a wider type, eg. ushort to int. Still, I am willing to allow the combination of those archetypes, even though I feel the doubt I mentioned. It would allow existing code to continue working, which is one of the main intents of the proposal, although I took some liberty with the boolean type in this case.
Apr 05 2006