D - bug?
- Dario (17/17) Sep 02 2002 I'm not sure this is a bug.
- Walter (4/21) Sep 03 2002 D follows the C rules for operand promotion, which makes the result an
- Dario (3/5) Sep 04 2002 Uh, you're right.
-
Walter
(7/12)
Sep 04 2002
The rationale for using the C rules is that old C programmers like me
- Mac Reiter (35/47) Sep 04 2002 There is also, of course, the pragmatic reasoning that if you add two uc...
- Sean L. Palmer (121/169) Sep 05 2002 This still seems akin to me to forcing a type conversion to long or doub...
- Sean L. Palmer (90/266) Sep 05 2002 Oh one more thing: any pragma keywords should trigger a linked
- Alix Pexton (6/29) Sep 05 2002 Sean L. Palmer wrote:
- Sean L. Palmer (5/34) Sep 05 2002 It strips asserts anyway in Release.
- Mac Reiter (44/64) Sep 05 2002 The following discussion is going to talk about classes, because it talk...
- Walter (11/17) Sep 06 2002 presence
- Walter (11/31) Sep 06 2002 Good question. The differences are:
- Mac Reiter (42/58) Sep 05 2002 byte d=150, e=230, f=2;
- Sean L. Palmer (45/90) Sep 05 2002 is
I'm not sure this is a bug.
When you sum (but not only when you sum) two ubytes you get an int.
Shouldn't I get a ubyte, since both operands are ubytes?
Look at this code to experiment it:
void func(int a)
{ printf("int\n"); }
void func(ubyte a)
{ printf("ubyte\n"); }
int main()
{
ubyte a = 4, b = 8;
func(a);
func(b);
func(a|b);
func(a+b);
return 0;
}
Sep 02 2002
D follows the C rules for operand promotion, which makes the result an
int. -Walter
"Dario" <supdar yahoo.com> wrote in message
news:al0c4t$15an$1 digitaldaemon.com...
I'm not sure this is a bug.
When you sum (but not only when you sum) two ubytes you get an int.
Shouldn't I get a ubyte, since both operands are ubytes?
Look at this code to experiment it:
void func(int a)
{ printf("int\n"); }
void func(ubyte a)
{ printf("ubyte\n"); }
int main()
{
ubyte a = 4, b = 8;
func(a);
func(b);
func(a|b);
func(a+b);
return 0;
}
Sep 03 2002
D follows the C rules for operand promotion, which makes the result an int. -WalterUh, you're right. Anyway that's not a logic rule and can be dangerous when using function overloading. But I guess you're not going to change D rules.
Sep 04 2002
"Dario" <supdar yahoo.com> wrote in message news:al4od0$7nh$1 digitaldaemon.com...The rationale for using the C rules is that old C programmers like me <g> are so used to them they are like second nature, for good or ill. Changing the way they work will, I anticipate, cause much grief because without thinking people will expect them to behave like C, and so will find D more frustrating than useful.D follows the C rules for operand promotion, which makes the result an int. -WalterUh, you're right. Anyway that's not a logic rule and can be dangerous when using function overloading. But I guess you're not going to change D rules.
Sep 04 2002
In article <al5lsn$21iq$1 digitaldaemon.com>, Walter says..."Dario" <supdar yahoo.com> wrote in message news:al4od0$7nh$1 digitaldaemon.com...There is also, of course, the pragmatic reasoning that if you add two uchars, the result will not necessarily fit in a uchar any more (150 + 150 = 300, which requires at least a 'short', and ints are faster than shorts). So you put the result in a bigger type so that the answer is correct. Then if the programmer actually wanted to get a truncated result (which would be 44, unless I've screwed up somewhere along the way behind the scenes...), then he/she can always mask it off and/or cast it back down to a uchar later. If this seems contrived, consider the following simple averaging code: unsigned char a, b, c; cin >> a >> b; c = (a+b)/2; cout << c; Now, arguably, the value of (a+b)/2 can't ever be too large for a uchar. However, the compiler has to do this in stages, so it has to do (a+b) and store that somewhere, and then it has to do the /2 to get the result. If the "type" of a+b was unsigned char, and a user input 150 and 150 to the preceding code snippet, the result would be 22 (44/2), which would probably surprise them a great deal. With the staging, a+b converts up to an int, the math occurs, the second stage /2 occurs (also as int), and now the result (150) is held in an internal temporary int. The assignment system looks at c and truncates/casts down to unsigned char (probably issuing a warning along the way), which is fine, since 150 fits in a uchar. Everything is fine. The compiler can't reasonably distinguish between this case and the function call case -- both are intermediate expressions. Most compilers consider it more important to maintain mathematically correct results than to maintain types, since the programmer can always force the type conversion if necessary, but would not be able to generate the correct result if the compiler discarded it during an intermediate stage. I used to do some Visual Basic programming, and sometimes they did some of this type enforcement, and it turned some pieces of code that should have been trivial into horribly complicated state machines... (I apologize if this posting feels offensive to anyone. I did not mean it to be such. I am, unfortunately, somewhat rushed at the moment and do not have the time to run my normal "tone" filters over what just rolled out of my fingers...) MacThe rationale for using the C rules is that old C programmers like me <g> are so used to them they are like second nature, for good or ill. Changing the way they work will, I anticipate, cause much grief because without thinking people will expect them to behave like C, and so will find D more frustrating than useful.D follows the C rules for operand promotion, which makes the result an int. -WalterUh, you're right. Anyway that's not a logic rule and can be dangerous when using function overloading. But I guess you're not going to change D rules.
Sep 04 2002
This still seems akin to me to forcing a type conversion to long or double in order to add two ints and divide by an int. And then what's going to go on if you add two longs and divide by a long... it does it internally in extra extra long? Where does it stop? Just always seemed too arbitrary to me. I think personally I'd advocate a system where it does the arithmetic in a value at least as large as the two input values so long as the result of the intermediate arithmetic would, at the end, fit back into the start type. But not that it has to do the math in the larger registers. Imposing a register type conversion forces extra mask operations into the code or forces the waste of valuable register space for architectures such as Intel that do a fairly good job at dealing with bytes registers efficiently. Math runs easier in SIMD registers also if you don't have to ensure intermediate precision. It definitely will slow things down to always require temporary arithmetic to be performed in int-sized registers... there's unpacking, then more of the larger ops (which take more time per op maybe and takes more ops as fewer can be done per cycle), then have to pack the result for storage. I'd leave it completely unspecified, or introduce a language mechanism to control the intermediate result (such as casting to the desired size first). Either allow it to be controlled precisely or leave it completely undetermined, but please don't do it half-assed like C did. One idea that came up in a previous discussion was a way to decorate a code block with attributes describing the desired goal of the optimizer... to optimize for speed, or space, or precision, or some combination. (there may be other valid goals as well, some interoperable with the above, some competing with the above). I'd love to standardize something like this since it's always been a matter of using different pragmas or language extensions on different compilers along with a combination of compiler switches and voodoo and wrap the whole thing in #ifdef THISCOMPILER >= THISVERSION crap. At least we have the version statement but nothing specifically to direct the compiler yet is there? Some syntax to give the compiler direct commands regarding parsing or code generation of the following or enclosed block. It could be an attribute like final. Something like fast // by fastest load/store and operation small // by smallest size (parm is how small it has to be) precise // by range or fractional accuracy or both exact // in number of bits sorted into categories (sign, exponent, integer, mantissa, fraction) wrap clamp bounded boundschecked unsafe But that litters the global namespace. Something more like pragma fast(1), precise(1e64, 1e-5) { for (;;); } Where the parameters are optional and default to something reasonable. The compiler can then tailor how it handles the intermediates according to the high-level goals for the section; if the goal is speed, avoid unnecessary conversions especially to slower types at the expense of precision. If the goal is safety, do bounds checking and throw lots of exceptions. If the goal is precision, do all arithmetic in precise intermediate format (extended maybe! or maybe just twice as big a register as the original types) or possibly: type numeric(fast(1), precise(1e64, 1e-5) mynumerictype; type numeric(fast(1), precise(1, 1e-2)[4] small mycolortype; mynumerictype a,b,c; for (a = 1.0; (a -= 0.0125) >= 0; ) plot(a, a^2, mycolortype{a^3, a^4, a^5, 1-a}); a couple more keywords might be the opposites of the above. slow large imprecise inexact managed safe Not even sure what I'd use all those keywords for yet either. But the general idea behind this proposal is to have a way to make the intent of the programmer known in a very general way. Common goals for a programmer involve optimizing for speed, size, precision, safety. The compiler has similar conflicting goals (commercial compilers do anyway) so why not standardize the interface to it. Think of it as a formal way to give very broad hints, or requests, or strong suggestions to the compiler. ;) In any case I would not make the result type of adding two chars be any larger than the largest explicit scalar present in the calculation. i.e. byte b=8, c=9; byte x = b + c; // stays in byte or larger in intermediate form int y = b + c; // intermediates done in int since result is int, and int is more precise than byte The most general spec should be that results can arbitrarily have more precision than you expected, even if this seems like an error; relying on inaccuracies seems in general to be bad programming practice. I want control over rounding modes and stuff like that in the language itself. I want control over endianness, saturation; give me as much control as possible and I'll find a use for most of it. ;) Sean "Mac Reiter" <Mac_member pathlink.com> wrote in message news:al5onu$2865$1 digitaldaemon.com...In article <al5lsn$21iq$1 digitaldaemon.com>, Walter says...an"Dario" <supdar yahoo.com> wrote in message news:al4od0$7nh$1 digitaldaemon.com...D follows the C rules for operand promotion, which makes the resultChangingThe rationale for using the C rules is that old C programmers like me <g> are so used to them they are like second nature, for good or ill.int. -WalterUh, you're right. Anyway that's not a logic rule and can be dangerous when using function overloading. But I guess you're not going to change D rules.morethe way they work will, I anticipate, cause much grief because without thinking people will expect them to behave like C, and so will find Duchars,frustrating than useful.There is also, of course, the pragmatic reasoning that if you add twothe result will not necessarily fit in a uchar any more (150 + 150 = 300,whichrequires at least a 'short', and ints are faster than shorts). So you puttheresult in a bigger type so that the answer is correct. Then if theprogrammeractually wanted to get a truncated result (which would be 44, unless I've screwed up somewhere along the way behind the scenes...), then he/she canalwaysmask it off and/or cast it back down to a uchar later. If this seems contrived, consider the following simple averaging code: unsigned char a, b, c; cin >> a >> b; c = (a+b)/2; cout << c; Now, arguably, the value of (a+b)/2 can't ever be too large for a uchar. However, the compiler has to do this in stages, so it has to do (a+b) andstorethat somewhere, and then it has to do the /2 to get the result. If the"type"of a+b was unsigned char, and a user input 150 and 150 to the precedingcodesnippet, the result would be 22 (44/2), which would probably surprise themagreat deal. With the staging, a+b converts up to an int, the math occurs,thesecond stage /2 occurs (also as int), and now the result (150) is held inaninternal temporary int. The assignment system looks at c andtruncates/castsdown to unsigned char (probably issuing a warning along the way), which isfine,since 150 fits in a uchar. Everything is fine. The compiler can't reasonably distinguish between this case and thefunctioncall case -- both are intermediate expressions. Most compilers considerit moreimportant to maintain mathematically correct results than to maintaintypes,since the programmer can always force the type conversion if necessary,butwould not be able to generate the correct result if the compiler discardeditduring an intermediate stage. I used to do some Visual Basic programming,andsometimes they did some of this type enforcement, and it turned somepieces ofcode that should have been trivial into horribly complicated statemachines...(I apologize if this posting feels offensive to anyone. I did not mean itto besuch. I am, unfortunately, somewhat rushed at the moment and do not havethetime to run my normal "tone" filters over what just rolled out of myfingers...)Mac
Sep 05 2002
Oh one more thing: any pragma keywords should trigger a linked
version-statement-testable flag so code can work in more than one way,
depending on surrounding or calling goal directives.
Imagine if "fast" or "precise" made it into the name mangling for function
overloading... or type overloading... or prerequisites, dbc!!!
template (T)
T operate_on(T a, T b)
in
{
assert(T.version(precision > 20))
}
body
{
version(fast && pentium_7_and_three_quarters)
{
return exp(a * log(b))/b;
}
version(precise && !broken_lexer_Dvprior1_03)
{
return a ** b;
}
}
typedef instance (pragma fast int) fast_int;
pragma precise int res = operate_on(cast(fast_int)1, cast(fast_int)2);
Which leads me to comment about the contracts currently in D. I don't see
how it's semantically much different to write:
in
{
assert(A);
}
out
{
assert(B);
}
body
{
C;
}
than it is to write:
{
assert(A);
C;
assert(B);
}
And in fact the latter seems way more natural. Maybe that shows off my
background. ;) I'd think about how to put more into a contract than just
some asserts. Such as hooking up a specific error message to an assert
(often used in production code BTW) and just listing some expressions that
must all be true (no need for a bunch of assert "statements" inside an in or
out contract.)
We also invariably want to intercept the assert behavior to draw some
message on screen or something, and maybe attempt to resume or abort, or
ignore any future assertions. Hard to do that without language support.
Some kind of global handle_assert function you can overload?
Can you overload the random number generator? The garbage collector? In a
standard manner that'll work on most platforms with compilers from different
vendors?
D already has 2 or 3 vendors, all at various stages on the way to D 1.0
spec. ;) I have only tried one.
Sean
"Sean L. Palmer" <seanpalmer earthlink.net> wrote in message
news:al75nt$26bl$1 digitaldaemon.com...
This still seems akin to me to forcing a type conversion to long or double
in order to add two ints and divide by an int. And then what's going to
go
on if you add two longs and divide by a long... it does it internally in
extra extra long? Where does it stop?
Just always seemed too arbitrary to me. I think personally I'd advocate a
system where it does the arithmetic in a value at least as large as the
two
input values so long as the result of the intermediate arithmetic would,
at
the end, fit back into the start type. But not that it has to do the math
in the larger registers. Imposing a register type conversion forces extra
mask operations into the code or forces the waste of valuable register
space
for architectures such as Intel that do a fairly good job at dealing with
bytes registers efficiently. Math runs easier in SIMD registers also if
you
don't have to ensure intermediate precision. It definitely will slow
things
down to always require temporary arithmetic to be performed in int-sized
registers... there's unpacking, then more of the larger ops (which take
more
time per op maybe and takes more ops as fewer can be done per cycle), then
have to pack the result for storage.
I'd leave it completely unspecified, or introduce a language mechanism to
control the intermediate result (such as casting to the desired size
first).
Either allow it to be controlled precisely or leave it completely
undetermined, but please don't do it half-assed like C did.
One idea that came up in a previous discussion was a way to decorate a
code
block with attributes describing the desired goal of the optimizer... to
optimize for speed, or space, or precision, or some combination. (there
may
be other valid goals as well, some interoperable with the above, some
competing with the above). I'd love to standardize something like this
since it's always been a matter of using different pragmas or language
extensions on different compilers along with a combination of compiler
switches and voodoo and wrap the whole thing in #ifdef THISCOMPILER >=
THISVERSION crap. At least we have the version statement but nothing
specifically to direct the compiler yet is there? Some syntax to give the
compiler direct commands regarding parsing or code generation of the
following or enclosed block.
It could be an attribute like final. Something like
fast // by fastest load/store and operation
small // by smallest size (parm is how small it has to be)
precise // by range or fractional accuracy or both
exact // in number of bits sorted into categories (sign, exponent,
integer, mantissa, fraction)
wrap
clamp
bounded
boundschecked
unsafe
But that litters the global namespace. Something more like
pragma fast(1), precise(1e64, 1e-5)
{
for (;;);
}
Where the parameters are optional and default to something reasonable.
The compiler can then tailor how it handles the intermediates according to
the high-level goals for the section; if the goal is speed, avoid
unnecessary conversions especially to slower types at the expense of
precision. If the goal is safety, do bounds checking and throw lots of
exceptions. If the goal is precision, do all arithmetic in precise
intermediate format (extended maybe! or maybe just twice as big a register
as the original types)
or possibly:
type numeric(fast(1), precise(1e64, 1e-5) mynumerictype;
type numeric(fast(1), precise(1, 1e-2)[4] small mycolortype;
mynumerictype a,b,c;
for (a = 1.0; (a -= 0.0125) >= 0; )
plot(a, a^2, mycolortype{a^3, a^4, a^5, 1-a});
a couple more keywords might be the opposites of the above.
slow
large
imprecise
inexact
managed
safe
Not even sure what I'd use all those keywords for yet either.
But the general idea behind this proposal is to have a way to make the
intent of the programmer known in a very general way. Common goals for a
programmer involve optimizing for speed, size, precision, safety. The
compiler has similar conflicting goals (commercial compilers do anyway) so
why not standardize the interface to it.
Think of it as a formal way to give very broad hints, or requests, or
strong
suggestions to the compiler. ;)
In any case I would not make the result type of adding two chars be any
larger than the largest explicit scalar present in the calculation.
i.e.
byte b=8, c=9;
byte x = b + c; // stays in byte or larger in intermediate form
int y = b + c; // intermediates done in int since result is int, and int
is
more precise than byte
The most general spec should be that results can arbitrarily have more
precision than you expected, even if this seems like an error; relying on
inaccuracies seems in general to be bad programming practice.
I want control over rounding modes and stuff like that in the language
itself. I want control over endianness, saturation; give me as much
control
as possible and I'll find a use for most of it. ;)
Sean
"Mac Reiter" <Mac_member pathlink.com> wrote in message
news:al5onu$2865$1 digitaldaemon.com...
In article <al5lsn$21iq$1 digitaldaemon.com>, Walter says...
"Dario" <supdar yahoo.com> wrote in message
news:al4od0$7nh$1 digitaldaemon.com...
D follows the C rules for operand promotion, which makes the result
an
int. -Walter
Uh, you're right.
Anyway that's not a logic rule and can be dangerous when using
function
overloading. But I guess you're not going to change D rules.
The rationale for using the C rules is that old C programmers like me
<g>
are so used to them they are like second nature, for good or ill.
Changing
the way they work will, I anticipate, cause much grief because without
thinking people will expect them to behave like C, and so will find D
more
frustrating than useful.
There is also, of course, the pragmatic reasoning that if you add two
uchars,
the result will not necessarily fit in a uchar any more (150 + 150 =
300,
which
requires at least a 'short', and ints are faster than shorts). So you
put
the
result in a bigger type so that the answer is correct. Then if the
programmer
actually wanted to get a truncated result (which would be 44, unless
I've
screwed up somewhere along the way behind the scenes...), then he/she
can
always
mask it off and/or cast it back down to a uchar later.
If this seems contrived, consider the following simple averaging code:
unsigned char a, b, c;
cin >> a >> b;
c = (a+b)/2;
cout << c;
Now, arguably, the value of (a+b)/2 can't ever be too large for a uchar.
However, the compiler has to do this in stages, so it has to do (a+b)
and
store
that somewhere, and then it has to do the /2 to get the result. If the
"type"
of a+b was unsigned char, and a user input 150 and 150 to the preceding
code
snippet, the result would be 22 (44/2), which would probably surprise
them
a
great deal. With the staging, a+b converts up to an int, the math
occurs,
the
second stage /2 occurs (also as int), and now the result (150) is held
in
an
internal temporary int. The assignment system looks at c and
truncates/casts
down to unsigned char (probably issuing a warning along the way), which
is
fine,
since 150 fits in a uchar. Everything is fine.
The compiler can't reasonably distinguish between this case and the
function
call case -- both are intermediate expressions. Most compilers consider
it more
important to maintain mathematically correct results than to maintain
types,
since the programmer can always force the type conversion if necessary,
but
would not be able to generate the correct result if the compiler
discarded
it
during an intermediate stage. I used to do some Visual Basic
programming,
and
sometimes they did some of this type enforcement, and it turned some
pieces of
code that should have been trivial into horribly complicated state
machines...
(I apologize if this posting feels offensive to anyone. I did not mean
it
to be
such. I am, unfortunately, somewhat rushed at the moment and do not
have
the
time to run my normal "tone" filters over what just rolled out of my
fingers...)
Mac
Sep 05 2002
Sean L. Palmer wrote:
The difference is that the D compiler strips the "in" and "out" blocks
when you compile with the "release" flag
or so is my understanding...
Alix...
webmaster "the D journal"
Which leads me to comment about the contracts currently in D. I don't see
how it's semantically much different to write:
in
{
assert(A);
}
out
{
assert(B);
}
body
{
C;
}
than it is to write:
{
assert(A);
C;
assert(B);
}
Sep 05 2002
It strips asserts anyway in Release. Sean "Alix Pexton" <Alix seven-point-star.co.uk> wrote in message news:al7a6t$2gha$1 digitaldaemon.com...Sean L. Palmer wrote: The difference is that the D compiler strips the "in" and "out" blocks when you compile with the "release" flag or so is my understanding... Alix... webmaster "the D journal"seeWhich leads me to comment about the contracts currently in D. I don'thow it's semantically much different to write: in { assert(A); } out { assert(B); } body { C; } than it is to write: { assert(A); C; assert(B); }
Sep 05 2002
In article <al77jb$2anj$1 digitaldaemon.com>, Sean L. Palmer says...
Which leads me to comment about the contracts currently in D. I don't see
how it's semantically much different to write:
in
{
assert(A);
}
out
{
assert(B);
}
body
{
C;
}
than it is to write:
{
assert(A);
C;
assert(B);
}
The following discussion is going to talk about classes, because it talks about
inheritance and you don't directly inherit functions... It applies to functions
because you can override member functions in derived classes.
In theory, by separating the contracts out, the compiler can provide contract
inheritance. I don't think this has been done yet, but it is a benefit over the
inline asserts. Design By Contract was originally created as part of Eiffel,
where contract inheritance does work. When you inherit from a class in Eiffel,
you don't just get its interfaces and members, you pick up all of its
contractual obligations. This allows any user of a class to also use
subclasses, knowing that they are required by the compiler to fulfill all
contractual requirements of the original class.
Contract inheritance is not as painful as I originally thought it would be. The
actual implementation is not especially efficient, and it frequently gets turned
off in Release builds, but it does at least work.
The idea is that when you derive from a class, you have to accept ANY input that
would have been legal to the base class. However, if your new class is more
capable, it may be able to accept inputs that used to be invalid. The new class
writes a more lenient contract. The compiler combines the contracts by putting
OR between the inheritance levels. That means that the derived class will
accept anything that was legal to the base OR anything that is legal to itself.
When you derive from a class, you also promise not to surprise any customers of
that base class. That means that any results must satisfy the original base
class contract. If your new class is more precise, or more complete, it may be
able to promise more than the base class, so it would have a new out contract.
The compiler combines the contracts by putting AND between the inheritance
levels. That means that the results will satisfy all of the base class
requirements AND all of the derived class requirements.
Presumably, D will eventually support contract inheritance, and you really need
to know which bits are contract and which bits are just asserts.
I agree that I would prefer having the out contract after the body, but the
placement is part of the "contract" idea. In a contract, you don't care as much
about how the work gets done (the body) as you do about what you have to supply
(the in) and what the contractor guarantees to give you (the out). The contract
is the user's view of the function. You get the return type, the parameter
list, and then the description of what you have to do and what the function will
do for you.
I also would prefer to simply have expressions in the contracts. The presence
of the assert() just seems to clutter the code, and nothing but asserts should
be in the contracts anyway, normally. I guess it might be necessary sometimes
to run a small piece of code to generate a test result, and then have an assert
on that result. I dunno. I haven't really used contracts enough to have a firm
grasp on what is really necessary.
Mac
Sep 05 2002
"Mac Reiter" <Mac_member pathlink.com> wrote in message news:al82ip$rf2$1 digitaldaemon.com...I also would prefer to simply have expressions in the contracts. Thepresenceof the assert() just seems to clutter the code, and nothing but assertsshouldbe in the contracts anyway, normally. I guess it might be necessarysometimesto run a small piece of code to generate a test result, and then have anasserton that result. I dunno. I haven't really used contracts enough to havea firmgrasp on what is really necessary.Sometimes contracts are not easilly expressible as functions. Consider a sort() function - the out contract should test that the array really is sorted. That requires a loop, which is not possible with just an assert expression.
Sep 06 2002
"Sean L. Palmer" <seanpalmer earthlink.net> wrote in message
news:al77jb$2anj$1 digitaldaemon.com...
Which leads me to comment about the contracts currently in D. I don't see
how it's semantically much different to write:
in
{
assert(A);
}
out
{
assert(B);
}
body
{
C;
}
than it is to write:
{
assert(A);
C;
assert(B);
}
Good question. The differences are:
1) Aesthetic - the in and out sections clearly mark what are preconditions
and what are postconditions.
2) Syntactic sugar - the out(result) syntax is a handy way to check the
function return value.
3) in and out bodies can contain arbitrary statements, loops, etc., not just
expressions.
4) Inheritance - the in and out clauses of virtual functions are inherited.
No way to do that with assert's.
Sep 06 2002
In article <al75nt$26bl$1 digitaldaemon.com>, Sean L. Palmer says...This still seems akin to me to forcing a type conversion to long or double in order to add two ints and divide by an int. And then what's going to go on if you add two longs and divide by a long... it does it internally in extra extra long? Where does it stop? Just always seemed too arbitrary to me. I think personally I'd advocate a system where it does the arithmetic in a value at least as large as the two input values so long as the result of the intermediate arithmetic would, at the end, fit back into the start type. But not that it has to do the math in the larger registers. Imposing a register type conversion forces extra[clip]In any case I would not make the result type of adding two chars be any larger than the largest explicit scalar present in the calculation. i.e. byte b=8, c=9; byte x = b + c; // stays in byte or larger in intermediate form int y = b + c; // intermediates done in int since result is int, and int is more precise than bytebyte d=150, e=230, f=2; byte x=(b+c)/f; // hosed. Your rule requires the intermediates to be bytes, // since nothing but bytes are used in this calculation. // Bytes are simply not big enough to do math with bytes. And as for speed, it doesn't matter how much faster you can make it if the answer is wrong. NOPs run really fast, and they give you an answer that is no more wrong than the above code would... I *do* like your ideas about standardizing ways to tell the compiler about your intent. I would prefer that certain of these "hints" actually be required -- boundschecked should require boundschecking, rather than just suggesting that "it might be a nice idea to check those arrays, if the compiler isn't too busy...". I generally hate hints -- if I took the time to tell the compiler to do something, it should either do it or throw an error if it is physically impossible to do it (attempts to inline recursive functions, for example). I agree that it is sometimes preferable to avoid the intermediate casts for performance reasons. I just think that the vast majority of code expects correct answers first, with optimization second. One thing I can think of to give the programmer the ability to turn off conversions would be something that looked like a cast, along the lines of: (I'm sure the exact syntax would have to be different) byte x = expr(byte)(b+c) / f; or, if you get really explicit: byte x = expr(byte)( expr(byte)(b+c) / f); although the outer expr() is not really needed, since the result immediately goes into a byte. expr() is kinda like cast(), except that cast() would evaluate the expression in whatever types it wanted and then cast it afterwards. expr() would tell the compiler what type to use for evaluating the expression, and that the programmer knows the cost and consequences and wants it to behave that way. That way, when the average programmer writes out a math formula, it generates the right answer (as long as the machine is physically capable of doing so -- I do *not* advocate implicitly doing 'long' math with software emulated bignum routines for the intermediate expressions). But when the advanced programmer really needs to tweak an expression down and either knows that the result will fit or explicitly wants the wrapping/truncation behavior, he/she can use expr() to force the behavior. expr() is probably a really bad name. It keeps bringing up lisp and scheme memories... Unfortunately, it is the best thing I could think of on spur of the moment. Mac
Sep 05 2002
"Mac Reiter" <Mac_member pathlink.com> wrote in message news:al7veu$ls1$1 digitaldaemon.com...In article <al75nt$26bl$1 digitaldaemon.com>, Sean L. Palmer says...isbyte b=8, c=9; byte x = b + c; // stays in byte or larger in intermediate form int y = b + c; // intermediates done in int since result is int, and intbytes,more precise than bytebyte d=150, e=230, f=2; byte x=(b+c)/f; // hosed. Your rule requires the intermediates to be// since nothing but bytes are used in this calculation. // Bytes are simply not big enough to do math with bytes.I confessed I thought of some of this as I was typing. What I tried to say at least by the end was that it shouldn't require any *more* precision than the smallest involved type. And in fact I think the above has that in a comment: "// stays in byte **or larger** in intermediate form" If you're worried about it, you can have the result or one of the inputs or any temporary part of it cast explicitly to a larger type. Or use "precise" goal directives around it. You have the exact same problem with integers and nobody seems to complain about that. I've personally hit the integer problem a bunch especially when doing fixed point arithmetic. What if all your byte values are only using maybe 7 of the bits? Then there's no need for a larger type for the intermediates.And as for speed, it doesn't matter how much faster you can make it if the answer is wrong. NOPs run really fast, and they give you an answer thatis nomore wrong than the above code would...If code would be illegal to generate in the SIMD unit due to precision issues even though what you're trying to do is force the compiler to do the calculation in the SIMD unit, it can be pretty frustrating. So making the rules too restrictive can cause problems too. The answer would only be wrong since you decided to use bytes (a very imprecise type) to hold your values. If you're worried about precision, use a more precise type, or add a cast, it's not that hard. SeanI *do* like your ideas about standardizing ways to tell the compiler aboutyourintent. I would prefer that certain of these "hints" actually berequired --boundschecked should require boundschecking, rather than just suggestingthat"it might be a nice idea to check those arrays, if the compiler isn't too busy...". I generally hate hints -- if I took the time to tell thecompiler todo something, it should either do it or throw an error if it is physically impossible to do it (attempts to inline recursive functions, for example). I agree that it is sometimes preferable to avoid the intermediate castsforperformance reasons. I just think that the vast majority of code expects correct answers first, with optimization second. One thing I can think oftogive the programmer the ability to turn off conversions would be somethingthatlooked like a cast, along the lines of: (I'm sure the exact syntax wouldhave tobe different) byte x = expr(byte)(b+c) / f; or, if you get really explicit: byte x = expr(byte)( expr(byte)(b+c) / f); although the outer expr() is not really needed, since the resultimmediatelygoes into a byte. expr() is kinda like cast(), except that cast() would evaluate theexpression inwhatever types it wanted and then cast it afterwards. expr() would tellthecompiler what type to use for evaluating the expression, and that theprogrammerknows the cost and consequences and wants it to behave that way. That way, when the average programmer writes out a math formula, itgeneratesthe right answer (as long as the machine is physically capable of doingso -- Ido *not* advocate implicitly doing 'long' math with software emulatedbignumroutines for the intermediate expressions). But when the advancedprogrammerreally needs to tweak an expression down and either knows that the resultwillfit or explicitly wants the wrapping/truncation behavior, he/she can useexpr()to force the behavior. expr() is probably a really bad name. It keeps bringing up lisp andschemememories... Unfortunately, it is the best thing I could think of on spurof themoment. Mac
Sep 05 2002









"Sean L. Palmer" <seanpalmer earthlink.net> 