## digitalmars.D - OT (partially): about promotion of integers

"eles" <eles eles.com> writes:
```Hello,

bytes, rised me a question, that is somewhat linked to a
difference between Pascal/Delphi/FPC (please, no flame here) and
C/D.

Basically, as far as i get it, both FPC and C use Integer (name
it int, if you like), as a fundamental type. That means, among
others, that this is the prefered type to cast (implicitely) to.

Now, there is a difference between the int-FPC and the int-C:
int-FPC is the *widest* integer type (and it is signed), and all
others integral types are subranges of this int-FPC. That is, the
unsigned type is simply a sub-range of positive numbers, the char
type is simply the subrange between -128 and +127 and so on.

This looks to me as a great advantage, since implicit
conversions are always straightforward and simple: everything is
first converted to the fundamental (widest) type, calculation is
should be handled by the compiler, not by the programmer), then
the final result is obtanied.

Note that this approach, of making unsigned integrals a subrange
of the int-FPC halves the maximum-representable number as
unsigned, since 1 bit is always reserved for the sign (albeit,
for unsigned, it is always 0).

OTOH, the fact that the int-FPC is the widest available, makes
it very naturally as a fundamental type and justifies (I think,
without doubt), the casting all other types to this type and of
the result of the arithmetic operation. If this result is in a
subrange, then it might get casted back to a subrange (that is,
another integral type).

In C/D, the problem is that int-C is the fundamental (and
prefered for conversion) type, but it is not the widest. So, you
have a plethora of implicit promotions.

Now, the off-topic question: the loss in unsigned-range aside
(that I find it to be a small price for the earned clarity), is
that any other reason (except C-compatibility) that D would not
implement that model (this is not a suggestion to do it now, I
know D is almost ready for prime-time, but it is a question),
that is the int-FPC like model for integral types?

Thank you,

Eles
```
Dec 11 2012
"eles" <eles eles.com> writes:
```  Now, the off-topic question: the loss in unsigned-range aside
(that I find it to be a small price for the earned clarity), is
that any other reason (except C-compatibility) that D would not
implement that model (this is not a suggestion to do it now, I
know D is almost ready for prime-time, but it is a question),
that is the int-FPC like model for integral types?

Rephrasing all that, it would be just like the fundamental type
in D would be the widest-integral type, and the unsigned variant
of that widest-integral type would be supprimated.

Then, all operands in an integral operations would be first
promoted to this widest-integral, computation would be made, then
the final result may be demoted back (the compiler is free to
optimize it as it wants, but behind the scene).
```
Dec 11 2012
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```On 12/11/12 10:20 AM, eles wrote:
Hello,

rised me a question, that is somewhat linked to a difference between
Pascal/Delphi/FPC (please, no flame here) and C/D.

There's a lot to be discussed on the issue. A few quick thoughts:

* 32-bit integers are a sweet spot for CPU architectures. There's rarely
a provision for 16- or 8-bit operations; the action is at 32- or 64-bit.

* Then, although most 64-bit operations are as fast as 32-bit ones,
transporting operands takes twice as much internal bus real estate and
sometimes twice as much core real estate (i.e. there are units that do
either two 32-bit ops or one 64-bit op).

* The whole reserving a bit and halving the range means extra costs of
operating with a basic type.

Andrei
```
Dec 11 2012
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```On 12/11/12 11:29 AM, eles wrote:
There's a lot to be discussed on the issue. A few quick thoughts:

* 32-bit integers are a sweet spot for CPU architectures. There's
rarely a provision for 16- or 8-bit operations; the action is at 32-
or 64-bit.

Speed can be still optimized by the compiler, behind the scenes. The
approach does not asks the compiler to promote everything to
widest-integral, but to do the job "as if". Currently, the choice of
int-C as the fastest-integral instead of widest-integral move the burden
from the compiler to the user.

Agreed. But then that's one of them "sufficiently smart compiler"
arguments. http://c2.com/cgi/wiki?SufficientlySmartCompiler

* Then, although most 64-bit operations are as fast as 32-bit ones,
transporting operands takes twice as much internal bus real estate and
sometimes twice as much core real estate (i.e. there are units that do
either two 32-bit ops or one 64-bit op).

* The whole reserving a bit and halving the range means extra costs of
operating with a basic type.

Yes, there is a cost. But, as always, there is a balance between
advantages and drawbacks. What is favourable? Simplicity of promotion or
a supplimentary bit?

A direct and natural mapping between language constructs and machine
execution is very highly appreciated in the market D is in. I don't see
that changing in the foreseeable future.

Besides, at the end of the day, a half-approach would be to have a
widest-signed-integral and a widest-unsigned-integral type and only play
with those two.

D has terrific abstraction capabilities. Lave primitive types alone and
define a UDT that implements your desired behavior. You can always
implement safe on top of fast but not the other way around.

Andrei
```
Dec 11 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/11/2012 8:35 AM, Andrei Alexandrescu wrote:
Besides, at the end of the day, a half-approach would be to have a
widest-signed-integral and a widest-unsigned-integral type and only play
with those two.

Why stop at 64 bits? Why not make there only be one integral type, and it is of
whatever precision is necessary to hold the value? This is quite doable, and
has
been done.

But at a terrible performance cost.

And, yes, in D you can create your own "BigInt" datatype which exhibits this
behavior.
```
Dec 11 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/11/2012 10:36 AM, eles wrote:
You really miss the point here. Nobody will ask you to promote those numbers to
64-bit or whatever *unless necessary*.

No, I don't miss the point. There are very few cases where the compiler could
statically prove that something will fit in less than 32 bits.

Consider this:

Integer foo(Integer i)
{
return i * 2;
}

Tell me how many bits that should be.
```
Dec 11 2012
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```On 12/11/12 1:36 PM, eles wrote:

A bit shameful.

I thought my answer wasn't all that shoddy and not defensive at all.

Andrei
```
Dec 11 2012
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```On 12/11/12 5:07 PM, eles wrote:
I thought my answer wasn't all that shoddy and not defensive at all.

I step back. I agree. Thank you.

Somebody convinced somebody else of something on the Net. This has good
day written all over it. Time to open that champagne. Cheers!

Andrei
```
Dec 11 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/11/2012 10:45 AM, bearophile wrote:
Walter Bright:

Why stop at 64 bits? Why not make there only be one integral type, and it
is of whatever precision is necessary to hold the value? This is quite
doable, and has been done.

I think no one has asked for *bignums on default* in this thread.

I know they didn't ask. But they did ask for 64 bits, and the exact same
argument will apply to bignums, as I pointed out.

But at a terrible performance cost.

tagged integers on default, and they are very far from being "terrible".
Tagged integers cause no heap allocations if they aren't large. Also the
Common Lisp compiler in various situations is able to infer an integer can't
be too much large, replacing it with some fixnum. And it's easy to add
annotations in critical spots to ask the Common Lisp compiler to use a
fixnum, to squeeze out all the performance.

I don't notice anyone reaching for Lisp or Ocaml for high performance
applications.

The result is code that's quick, for most situations. But it's more often
correct. In D you drive with eyes shut; sometimes for me it's hard to know if
some integral overflow has occurred in a long computation.

And, yes, in D you can create your own "BigInt" datatype which exhibits
this behavior.

Currently D bigints don't have short int optimization.

That's irrelevant to this discussion. It is not a problem with the language.
Anyone can improve the library one if they desire, or do their own.

I think the compiler doesn't perform on BigInts the optimizations it does on
ints, because it doesn't know about bigint properties.

I think the general lack of interest in bigints indicate that the builtin types
work well enough for most work.
```
Dec 11 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/11/2012 3:15 PM, deadalnix wrote:
That's irrelevant to this discussion. It is not a problem with the language.
Anyone can improve the library one if they desire, or do their own.

I think it is useful to draw a distinction.

I think the compiler doesn't perform on BigInts the optimizations it does on
ints, because it doesn't know about bigint properties.

I think the general lack of interest in bigints indicate that the builtin
types work well enough for most work.

That argument is fallacious. Something more used don't really mean better. OR
PHP and C++ are some of the best languages ever made.

I'm interested in crafting D to be a language that people will like and use.
Therefore, what things make a language popular are of significant interest.

I.e. it's meaningless to create the best language evar and be the only user of
it.

Now, if we have int with terrible problems, and bigint that solves those
problems, and yet people still prefer int by a 1000:1 margin, that makes me
very
skeptical that those problems actually matter.

We need to be solving the *right* problems with D.
```
Dec 11 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/16/2012 3:24 PM, SomeDude wrote:
Proof is, it seems to me that you (Isaac Gouy) often come around here. We can
magically invoke you every time one talks about the shootout. Which is pretty
astonishing for a language you aren't interested in.

Not really. You can set Google to email you whenever a search phrase turns up a
new result.
```
Dec 16 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/11/2012 10:44 AM, foobar wrote:
All of the above relies on the assumption that the safety problem is due to the
memory layout. There are many other programming languages that solve this by
using a different point of view - the problem lies in the implicit casts and
not
the memory layout. In other words, the culprit is code such as:
uint a = -1;
which compiles under C's implicit coercion rules but _really shouldn't_.
The semantically correct way would be something like:
uint a = 0xFFFF_FFFF;
but C/C++ programmers tend to think the "-1" trick is less verbose and
"better".

Trick? Not at all.

1. -1 is the size of an int, which varies in C.

2. -i means "complement and then increment".

3. Would you allow 2-1? How about 1-1? (1-1)-1?

Arithmetic in computers is different from the math you learned in school. It's
2's complement, and it's best to always keep that in mind when writing programs.
```
Dec 11 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/11/2012 3:44 PM, foobar wrote:
Thanks for proving my point. after all , you are a C++ developer, aren't you?
:)

No, I'm an assembler programmer. I know how the machine works, and C, C++, and
D
map onto that, quite deliberately. It's one reason why D supports the vector
types directly.

Seriously though, it _is_ a trick and a code smell.

Not to me. There is no trick or "smell" to anyone familiar with how computers
work.

I'm fully aware that computers used 2's complement. I'm also am aware of the
fact that the type has an "unsigned" label all over it. You see it right there
in that 'u' prefix of 'int'. An unsigned type should semantically entail _no
sign_ in its operations. You are calling a cat a dog and arguing that dogs
barf?
Yeah, I completely agree with that notion, except, we are still talking about
_a
cat_.

inevitable result is that signed and unsigned types *are* conflated in D, and
have to be, otherwise many things stop working.

For example, p[x]. What type is x?

Integer signedness in D is not really a property of the data, it is only how
one
happens to interpret the data in a specific context.

To answer you question, yes, I would enforce overflow and underflow checking
semantics. Any negative result assigned to an unsigned type _is_ a logic error.
you can claim that:
uint a = -1;
is perfectly safe and has a well defined meaning (well, for C programmers that
uint a = b - c;
what if that calculation results in a negative number? What should the compiler
do? well, there are _two_ equally possible solutions:
a. The overflow was intended as in the mask = -1 case; or
b. The overflow is a _bug_.

The user should be made aware of this and should make the decision how to
handle
this. This should _not_ be implicitly handled by the compiler and allow bugs go
unnoticed.

I think C# solved this _way_ better than C/D.

C# has overflow checking off by default. It is enabled by either using a
checked
{ } block, or with a compiler switch. I don't see that as "solving" the issue
in
any elegant or natural way, it's more of a clumsy hack.

But also consider that C# does not allow pointer arithmetic, or array slicing.
Both of these rely on wraparound 2's complement arithmetic.

Another data point would be (S)ML
which is a compiled language which requires _explicit conversions_ and has a
very strong typing system. Its programs are compiled to efficient native
executables and the strong typing allows both the compiler and the programmer
better reasoning of the code. Thus programs are more correct and can be
optimized by the compiler. In fact, several languages are implemented in ML
because of its higher guaranties.

ML has been around for 30-40 years, and has failed to catch on.
```
Dec 11 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/11/2012 5:05 PM, bearophile wrote:
Walter Bright:

ML has been around for 30-40 years, and has failed to catch on.

OcaML, Haskell, F#, and so on are all languages derived more or less directly
from ML, that share many of its ideas. Has Haskell caught on? :-)

Haskell is the language that everyone loves to learn and talk about, and few
actually use.

And it's significantly slower than D, in unfixable ways.
```
Dec 11 2012
Timon Gehr <timon.gehr gmx.ch> writes:
```On 12/12/2012 03:45 AM, Walter Bright wrote:
On 12/11/2012 5:05 PM, bearophile wrote:
Walter Bright:

ML has been around for 30-40 years, and has failed to catch on.

OcaML, Haskell, F#, and so on are all languages derived more or less
directly
from ML, that share many of its ideas. Has Haskell caught on? :-)

Haskell is the language that everyone loves to learn and talk about, and
few actually use.

And it's significantly slower than D,

(Sufficiently sane) languages are not slow or fast and I think the
factor GHC/DMD cannot be more than about 2 or 3 for roughly equivalently
written imperative code.

Furthermore no D implementation has any kind of useful performance for
lazy functional style D code.

In some ways, D is very significantly slower than Haskell. The compilers
optimize specific coding styles better than others.

in unfixable ways.

I disagree. That is certainly fixable. It is a mere QOI issue.
```
Dec 12 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/12/2012 12:01 PM, Timon Gehr wrote:
That is certainly fixable. It is a mere QOI issue.

When you have a language that fundamentally disallows mutation, some algorithms
are doomed to be slower. I asked Erik Maijer, one of the developers of Haskell,
if the implementation does mutation "under the hood" to make things go faster.
He assured me that it does not, that it follows the "no mutation" all the way.

I think the factor GHC/DMD cannot be more than about 2 or 3 for roughly
equivalently written imperative code.

A factor of 2 or 3 is make or break for a large class of programs.

Consider running a server farm. If you can make your code 5% faster, you need
5%
fewer servers. That translates into millions of dollars.
```
Dec 12 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/12/2012 2:17 PM, bearophile wrote:
- I've seen Facebook start from PHP, go to PHP compiled in some ways, and
lately
start to switch to faster languages, so when you have tons of servers space and
electricity used by CPUs becomes important for the bottom line. On the other
hand on similar servers lot of other people use languages where there is far

I know people who use Java on server farms. They are very, very, very cognizant
of overhead and work like hell trying to reduce it, because reducing it drops
millions of dollars right to the bottom line of profit.

Java makes no attempt to detect integer overflows.

Often small performance differences are not
more important than several other considerations, like coding speed, how much
easy is to find programmer, how cheap those programmers are, etc, even on
server
farms.

The problem they have with C++ is it is hard to find C++ programmers, not
because of overflow in C++ programs.

- If your code is buggy (because of overflows, or other causes), its output can
be worthless or even harmful. This is why some people are using OcaML for
high-speed trading (I have given two links in a precedent post), where bugs
risk
being quite costly.

I personally know people who write high speed trading software. These people
are
concerned with nanosecond delays. They write code in C++. They even hack on the
compiler trying to get it to generate faster code.

It doesn't surprise me a bit that some people who operate server farms use slow
languages like Ruby, Python, and Perl on them. This does cost them money for
extra hardware. There are always going to be businesses that have inefficient
operations, poorly allocated resources, and who leave a lot of money on the
table.
```
Dec 12 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/12/2012 5:32 PM, bearophile wrote:
One "important" firm uses OcaML for high speed trading because it's both very
fast (C++-class fast, faster than Java on certain kinds of code, if well used)
and apparently quite safer to use than C/C++. And it's harder to find OcaML
programmers than C++ ones.

Fair enough, so I challenge you to write an Ocaml version of the input sorting
```
Dec 12 2012
Timon Gehr <timon.gehr gmx.ch> writes:
```On 12/12/2012 10:35 PM, Walter Bright wrote:
On 12/12/2012 12:01 PM, Timon Gehr wrote:
That is certainly fixable. It is a mere QOI issue.

When you have a language that fundamentally disallows mutation,

It does not.

some algorithms are doomed to be slower.

Here's a (real) quicksort:

I asked Erik Maijer, one of the developers of
Haskell, if the implementation does mutation "under the hood" to make
things go faster.

"under the hood", obviously there must be mutation as this is how the
machine works.

He assured me that it does not, that it follows the
"no mutation" all the way.

Maybe he misunderstood. i.e. DMD does not do this to immutable data
either. eg. Control.Monad.ST allows in-place state mutation of data
types eg. from Data.STRef and Data.Array.ST. Such operations are
sequenced and crosstalk between multiple such 'threads' is excluded by
the type system, as long as only safe operations are used.

It is somewhat similar to (the still quite broken) 'pure' in D, but
stronger. (e.g. it is possible to pass mutable references into the rough
equivalent of 'strongly pure' code, but that code won't be able to read
their values, the references can appear as part of the return type, and
the caller will be able to access them again -- Done using basically
nothing but parametric polymorphism, which D lacks.)

Eg:

runST \$ do           -- ()pure{

writeSTRef x 2     -- x = 2; // mutate x
y <- readSTRef x   -- auto y = x;
writeSTRef x 3     -- x = 3; // mutate x
z <- readSTRef x   -- auto z = x;
return (y,z)       -- return tuple(y,z);}();
(2,3)                  -- tuple(2,3)

This paper describes how this is implemented in GHC (in-place mutation)
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.3299

The only reason I can see why this is not as fast as D is implementation
simplicity on the compiler side.

Here is some of the library code. It makes use of primitives (intrinsics):

I think the factor GHC/DMD cannot be more than about 2 or 3 for roughly
equivalently written imperative code.

A factor of 2 or 3 is make or break for a large class of programs.

Consider running a server farm. If you can make your code 5% faster, you
need 5% fewer servers. That translates into millions of dollars.

Provided the code is correct.
```
Dec 12 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/12/2012 3:23 PM, Timon Gehr wrote:
It is somewhat similar to (the still quite broken) 'pure' in D,

Broken how?

Provided the code is correct.

No language or compiler can prove code correct.
```
Dec 12 2012
Timon Gehr <timon.gehr gmx.ch> writes:
```On 12/13/2012 12:43 AM, Walter Bright wrote:
On 12/12/2012 3:23 PM, Timon Gehr wrote:
It is somewhat similar to (the still quite broken) 'pure' in D,

Broken how?

- There is no way to specify that a delegate is strongly pure without
resorting to type deduction, because
- Member functions/local functions are handled inconsistently.
- Delegate types legally obtained from certain member functions are
illegal to declare.
- 'pure' means 'weakly pure' for member functions and 'strongly
pure' for local functions. Therefore it means 'weakly pure' for
delegates, as those can be obtained from both.
- Delegates may break the transitivity of immutable, and by extension,
shared.

A good first step in fixing up immutable/shared would be to make
everything that is annotated 'error' pass, and the line annotated 'ok'
should fail:

import std.stdio;

struct S{
int x;
int foo()pure{
return x++;
}
int bar()immutable pure{
// return x++; // error
return 2;
}
}

int delegate()pure s(){
int x;
int foo()pure{
// return x++; // error
return 2;
}
/+int bar()immutable pure{ // error
return 2;
}+/
return &foo;
}

void main(){
S s;
int delegate()pure dg = &s.foo;
// int delegate()pure immutable dg2 = &s.bar; // error
writeln(dg(), dg(), dg(), dg()); // 0123
immutable int delegate()pure dg3 = dg; // ok
writeln(dg3(), dg3(), dg3(), dg3()); // 4567
// static assert(is(typeof(cast()dg3)==int delegate() immutable pure));
// error
auto bar = &s.bar;
pragma(msg, typeof(bar)); // "int delegate() immutable pure"
}

Provided the code is correct.

No language or compiler can prove code correct.

Sometimes it can. Certainly a compiler can check a user-provided proof.
eg: http://coq.inria.fr/

A minor issue with proving code correct is of course that the proven
specification might contain an error. The formal specification is often
far more explicit and easier to verify manually than the program though.
```
Dec 12 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/12/2012 5:16 PM, Timon Gehr wrote:
On 12/13/2012 12:43 AM, Walter Bright wrote:
On 12/12/2012 3:23 PM, Timon Gehr wrote:
It is somewhat similar to (the still quite broken) 'pure' in D,

Broken how?

- There is no way to specify that a delegate is strongly pure without resorting
to type deduction, because

Are these in bugzilla?
```
Dec 12 2012
Timon Gehr <timon.gehr gmx.ch> writes:
```On 12/13/2012 04:54 AM, Walter Bright wrote:
On 12/12/2012 5:16 PM, Timon Gehr wrote:
On 12/13/2012 12:43 AM, Walter Bright wrote:
On 12/12/2012 3:23 PM, Timon Gehr wrote:
It is somewhat similar to (the still quite broken) 'pure' in D,

Broken how?

- There is no way to specify that a delegate is strongly pure without
resorting
to type deduction, because

Are these in bugzilla?

Now they certainly are.

http://d.puremagic.com/issues/show_bug.cgi?id=9148

The following you can close if you think 'const' should not guarantee no
mutation. It does not break other parts of the type system:
http://d.puremagic.com/issues/show_bug.cgi?id=9149
```
Dec 13 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/13/2012 4:46 AM, Timon Gehr wrote:
Now they certainly are.

http://d.puremagic.com/issues/show_bug.cgi?id=9148

The following you can close if you think 'const' should not guarantee no
mutation. It does not break other parts of the type system:
http://d.puremagic.com/issues/show_bug.cgi?id=9149

Thank you.
```
Dec 13 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/12/2012 3:23 PM, Timon Gehr wrote:
On 12/12/2012 10:35 PM, Walter Bright wrote:
some algorithms are doomed to be slower.

Here's a (real) quicksort:

Ok, I'll bite.

Here's a program in Haskell and D that reads from standard in, splits into
lines, sorts the lines, and writes the result the standard out:

==============================
import Data.List
import qualified Data.ByteString.Lazy.Char8 as L
main = L.interact \$ L.unlines . sort . L.lines
==============================
import std.stdio;
import std.array;
import std.algorithm;
void main() {
stdin.byLine(KeepTerminator.yes)
map!(a => a.idup).
array.
sort.
copy(
stdout.lockingTextWriter());
}
===============================

The D version runs twice as fast as the Haskell one. Note that there's nothing
heroic going on with the D version - it's straightforward dumb code.
```
Dec 12 2012
Timon Gehr <timon.gehr gmx.ch> writes:
```On 12/13/2012 12:47 AM, Walter Bright wrote:
On 12/12/2012 3:23 PM, Timon Gehr wrote:
On 12/12/2012 10:35 PM, Walter Bright wrote:
some algorithms are doomed to be slower.

Here's a (real) quicksort:

Ok, I'll bite.

Here's a program in Haskell and D that reads from standard in, splits
into lines, sorts the lines, and writes the result the standard out:

==============================
import Data.List
import qualified Data.ByteString.Lazy.Char8 as L
main = L.interact \$ L.unlines . sort . L.lines
==============================
import std.stdio;
import std.array;
import std.algorithm;
void main() {
stdin.byLine(KeepTerminator.yes)
map!(a => a.idup).
array.
sort.
copy(
stdout.lockingTextWriter());
}
===============================

The D version runs twice as fast as the Haskell one.

You are testing some standard library functions that are implemented in
wildly different ways in both languages. They are not the same
algorithms. For example, looking at just the first element of the sorted
list will run in O(length.) in Haskell. If you build a sort function
with that property in D, it will be slower as well. (if a rather
Haskell-inspired implementation strategy is chosen, it will be a lot
slower.) The key difference is that the D version operates in a strict
fashion on arrays, while the Haskell version operates in a lazy fashion
on lazy lists.

This just means that Data.List.sort is inadequate for high-performance
code in case the entire contents of the list get looked at.

This is a good treatment of the matter:

You are using Data.List.sort. The best implementations shown there seem
to be around 5 times faster. I do not know how large the I/O overhead is.

Certainly, you can argue that the faster version should be in a
prominent place in the standard library, but the fact that it is not
does not indicate a fundamental performance problem in the Haskell
language. Also, note that I am completely ignoring what kind of code is
idiomatic in both languages. Fast Haskell code often looks similar to C
code.

Note that there's
nothing heroic going on with the D version - it's straightforward dumb
code.

A significant part of the D code is spent arranging data into the right
layout, while the Haskell code does nothing like that.
```
Dec 12 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/12/2012 5:51 PM, Timon Gehr wrote:
A significant part of the D code is spent arranging data into the right layout,
while the Haskell code does nothing like that.

So, please take the bait :-) and write a Haskell version that runs faster than
the D one.
```
Dec 12 2012
Timon Gehr <timon.gehr gmx.ch> writes:
```On 12/13/2012 10:28 PM, SomeDude wrote:
On Thursday, 13 December 2012 at 01:51:27 UTC, Timon Gehr wrote:
Certainly, you can argue that the faster version should be in a
prominent place in the standard library, but the fact that it is not
does not indicate a fundamental performance problem in the Haskell
language. Also, note that I am completely ignoring what kind of code
is idiomatic in both languages. Fast Haskell code often looks similar
to C code.

You can compare top performance for both languages, but the fact is, if
you write Haskell code extensively, you aren't going to write it like C,
so comparing idiomatic Haskell vs idiomatic D does make sense.

Optimizing bottlenecks is idiomatic in every language.

And comparing programs using the standard libraries also makes sense because
that's how languages are used. It probably doesn't make much sense in a
microbenchmark, but in a larger program it certainly does. ...

That is not what we are arguing.
```
Dec 14 2012
Timon Gehr <timon.gehr gmx.ch> writes:
```On 12/13/2012 09:09 PM, SomeDude wrote:
On Wednesday, 12 December 2012 at 20:01:43 UTC, Timon Gehr wrote:
On 12/12/2012 03:45 AM, Walter Bright wrote:
On 12/11/2012 5:05 PM, bearophile wrote:
Walter Bright:

ML has been around for 30-40 years, and has failed to catch on.

OcaML, Haskell, F#, and so on are all languages derived more or less
directly
from ML, that share many of its ideas. Has Haskell caught on? :-)

Haskell is the language that everyone loves to learn and talk about, and
few actually use.

And it's significantly slower than D,

(Sufficiently sane) languages are not slow or fast and I think the
factor GHC/DMD cannot be more than about 2 or 3 for roughly
equivalently written imperative code.

Furthermore no D implementation has any kind of useful performance for
lazy functional style D code.

In some ways, D is very significantly slower than Haskell. The
compilers optimize specific coding styles better than others.

in unfixable ways.

I disagree. That is certainly fixable. It is a mere QOI issue.

Actually, a factor of 2 to 3 can be huge.

Sure.

Consider that java is around a
factor 2 or less to C++ in the Computer Languages Benchmark Game, and
yet, you easily feel the difference everyday on your desktop applications.

Most software I use is written in C or C++. I think some of it is way
too slow.

But although the pure computation power is not very different, the real
difference I believe lies the memory management, which is probably far
less efficient in Java than in C++.

It still depends heavily on how well it is done in each case.
```
Dec 14 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/12/2012 2:53 AM, foobar wrote:
One example that comes to mind is the
future version of JavaScript is implemented in ML.

Um, there are many implementations of Javascript. In fact, I have implemented
it
in both C++ and D.
```
Dec 12 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/11/2012 4:06 PM, bearophile wrote:
Plus one or two switches to disable such checking, if/when someone wants it, to
regain the C performance. (Plus some syntax way to disable/enable such checking
in a small piece of code).

I.e. the C# "solution".

today in another thread. Global switches that change the semantics of the
language are a disaster. It means you cannot write a piece of code and have
confidence that it will behave in a certain way. It means your testing becomes
a
combinatorial explosion of cases - how many modules do you have, and you must
(to be thorough) test every combination of switches across your whole project.
If you have a 2 way switch, and 8 modules, that's 256 test runs.

2. The checked block "solution": This is a blunt club that affects everything
inside a block. What happens with template instantiations, inlined functions,
and mixins, for starters? What if you want one part of the expression checked
and not another? What a mess.

Not likely :-)

What you (and anyone else) *can* do, today, is write a SafeInt struct that acts
just like an int, but checks for overflow. It's very doable (one exists for
C++). Write it, use it, and prove its worth. Then you'll have a far better
case.
Write a good one, and we'll consider it for Phobos.
```
Dec 11 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/11/2012 5:15 PM, bearophile wrote:
Regarding safeInt I think today there is no way to write it efficiently in D,
because the overflow flags are not accessible from D, and if you use inlined
asm, you lose inlining in DMD. This is just one of the problems.

The way to deal with this is to examine the implementation of CheckedInt, and
design a couple of compiler intrinsics to use in its implementation that will
eliminate the asm code. (This is how the high level vector library Manu is
implementing is done.)

The other problems are syntax incompatibilities of user-defined structs
compared to
built-in ints.

This is not an issue.

Other problems are the probable lack of high-level optimizations
done on such user defined type.

Using intrinsics deals with this issue nicely, as the optimizer knows about
them.

We are very far from a good solution to such problems.

No, we are not.

something so provocative? Because I've seen D programmers go to herculean
lengths to get around problems they are having in the language. These efforts
make a strong case that they need better language support (UDAs are a primo
example of this). I see nobody bothering to write a CheckedInt type and seeing
how far they can push it, even though writing such a type is not a significant

Also, as I said before, there is a SafeInt class in C++. So far as I can tell,
nobody uses it.

Want to prove me wrong? Implement such a user defined type, and demonstrate
user
interest in it.

(Also note the HalfFloat class I implemented for Manu, as a demonstration of
how
a user defined type can implement a floating point type that is unknown to the
compiler.)
```
Dec 11 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/11/2012 8:42 PM, d coder wrote:

Manu posts here, reply to him!
```
Dec 11 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/11/2012 9:51 PM, David Piepgrass wrote:
something so provocative? Because I've seen D programmers go to herculean
lengths to get around problems they are having in the language. These efforts
make a strong case that they need better language support (UDAs are a primo
example of this). I see nobody bothering to write a CheckedInt type and seeing
how far they can push it, even though writing such a type is not a significant

I disagree with the analysis. I do want overflow detection, yet I would not use
a CheckedInt in D for the same reason I do not usually use one in C++: without
compiler support, it is too expensive to detect overflow. In my C++ I have a
lot
of math to do, and I'm using C++ because it's faster than C# which I would
otherwise prefer. Constantly checking for overflow without hardware support
would kill most of the performance advantage, so I don't do it.

You're not going to get performance with overflow checking even with the best
compiler support. For example, much arithmetic code is generated for the x86

LEA EAX,16[8*EBX][ECX]  for 16+8*b+c

The LEA instruction does no overflow checking. If you wanted it, the best code
would be:

MOV EAX,16
IMUL EBX,8
JO overflow
JO overflow
JO overflow

Which is considerably less efficient. (The LEA is designed to run in one
cycle).
Plus, often more registers are modified which impedes good register allocation.

This is why performance languages do not do overflow checking, and why C# only
does it under duress. It is not a conspiracy of pig-headed language developers
:-)

I do use "clipped conversion" though: e.g. ClippedConvert<short>(40000)==32767.
I can afford the overhead in this case because I don't do type conversions as
often as addition, bit shifts, etc.

You can't have both performant code and overflow detection.

The C# solution is not good enough either. C# throws exceptions on overflow,
which is convenient but is bad for performance if it happens regularly; it can
also make a debugger almost unusable. Some sort of mechanism that works like an
exception, but faster, would probably be better. Consider:

result = a * b + c * d;

If a * b overflows, there is probably no point to executing c * d so it may as
well jump straight to a handler; on the other hand, the exception mechanism is
costly, especially if the debugger is hooked in and causes a context switch
every single time it happens. So... I dunno. What's the best semantic for an
overflow detector?

If you desire overflows to be programming errors, then you want an abort, not a
thrown exception. I am perplexed by your desire to continue execution when
overflows happen regularly.
```
Dec 11 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/12/2012 2:33 AM, foobar wrote:
This isn't a perfect solutions
since the compiler has builtin knowledge about int and does optimizations that
will be lost with a library type.

```
Dec 12 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/12/2012 3:12 AM, foobar wrote:
Regarding performance and overflow checking, the example you give is x86
specific. What about other platforms? For example ARM is very popular nowadays
in the mobile world and there are many more smart-phones out there than there
are PCs. Is the same issue exists and if not (I suspect not, but really have no
idea) should D be geared towards current platforms or future ones?

I don't know the ARM instruction set.
```
Dec 12 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/12/2012 4:51 AM, Araq wrote:
From http://embed.cs.utah.edu/ioc/

" Examples of undefined integer overflows we have reported:

An SQLite bug
Some problems in SafeInt
GNU MPC
PHP
Firefox
GCC
PostgreSQL
LLVM
Python

We also reported bugs to BIND and OpenSSL. Most of the SPEC CPU 2006 benchmarks
contain undefined overflows."

Thanks, this is interesting information.

So how does D improve on C's model? If signed integers are required to wrap
around in D (no undefined behaviour), you also prevent some otherwise possible
optimizations (there is a reason it's still undefined behaviour in C).

D requires 2's complement arithmetic, it does not support 1's complement as C
does.
```
Dec 12 2012
Timon Gehr <timon.gehr gmx.ch> writes:
```On 12/12/2012 10:25 PM, Walter Bright wrote:
On 12/12/2012 4:51 AM, Araq wrote:
...
So how does D improve on C's model? If signed integers are required to
wrap
around in D (no undefined behaviour), you also prevent some otherwise
possible
optimizations (there is a reason it's still undefined behaviour in C).

D requires 2's complement arithmetic, it does not support 1's complement
as C does.

I think what he is talking about is that in C, if after a few steps of
inlining and constant propagation you end up with something like:

int x;
// ...
if(x>x+1) {
// lots and lots of code
}else return 0;

Then a C compiler will assume that the addition does not overflow and
reduce the code to 'return 0;', whereas a D compiler will not apply this
optimization as it might change the semantics of valid D programs.
```
Dec 12 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/12/2012 3:29 PM, Timon Gehr wrote:
On 12/12/2012 10:25 PM, Walter Bright wrote:
On 12/12/2012 4:51 AM, Araq wrote:
...
So how does D improve on C's model? If signed integers are required to
wrap
around in D (no undefined behaviour), you also prevent some otherwise
possible
optimizations (there is a reason it's still undefined behaviour in C).

D requires 2's complement arithmetic, it does not support 1's complement
as C does.

I think what he is talking about is that in C, if after a few steps of inlining
and constant propagation you end up with something like:

int x;
// ...
if(x>x+1) {
// lots and lots of code
}else return 0;

Then a C compiler will assume that the addition does not overflow and reduce
the
code to 'return 0;', whereas a D compiler will not apply this optimization as
it
might change the semantics of valid D programs.

You're right in that the D optimizer does not take advantage of C "undefined
behavior" in its optimizations. The article mentioned that many bugs were
caused
not by the actual wraparound behavior, but by aggressive C optimizers that
interpreted "undefined behavior" as not having to account for those cases.
```
Dec 12 2012
Walter Bright <newshound2 digitalmars.com> writes:
```On 12/11/2012 8:22 AM, Andrei Alexandrescu wrote:
* 32-bit integers are a sweet spot for CPU architectures. There's rarely a
provision for 16- or 8-bit operations; the action is at 32- or 64-bit.

Requiring integer operations to all be 64 bits would be a heavy burden on 32
bit
CPUs.
```
Dec 11 2012
"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
```On Wed, Dec 12, 2012 at 02:15:24AM +0100, bearophile wrote:
H. S. Teoh:

Just because you specify a certain compiler switch, it can cause
unrelated breakage in some obscure library somewhere, that assumes
modular arithmetic with C/C++ semantics.

The idea was about two switches, one for signed integrals, and the
other for both signed and unsigned. But from other posts I guess
Walter doesn't think this is a viable possibility.

Two switches is even worse than one. The problem is that existing code
assumes certain kind of behaviours from int, uint, etc.. Such code may
exist in common libraries imported by your code (directly or indrectly).
Now you compile your code with a switch (or two switches) that modifies
the behaviour of int, and things start to break. Even worse, if you only
use the switches on certain critical source files, then you may end up
with incompatible behaviour of the same library code in the same
executable (e.g. a template got instantiated once with the switches
enabled, once without). It leads to all kinds of inconsistencies and
subtle breakages that totally outweigh whatever benefits it may have

So the solutions I see now are stop using D for some kind of more
important programs, or using some kind of safeInt, and then work with
the compiler writers to allow user-defined structs to be usable as
naturally as possible as ints (and possibly efficiently).

It's not too late to add a new native type (or types) to the language
that support this kind of checking. I see that as the best solution to
this issue. Don't mess with the existing types, because too much already
depends on it. Add a new type that has the desired behaviour.

But you may have a hard time convincing Walter to put it in, though.

Regarding safeInt I think today there is no way to write it
efficiently in D, because the overflow flags are not accessible from
D, and if you use inlined asm, you lose inlining in DMD. This is
just one of the problems. The other problems are syntax
incompatibilities of user-defined structs compared to built-in ints.
Other problems are the probable lack of high-level optimizations
done on such user defined type.

These are implementation issues that we can work on improving. For one
thing, I'd love to see D get closer to the point where the distinction
between built-in types and user-defined types is gone. We may never
actually reach that point, but the closer we get, the better. This will
let us solve a lot of things, like drop-in replacements for AA's, etc.,
that are a bit ugly to do today.

One thing I've always thought about is a way for user-types to specify
sub-expression optimizations that the compiler can apply. Basically, if
I implement, say, a Matrix class, then I should be able to tell the
compiler that certain Matrix expressions, say A*B+A*C, can be factored
into A*(B+C), and have the optimizer automatically do this for me based
on what is defined in the type. Or specify that write("a");writeln("b");
can be replaced by writeln("ab");. But I haven't come up with a good
generic framework for actually making this implementable yet.

T

--
I don't trust computers, I've spent too long programming to think that
they can get anything right. -- James Miller
```
Dec 11 2012
"Araq" <rumpf_a gmx.de> writes:
``` I implement, say, a Matrix class, then I should be able to tell
the
compiler that certain Matrix expressions, say A*B+A*C, can be
factored
into A*(B+C), and have the optimizer automatically do this for
me based
on what is defined in the type. Or specify that
write("a");writeln("b");
can be replaced by writeln("ab");. But I haven't come up with a
good
generic framework for actually making this implementable yet.

Yeah, it's not that easy; Nimrod uses a hygienic macro system
with term rewriting rules and side effect analysis and alias
analysis for that ;-).

http://build.nimrod-code.org/docs/trmacros.html

http://forum.nimrod-code.org/t/70
```
Dec 12 2012
"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
```On Sun, Dec 16, 2012 at 04:45:31PM +0100, jerro wrote:
if, say, GDC was granted to come back in the shootout. Given it's
now widely acknowledged (at least in the programming communities)
to be one of the most promising languages around...

And especially if you also consider the fact that there Clean and
ATS are in the shootout and I'm guessing that very few people use
those.

I've used Clean before.

But yeah, probably not many people would be familiar with it.

T

--
Never trust an operating system you don't have source for! -- Martin Schulze
```
Dec 16 2012
"eles" <eles eles.com> writes:
``` There's a lot to be discussed on the issue. A few quick
thoughts:

* 32-bit integers are a sweet spot for CPU architectures.
There's rarely a provision for 16- or 8-bit operations; the
action is at 32- or 64-bit.

Speed can be still optimized by the compiler, behind the scenes.
The approach does not asks the compiler to promote everything to
widest-integral, but to do the job "as if". Currently, the choice
of int-C as the fastest-integral instead of widest-integral move
the burden from the compiler to the user.

* Then, although most 64-bit operations are as fast as 32-bit
ones, transporting operands takes twice as much internal bus
real estate and sometimes twice as much core real estate (i.e.
there are units that do either two 32-bit ops or one 64-bit op).

* The whole reserving a bit and halving the range means extra
costs of operating with a basic type.

Yes, there is a cost. But, as always, there is a balance between
advantages and drawbacks. What is favourable? Simplicity of
promotion or a supplimentary bit?

Besides, at the end of the day, a half-approach would be to have
a widest-signed-integral and a widest-unsigned-integral type and
only play with those two.

Eles
```
Dec 11 2012
"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
```On Tue, Dec 11, 2012 at 11:35:39AM -0500, Andrei Alexandrescu wrote:
On 12/11/12 11:29 AM, eles wrote:
There's a lot to be discussed on the issue. A few quick thoughts:

* 32-bit integers are a sweet spot for CPU architectures. There's
rarely a provision for 16- or 8-bit operations; the action is at 32-
or 64-bit.

Speed can be still optimized by the compiler, behind the scenes. The
approach does not asks the compiler to promote everything to
widest-integral, but to do the job "as if". Currently, the choice of
int-C as the fastest-integral instead of widest-integral move the
burden from the compiler to the user.

Agreed. But then that's one of them "sufficiently smart compiler"
arguments. http://c2.com/cgi/wiki?SufficientlySmartCompiler

A sufficiently smart compiler can solve the halting problem. ;-)

T

--
Obviously, some things aren't very obvious.
```
Dec 11 2012
"eles" <eles eles.com> writes:
``` Why stop at 64 bits? Why not make there only be one integral
type, and it is of whatever precision is necessary to hold the
value? This is quite doable, and has been done.

You really miss the point here. Nobody will ask you to promote
those numbers to 64-bit or whatever *unless necessary*. It will
only modify the implicit promotion rule, from "at least to int"
to "widest-integral".

You may chose, as a compiler, to promote the numbers only to 16
bits, or 32 bits, if you like, but only if the final result is
not viciated.

The compiler will be free to promote as it likes, as long as it
guarantees that the final result is "as if" the promotion is to
the widest-integral.

The point case is that this way the promotion rules, quite
complex now, will go straightforward. Yes, the burden will be on
the compiler rather than on the user. But this could improve in
time: C++ classes are nothing else than a burden that falls on
the compiler in order to make the programmer's life easier. Those
classes too, started as big behemots, so slow that scared
everyone.

Anyway, I will not defend this to the end of the world. Actually,
if you look in my original post, you will see that this is a
simple question, not a suggestion.

A bit shameful.
```
Dec 11 2012
"foobar" <foo bar.com> writes:
```On Tuesday, 11 December 2012 at 16:35:39 UTC, Andrei Alexandrescu
wrote:
On 12/11/12 11:29 AM, eles wrote:
There's a lot to be discussed on the issue. A few quick
thoughts:

* 32-bit integers are a sweet spot for CPU architectures.
There's
rarely a provision for 16- or 8-bit operations; the action is
at 32-
or 64-bit.

Speed can be still optimized by the compiler, behind the
scenes. The
approach does not asks the compiler to promote everything to
widest-integral, but to do the job "as if". Currently, the
choice of
int-C as the fastest-integral instead of widest-integral move
the burden
from the compiler to the user.

Agreed. But then that's one of them "sufficiently smart
compiler" arguments.
http://c2.com/cgi/wiki?SufficientlySmartCompiler

* Then, although most 64-bit operations are as fast as 32-bit
ones,
transporting operands takes twice as much internal bus real
estate and
sometimes twice as much core real estate (i.e. there are
units that do
either two 32-bit ops or one 64-bit op).

* The whole reserving a bit and halving the range means extra
costs of
operating with a basic type.

Yes, there is a cost. But, as always, there is a balance
between
advantages and drawbacks. What is favourable? Simplicity of
promotion or
a supplimentary bit?

A direct and natural mapping between language constructs and
machine execution is very highly appreciated in the market D is
in. I don't see that changing in the foreseeable future.

Besides, at the end of the day, a half-approach would be to
have a
widest-signed-integral and a widest-unsigned-integral type and
only play
with those two.

D has terrific abstraction capabilities. Lave primitive types
alone and define a UDT that implements your desired behavior.
You can always implement safe on top of fast but not the other
way around.

Andrei

All of the above relies on the assumption that the safety problem
is due to the memory layout. There are many other programming
languages that solve this by using a different point of view -
the problem lies in the implicit casts and not the memory layout.
In other words, the culprit is code such as:
uint a = -1;
which compiles under C's implicit coercion rules but _really
shouldn't_.
The semantically correct way would be something like:
uint a = 0xFFFF_FFFF;
but C/C++ programmers tend to think the "-1" trick is less
verbose and "better".
Another way is to explicitly state the programmer's intention:
uint a = reinterpret!uint(-1); // no run-time penalty should occur

D decided to follow C's coercion rules which I think is a design
mistake but one that cannot be easily changed.

Perhaps as Andrei suggested, a solution would be to use a higher
level "Integer" type defined in a library that enforces better
semantics.
```
Dec 11 2012
"bearophile" <bearophileHUGS lycos.com> writes:
```Walter Bright:

Why stop at 64 bits? Why not make there only be one integral
type, and it is of whatever precision is necessary to hold the
value? This is quite doable, and has been done.

I think no one has asked for *bignums on default* in this thread.

But at a terrible performance cost.

Nope, this is a significant fallacy of yours.
Common lisp (and OCaML) uses tagged integers on default, and they
are very far from being "terrible". Tagged integers cause no heap
allocations if they aren't large. Also the Common Lisp compiler
in various situations is able to infer an integer can't be too
much large, replacing it with some fixnum. And it's easy to add
annotations in critical spots to ask the Common Lisp compiler to
use a fixnum, to squeeze out all the performance.
The result is code that's quick, for most situations. But it's
more often correct. In D you drive with eyes shut; sometimes for
me it's hard to know if some integral overflow has occurred in a
long computation.

And, yes, in D you can create your own "BigInt" datatype which
exhibits this behavior.

Currently D bigints don't have short int optimization. And even
when this library problem is removed, I think the compiler
doesn't perform on BigInts the optimizations it does on ints,
because it doesn't know about bigint properties.

Bye,
bearophile
```
Dec 11 2012
"eles" <eles eles.com> writes:
``` Besides, at the end of the day, a half-approach would be to
have a widest-signed-integral and a widest-unsigned-integral
type and only play with those two.

Clarification: to have those two types as fundamental (ie:
promotion-favourite) types, not the sole types in the language.
```
Dec 11 2012
"eles" <eles eles.com> writes:
``` I thought my answer wasn't all that shoddy and not defensive at
all.

I step back. I agree. Thank you.
```
Dec 11 2012
```On Tuesday, 11 December 2012 at 21:57:38 UTC, Walter Bright wrote:
On 12/11/2012 10:45 AM, bearophile wrote:
Walter Bright:

Why stop at 64 bits? Why not make there only be one integral
type, and it
is of whatever precision is necessary to hold the value? This
is quite
doable, and has been done.

I think no one has asked for *bignums on default* in this

I know they didn't ask. But they did ask for 64 bits, and the
exact same
argument will apply to bignums, as I pointed out.

Agreed.

But at a terrible performance cost.

OCaML) uses
tagged integers on default, and they are very far from being
"terrible".
Tagged integers cause no heap allocations if they aren't
large. Also the
Common Lisp compiler in various situations is able to infer an
integer can't
be too much large, replacing it with some fixnum. And it's
annotations in critical spots to ask the Common Lisp compiler
to use a
fixnum, to squeeze out all the performance.

I don't notice anyone reaching for Lisp or Ocaml for high
performance applications.

I don't know about common LISP performances, never tried it in
something where that really matter. But OCaml is really very
performant. I don't know how it handle integer internally.

That's irrelevant to this discussion. It is not a problem with
the language.
Anyone can improve the library one if they desire, or do their
own.

The library is part of the language. What is a language with no
vocabulary ?

I think the compiler doesn't perform on BigInts the
optimizations it does on
ints, because it doesn't know about bigint properties.

I think the general lack of interest in bigints indicate that
the builtin types work well enough for most work.

That argument is fallacious. Something more used don't really
mean better. OR PHP and C++ are some of the best languages ever
```
Dec 11 2012
"eles" <eles eles.com> writes:
``` Somebody convinced somebody else of something on the Net.

About the non-defensiveness. As for the int's, I tend to consider
that the matter is controversial, but the balance is more
equilibrated than it seems (between drawbacks and advantages) of
either choice.
```
Dec 11 2012
"foobar" <foo bar.com> writes:
```On Tuesday, 11 December 2012 at 22:08:15 UTC, Walter Bright wrote:
On 12/11/2012 10:44 AM, foobar wrote:
All of the above relies on the assumption that the safety
problem is due to the
memory layout. There are many other programming languages that
solve this by
using a different point of view - the problem lies in the
implicit casts and not
the memory layout. In other words, the culprit is code such as:
uint a = -1;
which compiles under C's implicit coercion rules but _really
shouldn't_.
The semantically correct way would be something like:
uint a = 0xFFFF_FFFF;
but C/C++ programmers tend to think the "-1" trick is less
verbose and "better".

Trick? Not at all.

1. -1 is the size of an int, which varies in C.

2. -i means "complement and then increment".

3. Would you allow 2-1? How about 1-1? (1-1)-1?

Arithmetic in computers is different from the math you learned
in school. It's 2's complement, and it's best to always keep
that in mind when writing programs.

Thanks for proving my point. after all , you are a C++ developer,
aren't you? :)
Seriously though, it _is_ a trick and a code smell.
I'm fully aware that computers used 2's complement. I'm also am
aware of the fact that the type has an "unsigned" label all over
it. You see it right there in that 'u' prefix of 'int'. An
unsigned type should semantically entail _no sign_ in its
operations. You are calling a cat a dog and arguing that dogs
barf? Yeah, I completely agree with that notion, except, we are

To answer you question, yes, I would enforce overflow and
underflow checking semantics. Any negative result assigned to an
unsigned type _is_ a logic error.
you can claim that:
uint a = -1;
is perfectly safe and has a well defined meaning (well, for C
programmers that is), but what about:
uint a = b - c;
what if that calculation results in a negative number? What
should the compiler do? well, there are _two_ equally possible
solutions:
a. The overflow was intended as in the mask = -1 case; or
b. The overflow is a _bug_.

The user should be made aware of this and should make the
decision how to handle this. This should _not_ be implicitly
handled by the compiler and allow bugs go unnoticed.

I think C# solved this _way_ better than C/D. Another data point
would be (S)ML which is a compiled language which requires
_explicit conversions_ and has a very strong typing system. Its
programs are compiled to efficient native executables and the
strong typing allows both the compiler and the programmer better
reasoning of the code. Thus programs are more correct and can be
optimized by the compiler. In fact, several languages are
implemented in ML because of its higher guaranties.
```
Dec 11 2012
"bearophile" <bearophileHUGS lycos.com> writes:
```Walter Bright:

I don't notice anyone reaching for Lisp or Ocaml for high
performance applications.

Nowadays CommonLisp is not used much for anything (people at ITA
use it to plan flights, their code is efficient, algorithmically
complex, and used for heavy loads).

OCaML on the other hand is regarded as quite fast (but it's not
much used in general), it's sometimes used for its high
performance united to its greater safety, so someone uses it in

https://ocaml.janestreet.com/?q=node/61
https://ocaml.janestreet.com/?q=node/82

I think the compiler doesn't perform on BigInts the
optimizations it does on
ints, because it doesn't know about bigint properties.

I think the general lack of interest in bigints indicate that
the builtin types work well enough for most work.

Where do you see this general lack of interest in bigints? In D
or in other languages?

I use bigints often in D. In Python we use only bigints. In
Scheme, OcaML and Lisp-like languages multi-precison numbers are
the default ones. I think if you give programmers better bigints
(this means efficient and usable as naturally as ints), they will
use them.

I think currently in D there is no way to make bigints as
efficient as ints because there is no ways to express in D the
full semantics of integral numbers, that ints have. This is a
language limitation. One way to solve this problem, and keep
BigInts as Phobos code, is to introduce a built-in attribute
that's usable to mark user-defined structs as int-like.

----------------------

But OCaml is really very performant.<

It's fast considering it's a mostly functional language.

OCaML Vs C++ in the Shootout:

http://shootout.alioth.debian.org/u32/benchmark.php?test=all&lang=ocaml&lang2=gpp

http://shootout.alioth.debian.org/u32/benchmark.php?test=all&lang=ocaml&lang2=ghc

But as usual you have to take such comparisons cum grano salis,
because there are a lot more people working on the GHC compiler
and because the Shootout Haskell solutions are quite un-idiomatic
(you can see it also from the Shootout site itself, taking a look
at the length of the solutions) and they come from several years
of maniac-level discussions (they have patched the Haskell
compiler and its library several times to improve the results of
those benchmarks):

I don't know how it handle integer internally.<

It uses tagged integers, that are 31 or 63 bits long, the tag on
the less significant side:

http://stackoverflow.com/questions/3773985/why-is-an-int-in-ocaml-only-31-bits

Bye,
bearophile
```
Dec 11 2012
"bearophile" <bearophileHUGS lycos.com> writes:
```foobar:

I would enforce overflow and underflow checking semantics.<

Plus one or two switches to disable such checking, if/when
someone wants it, to regain the C performance. (Plus some syntax
way to disable/enable such checking in a small piece of code).

Bye,
bearophile
```
Dec 11 2012
"foobar" <foo bar.com> writes:
```On Wednesday, 12 December 2012 at 00:06:53 UTC, bearophile wrote:
foobar:

I would enforce overflow and underflow checking semantics.<

Plus one or two switches to disable such checking, if/when
someone wants it, to regain the C performance. (Plus some
syntax way to disable/enable such checking in a small piece of
code).

Bye,
bearophile

Yeah, of course, that's why I said the C# semantics are _way_
better. (That's a self quote)

btw, here's the link for SML which does not use tagged ints -
http://www.standardml.org/Basis/word.html#Word8:STR:SPEC

"Instances of the signature WORD provide a type of unsigned
integer with modular arithmetic and logical operations and
conversion operations. They are also meant to give efficient
hardware, and support bit-level operations on integers. They are
not meant to be a ``larger'' int. "
```
Dec 11 2012
"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
```On Wed, Dec 12, 2012 at 01:26:08AM +0100, foobar wrote:
On Wednesday, 12 December 2012 at 00:06:53 UTC, bearophile wrote:
foobar:

I would enforce overflow and underflow checking semantics.<

Plus one or two switches to disable such checking, if/when someone
wants it, to regain the C performance. (Plus some syntax way to
disable/enable such checking in a small piece of code).

I don't agree that compiler switches should change language semantics.
Just because you specify a certain compiler switch, it can cause
unrelated breakage in some obscure library somewhere, that assumes
modular arithmetic with C/C++ semantics. And this breakage will in all
likelihood go *unnoticed* until your software is running on the
customer's site and then it crashes horribly. And good luck debugging
that, because the breakage can be very subtle, plus it's *not* in your
own code, but in some obscure library code that you're not familiar
with.

I think a much better approach is to introduce a new type (or new types)
that *does* have the requisite bounds checking and static analysis.
That's what a type system is for.

[...]
Yeah, of course, that's why I said the C# semantics are _way_
better. (That's a self quote)

btw, here's the link for SML which does not use tagged ints -
http://www.standardml.org/Basis/word.html#Word8:STR:SPEC

"Instances of the signature WORD provide a type of unsigned integer
with modular arithmetic and logical operations and conversion
primitive machine word types of the underlying hardware, and support
bit-level operations on integers. They are not meant to be a
``larger'' int. "

It's kinda too late for D to rename int to word, say, but it's not too
late to introduce a new checked int type, say 'number' or something like
that (you can probably think of a better name).

In fact, Andrei describes a CheckedInt type that uses operator
You can probably expand that into a workable lightweight int
replacement. By wrapping an int in a struct with custom operators, you
can pretty much have an int-sized type (with value semantics, just like
"native" ints, no less!) that does what you want, instead of the usual
C/C++ int semantics.

T

--
In a world without fences, who needs Windows and Gates? -- Christian Surchi
```
Dec 11 2012
"bearophile" <bearophileHUGS lycos.com> writes:
```Walter Bright:

ML has been around for 30-40 years, and has failed to catch on.

OcaML, Haskell, F#, and so on are all languages derived more or
less directly from ML, that share many of its ideas. Has Haskell
caught on? :-)

Bye,
bearophile
```
Dec 11 2012
"bearophile" <bearophileHUGS lycos.com> writes:
```H. S. Teoh:

Just because you specify a certain compiler switch, it can cause
unrelated breakage in some obscure library somewhere, that
assumes modular arithmetic with C/C++ semantics.

The idea was about two switches, one for signed integrals, and
the other for both signed and unsigned. But from other posts I
guess Walter doesn't think this is a viable possibility.

So the solutions I see now are stop using D for some kind of more
important programs, or using some kind of safeInt, and then work
with the compiler writers to allow user-defined structs to be
usable as naturally as possible as ints (and possibly
efficiently).

Regarding safeInt I think today there is no way to write it
efficiently in D, because the overflow flags are not accessible
from D, and if you use inlined asm, you lose inlining in DMD.
This is just one of the problems. The other problems are syntax
incompatibilities of user-defined structs compared to built-in
ints. Other problems are the probable lack of high-level
optimizations done on such user defined type.

We are very far from a good solution to such problems.

Bye,
bearophile
```
Dec 11 2012
"bearophile" <bearophileHUGS lycos.com> writes:
```Walter Bright:

The way to deal with this is to examine the implementation of
CheckedInt, and design a couple of compiler intrinsics to use
in its implementation that will eliminate the asm code.

OK, good. I didn't think of this option.

Using intrinsics deals with this issue nicely, as the optimizer

OK.

Maybe you are right. I think I have never said there is a lot of

Bye,
bearophile
```
Dec 11 2012
d coder <dlang.coder gmail.com> writes:
```--047d7b2ee283ef152704d0a06ddc
Content-Type: text/plain; charset=ISO-8859-1

On Wed, Dec 12, 2012 at 8:14 AM, Walter Bright
<newshound2 digitalmars.com>wrote:

(This is how the high level vector library Manu is implementing is done.)

Greetings

regards
- Puneet

--047d7b2ee283ef152704d0a06ddc
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<br><br><div class=3D"gmail_quote">On Wed, Dec 12, 2012 at 8:14 AM, Walter =
Bright <span dir=3D"ltr">&lt;<a href=3D"mailto:newshound2 digitalmars.com" =
target=3D"_blank">newshound2 digitalmars.com</a>&gt;</span> wrote:<br><bloc=
kquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #cc=

<div id=3D":1h6">(This is how the high level <span class=3D"il">vector</spa=
n> <span class=3D"il">library</span> <span class=3D"il">Manu</span> is impl=
ementing is done.)</div></blockquote></div><div><br></div>Greetings<br><div=
<br>

iv><div><br></div><div>regards</div><div>- Puneet</div><div><br></div>

--047d7b2ee283ef152704d0a06ddc--
```
Dec 11 2012
"jerro" <a a.com> writes:
```On Wednesday, 12 December 2012 at 04:42:57 UTC, d coder wrote:
On Wed, Dec 12, 2012 at 8:14 AM, Walter Bright
<newshound2 digitalmars.com>wrote:

(This is how the high level vector library Manu is
implementing is done.)

Greetings

regards
- Puneet

The code is at https://github.com/TurkeyMan/phobos

It doesn't have anything to do with checked integers, though -
Walter was just using it as an example of an approach that we
could also use with checked integers.
```
Dec 11 2012
"David Piepgrass" <qwertie256 gmail.com> writes:
``` The problem, as I see it, is nobody actually cares about this.
Why would I say something so provocative? Because I've seen D
programmers go to herculean lengths to get around problems they
are having in the language. These efforts make a strong case
that they need better language support (UDAs are a primo
example of this). I see nobody bothering to write a CheckedInt
type and seeing how far they can push it, even though writing
such a type is not a significant challenge; it's a

I disagree with the analysis. I do want overflow detection, yet I
would not use a CheckedInt in D for the same reason I do not
usually use one in C++: without compiler support, it is too
expensive to detect overflow. In my C++ I have a lot of math to
do, and I'm using C++ because it's faster than C# which I would
otherwise prefer. Constantly checking for overflow without
hardware support would kill most of the performance advantage, so
I don't do it.

I do use "clipped conversion" though: e.g.
ClippedConvert<short>(40000)==32767. I can afford the overhead in
this case because I don't do type conversions as often as

The C# solution is not good enough either. C# throws exceptions
on overflow, which is convenient but is bad for performance if it
happens regularly; it can also make a debugger almost unusable.
Some sort of mechanism that works like an exception, but faster,
would probably be better. Consider:

result = a * b + c * d;

If a * b overflows, there is probably no point to executing c * d
so it may as well jump straight to a handler; on the other hand,
the exception mechanism is costly, especially if the debugger is
hooked in and causes a context switch every single time it
happens. So... I dunno. What's the best semantic for an overflow
detector?
```
Dec 11 2012
"bearophile" <bearophileHUGS lycos.com> writes:
```David Piepgrass:

I do want overflow detection, yet I would not use a CheckedInt
in D for the same reason I do not usually use one in C++:
without compiler support, it is too expensive to detect
overflow.

Here I have listed several problems in a library-defined SafeInt,
but Walter has expressed willingness to introduce intrinsics, to
give some compiler support, so it's a start of a solution:

Bye,
bearophile
```
Dec 11 2012
"Christopher Appleyard" <s45267935 sccb.ac.uk> writes:
```Hai :D I have seen your D program language on Google, it looks
cool! how much different is it to the C program language?
```
Dec 12 2012
"foobar" <foo bar.com> writes:
```On Wednesday, 12 December 2012 at 00:43:39 UTC, H. S. Teoh wrote:
On Wed, Dec 12, 2012 at 01:26:08AM +0100, foobar wrote:
On Wednesday, 12 December 2012 at 00:06:53 UTC, bearophile
wrote:
foobar:

I would enforce overflow and underflow checking semantics.<

Plus one or two switches to disable such checking, if/when
someone
wants it, to regain the C performance. (Plus some syntax way
to
disable/enable such checking in a small piece of code).

I don't agree that compiler switches should change language
semantics.
Just because you specify a certain compiler switch, it can cause
unrelated breakage in some obscure library somewhere, that
assumes
modular arithmetic with C/C++ semantics. And this breakage will
in all
likelihood go *unnoticed* until your software is running on the
customer's site and then it crashes horribly. And good luck
debugging
that, because the breakage can be very subtle, plus it's *not*
in your
own code, but in some obscure library code that you're not
familiar
with.

I think a much better approach is to introduce a new type (or
new types)
that *does* have the requisite bounds checking and static
analysis.
That's what a type system is for.

[...]
Yeah, of course, that's why I said the C# semantics are _way_
better. (That's a self quote)

btw, here's the link for SML which does not use tagged ints -
http://www.standardml.org/Basis/word.html#Word8:STR:SPEC

"Instances of the signature WORD provide a type of unsigned
integer
with modular arithmetic and logical operations and conversion
primitive machine word types of the underlying hardware, and
support
bit-level operations on integers. They are not meant to be a
``larger'' int. "

It's kinda too late for D to rename int to word, say, but it's
not too
late to introduce a new checked int type, say 'number' or
something like
that (you can probably think of a better name).

In fact, Andrei describes a CheckedInt type that uses operator
bounds checks.
You can probably expand that into a workable lightweight int
replacement. By wrapping an int in a struct with custom
operators, you
can pretty much have an int-sized type (with value semantics,
just like
"native" ints, no less!) that does what you want, instead of
the usual
C/C++ int semantics.

T

I didn't say D should change the implementation of integers, in
fact I said the exact opposite - that it's probably to late to
change the semantics for D. Had D was designed from scratch than
yes, I would have advocated for a different design, either the C#
one or as you suggest go even further and have two distinct types
(as in SML) which is even better. But by no means am I to suggest
to change D semantics _now_. Sadly, it's likely to late and we
can only try to paper it on top with additional library types.
This isn't a perfect solutions since the compiler has builtin
knowledge about int and does optimizations that will be lost with
a library type.
```
Dec 12 2012
"foobar" <foo bar.com> writes:
```On Wednesday, 12 December 2012 at 00:51:19 UTC, Walter Bright
wrote:
On 12/11/2012 3:44 PM, foobar wrote:
Thanks for proving my point. after all , you are a C++
developer, aren't you? :)

No, I'm an assembler programmer. I know how the machine works,
and C, C++, and D map onto that, quite deliberately. It's one
reason why D supports the vector types directly.

Seriously though, it _is_ a trick and a code smell.

Not to me. There is no trick or "smell" to anyone familiar with
how computers work.

I'm fully aware that computers used 2's complement. I'm also
am aware of the
fact that the type has an "unsigned" label all over it. You
see it right there
in that 'u' prefix of 'int'. An unsigned type should
semantically entail _no
sign_ in its operations. You are calling a cat a dog and
arguing that dogs barf?
Yeah, I completely agree with that notion, except, we are
cat_.

side). The inevitable result is that signed and unsigned types
*are* conflated in D, and have to be, otherwise many things
stop working.

For example, p[x]. What type is x?

Integer signedness in D is not really a property of the data,
it is only how one happens to interpret the data in a specific
context.

To answer you question, yes, I would enforce overflow and
underflow checking
semantics. Any negative result assigned to an unsigned type
_is_ a logic error.
you can claim that:
uint a = -1;
is perfectly safe and has a well defined meaning (well, for C
programmers that
uint a = b - c;
what if that calculation results in a negative number? What
should the compiler
do? well, there are _two_ equally possible solutions:
a. The overflow was intended as in the mask = -1 case; or
b. The overflow is a _bug_.

The user should be made aware of this and should make the
decision how to handle
this. This should _not_ be implicitly handled by the compiler
and allow bugs go
unnoticed.

I think C# solved this _way_ better than C/D.

C# has overflow checking off by default. It is enabled by
either using a checked { } block, or with a compiler switch. I
don't see that as "solving" the issue in any elegant or natural
way, it's more of a clumsy hack.

But also consider that C# does not allow pointer arithmetic, or
array slicing. Both of these rely on wraparound 2's complement
arithmetic.

Another data point would be (S)ML
which is a compiled language which requires _explicit
conversions_ and has a
very strong typing system. Its programs are compiled to
efficient native
executables and the strong typing allows both the compiler and
the programmer
better reasoning of the code. Thus programs are more correct
and can be
optimized by the compiler. In fact, several languages are
implemented in ML
because of its higher guaranties.

ML has been around for 30-40 years, and has failed to catch on.

This is precisely the point that signed and unsigned types are
conflated *in D*.
Other languages, namely ML chose a different design.
ML chose to have two distinct types: word and int, word is for
binary data and int for integer numbers. Words provide efficient
ints represent numbers and do carry overflow checks. you can
convert between the two and the compiler/run-time can carry
special knowledge about such conversions in order to provide
better optimization.
in ML, array indexing is done with an int since it _is_
conceptually a number.

Btw, SML was standardized in '97. I'll also dispute the claim
that it hasn't caught on - there are many derived languages from
it and it is just as large if not larger than the C family of
languages. It has influenced many languages and it and its
derivations are being used. One example that comes to mind is the
future version of JavaScript is implemented in ML. So no, not
forgotten but rather alive and kicking.
```
Dec 12 2012
"foobar" <foo bar.com> writes:
```On Wednesday, 12 December 2012 at 10:35:26 UTC, Walter Bright
wrote:
On 12/12/2012 2:33 AM, foobar wrote:
This isn't a perfect solutions
since the compiler has builtin knowledge about int and does
optimizations that
will be lost with a library type.

Yeah, just saw that :)
So basically you're suggesting to implement Integer and Word
library types using compiler intrinsics as a way to migrate to
better ML compatible semantics. This is a possible solution if it
can be proven to work.

Regarding performance and overflow checking, the example you give
is x86 specific. What about other platforms? For example ARM is
very popular nowadays in the mobile world and there are many more
smart-phones out there than there are PCs. Is the same issue
exists and if not (I suspect not, but really have no idea) should
D be geared towards current platforms or future ones?
```
Dec 12 2012
"bearophile" <bearophileHUGS lycos.com> writes:
```foobar:

So basically you're suggesting to implement Integer and Word
library types using compiler intrinsics as a way to migrate to
better ML compatible semantics.

I think there were no references to ML in that part of Walter

Regarding performance and overflow checking, the example you
give is x86 specific. What about other platforms? For example
ARM is very popular nowadays in the mobile world and there are
many more smart-phones out there than there are PCs. Is the
same issue exists and if not (I suspect not, but really have no
idea) should D be geared towards current platforms or future
ones?

Currently DMD (and a bit D too) is firmly oriented toward x86,
with a moderate orientation toward 64 bit too. Manu has asked for
more attention toward ARM, but (as Andrei has said) maybe
finishing const/immutable/shared is better now.

Bye,
bearophile
```
Dec 12 2012
"Araq" <rumpf_a gmx.de> writes:
``` Arithmetic in computers is different from the math you learned
in school. It's 2's complement, and it's best to always keep
that in mind when writing programs.

From http://embed.cs.utah.edu/ioc/

" Examples of undefined integer overflows we have reported:

An SQLite bug
Some problems in SafeInt
GNU MPC
PHP
Firefox
GCC
PostgreSQL
LLVM
Python

We also reported bugs to BIND and OpenSSL. Most of the SPEC CPU
2006 benchmarks contain undefined overflows."

So how does D improve on C's model? If signed integers are
required to wrap around in D (no undefined behaviour), you also
prevent some otherwise possible optimizations (there is a reason
it's still undefined behaviour in C).
```
Dec 12 2012
"bearophile" <bearophileHUGS lycos.com> writes:
```Araq:

So how does D improve on C's model?

There is some range analysis on shorter integral values. But
overall it shares the same troubles.

If signed integers are required to wrap around in D (no
undefined behaviour),

I think in D specs signed integers don't require the wrap-around
(so it's undefined behaviour).

Bye,
bearophile
```
Dec 12 2012
"Michael" <pr m1xa.com> writes:
```Machine/hardware have a explicitly defined register size and does
know nothing about sign and data type. fastest operation is
unsigned and fits to register size.

For example in your case, some algorithm that coded with
chained-if-checks may come unusable because it will slow.

http://msdn.microsoft.com/ru-ru/library/74b4xzyw.aspx
By default it is only for constants. For expressions in runtime
it must be explicitly enabled.

I think this check must be handled by developer through library
or compiler.
```
Dec 12 2012
"Michael" <pr m1xa.com> writes:
``` And about C# checked:
http://msdn.microsoft.com/ru-ru/library/74b4xzyw.aspx
By default it is only for constants. For expressions in runtime
it must be explicitly enabled.

```
Dec 12 2012
"Max Samukha" <maxsamukha gmail.com> writes:
```On Wednesday, 12 December 2012 at 02:44:42 UTC, Walter Bright
wrote:
UDAs are a primo example of this.

OT: Why those are not allowed on module decls and local decls? We
can't use UDAs on decls in unittest blocks. We can't use a UDA to
mark a module reflectable, can't put an attribute on a
"voldemort" type, etc. Please don't introduce arbitrary
restrictions. That way you exclude many valid potential use
cases, a recurring pattern that constantly pisses of D users.
Features should be as general as reasonably possible. Otherwise,
they *do* make us go herculean lengths.
```
Dec 12 2012
"Max Samukha" <maxsamukha gmail.com> writes:
```On Wednesday, 12 December 2012 at 08:00:09 UTC, Walter Bright
wrote:
On 12/11/2012 11:53 PM, Walter Bright wrote:
On 12/11/2012 11:47 PM, Han wrote:
Walter Bright wrote:

ML has been around for 30-40 years, and has failed to catch
on.

Isn't D on that same historical path?

Many languages wander in the wilderness for years before they
catch on.

BTW, many rock bands burst out of nowhere on the scene with
instant success. Overlooked is the previous 10 years the band
struggled in obscurity.

This includes bands like The Beatles. Well, 6 years for The
Beatles.

Led Zep, too. Long time ago I read some pseudo-scientific book
called "Heavy Metal" (don't remember the author) who claimed it
is a rule: a couple of fans in the beginning, several years of
desperation and only after that - fame and fortune. Of course,
the reality as much more complex.
```
Dec 12 2012
"eles" <eles eles.com> writes:
```On Wednesday, 12 December 2012 at 14:39:40 UTC, Michael wrote:
Machine/hardware have a explicitly defined register size and
does know nothing about sign and data type. fastest operation
is unsigned and fits to register size.

virtual methods, neither.

The question is: why the DEVELOPER should know about that
register size? There are many other things that the developer is
unaware of (read SSE instructions) and those are optimized behind
his back.

The choice to promote to the fastest type is a sweet thing for
the compiler, but a burden for the developer.

OTOH, I *never* asked for compulsory promotion, just mimicking
it. (in fact, I was not asking for anything, just addressed a
question) The idea is to guarantee, by the compiler, that the
final result of an integral arithmetic expression is AS IF all
integrals there are promoted to some widest-integral type.

Actual promotion would be made only if the compiler believes
that's really necessary.

In the current implementattion too, speed is lost as long as you
have a long there, as you need promotion further than int.
```
Dec 12 2012
"eles" <eles eles.com> writes:
``` OTOH, I *never* asked for compulsory promotion, just mimicking
it. (in fact, I was not asking for anything, just addressed a
question) The idea is to guarantee, by the compiler, that the
final result of an integral arithmetic expression is AS IF all
integrals there are promoted to some widest-integral type.

And the question is exactly that: what are the reasons to favor
one view over another? (that is int-C over int-FPC)

Is FPC that slow? Is C that easy? Weighting pro and cons?
```
Dec 12 2012
"Michael" <pr m1xa.com> writes:
```I read all thread and conclude that developers want a one button
- 'do all what I need'.

As mentioned above, for example, python have a arbitrary int
(that implemented as C library ;)).

C can be used on many platforms. For each platform developer have
solution as library. Right way is creating something new instead
cutting something that exist.

Each platform have a own limitations: memory, execution time etc.
It's good if different platforms can communicate between each
other.

Not all algorithms consume a lots of memory or have needs in
arbitrary int.

In some cases we have + or -. Good/right way is "-" -> library
solution -> "+" .
Language features are fundamental features.
```
Dec 12 2012
"eles" <eles eles.com> writes:
``` For each platform developer have solution as library. Right way
is creating something new instead cutting something that exist.

Moving some of the things to from the library to the language is
hard and limitating, but sometimes it worths the effort.

An example: threads. C/C++ have those as external library (not
This is very nice, but limits the degree to which the compiler is
able to optimize and to check the correctness of code, since the
very notion/concept of thread is alien to it.

Such library can be optimized just with respect to one compiler,
at most.

Take Java or D: here, threads are part of the language/standard
them.

(This issue is a bit off-topic, but it shows why it is important
that some things should be *standard*)
```
Dec 12 2012
"Michael" <pr m1xa.com> writes:
```Thread (and etc) is a high level abstraction that requires a
support by hardware/software/instruction set. If it necessary,
library can be integrated to language. And it's another one
```
Dec 12 2012
"eles" <eles eles.com> writes:
``` Thread (and etc) is a high level abstraction that requires a
support by hardware/software/instruction set.

Not only. First of all, it requires that the compiler *knows* and
*understands* the concept of thread. This is why C mimicking C++
will *never* get as fast as a true C++ compiler, for the latter
*knows* what a class is and wat to expect from it, what are the
goals and the uses of such concept.

The same stands for any other thing. The ideea is:
conceptualization.

A compiler that does not know what a class is will only partially
optimize, if any. It is a blind compiler.
```
Dec 12 2012
"eles" <eles eles.com> writes:
```On Wednesday, 12 December 2012 at 21:51:00 UTC, Michael wrote:
Thread (and etc) is a high level abstraction that requires a
support by hardware/software/instruction set.

And you can do happily multi-threading on a single processor,
with no parallelization and so on. It is just time-slicing. This
could be implemented at many levels: at the hardware level, at
the OS level, but also at the compiler level (through a runtime).
```
Dec 12 2012
"Michael" <pr m1xa.com> writes:
```Even OOP possible in asm.

It's completely OT ;)
```
Dec 12 2012
"bearophile" <bearophileHUGS lycos.com> writes:
```Walter Bright:

Consider running a server farm. If you can make your code 5%
faster, you need 5% fewer servers. That translates into
millions of dollars.

- I've seen Facebook start from PHP, go to PHP compiled in some
ways, and lately start to switch to faster languages, so when you
have tons of servers space and electricity used by CPUs becomes
important for the bottom line. On the other hand on similar
servers lot of other people use languages where there is far more
differences are not more important than several other
considerations, like coding speed, how much easy is to find
programmer, how cheap those programmers are, etc, even on server
farms.
- If your code is buggy (because of overflows, or other causes),
its output can be worthless or even harmful. This is why some
people are using OcaML for high-speed trading (I have given two
links in a precedent post), where bugs risk being quite costly.

Bye,
bearophile
```
Dec 12 2012
"ixid" <nuaccount gmail.com> writes:
```On Wednesday, 12 December 2012 at 21:27:35 UTC, Walter Bright
wrote:
On 12/12/2012 3:12 AM, foobar wrote:
Regarding performance and overflow checking, the example you
give is x86
specific. What about other platforms? For example ARM is very
in the mobile world and there are many more smart-phones out
there than there
are PCs. Is the same issue exists and if not (I suspect not,
but really have no
idea) should D be geared towards current platforms or future
ones?

I don't know the ARM instruction set.

I think it would be very hard, at this stage, to argue that you
should be putting your effort into ARM rather than x86. It's a
nice to have that doesn't seem very relevant to gaining traction
for D, areas like bioinformatics seem more relevant.
```
Dec 12 2012
```On Wednesday, 12 December 2012 at 22:36:35 UTC, ixid wrote:
On Wednesday, 12 December 2012 at 21:27:35 UTC, Walter Bright
wrote:
On 12/12/2012 3:12 AM, foobar wrote:
Regarding performance and overflow checking, the example you
give is x86
specific. What about other platforms? For example ARM is very
in the mobile world and there are many more smart-phones out
there than there
are PCs. Is the same issue exists and if not (I suspect not,
but really have no
idea) should D be geared towards current platforms or future
ones?

I don't know the ARM instruction set.

I think it would be very hard, at this stage, to argue that you
should be putting your effort into ARM rather than x86. It's a
nice to have that doesn't seem very relevant to gaining
traction for D, areas like bioinformatics seem more relevant.

http://santyhammer.blogspot.com/2012/11/something-is-changing-in-desktop.html
```
Dec 12 2012
```On Wednesday, 12 December 2012 at 23:47:26 UTC, Walter Bright
wrote:
On 12/12/2012 3:23 PM, Timon Gehr wrote:
On 12/12/2012 10:35 PM, Walter Bright wrote:
some algorithms are doomed to be slower.

Here's a (real) quicksort:

Ok, I'll bite.

Here's a program in Haskell and D that reads from standard in,
splits into lines, sorts the lines, and writes the result the
standard out:

==============================
import Data.List
import qualified Data.ByteString.Lazy.Char8 as L
main = L.interact \$ L.unlines . sort . L.lines
==============================
import std.stdio;
import std.array;
import std.algorithm;
void main() {
stdin.byLine(KeepTerminator.yes)
map!(a => a.idup).
array.
sort.
copy(
stdout.lockingTextWriter());
}
===============================

The D version runs twice as fast as the Haskell one. Note that
there's nothing heroic going on with the D version - it's
straightforward dumb code.

You'll find a lot of trap into that in D, some that can kill your
perfs. For instance :

stdin.byLine(KeepTerminator.yes).
map!(a => a.idup).
filter!(a => a).
array

And bazinga, you just doubled the number of memory allocation
```
Dec 12 2012
"David Piepgrass" <qwertie256 gmail.com> writes:
```On Wednesday, 12 December 2012 at 06:19:14 UTC, Walter Bright
wrote:
You're not going to get performance with overflow checking even
with the best compiler support. For example, much arithmetic
code is generated for the x86 using addressing mode
instructions, like:

LEA EAX,16[8*EBX][ECX]  for 16+8*b+c

The LEA instruction does no overflow checking. If you wanted
it, the best code would be:

MOV EAX,16
IMUL EBX,8
JO overflow
JO overflow
JO overflow

Which is considerably less efficient. (The LEA is designed to
run in one cycle). Plus, often more registers are modified
which impedes good register allocation.

Thanks for the tip. Of course, I don't need and wouldn't use
overflow checking all the time--in fact, since I've written a big
system in a language that can't do overflow checking, you might
say I "never need" overflow checking, in the same way that C
programmers "never need" constructors, destructors, generics or
exceptions as demonstrated by the fact that they can and do build
large systems without them.

Still, the cost of overflow checking is a lot bigger, and
requires a lot more cleverness, without compiler support. Hence I
work harder to avoid the need for it.

If you desire overflows to be programming errors, then you want
an abort, not a thrown exception. I am perplexed by your desire
to continue execution when overflows happen regularly.

I explicitly say I want to handle overflows quickly, and you
conclude that I want an unrecoverable abort? WTF! No, I think
overflows should be handled efficiently, and should be nonfatal.

Maybe it would be better to think in terms of the carry flag: it
seems to me that a developer needs access to the carry flag in
order to do 128+bit arithmetic efficiently. I have written code
to "make do" without the carry flag, it's just more efficient if
it can be used. So imagine an intrinsic that gets the value of
the carry flag*--obviously it wouldn't throw an exception. I just
think overflow should be handled the same way. If the developer
wants to react to overflow with an exception/abort, fine, but it
should not be mandatory as it is in .NET.

* Yes, I know you'd usually just ADC instead of retrieving the
actual value of the flag, but sometimes you do want to just get
the flag.

Usually when there is an overflow I just want to discard one data
point and move on, or set the result to the maximum/minimum
integer, possibly make a note in a log, but only occasionally do
I want the debugger to break.
```
Dec 12 2012
"bearophile" <bearophileHUGS lycos.com> writes:
```Walter Bright:

Java makes no attempt to detect integer overflows.

There are various kinds of code. In some kinds of programs you
want to be more sure that the result is correct, while other
kinds of programs this need is less strong.

I personally know people who write high speed trading software.
These people are concerned with nanosecond delays. They write
code in C++. They even hack on the compiler trying to get it to
generate faster code.

It doesn't surprise me a bit that some people who operate
server farms use slow languages like Ruby, Python, and Perl on
them. This does cost them money for extra hardware. There are
always going to be businesses that have inefficient operations,
poorly allocated resources, and who leave a lot of money on the
table.

One "important" firm uses OcaML for high speed trading because
it's both very fast (C++-class fast, faster than Java on certain
kinds of code, if well used) and apparently quite safer to use
than C/C++. And it's harder to find OcaML programmers than C++
ones.

Bye,
bearophile
```
Dec 12 2012
"foobar" <foo bar.com> writes:
```On Wednesday, 12 December 2012 at 21:05:05 UTC, Walter Bright
wrote:
On 12/12/2012 2:53 AM, foobar wrote:
One example that comes to mind is the
future version of JavaScript is implemented in ML.

Um, there are many implementations of Javascript. In fact, I
have implemented it in both C++ and D.

Completely besides the point.
The discussion was about using ML in real life, not what you
specifically chose to use. The fact that you use D has no baring
on your own assertion that ML is effectively dead. Fact is _other
people and organizations_ do use it to great effect.
Also, I said _future version_ of JS, meaning the next version of
the ECMAscript standard. ML was specifically chosen as it allows
to both be efficient and verify the correctness of the
implementation.
```
Dec 13 2012
"jerro" <a a.com> writes:
``` it's both very fast (C++-class fast, faster than Java on
certain kinds of code, if well used) and apparently quite safer

Last I tried OCaml, "well used" in context of performance would
mean avoiding many useful abstractions. One thing I remember is
that using functors always has run time cost and I don't see why
it should.
```
Dec 13 2012
"xenon325" <1 a.net> writes:
```On Wednesday, 12 December 2012 at 08:25:04 UTC, Han wrote:
Walter Bright wrote:
Overlooked is the previous 10 years the band struggled in
obscurity.

You KNOW that D has not been "overlooked". Developers and users
with
applications give it a look (the former mostly) and then choose
something else.

Overlooked ? No. Using ? No. Disliked and abandoned ? No.

Quite a few times I've seen on the web people saying something
like:
"D looks *really* nice, can't use it right now, but definitely
keeping eye on it"

Same with myself. Keeping eye on it since mid-2010.

Do you really think that D will ever have popularity to the
level of The Beatles?

From what I've seen so far, I'd say that's quite possible.

Do you have a "Moses complex" (psychological) a little bit
maybe?

You do understand this doesn't add anything to the discussion,
right ?
```
Dec 13 2012
"SomeDude" <lovelydear mailmetrash.com> writes:
```On Wednesday, 12 December 2012 at 20:01:43 UTC, Timon Gehr wrote:
On 12/12/2012 03:45 AM, Walter Bright wrote:
On 12/11/2012 5:05 PM, bearophile wrote:
Walter Bright:

ML has been around for 30-40 years, and has failed to catch
on.

OcaML, Haskell, F#, and so on are all languages derived more
or less
directly
from ML, that share many of its ideas. Has Haskell caught on?
:-)

Haskell is the language that everyone loves to learn and talk
few actually use.

And it's significantly slower than D,

(Sufficiently sane) languages are not slow or fast and I think
the factor GHC/DMD cannot be more than about 2 or 3 for roughly
equivalently written imperative code.

Furthermore no D implementation has any kind of useful
performance for lazy functional style D code.

In some ways, D is very significantly slower than Haskell. The
compilers optimize specific coding styles better than others.

in unfixable ways.

I disagree. That is certainly fixable. It is a mere QOI issue.

Actually, a factor of 2 to 3 can be huge. Consider that java is
around a factor 2 or less to C++ in the Computer Languages
Benchmark Game, and yet, you easily feel the difference everyday
But although the pure computation power is not very different,
the real difference I believe lies the memory management, which
is probably far less efficient in Java than in C++.
```
Dec 13 2012
"SomeDude" <lovelydear mailmetrash.com> writes:
```On Thursday, 13 December 2012 at 01:32:23 UTC, bearophile wrote:
Walter Bright:

Java makes no attempt to detect integer overflows.

There are various kinds of code. In some kinds of programs you
want to be more sure that the result is correct, while other
kinds of programs this need is less strong.

I personally know people who write high speed trading
software. These people are concerned with nanosecond delays.
They write code in C++. They even hack on the compiler trying
to get it to generate faster code.

It doesn't surprise me a bit that some people who operate
server farms use slow languages like Ruby, Python, and Perl on
them. This does cost them money for extra hardware. There are
always going to be businesses that have inefficient
operations, poorly allocated resources, and who leave a lot of
money on the table.

One "important" firm uses OcaML for high speed trading because
it's both very fast (C++-class fast, faster than Java on
certain kinds of code, if well used) and apparently quite safer
to use than C/C++. And it's harder to find OcaML programmers
than C++ ones.

Bye,
bearophile

According to the Benchmark game, performance of Ocaml is good,
but not fantstic. And certainly not "C++-class" fast. It's more
like "Java-class" fast. (in fact it's slower than Java 7 on most
tests, but uses much more memory).
Unfortunately, D hasn't been on the game for a long time, but
last time it was, it was effectively faster than g++.

So really, we are not talking the same kind of performance here.
D is likely to be MUCH faster than Ocaml.
```
Dec 13 2012
"SomeDude" <lovelydear mailmetrash.com> writes:
```On Thursday, 13 December 2012 at 01:51:27 UTC, Timon Gehr wrote:
Certainly, you can argue that the faster version should be in a
prominent place in the standard library, but the fact that it
is not does not indicate a fundamental performance problem in
the Haskell language. Also, note that I am completely ignoring
what kind of code is idiomatic in both languages. Fast Haskell
code often looks similar to C code.

You can compare top performance for both languages, but the fact
is, if you write Haskell code extensively, you aren't going to
write it like C, so comparing idiomatic Haskell vs idiomatic D
does make sense. And comparing programs using the standard
libraries also makes sense because that's how languages are used.
It probably doesn't make much sense in a microbenchmark, but in a
larger program it certainly does. And if the standard library is
twice as slow in implementation A than in implemention B, then
most programs will feel *at least* twice as slow, and usually
more, because if you call a function f that's twice as slow in A
than in B from another function that's also twice as slow in A
than in B, then the whole thing is 4 times slower.
```
Dec 13 2012
"SomeDude" <lovelydear mailmetrash.com> writes:
```On Thursday, 13 December 2012 at 21:28:52 UTC, SomeDude wrote:
On Thursday, 13 December 2012 at 01:51:27 UTC, Timon Gehr wrote:

if the standard library is twice as slow in implementation A
than in implemention B, then most programs will feel *at least*
twice as slow, and usually more, because if you call a function
f that's twice as slow in A than in B from another function
that's also twice as slow in A than in B, then the whole thing
is 4 times slower.

```
Dec 13 2012
"evilrat" <evilrat666 gmail.com> writes:
```On Friday, 14 December 2012 at 08:04:55 UTC, Han wrote:
Then put up a real-time tracking chart of that on the D
website: "The
popularity of D vs. the popularity of The Beatles". I think
what you
answered goes to show the level of dissillusionment of (or
shamefully
insulting level of propagandism put forth by) the typical D
fanboy.

nice try mr.Troll, but no, you came here and start saying
crap(sry if it insults you) on almost 10 pages. if you really
need feature "X" you can always do it yourself and even maybe
give ur solution to community, instead you are just flooding with
same template "how D can be a number one(and no one really want D
to be number one, people want good usable language right?) if
(people/devs) there is everything don't want to (...)?"

you are saying devs/people ignorant, they give you facts but you
don't want to accept them, so everyone is bad?

and now you are calling the whole community D fanboys, what is
wrong with u dude?

p.s. i don't want to continue this time wasting discussion
```
Dec 14 2012
"evilrat" <evilrat666 gmail.com> writes:
```nevermind, i've lost the whole idea...
```
Dec 14 2012
"Isaac Gouy" <igouy2 yahoo.com> writes:
```On Tuesday, 11 December 2012 at 23:59:29 UTC, bearophile wrote:

-snip-

But as usual you have to take such comparisons cum grano salis,
because there are a lot more people working on the GHC compiler
and because the Shootout Haskell solutions are quite
un-idiomatic (you can see it also from the Shootout site
itself, taking a look at the length of the solutions) and they
come from several years of maniac-level discussions (they have
patched the Haskell compiler and its library several times to
improve the results of those benchmarks):

I looked at that haskellwiki page but I didn't find anything to
suggest -- "they have patched the Haskell compiler and its
library several times to improve the results of those benchmarks"?

Was it something the compiler writers told you?
```
Dec 15 2012
"SomeDude" <lovelydear mailmetrash.com> writes:
```On Saturday, 15 December 2012 at 17:11:01 UTC, Isaac Gouy wrote:
On Tuesday, 11 December 2012 at 23:59:29 UTC, bearophile wrote:

-snip-

But as usual you have to take such comparisons cum grano
salis, because there are a lot more people working on the GHC
compiler and because the Shootout Haskell solutions are quite
un-idiomatic (you can see it also from the Shootout site
itself, taking a look at the length of the solutions) and they
come from several years of maniac-level discussions (they have
patched the Haskell compiler and its library several times to
improve the results of those benchmarks):

I looked at that haskellwiki page but I didn't find anything to
suggest -- "they have patched the Haskell compiler and its
library several times to improve the results of those
benchmarks"?

Was it something the compiler writers told you?

Probably bearophile meant that the shootout allowed them to see
some weaknesses in some implementations, and therefore helped
them improve on those. It's also something that would benefit D
if, say, GDC was granted to come back in the shootout. Given it's
now widely acknowledged (at least in the programming communities)
to be one of the most promising languages around...
```
Dec 16 2012
"jerro" <a a.com> writes:
``` if, say, GDC was granted to come back in the shootout. Given
it's now widely acknowledged (at least in the programming
communities) to be one of the most promising languages around...

And especially if you also consider the fact that there Clean and
ATS are in the shootout and I'm guessing that very few people use
those.
```
Dec 16 2012
"Isaac Gouy" <igouy2 yahoo.com> writes:
```On Sunday, 16 December 2012 at 15:45:32 UTC, jerro wrote:
if, say, GDC was granted to come back in the shootout. Given
it's now widely acknowledged (at least in the programming
communities) to be one of the most promising languages
around...

And especially if you also consider the fact that there Clean
and ATS are in the shootout and I'm guessing that very few
people use those.

See

http://www.digitalmars.com/d/archives/digitalmars/D/Why_did_D_leave_the_programming_language_shootout_and_will_it_return_144864.html#N144870
```
Dec 16 2012
"Isaac Gouy" <igouy2 yahoo.com> writes:
```On Sunday, 16 December 2012 at 13:05:50 UTC, SomeDude wrote:
-snip-
Was it something the compiler writers told you?

Probably bearophile meant that...

I can make my own guesses, but I wanted to know what bearophile
meant so I asked him ;-)
```
Dec 16 2012
"SomeDude" <lovelydear mailmetrash.com> writes:
```On Sunday, 16 December 2012 at 19:59:31 UTC, Isaac Gouy wrote:
On Sunday, 16 December 2012 at 15:45:32 UTC, jerro wrote:
if, say, GDC was granted to come back in the shootout. Given
it's now widely acknowledged (at least in the programming
communities) to be one of the most promising languages
around...

And especially if you also consider the fact that there Clean
and ATS are in the shootout and I'm guessing that very few
people use those.

See

http://www.digitalmars.com/d/archives/digitalmars/D/Why_did_D_leave_the_programming_language_shootout_and_will_it_return_144864.html#N144870

Still, you don't explain why you picked, say ATS, which is
significantly more esoteric than D, and much less likely to be
used by the community in the large. I argue that many more people
would be interested in the performance of D.
```
Dec 16 2012
"SomeDude" <lovelydear mailmetrash.com> writes:
```On Sunday, 16 December 2012 at 23:21:15 UTC, SomeDude wrote:
On Sunday, 16 December 2012 at 19:59:31 UTC, Isaac Gouy wrote:
On Sunday, 16 December 2012 at 15:45:32 UTC, jerro wrote:
if, say, GDC was granted to come back in the shootout. Given
it's now widely acknowledged (at least in the programming
communities) to be one of the most promising languages
around...

And especially if you also consider the fact that there Clean
and ATS are in the shootout and I'm guessing that very few
people use those.

See

http://www.digitalmars.com/d/archives/digitalmars/D/Why_did_D_leave_the_programming_language_shootout_and_will_it_return_144864.html#N144870

Still, you don't explain why you picked, say ATS, which is
significantly more esoteric than D, and much less likely to be
used by the community in the large. I argue that many more
people would be interested in the performance of D.

Proof is, it seems to me that you (Isaac Gouy) often come around
here. We can magically invoke you every time one talks about the
shootout. Which is pretty astonishing for a language you aren't
interested in.
```
Dec 16 2012
"Isaac Gouy" <igouy2 yahoo.com> writes:
```On Monday, 17 December 2012 at 01:14:37 UTC, Walter Bright wrote:
On 12/16/2012 3:24 PM, SomeDude wrote:
Proof is, it seems to me that you (Isaac Gouy) often come
around here. We can
magically invoke you every time one talks about the shootout.
Which is pretty
astonishing for a language you aren't interested in.

Not really. You can set Google to email you whenever a search
phrase turns up a new result.

Yes, that's more or less what I do.

I have a couple of Google searches saved as bookmarks, and when