www.digitalmars.com         C & C++   DMDScript  

D - Unsigned

reply "Roberto Mariottini" <rmariottini lycosmail.com> writes:
Does D handles unsigned integers in a better way than C/C++?
Note that the following program is valid ISO C and ISO C++, and prints
out -112 and -123 without problems. To me it seems trash.

#include <stdio.h>

unsigned int f(unsigned int n)
{
  return n - 100U;
}

int main()
{
  unsigned int u = -12;

  printf("the total is: %d\n", f(u));
  printf("the total is: %d\n", f(-23));

  return 0;
}

Ciao
Mar 19 2002
parent reply "Richard Krehbiel" <rich kastle.com> writes:
"Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
news:a774v2$cjj$1 digitaldaemon.com...
 Does D handles unsigned integers in a better way than C/C++?
 Note that the following program is valid ISO C and ISO C++, and prints
 out -112 and -123 without problems. To me it seems trash.
What exactly do you expect?
 #include <stdio.h>

 unsigned int f(unsigned int n)
 {
   return n - 100U;
 }

 int main()
 {
   unsigned int u = -12;

   printf("the total is: %d\n", f(u));
   printf("the total is: %d\n", f(-23));

   return 0;
 }
This program is not "well-defined" ISO C. You've lied to printf, telling it to print a signed value ("%d"), while supplying an unsigned value. Try it again with "%u". (printf needs to be replaced with something type-safe. D doesn't plan to address this.)
Mar 19 2002
parent reply "Roberto Mariottini" <rmariottini lycosmail.com> writes:
"Richard Krehbiel" <rich kastle.com> ha scritto nel messaggio
news:a77dur$j1u$1 digitaldaemon.com...
 "Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
 news:a774v2$cjj$1 digitaldaemon.com...
 Does D handles unsigned integers in a better way than C/C++?
 Note that the following program is valid ISO C and ISO C++, and prints
 out -112 and -123 without problems. To me it seems trash.
What exactly do you expect?
Something like this:
 #include <stdio.h>

 unsigned int f(unsigned int n)
 {
   return n - 100U;
Here the subtraction should be internal within unsigned (being both operand unsigned). A negative result (i.e. carry flag set) should raise some exception.
 }

 int main()
 {
   unsigned int u = -12;
This is a syntax/type error.
   printf("the total is: %d\n", f(u));
I agree that some substitute for printf is A MUST.
   printf("the total is: %d\n", f(-23));
Another syntax/type error.
   return 0;
 }
This program is not "well-defined" ISO C. You've lied to printf, telling
it
 to print a signed value ("%d"), while supplying an unsigned value.  Try it
 again with "%u".
It is 100% ISO-C standard. The fact it is malfunctioning is something programmer-related, not language related, as you stated. The fact is that in C and C++, int and unsigned are EXACTLY the same type for most operations. The example shown as an unsigned can be treated as a signed int, behaving perfectly the same. The only difference arise in comparison when <, >, <= or >= are involved. Note, anyway, that if (10u < -1) { printf ("I am a stupid language/compiler\n"); } else { printf ("I am a smart language/compiler\n"); } always shows how much C/C++ is stupid in handling unsignedness. Ciao
Mar 19 2002
next sibling parent reply "Richard Krehbiel" <rich kastle.com> writes:
"Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
news:a77kqt$ov7$1 digitaldaemon.com...
 "Richard Krehbiel" <rich kastle.com> ha scritto nel messaggio
 news:a77dur$j1u$1 digitaldaemon.com...
 "Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
 news:a774v2$cjj$1 digitaldaemon.com...
 Does D handles unsigned integers in a better way than C/C++?
 Note that the following program is valid ISO C and ISO C++, and prints
 out -112 and -123 without problems. To me it seems trash.
What exactly do you expect?
Something like this:
 #include <stdio.h>

 unsigned int f(unsigned int n)
 {
   return n - 100U;
Here the subtraction should be internal within unsigned (being both operand unsigned). A negative result (i.e. carry flag set) should raise
some
 exception.
I use this property of unsigned'ed arithmetic (that of being modulo-UINT_MAX) a lot. If you're going to take it away, then replace it with something else. (I despise Java's lack of unsigned types and it's insistence that overflow and underflow throw exceptions.)
 }

 int main()
 {
   unsigned int u = -12;
This is a syntax/type error.
How then should I represent the value that, when 12 is added to it, becomes zero? And keep in mind I'm still talking about modulo-2**32 math. (Visual Studio prints a warning about such signed/unsienged mismatches. I have resorted to turning them off.)
   printf("the total is: %d\n", f(u));
I agree that some substitute for printf is A MUST.
I have a sinister plan to write a macro language for D, a "DPP" if you will, that has full access to the D type system, and will allow writing a macro that offers a "print" statement that offers the following syntax: print "a = ", a, "b = ", b, "\n";
 The fact is that in C and C++, int and unsigned are EXACTLY the same type
 for most operations.
You've got it right there. This is a consequence of the fact that, on most architectures, there is only a single set of machine instructions for arithmetic, used for both signed and unsigned types. The underlying machine can only do modulo-2**n arithmetic, and attempts to coerce it otherwise adds significant run-time overhead. Since D is supposed to be for high-performance systems work, it'll work this way too. Sorry.
Mar 19 2002
parent Russell Borogove <kaleja estarcion.com> writes:
Richard Krehbiel wrote:
 "Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
  unsigned int u = -12;
This is a syntax/type error.
How then should I represent the value that, when 12 is added to it, becomes zero? And keep in mind I'm still talking about modulo-2**32 math. (Visual Studio prints a warning about such signed/unsienged mismatches. I have resorted to turning them off.)
This is an interesting point -- most compilers warn on this. Under D, that means it's probably going to be considered an error, because Walter is trying to eliminate warnings. The warning-free C version is: unsigned int u = (unsigned)-12; and in D: unsigned int u = cast(unsigned)-12; Roberto, do you consider this a reasonable statement, or erroneous? As Richard indicates, there's a long history of using constructs like this. I had to do it yesterday myself in converting an alignment to a mask in memory management code. -RB
Mar 19 2002
prev sibling parent reply "Walter" <walter digitalmars.com> writes:
"Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
news:a77kqt$ov7$1 digitaldaemon.com...
 The fact is that in C and C++, int and unsigned are EXACTLY the same type
 for most operations.
 The example shown as an unsigned can be treated as a signed int, behaving
 perfectly the same.
 The only difference arise in comparison when <, >, <= or >= are involved.
The /, %, and >> behave differently as well as conversions to wider types or to floating point types. The reason that D does not change any of the semantics with operators, operator precedence, default integral promotions, etc., is because although many of the rules are byzantine, experienced C programmers have become very used to them. Subtly changing them will cause much grief in porting C code to D and porting C programmers to D <g> as wierd bugs will appear in formerly working code.
Mar 19 2002
parent reply "Roberto Mariottini" <rmariottini lycosmail.com> writes:
"Walter" <walter digitalmars.com> ha scritto nel messaggio
news:a782ui$13q9$1 digitaldaemon.com...
 "Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
 news:a77kqt$ov7$1 digitaldaemon.com...
 The fact is that in C and C++, int and unsigned are EXACTLY the same
type
 for most operations.
 The example shown as an unsigned can be treated as a signed int,
behaving
 perfectly the same.
 The only difference arise in comparison when <, >, <= or >= are
involved.
 The /, %, and >> behave differently as well as conversions to wider types
or
 to floating point types.

 The reason that D does not change any of the semantics with operators,
 operator precedence, default integral promotions, etc., is because
although
 many of the rules are byzantine, experienced C programmers have become
very
 used to them. Subtly changing them will cause much grief in porting C code
 to D and porting C programmers to D <g> as wierd bugs will appear in
 formerly working code.
I'm very disappointed with this. I've programmed C for more than ten years now, and C++ for nearly 10. And I NEVER use unsigneds, for I've learnt that it is a RIDICULOUS thing, that took me hours of bug tracking to find. Don't think you'll have much code to break, thoug, because people tend to not use unsigneds, like I do. I've seen a large quantity of code written by a large variety of people, and I think they can be divided in two kind of unsignedness usage: - People not using unsigned for "normal" operations, using it only in particular cases where it's necessary (like I do). - People using unsigned everywhere, in the firm belief the compiler will not accept code breaking the unsignedness (I often find unsigned functions parameters, and find people surprised when I tell them it assures nothing about signedness of the argument). For the first kind of people (that know what they do) a simple explaination on what changed in D and how to achieve the same functionality would be enough. For the second kind of people (the biggest part) no change in their programming life is needed, because they always thought it worked like this. Ciao. PS: if someone has some old C code that isn't worth changing, he can link it with D programs and let it be.
Mar 20 2002
next sibling parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
news:a79gcd$1rhf$1 digitaldaemon.com...

 I'm very disappointed with this.
 I've programmed C for more than ten years now, and C++ for nearly 10.
 And I NEVER use unsigneds, for I've learnt that it is a RIDICULOUS thing,
 that took me hours of bug tracking to find.
 Don't think you'll have much code to break, thoug, because people tend to
 not use unsigneds, like I do.
Funny enough, my point is exactly the opposite: I always use unsigned ints where sign is not needed. I just don't understand why should I waste 2 billion of possible values, if they aren't used anyhow...
  - People using unsigned everywhere, in the firm belief the compiler will
 not
    accept code breaking the unsignedness (I often find unsigned functions
 parameters,
    and find people surprised when I tell them it assures nothing about
 signedness of
    the argument).
Most compilers issue warnings when you try to pass a signed value as an unsigned argument. This is much more reliable (and faster!) then checking the value in your function and raising an exception if it is negative.
Mar 20 2002
parent reply "Roberto Mariottini" <rmariottini lycosmail.com> writes:
"Pavel Minayev" <evilone omen.ru> ha scritto nel messaggio
news:a79q0b$20bt$1 digitaldaemon.com...
 "Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
 news:a79gcd$1rhf$1 digitaldaemon.com...

 I'm very disappointed with this.
 I've programmed C for more than ten years now, and C++ for nearly 10.
 And I NEVER use unsigneds, for I've learnt that it is a RIDICULOUS
thing,
 that took me hours of bug tracking to find.
 Don't think you'll have much code to break, thoug, because people tend
to
 not use unsigneds, like I do.
Funny enough, my point is exactly the opposite: I always use unsigned ints where sign is not needed. I just don't understand why should I waste 2 billion of possible values, if they aren't used anyhow...
They are used anyway. Every int can be used as unsigned without problems. In fact they are almost the same type. int i = MAX_INT + 13; // you can use the exact literal if you know it printf("%u", i); // prints out MAX_INT + 13 Using unsigned doesn't add anything to your programs.
  - People using unsigned everywhere, in the firm belief the compiler
will
 not
    accept code breaking the unsignedness (I often find unsigned
functions
 parameters,
    and find people surprised when I tell them it assures nothing about
 signedness of
    the argument).
Most compilers issue warnings when you try to pass a signed value as an unsigned argument. This is much more reliable (and faster!) then checking the value in your function and raising an exception if it is negative.
Yes, but what's a warning? An error? Not-an-error? (this recalls me fuzzy logic) D has no warnings. So it either: - stop with a compiler error - raise an exception Leaving it like it is (no warning AND no error AND no exception), is not the right way. Ciao
Mar 20 2002
next sibling parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
news:a7a0fl$23t5$1 digitaldaemon.com...

 They are used anyway. Every int can be used as unsigned without problems.
 In fact they are almost the same type.

 int i = MAX_INT + 13; // you can use the exact literal if you know it
 printf("%u", i);  // prints out MAX_INT + 13

 Using unsigned doesn't add anything to your programs.
Using unsigned states that this value can only be positive. This gives compiler the opportunity to warn programmer when he tries to pass a signed value where function expects an unsigned argument. And "they are used anyway" is wrong. Haven't you ever written anything like this? for (int i = 0; i < n; i++) ... Now, since i is signed, n can only be as large as 0x7fffffff (because otherwise the loop would never get executed). If it were declared as unsigned int, the entire range of values up to 0xffffffff can be used. I wonder, why use signed int where it is obviously unsigned? What benefits does it give?
 D has no warnings. So it either:

  - stop with a compiler error
  - raise an exception

 Leaving it like it is (no warning AND no error AND no exception), is not
 the right way.
Following this logic, D should also stop with a compiler error or raise an exception when passing a float as int argument, but it doesn't. Strict type-checking has its benefits, yet I don't like it much; guess I just had got too used to C freedom in type conversions... Exceptions are too costly to be taken seriously in such cases. Compiler error - could be, probably requiring an explicit cast. I'd prefer it to be as it is now, though (but this is just a personal point).
Mar 20 2002
parent reply "Roberto Mariottini" <rmariottini lycosmail.com> writes:
"Pavel Minayev" <evilone omen.ru> ha scritto nel messaggio
news:a7a1j3$24cv$1 digitaldaemon.com...
 "Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
 news:a7a0fl$23t5$1 digitaldaemon.com...

 They are used anyway. Every int can be used as unsigned without
problems.
 In fact they are almost the same type.

 int i = MAX_INT + 13; // you can use the exact literal if you know it
 printf("%u", i);  // prints out MAX_INT + 13

 Using unsigned doesn't add anything to your programs.
Using unsigned states that this value can only be positive. This gives compiler the opportunity to warn programmer when he tries to pass a signed value where function expects an unsigned argument. And "they are used anyway" is wrong. Haven't you ever written anything
like
 this?

     for (int i = 0; i < n; i++)
         ...

 Now, since i is signed, n can only be as large as 0x7fffffff (because
 otherwise
 the loop would never get executed). If it were declared as unsigned int,
the
 entire range of values up to 0xffffffff can be used.

 I wonder, why use signed int where it is obviously unsigned? What benefits
 does it give?
There were no benefits if unsignedness were supported by the language. Try to realize that unsigned doesn't exist in C/C++ (like arrays). unsigned u = MAX_INT + 100; ... ... // now i forget that u > MAX_INT some_function(..., u, ...); // this is a third-party function accepting an int ... // or a long if sizeof(int) == sizeof(long) ... Here you are passing a small negative integer instead of a big unsigned one. And this bug can also be unnoticed for years: void some_function (..., int x, ...) { ... // x isn't used some_other_func(..., x, ...); // this function accept an unsigned ... // x isn't used } Only the day that some calculations are introduced into some_function by a third party programmer your code break (actually my code broke).
 D has no warnings. So it either:

  - stop with a compiler error
  - raise an exception

 Leaving it like it is (no warning AND no error AND no exception), is not
 the right way.
Following this logic, D should also stop with a compiler error or raise an exception when passing a float as int argument, but it doesn't. Strict type-checking has its benefits, yet I don't like it much; guess I just had got too used to C freedom in type conversions...
If the truncated float is between the int range no exception is needed :-) Pascal was my first love ;-)
 Exceptions are too costly to be taken seriously in such cases. Compiler
 error - could be, probably requiring an explicit cast. I'd prefer it to
 be as it is now, though (but this is just a personal point).
Simply requiring an explicit cast works well in Java and in C/C++ with all warnings turned on. As D is, it doesn't raise a warning... Ciao.
Mar 21 2002
next sibling parent "Sean L. Palmer" <spalmer iname.com> writes:
I would not mind being forced to cast signed to unsigned or vice versa.  It
would prevent this entire class of errors.  However I think int literals
should implicitly convert if they fit within the range of the destination,
no matter what type of int the destination is.  I don't like having to
specify suffixes to all ints ( but 16u is a handy shortcut for
cast(uint)16 ).

I do however value the ability to declare a signed or unsigned integer, as
the situation warrants.  It's a good form of self-documentation that helps
people (and the compiler) know what sort of values should be used.

Sean

"Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
news:a7c65n$7jl$1 digitaldaemon.com...
 "Pavel Minayev" <evilone omen.ru> ha scritto nel messaggio
 news:a7a1j3$24cv$1 digitaldaemon.com...
 "Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
 news:a7a0fl$23t5$1 digitaldaemon.com...

 They are used anyway. Every int can be used as unsigned without
problems.
 In fact they are almost the same type.

 int i = MAX_INT + 13; // you can use the exact literal if you know it
 printf("%u", i);  // prints out MAX_INT + 13

 Using unsigned doesn't add anything to your programs.
Using unsigned states that this value can only be positive. This gives compiler the opportunity to warn programmer when he tries to pass a signed value where function expects an unsigned argument. And "they are used anyway" is wrong. Haven't you ever written anything
like
 this?

     for (int i = 0; i < n; i++)
         ...

 Now, since i is signed, n can only be as large as 0x7fffffff (because
 otherwise
 the loop would never get executed). If it were declared as unsigned int,
the
 entire range of values up to 0xffffffff can be used.

 I wonder, why use signed int where it is obviously unsigned? What
benefits
 does it give?
There were no benefits if unsignedness were supported by the language. Try to realize that unsigned doesn't exist in C/C++ (like arrays). unsigned u = MAX_INT + 100; ... ... // now i forget that u > MAX_INT some_function(..., u, ...); // this is a third-party function accepting
an
 int
 ...                                    // or a long if sizeof(int) ==
 sizeof(long)
 ...

 Here you are passing a small negative integer instead of a big unsigned
one.
 And this bug can also be unnoticed for years:

 void some_function (..., int x, ...)
 {
    ...  // x isn't used
    some_other_func(..., x, ...);   // this function accept an unsigned
    ...  // x isn't used
 }

 Only the day that some calculations are introduced into some_function by a
 third party programmer your code break (actually my code broke).

 D has no warnings. So it either:

  - stop with a compiler error
  - raise an exception

 Leaving it like it is (no warning AND no error AND no exception), is
not
 the right way.
Following this logic, D should also stop with a compiler error or raise
an
 exception when passing a float as int argument, but it doesn't. Strict
 type-checking has its benefits, yet I don't like it much; guess I just
 had got too used to C freedom in type conversions...
If the truncated float is between the int range no exception is needed :-) Pascal was my first love ;-)
 Exceptions are too costly to be taken seriously in such cases. Compiler
 error - could be, probably requiring an explicit cast. I'd prefer it to
 be as it is now, though (but this is just a personal point).
Simply requiring an explicit cast works well in Java and in C/C++ with all warnings turned on. As D is, it doesn't raise a warning... Ciao.
Mar 21 2002
prev sibling parent "Pavel Minayev" <evilone omen.ru> writes:
"Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
news:a7c65n$7jl$1 digitaldaemon.com...

 There were no benefits if unsignedness were supported by the language.
 Try to realize that unsigned doesn't exist in C/C++ (like arrays).

  unsigned u = MAX_INT + 100;

 ...
 ...  // now i forget that u > MAX_INT
 some_function(..., u, ...);  // this is a third-party function accepting
an
 int
 ...                                    // or a long if sizeof(int) ==
 sizeof(long)
 ...

 Here you are passing a small negative integer instead of a big unsigned
one. This is a problem of C weak type system, not signed/unsigned ints.
 And this bug can also be unnoticed for years:

 void some_function (..., int x, ...)
 {
    ...  // x isn't used
    some_other_func(..., x, ...);   // this function accept an unsigned
    ...  // x isn't used
 }
This would result in a "signed to unsigned convertion" warning on most C++ compilers I've seen.
 Simply requiring an explicit cast works well in Java and in C/C++ with all
 warnings turned on.
 As D is, it doesn't raise a warning...
My point is, I could live without casts - but I understand that others might need it as a safety guarding feature.
Mar 21 2002
prev sibling parent reply Russell Borogove <kaleja estarcion.com> writes:
Roberto Mariottini wrote:
 "Pavel Minayev" <evilone omen.ru> ha scritto nel messaggio
 news:a79q0b$20bt$1 digitaldaemon.com...
 
"Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
news:a79gcd$1rhf$1 digitaldaemon.com...


I'm very disappointed with this.
I've programmed C for more than ten years now, and C++ for nearly 10.
And I NEVER use unsigneds [snip]
I'll guess that you don't right-shift very often. :)
 They are used anyway. Every int can be used as unsigned without problems.
 In fact they are almost the same type.
 
 int i = MAX_INT + 13; // you can use the exact literal if you know it
This probably issues a warning on some or most compilers.
 printf("%u", i);  // prints out MAX_INT + 13
This probably issues a warning in GCC, which does printf typechecking. If a warning is not issued, that says more about printf and the varargs mechanism than about unsigned.
 Yes, but what's a warning? An error? Not-an-error? (this recalls me fuzzy
 logic)
In C and C++ a warning is generally not-an-error; in D a warning is generally an error.
 D has no warnings. So it either:
 
  - stop with a compiler error
  - raise an exception
Yes, so D will probably call it a compile-time error. So you're safe from most subtle unsigned/signed mismatch problems, so what's the problem with unsigned? -RB
Mar 20 2002
parent reply "Roberto Mariottini" <rmariottini lycosmail.com> writes:
"Russell Borogove" <kaleja estarcion.com> ha scritto nel messaggio
news:3C98C99A.1010403 estarcion.com...
 Roberto Mariottini wrote:
 "Pavel Minayev" <evilone omen.ru> ha scritto nel messaggio
 news:a79q0b$20bt$1 digitaldaemon.com...

"Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
news:a79gcd$1rhf$1 digitaldaemon.com...


I'm very disappointed with this.
I've programmed C for more than ten years now, and C++ for nearly 10.
And I NEVER use unsigneds [snip]
I'll guess that you don't right-shift very often. :)
You are right. But when I need it, I use explicit casts (I think they are worth anyway with rigth-shift).
 They are used anyway. Every int can be used as unsigned without
problems.
 In fact they are almost the same type.

 int i = MAX_INT + 13; // you can use the exact literal if you know it
This probably issues a warning on some or most compilers.
 printf("%u", i);  // prints out MAX_INT + 13
This probably issues a warning in GCC, which does printf typechecking. If a warning is not issued, that says more about printf and the varargs mechanism than about unsigned.
I agree that we need a typesafe alternative to printf.
 Yes, but what's a warning? An error? Not-an-error? (this recalls me
fuzzy
 logic)
In C and C++ a warning is generally not-an-error; in D a warning is generally an error.
 D has no warnings. So it either:

  - stop with a compiler error
  - raise an exception
Yes, so D will probably call it a compile-time error. So you're safe from most subtle unsigned/signed mismatch problems, so what's the problem with unsigned?
The problem is that, AFAIK, D doesn't raise an error. I am wrong? Ciao
Mar 21 2002
parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
news:a7c6b9$7r6$1 digitaldaemon.com...

 I agree that we need a typesafe alternative to printf.
I'd state that we need a typesafe alternative to varargs in general. I don't paramarrays.
Mar 21 2002
parent reply Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
Remember my recursive solution... ?

I don't know, maybe everybody thought it was awful, but it was a possible
solution...

--
The Villagers are Online! villagersonline.com

.[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ]
.[ (a version.of(English).(precise.more)) is(possible) ]
?[ you want.to(help(develop(it))) ]
Mar 21 2002
parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3C9A024B.838850E0 deming-os.org...

 Remember my recursive solution... ?

 I don't know, maybe everybody thought it was awful, but it was a possible
 solution...
Could you please remind it?
Mar 21 2002
parent reply Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
You would declare a varargs function much like C, but you would not define a
body for it.  The D version of printf might be:
    char[] printf(char[], ...);
The key here is that the first argument is the same type as the return value.

Then you would define a series of functions that had the same name and return
value, that had the same arguments, plus a single additional argument:
    char[] printf(char[], int);
    char[] printf(char[], char);
    char[] printf(char[], Object);
    char[] printf(char[], char[]);
    etc...

The compiler would then turn a varargs call into a recursive call:
    D code:
        printf("%d: %s\n", 123, "abc");
    Expanded by the compiler to be:
        printf(printf("%d: %s\n", 123), "abc");

Now how this would work is that the first (i.e. the innermost) call to printf,
which is
    printf("%d: %s\n", 123)
would parse the first part of the string, printing out the first format
specifier (%d), according to the value given.  It would also print out any
additional characters after that, up to the next format specifier.  Thus, this
call would print out
    "123: "
It would return whatever of the char array is left.  That is, it would return
the string
    "%s\n"
Therefore, the second (outermost) call to printf would work out to the
following:
    printf("%s\n", "abc");
This would work just like the first, printing out the properly formatted data
and any trailing characters.  It prints
    "abc\n"
and returns
    ""

Overall, the recursive printf has printed
    "123: abc\n"
and returned
    ""

Now, what happens if you pass the wrong type and it doesn't match the format
specifier?  Well, then printf throws exceptions.  In some cases, an optimizer
may even be able (at compile time) to look ahead enough and realize that an
exception will be thrown, and issue a warning.

What happens if you call a varargs function with a function of a type you don't
recognize?  You get a compile-time error, of course, just as if you had called
the function directly with invalid arguments.

What happens if you don't use all of the format specifiers?  You can deal with
that by adding an out {} clause to the original printf varargs declaration:
    char[] printf(char[], ...)
        out(result)
        {
            assert(result.length == 0);
        }

Now, you can return whatever you want from each of the printf calls...but when
you're done, the result returned MUST be the empty string.

--
The Villagers are Online! villagersonline.com

.[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ]
.[ (a version.of(English).(precise.more)) is(possible) ]
?[ you want.to(help(develop(it))) ]
Mar 21 2002
next sibling parent reply "Richard Krehbiel" <rich kastle.com> writes:
"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3C9A0B2A.97E27D89 deming-os.org...
 You would declare a varargs function much like C, but you would not define
a
 body for it.  The D version of printf might be:
     char[] printf(char[], ...);
 The key here is that the first argument is the same type as the return
value.
 Then you would define a series of functions that had the same name and
return
 value, that had the same arguments, plus a single additional argument:
     char[] printf(char[], int);
     char[] printf(char[], char);
     char[] printf(char[], Object);
     char[] printf(char[], char[]);
     etc...

 The compiler would then turn a varargs call into a recursive call:
     D code:
         printf("%d: %s\n", 123, "abc");
     Expanded by the compiler to be:
         printf(printf("%d: %s\n", 123), "abc");
This is very promising - let's talk about it. It seems to me that the "recursive" behavior is kinda backwards. The rightmost argument is evaluated first, and this may be bad. But, the part about using polymorphism for a variable argument list is interesting, and it made me think. How about something like the VB ParamArray, but smarter? Start with the function declaration: void printf(char[], PrintfParam args...) { // Function body "type args..." is like "type args[]" but signals that the compiler should collect up the arguments into the array. "type... args" should be a synonym of "type args..." just as "type[] array" is the same as "type array[]". The body of this function treats args as if it were a regular dynamic array, because it is. When a user codes: char[] name = "Cecil"; printf("My name is %s\n", name); ..the compiler does the equivalent of this: PrintfParms[] _t; _t[0] = new PrintfParms(name); printf("My name is %s\n", _t); Rules: Each function argument must be implicitly compatible with the argument type; else, Each argument must be convertible to the argument type via the type's constructor; else, The argument's not valid and the compiler prints an error. Only the final parameter to a function can be varargs, just like C. I like this better than the universal VARIANT type that VB uses, since that way allows programmers to pass types that the function doesn't expect, and/or requires the called function to expect every type. The argument type only needs constructors for types it can deal with. You can't make a varargs function in C which is allowed to take zero arguments. But with this method, you can: void NoArgs(Type args...) could be called with no arguments, and would receive a zero-length array. "<type> args..." is implicitly compatible with "<type> args[]" and, as such, can be passed around between functions, like C's va_list can. But it's even better: you can write code to construct the a "<type> args[]", but you *can't* construct a va_list in C except by calling a varargs function. Discussion? -- Richard Krehbiel, Arlington, VA, USA rich kastle.com (work) or krehbiel3 comcast.net (personal)
Mar 21 2002
parent reply Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
Richard Krehbiel wrote:

 It seems to me that the "recursive" behavior is kinda backwards.  The
 rightmost argument is evaluated first, and this may be bad.
Unless I'm misunderstanding you - no, this isn't true. The leftmost "vararg" is given to the innermost function, and is thus evaluated first. Or am I missing what you mean?
 But, the part
 about using polymorphism for a variable argument list is interesting, and it
 made me think.

 How about something like the VB ParamArray, but smarter?
I like the idea you proposed, though making them actually be cast to the right type (even by a constructor) seems hack-ish to me (personal opinion). Maybe some sort of union-like syntax? You would tell the compiler which types are valid in the varargs array. The array would be an array of structures, each element of which contains an identifer telling the type and a union. Still pondering... -- The Villagers are Online! villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ]
Mar 21 2002
next sibling parent "Pavel Minayev" <evilone omen.ru> writes:
"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3C9A2BDA.9C26890A deming-os.org...

 I like the idea you proposed, though making them actually be cast to the
right
 type (even by a constructor) seems hack-ish to me (personal opinion).
Oh yes! Some syntax to construct arrays on the fly would be needed in this case.
 Maybe some sort of union-like syntax?  You would tell the compiler which
types
 are valid in the varargs array.  The array would be an array of
structures, each
 element of which contains an identifer telling the type and a union.
... which leads us back to the variants (or, at least, some sort of them).
Mar 21 2002
prev sibling parent reply "Richard Krehbiel" <rich kastle.com> writes:
"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3C9A2BDA.9C26890A deming-os.org...
 Richard Krehbiel wrote:

 It seems to me that the "recursive" behavior is kinda backwards.  The
 rightmost argument is evaluated first, and this may be bad.
Unless I'm misunderstanding you - no, this isn't true. The leftmost
"vararg" is
 given to the innermost function, and is thus evaluated first.  Or am I
missing
 what you mean?
No, that's what I meant; I misread it to mean that vararg(a, b, c) would convert to vararg(a, vararg(b, c)). But now I've re-read it and I see you meant vararg(a, b, c) becomes vararg(vararg(a, b), c). Um - that's still hard to read. But I think I get it. :-)
 But, the part
 about using polymorphism for a variable argument list is interesting,
and it
 made me think.

 How about something like the VB ParamArray, but smarter?
I like the idea you proposed, though making them actually be cast to the
right
 type (even by a constructor) seems hack-ish to me (personal opinion).

 Maybe some sort of union-like syntax?  You would tell the compiler which
types
 are valid in the varargs array.  The array would be an array of
structures, each
 element of which contains an identifer telling the type and a union.
Actually I was trying to avoid having unions and a suite of compiler-defined run-time constants for object kinds. I don't know what it is that I dislike about it, though. In my scheme you can use unions with type IDs if you wish, and you get to define the type IDs too. -- Richard Krehbiel, Arlington, VA, USA rich kastle.com (work) or krehbiel3 comcast.net (personal)
Mar 21 2002
parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Richard Krehbiel" <rich kastle.com> wrote in message
news:a7dbhq$s5k$1 digitaldaemon.com...
 "Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
 news:3C9A2BDA.9C26890A deming-os.org...
 Richard Krehbiel wrote:

 It seems to me that the "recursive" behavior is kinda backwards.  The
 rightmost argument is evaluated first, and this may be bad.
Unless I'm misunderstanding you - no, this isn't true. The leftmost
"vararg" is
 given to the innermost function, and is thus evaluated first.  Or am I
missing
 what you mean?
No, that's what I meant; I misread it to mean that vararg(a, b, c) would convert to vararg(a, vararg(b, c)). But now I've re-read it and I see you meant vararg(a, b, c) becomes vararg(vararg(a, b), c). Um - that's still hard to read. But I think I get it. :-)
 But, the part
 about using polymorphism for a variable argument list is interesting,
and it
 made me think.

 How about something like the VB ParamArray, but smarter?
I like the idea you proposed, though making them actually be cast to the
right
 type (even by a constructor) seems hack-ish to me (personal opinion).

 Maybe some sort of union-like syntax?  You would tell the compiler which
types
 are valid in the varargs array.  The array would be an array of
structures, each
 element of which contains an identifer telling the type and a union.
Actually I was trying to avoid having unions and a suite of
compiler-defined
 run-time constants for object kinds.  I don't know what it is that I
dislike
 about it, though.

 In my scheme you can use unions with type IDs if you wish, and you get to
 define the type IDs too.

 --
 Richard Krehbiel, Arlington, VA, USA
 rich kastle.com (work) or krehbiel3 comcast.net  (personal)


 ----- Original Message -----
From: "Richard Krehbiel" <rich kastle.com> Newsgroups: D Sent: Thursday, March 21, 2002 10:15 PM Subject: Re: Type-safe varargs
 In my scheme you can use unions with type IDs if you wish, and you get to
 define the type IDs too.
... and you have to fill those unions yourself, which just kills the idea. I want to be able to write print() in a single line, like this: print("Hello, world!", foo, bar); And not like this: print(new PrintArg("Hello, world!"), new PrintArg(foo), new PrintArg(bar));
Mar 21 2002
parent reply "Richard Krehbiel" <rich kastle.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:a7dbqt$sd1$1 digitaldaemon.com...
 "Richard Krehbiel" <rich kastle.com> wrote in message
 news:a7dbhq$s5k$1 digitaldaemon.com...
 Actually I was trying to avoid having unions and a suite of
 compiler-defined run-time constants for object kinds.
 I don't know what it is that I dislike about it, though.

 In my scheme you can use unions with type IDs if you wish, and you get
to
 define the type IDs too.
... and you have to fill those unions yourself, which just kills the idea. I want to be able to write print() in a single line, like this: print("Hello, world!", foo, bar); And not like this: print(new PrintArg("Hello, world!"), new PrintArg(foo), new PrintArg(bar));
What I meant was the declared homogenous parameter type could, within it's various constructors, set up a union containing the types, and set a type ID; not that the caller of printf would have to do it. -- Richard Krehbiel, Arlington, VA, USA rich kastle.com (work) or krehbiel3 comcast.net (personal)
Mar 21 2002
parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Richard Krehbiel" <rich kastle.com> wrote in message
news:a7dcsf$sso$1 digitaldaemon.com...

 What I meant was the declared homogenous parameter type could, within it's
 various constructors, set up a union containing the types, and set a type
 ID; not that the caller of printf would have to do it.
That is, you've proposed a variant. Or, better called, "value packaging" -
Mar 21 2002
parent reply "Richard Krehbiel" <rich kastle.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:a7ed1d$1chu$1 digitaldaemon.com...
 "Richard Krehbiel" <rich kastle.com> wrote in message
 news:a7dcsf$sso$1 digitaldaemon.com...

 What I meant was the declared homogenous parameter type could, within
it's
 various constructors, set up a union containing the types, and set a
type
 ID; not that the caller of printf would have to do it.
That is, you've proposed a variant.
What I've proposed is that it's really up to the programmer. When he picks the argument type, that type's constructors (which he himself has written) decide what transformation is applied to any given arguments. Are they moved into a union with an appropriate type ID? Are they converted to a common representation (this would make sense for printf; everything eventually becomes a string)? Are their binary bits copied uninterpreted? Your own constructors make these decisions. And if you pick a basic type for the argument type, then all the arguments are checked for direct compatability. (If, say, you want a varargs function that takes an arbitrary number of doubles, but no other type.) -- Richard Krehbiel, Arlington, VA, USA rich kastle.com (work) or krehbiel3 comcast.net (personal)
Mar 22 2002
parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Richard Krehbiel" <rich kastle.com> wrote in message
news:a7f7hn$1qf4$1 digitaldaemon.com...

 What I've proposed is that it's really up to the programmer.  When he
picks
 the argument type, that type's constructors (which he himself has written)
 decide what transformation is applied to any given arguments.  Are they
 moved into a union with an appropriate type ID?  Are they converted to a
 common representation (this would make sense for printf; everything
 eventually becomes a string)?  Are their binary bits copied uninterpreted?
 Your own constructors make these decisions.
Constructors aren't called automatically in D.
Mar 22 2002
parent reply "Richard Krehbiel" <rich kastle.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:a7fmeh$22al$1 digitaldaemon.com...
 "Richard Krehbiel" <rich kastle.com> wrote in message
 news:a7f7hn$1qf4$1 digitaldaemon.com...

 What I've proposed is that it's really up to the programmer.  When he
picks
 the argument type, that type's constructors (which he himself has
written)
 decide what transformation is applied to any given arguments.  Are they
 moved into a union with an appropriate type ID?  Are they converted to a
 common representation (this would make sense for printf; everything
 eventually becomes a string)?  Are their binary bits copied
uninterpreted?
 Your own constructors make these decisions.
Constructors aren't called automatically in D.
They're not? Saying "type o = new type;" doesn't call the constructor? Are you sure? I gotta look this up... Well, according to the text in http://www.digitalmars.com/d/class.html the constructor *is* called automatically. -- Richard Krehbiel, Arlington, VA, USA rich kastle.com (work) or krehbiel3 comcast.net (personal)
Mar 22 2002
parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Richard Krehbiel" <rich kastle.com> wrote in message
news:a7fo73$250h$1 digitaldaemon.com...

 They're not?  Saying "type o = new type;" doesn't call the constructor?
Are
 you sure?
When you write "new type", it does. But I thought you don't want to write it in vararg function calls...
Mar 22 2002
parent "Richard Krehbiel" <krehbiel3 comcast.net> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:a7ft8d$1da3$1 digitaldaemon.com...
 "Richard Krehbiel" <rich kastle.com> wrote in message
 news:a7fo73$250h$1 digitaldaemon.com...

 They're not?  Saying "type o = new type;" doesn't call the constructor?
Are
 you sure?
When you write "new type", it does. But I thought you don't want to write it in vararg function calls...
I don't want the programmer to write it, I want the compiler to generate it. So, for the function declared thus: int print(Printarg args...) { ..the caller codes precisely this: print("The answer is", i, "\n"); ...the compiler generates: Printarg _t[3]; _t[0] = new Printarg("The answer is"); _t[1] = new Printarg(i); _t[2] = new Printarg("\n"); print(_t);
Mar 22 2002
prev sibling parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3C9A0B2A.97E27D89 deming-os.org...

 You would declare a varargs function much like C, but you would not define
a
 body for it.  The D version of printf might be:
     char[] printf(char[], ...);
 The key here is that the first argument is the same type as the return
value.
 Then you would define a series of functions that had the same name and
return
 value, that had the same arguments, plus a single additional argument:
     char[] printf(char[], int);
     char[] printf(char[], char);
     char[] printf(char[], Object);
     char[] printf(char[], char[]);
     etc...

 The compiler would then turn a varargs call into a recursive call:
     D code:
         printf("%d: %s\n", 123, "abc");
     Expanded by the compiler to be:
         printf(printf("%d: %s\n", 123), "abc");
I like the idea! Maybe the syntax could be tweaked a bit, a special attribute or something: vararg printf(char[]); ... Anyhow, it seems like a good solution to me. Completely typesafe, with ability to limit range of types that could be passed to a function, rather fast, and should be easier to implement than paramarrays (I think). What do you think of this, Walter?
Mar 21 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:a7dbi4$s5l$1 digitaldaemon.com...
 Anyhow, it seems like a good solution to me. Completely typesafe, with
 ability
 to limit range of types that could be passed to a function, rather fast,
 and should be easier to implement than paramarrays (I think).
 What do you think of this, Walter?
It works, but it's a lot of code generated <g>.
Apr 02 2002
parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Walter" <walter digitalmars.com> wrote in message
news:a8dq8q$2di6$1 digitaldaemon.com...

 It works, but it's a lot of code generated <g>.
I don't see better idea... Besides, printf()'s format string parsing is slow as well.
Apr 02 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:a8du0n$2fng$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:a8dq8q$2di6$1 digitaldaemon.com...
 It works, but it's a lot of code generated <g>.
I don't see better idea... Besides, printf()'s format string parsing is slow as well.
It's just awful hard to beat printf. I've also been known to printf out types as different types, for example, printing out floats with %x so I can check the bit pattern. Not quite as bad as vptr jamming, but ... 99% of the printf's I use are for debugging, and for that a quick and dirty printf works just great. I need to write a D version anyway so the format string can be a char[], and sprintf will write to a char[], etc. I'm really ready to abandon 0 terminated C strings!
Apr 03 2002
next sibling parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Walter" <walter digitalmars.com> wrote in message
news:a8ehr6$3i4$1 digitaldaemon.com...

 It's just awful hard to beat printf. I've also been known to printf out
 types as different types, for example, printing out floats with %x so I
can
 check the bit pattern. Not quite as bad as vptr jamming, but ...
That's why printf should still be there, whatever new output method is provided.
 99% of the printf's I use are for debugging, and for that a quick and
dirty
 printf works just great.
Well maybe, but I was talking not only about screen output, but about generic IO as well: streams and such. Having text-based IO is especially convenient for socket streams, since most TCP/IP protocols are textual.
 I need to write a D version anyway so the format
 string can be a char[], and sprintf will write to a char[], etc. I'm
really Your too fast, I've just started thinking of writing my own printf specially for D, with support for char[], imaginary and complex numbers etc... =)
Apr 03 2002
parent "OddesE" <OddesE_XYZ hotmail.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:a8evq4$5ul$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:a8ehr6$3i4$1 digitaldaemon.com...
<SNIP>
 I need to write a D version anyway so the format
 string can be a char[], and sprintf will write to a char[], etc. I'm
really Your too fast, I've just started thinking of writing my own printf specially for D, with support for char[], imaginary and complex numbers etc... =)
Look who's talking! :) -- Stijn OddesE_XYZ hotmail.com http://OddesE.cjb.net _________________________________________________ Remove _XYZ from my address when replying by mail
Apr 03 2002
prev sibling parent reply "Roberto Mariottini" <rmariottini lycosmail.com> writes:
"Walter" <walter digitalmars.com> ha scritto nel messaggio
news:a8ehr6$3i4$1 digitaldaemon.com...
 It's just awful hard to beat printf. I've also been known to printf out
 types as different types, for example, printing out floats with %x so I
can
 check the bit pattern. Not quite as bad as vptr jamming, but ...

 99% of the printf's I use are for debugging, and for that a quick and
dirty
 printf works just great. I need to write a D version anyway so the format
 string can be a char[], and sprintf will write to a char[], etc. I'm
really
 ready to abandon 0 terminated C strings!
The real printf problems are: 1. %s does not specify a size. This is due to the use of 0 terminated strings, if you pass something non-0-terminated or something which is not a string your program will probably crash (it usually happens on the customer's computer, not yours). If you remove 0-terminated strings, you remove this problem. 2. the size of the actual arguments passed are not known, and need not to match with format specifiers. I wonder what can happen if you specify a long int format and pass a plain int as actual argument. 3. the user must remember all thoose little tricky letters ('d' for integers, 'g' for floats, 'h' for shorts, who invented them?) to print simple types. Once solved theese, I'm well with printf. Ciao
Apr 09 2002
parent reply Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
Roberto Mariottini wrote:

  2. the size of the actual arguments passed are not known, and need not to
      match with format specifiers. I wonder what can happen if you specify a
      long int format and pass a plain int as actual argument.
I don't know about all compilers (or what the C spec might be), but it seems in my experiments that all ints are cast up to longs when passed to printf. So if you do this: char c; printf("%c\n",c); c is actually cast to a long, passed that way, and then the printf code casts it back to a char. I don't know if they automatically get cast to signed or unsigned. -- The Villagers are Online! villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ]
Apr 09 2002
parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3CB317F9.3C763A11 deming-os.org...

 I don't know about all compilers (or what the C spec might be), but it
seems in
 my experiments that all ints are cast up to longs when passed to printf.
So if
 you do this:
     char c;
     printf("%c\n",c);
 c is actually cast to a long, passed that way, and then the printf code
casts it But to remember that D long is not the C one... If I recall it correctly, when you call the vararg function, the conversions are: (unsigned/signed) char, short -> int float -> double
 back to a char.  I don't know if they automatically get cast to signed or
 unsigned.
There's no sense in it, since both have the same bit pattern, so the PUSH instruction generated is absolutely the same.
Apr 09 2002
parent reply Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
Pavel Minayev wrote:

     (unsigned/signed) char, short -> int
     float -> double

 back to a char.  I don't know if they automatically get cast to signed or
 unsigned.
There's no sense in it, since both have the same bit pattern, so the PUSH instruction generated is absolutely the same.
The push is, but the cast is not. If it's an unsigned, then you just add bytes of 0's...if it's signed, then you sign extend. Anyhow, let's hope that Walter uses my typesafe varargs idea...or comes up with a better one :) -- The Villagers are Online! villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ]
Apr 09 2002
parent "Pavel Minayev" <evilone omen.ru> writes:
"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3CB326F8.2D0A5A4D deming-os.org...

 The push is, but the cast is not.  If it's an unsigned, then you just add
bytes
 of 0's...if it's signed, then you sign extend.
It's passed as is. Your responsibility is to supply the appropriate format specifier.
 Anyhow, let's hope that Walter uses my typesafe varargs idea...or comes up
with
 a better one :)
Oh yes. I'd even add it to the list of what has to be done for D to be considered beta (remember that one?).
Apr 09 2002
prev sibling parent reply Russell Borogove <kaleja estarcion.com> writes:
(This is off-topic, because it's about a C/C++ project,
but since we were just discussing unsigned, I thought
I'd share with the whole class.)

Roberto Mariottini wrote:
 I've programmed C for more than ten years now, and C++ for nearly 10.
 And I NEVER use unsigneds, for I've learnt that it is a RIDICULOUS thing,
 that took me hours of bug tracking to find.
 Don't think you'll have much code to break, thoug, because people tend to
 not use unsigneds, like I do.
Some programmer's use of signed integers instead of unsigned just cost me half a day of work. I am porting a game from the Playstation 2 to the Nintendo Gamecube. It uses a custom memory manager, as do most console games these days. Among other requirements, the mem manager has to be able to allocate memory aligned to any power-of-two boundary. The code to do this looks something like this: void* my_alloc_aligned( int size, int align ) { void *mem; int address; int aligned; mem = my_alloc( size + align ); address = (int)mem; // note that in the case where we happen to get // aligned memory in the first place, this appears // to waste an alignment, but we actually have to // leave an offset cookie before the used memory // block, so we can't just take the original // allocation. aligned = address + align - (address % align); ... more housekeeping ... } This works on the PS2. This gives back the wrong address on the Gamecube. The culprit is the % operator acting on a signed number; the user memory space on the Gamecube is mapped in the 0x80000000 and up range. The result of the mod is negative, the block handed back to the application is in the wrong place, and it stomps on the next mem management tracking block.[1] Using unsigned int throughout this function in place of int solves the problem, and should have been done in the first place. Of course, the memory mapping on the Gamecube seems a questionable choice for this very reason, but still. Later in the day, I ran across the code: // make sure the pointer is reasonable ASSERT((int)pointer > 0x100000); Fortunately, this one was easy to identify after having solved the above. -Russell B [1] The compiler can't see that align is always going to be a power of 2, so it can't convert the mod to a mask.[2] [2] How do you design a language in such a way as to strongly type a variable such that it can only be a power of two, or only divisible by 7, or only prime?
Mar 21 2002
next sibling parent reply "Serge K" <skarebo programmer.net> writes:
    aligned = address + align - (address % align);
 [1] The compiler can't see that align is always going
 to be a power of 2, so it can't convert the mod to a
 mask.[2]
Well... Try this: // obvious solution, isn't it? ;-) aligned = (address + align) & ~(align-1);
Mar 21 2002
parent reply Russell Borogove <kaleja estarcion.com> writes:
Serge K wrote:
   aligned = address + align - (address % align);
[1] The compiler can't see that align is always going
to be a power of 2, so it can't convert the mod to a
mask.[2]
Well... Try this: // obvious solution, isn't it? ;-) aligned = (address + align) & ~(align-1);
Well, yes, but then, so is using "unsigned". -R
Mar 21 2002
parent reply "Serge K" <skarebo programmer.net> writes:
 // obvious solution, isn't it? ;-)

 aligned = (address + align) & ~(align-1);
Well, yes, but then, so is using "unsigned".
In such situation "unsigned" cannot help compiler to replace "%" with masking... (I'm not against unsigned integer types - use them all the time if it fits ;-) and I really don't like that ">>" is an arithmetic shift in Java. It's against logic : "<<" - left shift with zero-filling, ">>" - right shift with sign-filling (???), ">>>" - right shift with zero-filling (???). Than again - I use shifts mostly for bit-twiddling.
Mar 21 2002
parent Russell Borogove <kaleja estarcion.com> writes:
Serge K wrote:
// obvious solution, isn't it? ;-)

aligned = (address + align) & ~(align-1);
Well, yes, but then, so is using "unsigned".
In such situation "unsigned" cannot help compiler to replace "%" with masking...
Correctness first, then performance. (I happen to know that this game doesn't do much mem allocation in midgame, also...)
 It's against logic :
 "<<" - left shift with zero-filling,
 ">>" - right shift with sign-filling (???),
 ">>>" - right shift with zero-filling (???).
 Than again - I use shifts mostly for bit-twiddling.
Too bad the +/- glyph isn't traditional ascii -- we could have >> for LSR and +/->> for ASR. Of course, that would look funny. -R
Mar 21 2002
prev sibling parent reply "Roberto Mariottini" <rmariottini lycosmail.com> writes:
"Russell Borogove" <kaleja estarcion.com> ha scritto nel messaggio
news:3C9A1C02.3060708 estarcion.com...
 (This is off-topic, because it's about a C/C++ project,
 but since we were just discussing unsigned, I thought
 I'd share with the whole class.)

 Roberto Mariottini wrote:
 I've programmed C for more than ten years now, and C++ for nearly 10.
 And I NEVER use unsigneds, for I've learnt that it is a RIDICULOUS
thing,
 that took me hours of bug tracking to find.
 Don't think you'll have much code to break, thoug, because people tend
to
 not use unsigneds, like I do.
Some programmer's use of signed integers instead of unsigned just cost me half a day of work.
[...snip code example ...]
 This works on the PS2. This gives back the wrong
 address on the Gamecube. The culprit is the %
 operator acting on a signed number; the user memory
 space on the Gamecube is mapped in the 0x80000000
 and up range. The result of the mod is negative,
 the block handed back to the application is in the
 wrong place, and it stomps on the next mem management
 tracking block.[1]
This bug has come due to failing preconditions (memory address <= MAX_INT). This is only one of the errors that can come when using non-portable code. And writing portable code is VERY difficult, even for experienced preogrammers like me and you. You are lucky that they both have 32bit integers...
 Using unsigned int throughout this function in
 place of int solves the problem, and should have
 been done in the first place.
I disagree. A cast should be used instead, IMO ;-) Yes, unsigned can be the solution, if only they were really supported...
 Of course, the memory
 mapping on the Gamecube seems a questionable choice
 for this very reason, but still.

 Later in the day, I ran across the code:

    // make sure the pointer is reasonable
    ASSERT((int)pointer > 0x100000);

 Fortunately, this one was easy to identify after
 having solved the above.
It should be: ASSERT(pointer > (pointer_type)0x100000); You are assuming that pointer representation is linear, this may not be good. Only pointer arithmethic can do the right thing.
 -Russell B

 [1] The compiler can't see that align is always going
 to be a power of 2, so it can't convert the mod to a
 mask.[2]
Ok, but if you are working with bits, maybe a mask is better.
 [2] How do you design a language in such a way as to
 strongly type a variable such that it can only be a
 power of two, or only divisible by 7, or only prime?
I don't know. Russel, I think we have the same problem. Signed and unsigned as they are in C/C++ are not typesafe, and too error-prone to use consistently. I am not against unsigneds, I'd love to have TRUE unsigneds in C/C++. But there aren't, and this is our problem. Ciao P.S.: Just curious, does anybody used signed-modulus operator for useful tasks?
Mar 22 2002
parent Russell Borogove <kaleja estarcion.com> writes:
Roberto Mariottini wrote:
 "Russell Borogove" <kaleja estarcion.com> ha scritto nel messaggio
 news:3C9A1C02.3060708 estarcion.com...
 
(This is off-topic, because it's about a C/C++ project,
but since we were just discussing unsigned, I thought
I'd share with the whole class.)

Roberto Mariottini wrote:

I've programmed C for more than ten years now, and C++ for nearly 10.
And I NEVER use unsigneds, for I've learnt that it is a RIDICULOUS
thing,
that took me hours of bug tracking to find.
Don't think you'll have much code to break, thoug, because people tend
to
not use unsigneds, like I do.
Some programmer's use of signed integers instead of unsigned just cost me half a day of work.
[...snip code example ...] This bug has come due to failing preconditions (memory address <= MAX_INT).
Which was never asserted, despite the fact that, on the whole, this code is pretty heavily peppered with assertions. (This makes me skeptical of design-by-contract as a cure-all, because some assumptions are always going to be implicit rather than explicit -- not that DBC is a bad thing, far from it!) However, with unsigned integers, it just _works_, regardless of the high address bit. That makes the function more reusable, more portable.
 This is only one of the errors that can come when using non-portable code.
 And writing portable code is VERY difficult, even for experienced
 preogrammers
 like me and you. You are lucky that they both have 32bit integers...
Technically, lucky that int is at least as big as pointer to the mapped memory space on both systems, and that pointers have a reasonable bit representation. It should still be at least a few more years before consoles need more than a 4GB memory space... :) Portability is relative. I just want a little bit more portability.
Using unsigned int throughout this function in
place of int solves the problem, and should have
been done in the first place.
I disagree. A cast should be used instead, IMO ;-)
A cast to what? You can't % or & a pointer....
Later in the day, I ran across the code:

   // make sure the pointer is reasonable
   ASSERT((int)pointer > 0x100000);

Fortunately, this one was easy to identify after
having solved the above.
It should be: ASSERT(pointer > (pointer_type)0x100000);
Yes -- but note that this only makes a difference at the machine instruction level because pointer comparisons are considered _unsigned_. Apart from, maybe, selection of registers, on most machines that have the required integer widths (and simple pointer representations, as you correctly note) the expression ((unsigned int)pointer > 0x100000) should generate the exact same instruction sequence as (pointer > (pointer_type)0x100000)
 You are assuming that pointer representation is 
 linear, this may not be good.
 Only pointer arithmethic can do the right thing.
This is a politically incorrect thing for me to say, but every C implementation I've used on for the last fourteen years has had a simple, linear pointer representation[1], could hold a pointer in a long if not an int, and could accept arbitrary integers being stored into pointers without crashing until the dereference. I know "I'm not supposed to assume that those features are universal", but anyone who introduces a new architecture that doesn't work that way at this point is a fool, and old architectures that don't work that way are rare and getting rarer.[2]
 Russel, I think we have the same problem. Signed and unsigned as they
 are in C/C++ are not typesafe, and too error-prone to use consistently.
 I am not against unsigneds, I'd love to have TRUE unsigneds in C/C++.
 But there aren't, and this is our problem.
Well, we'll have to agree to disagree. I don't want "true unsigneds" because they don't match the underlying machine representation. Consider that most CPUs don't know or care if the contents of a register, or memory, are signed, unsigned, or pointer -- only the compiler really knows, and it's only through the compiler's selection of instructions to apply to them that they gain an identity as one thing or another. It's ingrained in my C experience to think of casts among pointer and integer types as being bit-preserving, and between integer and floating-point types as value- preserving. I really feel that I'd rather have a separate, general facility for range-constrained values -- where you can say, not that foo is merely signed or unsigned, but that foo's acceptable range is from -60 to +6, for example, and define whether you want out of range results silently clamped, or exceptions thrown. Such a mechanism would be far more powerful, would incur performance overhead only when asked for, and wouldn't change the behavior of the signed and unsigned types that we're used to. <rant humor_through_repetition_mode=ON> Of course, with operator overloading, you could just create a range-constrained value class, complete with a virtual handle-out-of-range function that could clamp, wrap, throw an exception, or "waveshape" out of range results, without having to extend or mutate the language itself. </rant> -Russell B [1] Although the simple representation might not be the _only_ one, cf. "near", "far", "huge" on most 16-bit x86 implementations. [2] I'm aware that this is vaguely hypocritical given my original complaint.
Mar 22 2002