www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 4172] New: Improve varargs

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172

           Summary: Improve varargs
           Product: D
           Version: future
          Platform: Other
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: nfxjfg gmail.com



Right now, "D-style Variadic Functions" provide _argptr and _arguments "magic"
variables. _argptr is just a pointer to the arguments, and getting the pointer
to each actual argument is very hard to implement. Actually it's _impossible_
to do in a portable way. Think of non-x86 architectures.

I propose the introduction of an additional _argarray variable, which is a
void*[]. Each array item should point to an argument's value, similar how each
item of _arguments contains the type of the argument.

Example:

void foo(...) {
   for (uint i = 0; i < _arguments.length; i++) {
     writefln("type of argument %s: %s", i, _arguments[i]);
     writefln("pointer to argument %s: %s", i, _argarray[i]);
   }
}

(Try that with to implement in a portable way without _argarray!)

This would also allow to implement positional format arguments more easily.

Note that there's the va_arg template, but it only works with argument types
known at compile time. For formatting functions and the like, you want to be
able to do everything at runtime.

This proposal is meant for both D1 and D2.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 10 2010
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172


Fawzi Mohamed <fawzi gmx.ch> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fawzi gmx.ch



The clean way to fix this (and what LDC does) is to pack all arguments
(aligning them) in a stack allocated array, create the typeinfo array, and then
call the function passing to it a void* to the packed array and the typeinfo
array.

Thus the D vararg would be equivalent to void*,TypeInfo[].
More explicitly
f(...) -> f(void*,TypeInfo[])

This is slightly less efficient than C for some arguments, but is very
portable, and one can skip an argument just using typeinfo information (tsize,
and possibly alignment).

This was the first bug I did encounter when coming to D: I did try to print a
complex number and it failed because tango did not explicitly decode a complex.
I did add support for some more kinds of hard coded structs to decode, but as
nfxjfg says it is impossible to cover all cases on all architectures, as one
should cover all possible types at compile time and decode each one with a
special va_arg call.

The proposed solution is relatively efficient, and portable.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 24 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172




22:08:23 PST ---

 The clean way to fix this (and what LDC does) is to pack all arguments
 (aligning them) in a stack allocated array, create the typeinfo array, and then
 call the function passing to it a void* to the packed array and the typeinfo
 array.
Doesn't dmd do the same? More handy way is to pack arguments into struct and pass single TypeInfo for that struct. It already has all necessary align and offset info there. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 25 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172




22:11:11 PST ---
And TypeInfo is immutable and doesn't require construction on every call.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 25 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172






 The clean way to fix this (and what LDC does) is to pack all arguments
 (aligning them) in a stack allocated array, create the typeinfo array, and then
 call the function passing to it a void* to the packed array and the typeinfo
 array.
Doesn't dmd do the same?
Yeah, it looks like dmd on 64 bit will emulate the old shitty way that was "natural" for dmd on 32 bit. But I don't understand why you would WANT to do that. Why emulate something broken and hard to use? Passing an array of void* to each parameter (see _argarray in issue description) would be so much easier.
 More handy way is to pack arguments into struct and pass single TypeInfo for
 that struct. It already has all necessary align and offset info there.
Not sure what you mean by that. There's no RTTI for struct members, so this would be very not-useful. You'd still need to pass _arguments, and you'd still need to follow the ABI for the struct layout (granted, better than trying to follow the stack layout). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 26 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172




10:43:31 PST ---
TypeInfo has offTi property that returns OffsetTypeInfo[], which is exactly
what you want - types and offsets of struct members.

What problem do you have with struct layout? You can't sum address of struct
and field offset to get address of field? Anyway, if reflection in D is awful,
may be it's better to make it usable rather than burden the compiler with work
that can be done by reflection but is not done yet?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 26 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172





 TypeInfo has offTi property that returns OffsetTypeInfo[], which is exactly
 what you want - types and offsets of struct members.
That information is (and always has been) missing. offTi is always null. Maybe Walter tried it and then thought it'd use too much memory.
 What problem do you have with struct layout? You can't sum address of struct
 and field offset to get address of field? Anyway, if reflection in D is awful,
 may be it's better to make it usable rather than burden the compiler with work
 that can be done by reflection but is not done yet?
Good luck with that. And I don't see any additional burden. It's already burdened with "packing" the params on the stack. My proposal might actually make it simpler for both compiler and user. Why burden the user with highly ABI dependent struct or stack layouts? Is this assembler? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 26 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172


Andrei Alexandrescu <andrei metalanguage.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
                 CC|                            |andrei metalanguage.com
            Version|future                      |D1
         AssignedTo|nobody puremagic.com        |bugzilla digitalmars.com



09:32:53 PST ---
Marking this as a D1 only thing and leaving decision to Walter. My suggestion
is to close this as a wontfix.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 28 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172




I agree, I came to this from  a discussion on the IRC, and I was thinking it
was a GDC bug (which has a badly broken implementation.
If implemented correctly, and with alignment info and size in the typeinfo it
is perfectly workable.
Not the most nice possible, but reasonably efficient and usable.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 28 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172




21:56:33 PST ---
Unfortunately, the 64 bit C ABI is rather disastrously complex for varargs. I
went back and forth for a while on how to implement it, and whether to use the
C ABI for D variadic functions as well as C variadic functions.

I finally decided that, although the C variadics were inefficient, they are
used rarely enough that it doesn't much matter, and that D will follow the C
ABI.

The result is a much expanded and more complex std.c.stdarg implementation.

We can revisit this and look into making it more efficient later, but for now I
just want to get it working.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 29 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172




I don't understand why an ldc like approach (caller has to prepare marshalled
array, vararg function is equivalent to (void*,TypeInfo[])) would not work if
typeinfo has align info.

The use of the C ABI is the reason gdc is broken, using it is not possible to
loop on arbitrary arguments, all possible arguments should be accounted for at
compile time, which defeats much of the purpose of varargs...

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 29 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172




This discussion might be relevant

http://dsource.org/projects/tango/ticket/1042

should the C ABI be used, then one is better off using compile time varargs
(maybe using an explicit extra tuple) to build an "old-style" vararg call.
Thus I would see no point in doing it, deprecating it would be better.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 30 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172





 We can revisit this and look into making it more efficient later, but for now I
 just want to get it working.
It's about ease of use, not efficiency. The C ABI is indeed "disastrously complex". Just look at the code here: http://www.dsource.org/projects/druntime/browser/trunk/src/core/stdc/stdarg.d The user has to duplicate that code, if he wants to use TypeInfos to unpack the arguments, instead of using compile time types (that va_start) would require. Now think how that would look like on 64 bits. If you're going to use the 64 C ABI for D variadics, you may as well completely remove them from D1 and D2. I don't understand what's so hard about just creating a void*[] on the stack, whose items points to local variables containing the actual argument data. I've done something similar before, when I changed the associative array ABI for by precise GC scanning patch. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 30 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172




PS: passing a void[]* instead of void* would also allow users to build varargs
at on the call site.

Backward compatibility with the old way (for D1) can be achieved by naming the
void*[] _argarray, and by setting _argptr to _argarray[0], and making the
compiler write all arguments in a linear array. (That would be done for
transition. The "old" way of traversing args would be declared deprecated.)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 30 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172




07:50:14 PST ---
Should we close this then? Again, for D2 the vararg problem is settled, and I
see little reason to put much work on improving D1's varargs only.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 30 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172




To keep backward compatibility and a working D1 compiler at least for D1, if
marshalling is too difficult with the current compiler I would evaluate the
effort of creating a hidden variadic template function for each variadic
function, that (using a tuple) would pass the stuff following the old
convention to the real variadic function defined as taking void*,TypeInfo[].
Then at each calling point after the overload resolution, if the match is for
the variadic function one would call the corresponding variadic template
function.

As the differences between variadic functions and template functions are just
small overloading differences (maybe not any more in D2, I did not check the
details), and the possibility to override variadic functions, and those would
still work correctly, with such an approach one can keep D1 working also on 64
bit architectures.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 02 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172




Here we can see the result of the "old" varargs being ported to 64 bits:
http://dsource.org/projects/phobos/changeset/2229

The compiler got more complicated too. Good job, Walter.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 21 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |WONTFIX


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 21 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172




20:25:56 PST ---

 Here we can see the result of the "old" varargs being ported to 64 bits:
 http://dsource.org/projects/phobos/changeset/2229
 The compiler got more complicated too. Good job, Walter.
It's not even the "old" varargs. I don't think it's possible to support the C varargs correctly with the 64 bit ABI. (Note that gcc does not implement the 64 bit varargs ABI correctly - I had to do a lot of experiments to figure out how it *really* worked.) We could invent our own ABI for varargs, and it would be simple. But then, we're screwed trying to interoperate with C code that uses varargs. The 64 bit varargs works tolerably ok, though I'm not thrilled with it. The reason for that changeset is I wished to avoid the varargs-style copying of the argument in order to access it. Much better to point to whereever it is, so the code has to dip under the hood to the dirty underbelly of varargs. Avoiding the copy not only speeds things up, it avoids issues like where/when does the destructor happen on the copy? To me, the 64 bit varargs ABI looks like a giant mistake that was codified instead of fixed in order to preserve backwards compatibility. (It suggests the original designer tried to do some clever optimizations, but failed to think it through and the result is an inefficient mess.) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 21 2010
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4172





 We could invent our own ABI for varargs, and it would be simple. But then,
 we're screwed trying to interoperate with C code that uses varargs.
The D ABI doesn't need to follow the C ABI, and that includes varargs. Of course the compiler still needs to implement the C ABI for extern(C) functions, but that is nothing a D programmer needs to care about. This issue wasn't about the C ABI at all. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 21 2010