www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - RFC: generic safety, specialized implementation?

reply =?UTF-8?B?THXDrXM=?= Marques <luis luismarques.eu> writes:
Consider some C code like this:

     struct Foo
     {
         struct Foo *next;
         int bar;
     };

     struct Bar
     {
         struct Bar *next;
         char* name;
     };

     struct Unrelated
     {
         int x;
         int y;
     };

     void combulate(void *item)
     {
         struct ItemWithNext
         {
             struct ItemWithNext *next;
         };

         struct ItemWithNext *inext = item;

         // ... lots of code here
         inext->next = ...;
     }

     int main()
     {
         struct Foo foo;
         struct Bar bar;
         struct Unrelated unrelated;
         combulate(&foo);
         combulate(&bar);
         combulate(&unrelated); // bug
     }

The function `combulate` must use a `void*` parameter to accept 
any argument that structurally conforms to the interface it 
expects -- in this case, having a `next` field as its first 
member. That has the disadvantage that the type system doesn't 
catch the mistake of passing it an Unrelated value, which doesn't 
conform to the expected binary interface. In D we might instead 
do something like this:

     struct Foo
     {
         Foo* next;
         int bar;
     }

     struct Bar
     {
         Bar* next;
         char* name;
     }

     struct Unrelated
     {
         int x;
         int y;
     };

     void combulate(T)(T* item)
     {
         // ... lots of code here
         item.next = ...;
     }

     void main()
     {
         Foo foo;
         Bar bar;
         Unrelated unrelated;
         combulate(&foo);
         combulate(&bar);
         combulate(&unrelated); // compilation error
     }

This catches the bug, but will have the disadvantage of 
generating code for the various types to which combulate is 
specialized, even though the body of the function template 
doesn't rely on anything that varies between the specializations. 
That is problematic in the context where I'm using this (embedded 
systems). So instead I've started using a mixed approach, with 
generic code that checks for the appropriate type, but delegates 
the actual work to a non-templated function (a fully specialized 
function template in my actual case), except in the cases where 
the actual code is small enough or the specialization 
significantly improves the performance. Something like this:

     private struct ItemWithNext
     {
         ItemWithNext *next;
     }

     private void combulate(ItemWithNext* item)
     {
         // ... lots of code here
         item.next = ...;
     }

     pragma(inline, true)
     void combulate(T)(T* item)
         if(someKindOfCheck!item)
     {
         combulate(cast(ItemWithNext*) item);
     }

So far this seems to be working well for me. Do you have 
experience writing this kind of code? Do you have any advice that 
might be relevant to this situation?

PS: I know I don't have to define and use a structure to access 
the next field, but I feel like that generalizes better to other 
scenarios and is clearer.
Jan 19 2018
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 19 January 2018 at 19:18:22 UTC, Luís Marques wrote:
 So far this seems to be working well for me. Do you have 
 experience writing this kind of code? Do you have any advice 
 that might be relevant to this situation?
This is actually the way a lot of stuff is implemented in d runtime code, like the array helper functions (the interface done via a bit of magic in the compiler rather than templates mostly tho). I find it works in some cases with a bit smaller binary, little runtime speed change, and it can sometimes help compile speeds. Of course, you can also sometimes use interfaces and classes to achieve this same goal.
Jan 19 2018
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Jan 19, 2018 at 07:18:22PM +0000, Luís Marques via Digitalmars-d wrote:
[...]
     void combulate(T)(T* item)
     {
         // ... lots of code here
         item.next = ...;
     }
[...] [...]
 This catches the bug, but will have the disadvantage of generating
 code for the various types to which combulate is specialized, even
 though the body of the function template doesn't rely on anything that
 varies between the specializations.
Yeah, I think this is one area where the way the compiler implements templates could be improved. Quite often, in a complex template (whether it's a template function or a template aggregate like a struct or class) only a subset of the code actually depends on the template arguments. Nevertheless, the compiler duplicates the entirety of the code in the generated template. This leads to a lot of template bloat. I'm not sure how possible it is in the current compiler implementation, but it would be nice if the compiler were a bit smarter about instantiating templates. If an instantiated function (could be any subset of the code, but granularity at the function level is probably easiest to implement) would not result in code that differs from other instantiations in the generated IR, only emit the code if it hasn't been emitted yet; otherwise just alias that particular instantiation to the previous instantiation. Perhaps an enhancement request could be filed for this. [...]
 That is problematic in the context where I'm using this (embedded
 systems). So instead I've started using a mixed approach, with generic
 code that checks for the appropriate type, but delegates the actual
 work to a non-templated function (a fully specialized function
 template in my actual case), except in the cases where the actual code
 is small enough or the specialization significantly improves the
 performance. Something like this:
[...]
 So far this seems to be working well for me. Do you have experience
 writing this kind of code? Do you have any advice that might be
 relevant to this situation?
[...] Yeah, basically, this is just doing what I described above by hand. I've done similar refactorings before, to reduce template bloat. It certainly seems to work well. However, it would be nice if the compiler automated such rote work for us. T -- Help a man when he is in trouble and he will remember you when he is in trouble again.
Jan 19 2018
prev sibling parent Timothee Cour <thelastmammoth gmail.com> writes:
this is very related to ICF (identical code folding), which some linkers do.

in summary: gold and lld have options to do that, and it can be unsafe
to do if code relies on each function having distinct address;
 gold's --icf=safe option only enables ICF for functions that can be proven not
to have their address taken, so code that relies on distinct addresses will
still work.
links: https://stackoverflow.com/questions/38662972/does-the-linker-usually-optimize-away-duplicated-code-from-different-c-templat https://stackoverflow.com/questions/15168924/gcc-clang-merging-functions-with-identical-instructions-comdat-folding https://releases.llvm.org/3.9.0/tools/lld/docs/ReleaseNotes.html On Fri, Jan 19, 2018 at 11:29 AM, H. S. Teoh via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 On Fri, Jan 19, 2018 at 07:18:22PM +0000, Luís Marques via Digitalmars-d
wrote:
 [...]
     void combulate(T)(T* item)
     {
         // ... lots of code here
         item.next = ...;
     }
[...] [...]
 This catches the bug, but will have the disadvantage of generating
 code for the various types to which combulate is specialized, even
 though the body of the function template doesn't rely on anything that
 varies between the specializations.
Yeah, I think this is one area where the way the compiler implements templates could be improved. Quite often, in a complex template (whether it's a template function or a template aggregate like a struct or class) only a subset of the code actually depends on the template arguments. Nevertheless, the compiler duplicates the entirety of the code in the generated template. This leads to a lot of template bloat. I'm not sure how possible it is in the current compiler implementation, but it would be nice if the compiler were a bit smarter about instantiating templates. If an instantiated function (could be any subset of the code, but granularity at the function level is probably easiest to implement) would not result in code that differs from other instantiations in the generated IR, only emit the code if it hasn't been emitted yet; otherwise just alias that particular instantiation to the previous instantiation. Perhaps an enhancement request could be filed for this. [...]
 That is problematic in the context where I'm using this (embedded
 systems). So instead I've started using a mixed approach, with generic
 code that checks for the appropriate type, but delegates the actual
 work to a non-templated function (a fully specialized function
 template in my actual case), except in the cases where the actual code
 is small enough or the specialization significantly improves the
 performance. Something like this:
[...]
 So far this seems to be working well for me. Do you have experience
 writing this kind of code? Do you have any advice that might be
 relevant to this situation?
[...] Yeah, basically, this is just doing what I described above by hand. I've done similar refactorings before, to reduce template bloat. It certainly seems to work well. However, it would be nice if the compiler automated such rote work for us. T -- Help a man when he is in trouble and he will remember you when he is in trouble again.
Jan 19 2018