www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - A Tango Fibers question and a functional programming anecdote.

reply downs <default_357-line yahoo.de> writes:
Here's a question to the tango devs.
I've been developing my own stackthreads (Fibers basically) library for
a while now, and finally got it working a few days ago.

Then we did some speed tests and discovered that it was about twice as
slow as Tango's Fiber implementation.

This prompted me to dig into the Fibers sourcecode, where I discovered
that Fibers doesn't backup ECX/EDX on context switch, and the FP
registers only on windows.

Now I am confused.

Could a knowledgeable tango dev kindly shed light on this issue? Why
isn't it necessary to save all registers on a context switch?

----
Now for the anecdote.
Today, at five in the morning, I wrote what I believe to be the first
purely functional D code with as little redundancy as possible in the
language.

 ["Load ", "Prefetch ", "Prefetch "] /zip/ ([0, 1, 2] /map/ (&info
/rfix/ currentComic /fix/ (&subtract!(int) /fix/ currentStrip))) /map/ &concat!(string) /zip/ [&update, &prefetch /fix/ (currentStrip-1), &prefetch /fix/ (currentStrip-2)] /map/ &Pool.addTask;

So I looked at it, and for some _strange_ reason my eyes suddenly
started bleeding heavily.
That's when I knew I had overdone it on the templates again.
:)


PS: Here's the same code with some line breaks. But where's the fun in that?

 ["Load ", "Prefetch ", "Prefetch "] /zip/ (
   [0, 1, 2] /map/ (
     &info /rfix/ currentComic /fix/ (
       &subtract!(int) /fix/ currentStrip
       )
     )
   )
   /map/ &concat!(string) /zip/ [
     &update,
     &prefetch /fix/ (currentStrip-1),
     &prefetch /fix/ (currentStrip-2)
   ] /map/ &Pool.addTask;
Oct 27 2007
next sibling parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"downs" <default_357-line yahoo.de> wrote in message 
news:fg15ev$12uh$1 digitalmars.com...
 Here's a question to the tango devs.
 I've been developing my own stackthreads (Fibers basically) library for
 a while now, and finally got it working a few days ago.

 Then we did some speed tests and discovered that it was about twice as
 slow as Tango's Fiber implementation.

 This prompted me to dig into the Fibers sourcecode, where I discovered
 that Fibers doesn't backup ECX/EDX on context switch, and the FP
 registers only on windows.

 Now I am confused.

 Could a knowledgeable tango dev kindly shed light on this issue? Why
 isn't it necessary to save all registers on a context switch?
I'm not knowledgeable, but you said ECX/EDX and it triggered something in my brain. Sure enough, according to the D 1.0 ABI, "EAX, ECX, EDX are scratch registers and can be destroyed by a function." So they don't have to be preserved because you're allowed to do anything with them. As for nothing but ST(0-7) being preserved on Windows.. no clue.
 ["Load ", "Prefetch ", "Prefetch "] /zip/ ([0, 1, 2] /map/ (&info
/rfix/ currentComic /fix/ (&subtract!(int) /fix/ currentStrip))) /map/ &concat!(string) /zip/ [&update, &prefetch /fix/ (currentStrip-1), &prefetch /fix/ (currentStrip-2)] /map/ &Pool.addTask;
 ["Load ", "Prefetch ", "Prefetch "] /zip/ (
   [0, 1, 2] /map/ (
     &info /rfix/ currentComic /fix/ (
       &subtract!(int) /fix/ currentStrip
       )
     )
   )
   /map/ &concat!(string) /zip/ [
     &update,
     &prefetch /fix/ (currentStrip-1),
     &prefetch /fix/ (currentStrip-2)
   ] /map/ &Pool.addTask;
Why/how do you write code like this? Is it some kind of thing in your brain that makes you think about everything inside out? Stop it. Please. For all our brains' sakes.
Oct 27 2007
parent reply downs <default_357-line yahoo.de> writes:
Jarrett Billingsley wrote:
 I'm not knowledgeable, but you said ECX/EDX and it triggered something in my 
 brain.  Sure enough, according to the D 1.0 ABI, "EAX, ECX, EDX are scratch 
 registers and can be destroyed by a function."  So they don't have to be 
 preserved because you're allowed to do anything with them.
 
Removed the push for ECX and EDX ( the others are used internally ). It seems to work. Many thanks! :) Still runs somewhat slower than Fibers, though. I can't imagine why. Also, oddly, GDC doesn't allow me to push EBP in -O mode and D inline assembly. Works in native GCC assembly. *Very* odd.
 As for nothing but ST(0-7) being preserved on Windows.. no clue.
 
Aww. Still thanks for answering. I'm sure somebody will know it.
 ["Load ", "Prefetch ", "Prefetch "] /zip/ ([0, 1, 2] /map/ (&info
/rfix/ currentComic /fix/ (&subtract!(int) /fix/ currentStrip))) /map/ &concat!(string) /zip/ [&update, &prefetch /fix/ (currentStrip-1), &prefetch /fix/ (currentStrip-2)] /map/ &Pool.addTask;
 ["Load ", "Prefetch ", "Prefetch "] /zip/ (
   [0, 1, 2] /map/ (
     &info /rfix/ currentComic /fix/ (
       &subtract!(int) /fix/ currentStrip
       )
     )
   )
   /map/ &concat!(string) /zip/ [
     &update,
     &prefetch /fix/ (currentStrip-1),
     &prefetch /fix/ (currentStrip-2)
   ] /map/ &Pool.addTask;
Why/how do you write code like this? Is it some kind of thing in your brain that makes you think about everything inside out? Stop it. Please. For all our brains' sakes.
:evil cackle: Braaaaaaaaaaaaaaains! Actually, it started out with a piece of code that looked like this
 Pool.addTask("Load "~info(), &update);
 Pool.addTask("Prefetch "~info(), &update);
 Pool.addTask("Prefetch "~info(), &update);
Clearly, the addTask is redundant here. I replaced it with
 [stuple("Load "~info(), &update), stuple...] /map/ &Pool.addTask;
Somebody in IRC pointed out that the stuple was redundant. He must feel horrible for indirectly causing this.
 ["Load "~info, ...] /zip/ [&update, ...] /map/ &Pool.addTask;
Then I noticed the info calls were also redundant. It went kinda out of control after that. You know the result. :) --downs
Oct 28 2007
next sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
downs wrote:
 Jarrett Billingsley wrote:
 ...
 As for nothing but ST(0-7) being preserved on Windows.. no clue.
Aww. Still thanks for answering. I'm sure somebody will know it.
I think this might be it (from the ABI): "The FPU stack must be empty when calling a function." So when it calls the function, the caller will have emptied the FP stack, so there's nothing *in* ST(0-7) to save.
 ...
 
 ["Load "~info, ...] /zip/ [&update, ...] /map/ &Pool.addTask;
Then I noticed the info calls were also redundant. It went kinda out of control after that. You know the result. :) --downs
You're my hero, downs. -- Daniel
Oct 28 2007
prev sibling parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"downs" <default_357-line yahoo.de> wrote in message 
news:fg1qhp$22m6$1 digitalmars.com...

 Actually, it started out with a piece of code that looked like this

 Pool.addTask("Load "~info(), &update);
 Pool.addTask("Prefetch "~info(), &update);
 Pool.addTask("Prefetch "~info(), &update);
There's an idea for a language feature (off-the-wall and will never get in, but regardless!) class Pool { void addTask(char[][] names..., void delegate() dgs...) { // names and dgs should be the same length // pairs are in names[i] and dgs[i] } } Pool.addTask("Load " ~ info(), &update, "Prefetch " ~ info(), &update); Heee. I don't see, though, why you couldn't do void addTask(void delegate()[char[]] tasks) { ... } Pool.addTask(["Load " ~ info() : &update, "Prefetch " ~ info() : &update]);
 Clearly, the addTask is redundant here.
 I replaced it with

 [stuple("Load "~info(), &update), stuple...] /map/ &Pool.addTask;
Somebody in IRC pointed out that the stuple was redundant. He must feel horrible for indirectly causing this.
 ["Load "~info, ...] /zip/ [&update, ...] /map/ &Pool.addTask;
Then I noticed the info calls were also redundant. It went kinda out of control after that. You know the result. :)
Redundancy is not much to worry about, especially if the result of removing it is entirely unreadable.
Oct 28 2007
prev sibling next sibling parent Don Clugston <dac nospam.com.au> writes:
downs wrote:
 Here's a question to the tango devs.
 I've been developing my own stackthreads (Fibers basically) library for
 a while now, and finally got it working a few days ago.
 
 Then we did some speed tests and discovered that it was about twice as
 slow as Tango's Fiber implementation.
 
 This prompted me to dig into the Fibers sourcecode, where I discovered
 that Fibers doesn't backup ECX/EDX on context switch, and the FP
 registers only on windows.
 
 Now I am confused.
 
 Could a knowledgeable tango dev kindly shed light on this issue? Why
 isn't it necessary to save all registers on a context switch?
 
 ----
 Now for the anecdote.
 Today, at five in the morning, I wrote what I believe to be the first
 purely functional D code with as little redundancy as possible in the
 language.
 
 ["Load ", "Prefetch ", "Prefetch "] /zip/ ([0, 1, 2] /map/ (&info
/rfix/ currentComic /fix/ (&subtract!(int) /fix/ currentStrip))) /map/ &concat!(string) /zip/ [&update, &prefetch /fix/ (currentStrip-1), &prefetch /fix/ (currentStrip-2)] /map/ &Pool.addTask; So I looked at it, and for some _strange_ reason my eyes suddenly started bleeding heavily. That's when I knew I had overdone it on the templates again. :)
In my best Hercules Grytpype-Thynne voice: "You silly, twisted boy." http://www.thegoonshow.net/quotes.asp
Oct 28 2007
prev sibling next sibling parent reply Robert Fraser <fraserofthenight gmail.com> writes:
downs Wrote:

 ["Load ", "Prefetch ", "Prefetch "] /zip/ ([0, 1, 2] /map/ (&info
/rfix/ currentComic /fix/ (&subtract!(int) /fix/ currentStrip))) /map/ &concat!(string) /zip/ [&update, &prefetch /fix/ (currentStrip-1), &prefetch /fix/ (currentStrip-2)] /map/ &Pool.addTask;
Ewwww! That whole create-your-own infix operator thing *is* cool, no doubt, but IMO, it reduces readability. To me: map(array, &func); ... is more readable than... array /map/ &func; ... but maybe that's because I've never actually done any functional programming.
Oct 28 2007
parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Robert Fraser wrote:
 downs Wrote:
 
 ["Load ", "Prefetch ", "Prefetch "] /zip/ ([0, 1, 2] /map/ (&info
/rfix/ currentComic /fix/ (&subtract!(int) /fix/ currentStrip))) /map/ &concat!(string) /zip/ [&update, &prefetch /fix/ (currentStrip-1), &prefetch /fix/ (currentStrip-2)] /map/ &Pool.addTask;
Ewwww! That whole create-your-own infix operator thing *is* cool, no doubt, but IMO, it reduces readability. To me: map(array, &func); ... is more readable than... array /map/ &func; ... but maybe that's because I've never actually done any functional programming.
The use of infix operators is unrelated to functional programming. Lisp doesn't have infix operators, for instance. map(array, &func) is every bit as "functional" as array /map/ &func. Or array.map(&func) for that matter. The functional style part comes from hiding the explicit procedural steps taken by the computer in functions like 'map', and by treating functions like 'func' more as data than code. --bb
Oct 28 2007
prev sibling next sibling parent reply BCS <ao pathlink.com> writes:
Reply to Downs,

 ["Load ", "Prefetch ", "Prefetch "] /zip/ (
 [0, 1, 2] /map/ (
 &info /rfix/ currentComic /fix/ (
 &subtract!(int) /fix/ currentStrip
 )
 )
 )
 /map/ &concat!(string) /zip/ [
 &update,
 &prefetch /fix/ (currentStrip-1),
 &prefetch /fix/ (currentStrip-2)
 ] /map/ &Pool.addTask;
how the heck is that parsing?? Is the /map/ actualy this: (A / B) / C; ??
Oct 28 2007
parent reply BCS <ao pathlink.com> writes:
Reply to Benjamin,

 Reply to Downs,
 
 ["Load ", "Prefetch ", "Prefetch "] /zip/ (
 [0, 1, 2] /map/ (
 &info /rfix/ currentComic /fix/ (
 &subtract!(int) /fix/ currentStrip
 )
 )
 )
 /map/ &concat!(string) /zip/ [
 &update,
 &prefetch /fix/ (currentStrip-1),
 &prefetch /fix/ (currentStrip-2)
 ] /map/ &Pool.addTask;
how the heck is that parsing?? Is the /map/ actualy this: (A / B) / C; ??
BTW if that is what you are doing, I may be force to do something totally EVIL to reclaim some ground in the "does nasty things with D" ranking. I have a few ideas though.... }:-) (BBBBWWWWAAAA HAA Haa haaa.....)
Oct 28 2007
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
BCS wrote:
 Reply to Benjamin,
 ...
 how the heck is that parsing?? Is the /map/ actualy this:

 (A / B) / C; ??
BTW if that is what you are doing, I may be force to do something totally EVIL to reclaim some ground in the "does nasty things with D" ranking. I have a few ideas though.... }:-) (BBBBWWWWAAAA HAA Haa haaa.....)
IIRC, map is basically defined like this: struct map { static map_ opDiv_r(T)(some_array_type!(T)); } struct map_ { some_array_type!(T) opDiv(T)(some_delegate_type!(T)); } Or something similar. -- Daniel
Oct 28 2007
parent reply downs <default_357-line yahoo.de> writes:
Daniel Keep wrote:
 
 BCS wrote:
 Reply to Benjamin,
 ...
 how the heck is that parsing?? Is the /map/ actualy this:

 (A / B) / C; ??
BTW if that is what you are doing, I may be force to do something totally EVIL to reclaim some ground in the "does nasty things with D" ranking. I have a few ideas though.... }:-) (BBBBWWWWAAAA HAA Haa haaa.....)
Looking forward to it! :) The world needs more weird and disturbing things. Shake it up.
 
 IIRC, map is basically defined like this:
 
 struct map
 {
   static map_ opDiv_r(T)(some_array_type!(T));
 }
 
 struct map_
 {
   some_array_type!(T) opDiv(T)(some_delegate_type!(T));
 }
 
 Or something similar.
 
 	-- Daniel
Yes exactly; though I have all the ugliness packed around in a template. Quote from the respective file:
 mixin(Operator!("map", "
   static if (is(ReturnType!(RHS)==void)) {
     foreach (ref entry; lhs) rhs(mixin(Value!(ElemType!(LHS), RHS,
"~Quote!("entry")~")));
   } else {
     auto res=new ReturnType!(RHS)[lhs.length];
     foreach (id, ref entry; res)
entry=rhs(mixin(Value!(ElemType!(LHS), RHS, "~Quote!("lhs[id]")~")));
     return res;
   }
 "));
Oct 28 2007
parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
downs wrote:
 Daniel Keep wrote:
 BCS wrote:
 Reply to Benjamin,
 ...
 how the heck is that parsing?? Is the /map/ actualy this:

 (A / B) / C; ??
BTW if that is what you are doing, I may be force to do something totally EVIL to reclaim some ground in the "does nasty things with D" ranking. I have a few ideas though.... }:-) (BBBBWWWWAAAA HAA Haa haaa.....)
Looking forward to it! :) The world needs more weird and disturbing things. Shake it up.
 IIRC, map is basically defined like this:

 struct map
 {
   static map_ opDiv_r(T)(some_array_type!(T));
 }

 struct map_
 {
   some_array_type!(T) opDiv(T)(some_delegate_type!(T));
 }

 Or something similar.

 	-- Daniel
Yes exactly; though I have all the ugliness packed around in a template.
Does it compile down to the same code as just a plain call to a map() function? (Or is there some other benefit that justifies the extra runtime cost and loss of readability?) --bb
Oct 29 2007
parent downs <default_357-line yahoo.de> writes:
Bill Baxter wrote:
 
 Does it compile down to the same code as just a plain call to a map()
 function?  (Or is there some other benefit that justifies the extra
 runtime cost and loss of readability?)
 
 --bb
Loss of readability is debatable. Personally I like it better, or else I wouldn't use it. :) The runtime cost might not be as high as you expect. Let us do a test.
 module test12;
 import std.stdio;
 import tools.base, tools.functional, tools.log;

 int add(int a, int b) { return a+b; }

 mixin(Operator!("add3", "return lhs+rhs;"));

 void main() {
   const long count=1_000_000_000;
   long test;
   int delegate(int, int) add1=(int a, int b) { return add(a, b); };
   int function(int, int) add2=&add;
   logln("Function: ", time({ for (int i=0; i<count; ++i) test+=add(0,
1); }), " -test ", test);
   logln("Delegate function: ", time({ for (int i=0; i<count; ++i)
test+=add1(0, 1); }), " -test ", test);
   logln("Function ... function: ", time({ for (int i=0; i<count; ++i)
test+=add2(0, 1); }), " -test ", test);
   logln("Infix function: ", time({ for (int i=0; i<count; ++i) test+=0
/add3/ 1; }), " -test ", test);
 }
The results:
 gentoo-pc ~/d $ gdc test12.d -o test12 -O3 -frelease -Itools
tools/tools/*.d && ./test12
 /--[ Main Thread ]------
 | Function: 0 -test 1000000000
 | Delegate function: 7234 -test 2000000000
 | Function ... function: 6415 -test -1294967296
 | Infix function: 1501 -test -294967296
 \-----------------------------
Hm ... interesting. I had expected the infix function to be inlined as well as the straight function. Ah well. Four times the speed of an un-inlined function is good enough for me. :)
Oct 29 2007
prev sibling next sibling parent bearophile <bearophile lycos.com> writes:
downs Wrote:
 ["Load ", "Prefetch ", "Prefetch "] /zip/ ([0, 1, 2] /map/ (&info
/rfix/ currentComic /fix/ (&subtract!(int) /fix/ currentStrip))) /map/ &concat!(string) /zip/ [&update, &prefetch /fix/ (currentStrip-1), &prefetch /fix/ (currentStrip-2)] /map/ &Pool.addTask;
I too have developed a large functional-style lib for D, but I think D needs a bit more built-in support for it (the first class functions, allowing a better syntax, etc) and you need to program in a more tidy way :-) bearophile
Oct 29 2007
prev sibling parent reply Mikola Lysenko <mclysenk mtu.edu> writes:
downs Wrote:

 Here's a question to the tango devs.
 I've been developing my own stackthreads (Fibers basically) library for
 a while now, and finally got it working a few days ago.
 
 Then we did some speed tests and discovered that it was about twice as
 slow as Tango's Fiber implementation.
 
 This prompted me to dig into the Fibers sourcecode, where I discovered
 that Fibers doesn't backup ECX/EDX on context switch, and the FP
 registers only on windows.
 
 Now I am confused.
 
 Could a knowledgeable tango dev kindly shed light on this issue? Why
 isn't it necessary to save all registers on a context switch?
 
Well, as one of the culprits implicated in writing the fiber switching code, I might be able to explain things. First, let me clear one thing up: Tango does not store the float registers on Windows. Since the C calling convention on both Posix and Win32 states that the FPU is not preserved across calls, there is no need to do such a thing. You may be confusing the floating point stack with the necromancy involving the FS segment values, which are part of the thread information block on windows. Those values are necessary for unwinding the stack in the event of an exception and absolutely must be saved in that exact order. Doing otherwise risks horrible disaster if the system were to inadvertently switch threads in the middle of a fiber switch or if one of the fibers were to generate an exception (via throw or otherwise). Many things went in to making the fiber switching on Tango fast, including optimizing register usage and restructuring the API such that each operation could be performed in O(1) time with a low constant cost. Seemingly simple stuff like marking/unmarking GC regions had to be overhauled from the Phobos' O(n) algorithm to make them efficient. Even without optimizations, just getting fiber switching *right* is very difficult. There are many subtle issues involving garbage collection, thread synchronization and exception handling. I spent many hours hunting down tiny race conditions and testing various extreme cases in order to track down and exterminate as many bugs as we could possibly find. For these reasons and more, I would strongly recommend that you just use Tango's Fibers rather than try to roll your own. They are highly optimized and have been rigorously checked. While it is possible to get a fairly rudimentary version of context switching working, I can't say that I would trust such an implementation. -Mik
Oct 29 2007
parent reply downs <default_357-line yahoo.de> writes:
Mikola Lysenko wrote:
 
 Well, as one of the culprits implicated in writing the fiber switching code, I
might be able to explain things.  First, let me clear one thing up:  
 
 Tango does not store the float registers on Windows.  Since the C calling
convention on both Posix and Win32 states that the FPU is not preserved across
calls, there is no need to do such a thing.  You may be confusing the floating
point stack with the necromancy involving the FS segment values, which are part
of the thread information block on windows.  Those values are necessary for
unwinding the stack in the event of an exception and absolutely must be saved
in that exact order.  Doing otherwise risks horrible disaster if the system
were to inadvertently switch threads in the middle of a fiber switch or if one
of the fibers were to generate an exception (via throw or otherwise).
 
 Many things went in to making the fiber switching on Tango fast, including
optimizing register usage and restructuring the API such that each operation
could be performed in O(1) time with a low constant cost.  Seemingly simple
stuff like marking/unmarking GC regions had to be overhauled from the Phobos'
O(n) algorithm to make them efficient.
 
 Even without optimizations, just getting fiber switching *right* is very
difficult.  There are many subtle issues involving garbage collection, thread
synchronization and exception handling.  I spent many hours hunting down tiny
race conditions and testing various extreme cases in order to track down and
exterminate as many bugs as we could possibly find.  
 
 For these reasons and more, I would strongly recommend that you just use
Tango's Fibers rather than try to roll your own.  They are highly optimized and
have been rigorously checked.  While it is possible to get a fairly rudimentary
version of context switching working, I can't say that I would trust such an
implementation.
 
 -Mik
Hell, I'll *gladly* use them the moment they're available on Phobos. Even getting my current version to work has been a week-long nightmare. Phobos being the target platform for scrapple.tools though, until then, I'll have no choice but to go with my own thang. So I guess all I can hope for is that merging the Phobos and Tango base code happen fast. Apropos, when that happens, will Fibers become a core part of Phobos / the shared core? I'd very much like to see that day. In the meantime, thanks for the explanation. --downs
Oct 29 2007
parent reply Sean Kelly <sean f4.ca> writes:
downs wrote:
 
 So I guess all I can hope for is that merging the Phobos and Tango base
 code happen fast. Apropos, when that happens, will Fibers become a core
 part of Phobos / the shared core?
They'll have to, since thread support is a part of the runtime library. Sean
Oct 29 2007
parent reply Brad Roberts <braddr puremagic.com> writes:
Sean Kelly wrote:
 downs wrote:
 So I guess all I can hope for is that merging the Phobos and Tango base
 code happen fast. Apropos, when that happens, will Fibers become a core
 part of Phobos / the shared core?
They'll have to, since thread support is a part of the runtime library. Sean
They will be, but not in 1.x, quite probably. Later, Brad
Oct 29 2007
parent Sean Kelly <sean f4.ca> writes:
Brad Roberts wrote:
 Sean Kelly wrote:
 downs wrote:
 So I guess all I can hope for is that merging the Phobos and Tango base
 code happen fast. Apropos, when that happens, will Fibers become a core
 part of Phobos / the shared core?
They'll have to, since thread support is a part of the runtime library.
They will be, but not in 1.x, quite probably.
Oops. Thanks for clarifying that. Yes, Tango/Phobos compatibility will likely be handled differently for 1.0. Sean
Oct 29 2007