## digitalmars.D.announce - Deformable registration in D

- Robert Jacques (20/20) Jul 31 2007 I recently presented at the 49th Annual Meeting of the American
- Don Clugston (17/39) Aug 01 2007 Awesome!
- Robert Jacques (12/31) Aug 02 2007 This actually this isn’t too unexpected. MSVC7 uses the SSE2 optimized...
- Saaa (4/12) Aug 02 2007 Exp with finite support, how does this go?
- Daniel Keep (11/17) Aug 02 2007 I can't think of a formal definition of numerical stability, but it
- Saaa (2/12) Aug 03 2007 Would this mean it accepts 1/3 as input or just floating point types?
- Robert Jacques (1/2) Aug 03 2007 Floating point types.
- Robert Jacques (8/9) Aug 02 2007 The support of a function is basically the range in which its non-zero. ...
- Don Clugston (5/15) Aug 03 2007 What cutoff are you using? Denormals are only generated for x around -11...
- Robert Jacques (5/9) Aug 03 2007 I first started with -5 and and used -3 in the release, as the method is...
- Saaa (6/15) Aug 03 2007 Thanks for you reply.
- Dave (2/24) Aug 03 2007
- Robert Jacques (5/6) Aug 03 2007 My plane got delayed, so I missed a portion of the poster session, but I...

I recently presented at the 49th Annual Meeting of the American Association of Physicists in Medicine an application of a deformable registration algorithm during the general poster discussion session. Of course, the reason I?m posting this here is that I wrote it in D. Or more precisely, I refactored it to D from C++, with some surprising results. First, the main algorithm dropped to half the lines of code and I was able to completely remove some supporting routines. Second, the numerical stability of the algorithm increased, helping reduce a major fundamental issue/flaw in algorithm (Ah, the joys of a continuous algorithm working on discrete data). And lastly, it ran three to four times slower. The culprit turned out to be the exponential function, which is the crux of the algorithm and also what I assume was responsible for the increased stability. I switched to using an exponential with finite support and regained the speed while keeping the stability. For reference, I was using Microsoft Visual Studio 2003 and D1.x/Phobos on Intel/XP box. The code is now open source and available at: http://dailabs.duhs.duke.edu/imagequality.html And the associated abstract is published in Medical Physics: http://www.medphys.org/ P.S. Thank you Walter and the community for this great language.

Jul 31 2007

Robert Jacques wrote:I recently presented at the 49th Annual Meeting of the American Association of Physicists in Medicine an application of a deformable registration algorithm during the general poster discussion session. Of course, the reason I?m posting this here is that I wrote it in D. Or more precisely, I refactored it to D from C++, with some surprising results. First, the main algorithm dropped to half the lines of code and I was able to completely remove some supporting routines. Second, the numerical stability of the algorithm increased, helping reduce a major fundamental issue/flaw in algorithm (Ah, the joys of a continuous algorithm working on discrete data). And lastly, it ran three to four times slower. The culprit turned out to be the exponential function, which is the crux of the algorithm and also what I assume was responsible for the increased stability. I switched to using an exponential with finite support and regained the speed while keeping the stability. For reference, I was using Microsoft Visual Studio 2003 and D1.x/Phobos on Intel/XP box. The code is now open source and available at: http://dailabs.duhs.duke.edu/imagequality.html And the associated abstract is published in Medical Physics: http://www.medphys.org/ P.S. Thank you Walter and the community for this great language.Awesome! A few comments: (1) D's exp() shouldn't be too much slower than C's. Certainly not a factor of four. (Unless you were creating a lot of denormalised numbers). So your result is interesting. Perhaps there is a performance bug in exp(). (2) DMD does essentially no floating-point optimisation. The fact that it can compete directly with Visual C++ is impressive, since there is enormous potential for improvement (a factor of 2 at the very least). (3) The increased stability could also be due to the use of 80-bit reals instead of 64 bits. You could easily test this by putting import std.c.fenv; fesetprec(FE_DBLPREC); into main(), and seeing if the stability disappears. (4) My experience has been that D is a superb language for developing floating-point algorithms. It's great to see further confirmation of this. I presume that you were using DMD, not GDC. If not, it would explain (1) and (2).

Aug 01 2007

On Wed, 01 Aug 2007 07:40:05 -0400, Don Clugston <dac nospam.com.au> wrote:Awesome! A few comments: (1) D's exp() shouldn't be too much slower than C's. Certainly not a factor of four. (Unless you were creating a lot of denormalised numbers). So your result is interesting. Perhaps there is a performance bug in exp(). (2) DMD does essentially no floating-point optimisation. The fact that it can compete directly with Visual C++ is impressive, since there is enormous potential for improvement (a factor of 2 at the very least).This actually this isn’t too unexpected. MSVC7 uses the SSE2 optimized exp function, which performed similarly to simple addition as other bottlenecks, like memory bandwidth, seemed to come into play. And both D and C should read and write arrays at about the same speed.(3) The increased stability could also be due to the use of 80-bit reals instead of 64 bits. You could easily test this by putting import std.c.fenv; fesetprec(FE_DBLPREC); into main(), and seeing if the stability disappears.Actually, while I did some tests using reals, I ended up using floats in the end to cut down on memory traffic, which was the main bottle neck at the time, and improved the algorithm's fundamental stability.(4) My experience has been that D is a superb language for developing floating-point algorithms. It's great to see further confirmation of this.I agree.I presume that you were using DMD, not GDC. If not, it would explain (1) and (2).And, yes, I was using DMD. Thanks for the suggestions. I'm going to look into the exp function directly.

Aug 02 2007

Second, the numerical stability of the algorithm increased, helping reduce a major fundamental issue/flaw in algorithm (Ah, the joys of a continuous algorithm working onWhat does numerical stability actually mean? and... what is a continuous algorithm?discrete data). And lastly, it ran three to four times slower. The culprit turned out to be the exponential function, which is the crux of the algorithm and also what I assume was responsible for the increased stability. I switched to using an exponential with finite support and regained the speed while keeping the stability.Exp with finite support, how does this go? It all sounds interesting so I would like to understand it all :)

Aug 02 2007

Saaa wrote:I can't think of a formal definition of numerical stability, but it basically means that an algorithm produces the correct answer to within a reasonable margin of error. More at http://en.wikipedia.org/wiki/Numerical_stability A continuous algorithm is one which operates over a continuous set of values. For instance, integers (1, 7, 42) are discrete, whilst real numbers (1.2, 7.85, 3.141...) are continuous. Continuous sets don't have any "gaps" between members, whilst discrete sets do. Hope that helps :) -- DanielSecond, the numerical stability of the algorithm increased, helping reduce a major fundamental issue/flaw in algorithm (Ah, the joys of a continuous algorithm working onWhat does numerical stability actually mean? and... what is a continuous algorithm?

Aug 02 2007

I can't think of a formal definition of numerical stability, but it basically means that an algorithm produces the correct answer to within a reasonable margin of error. More at http://en.wikipedia.org/wiki/Numerical_stabilityThanks, I should have be able to find that one myself, sorry :)A continuous algorithm is one which operates over a continuous set of values. For instance, integers (1, 7, 42) are discrete, whilst real numbers (1.2, 7.85, 3.141...) are continuous. Continuous sets don't have any "gaps" between members, whilst discrete sets do. Hope that helps :) -- DanielWould this mean it accepts 1/3 as input or just floating point types?

Aug 03 2007

Would this mean it accepts 1/3 as input or just floating point types?Floating point types.

Aug 03 2007

Exp with finite support, how does this go?The support of a function is basically the range in which its non-zero. See http://en.wikipedia.org/wiki/Support_(mathematics) for more details. I’m specifically using it to calculate the Gaussian, which is essentially exp(-x) where x > 0 (x is a distance). So I have a few small x’s which become relatively large exp(-x) and many large x’s that become really tiny exp(-x). By approximating all those large x’s (tiny exp(-x)) with zero (i.e. making the support finite) I don’t have to calculate as many exp functions, which speeds up the computation.

Aug 02 2007

Robert Jacques wrote:What cutoff are you using? Denormals are only generated for x around -11300 for 80-bit reals, but for floats they'll happen for x between about -85 and -105. IIRC, denormals are ~100 times slower than normals on x87, so could easily be performance killers.Exp with finite support, how does this go?The support of a function is basically the range in which its non-zero. See http://en.wikipedia.org/wiki/Support_(mathematics) for more details. I’m specifically using it to calculate the Gaussian, which is essentially exp(-x) where x > 0 (x is a distance). So I have a few small x’s which become relatively large exp(-x) and many large x’s that become really tiny exp(-x). By approximating all those large x’s (tiny exp(-x)) with zero (i.e. making the support finite) I don’t have to calculate as many exp functions, which speeds up the computation.

Aug 03 2007

What cutoff are you using? Denormals are only generated for x around -11300 for 80-bit reals, but for floats they'll happen for x between about -85 and -105. IIRC, denormals are ~100 times slower than normals on x87, so could easily be performance killers.I first started with -5 and and used -3 in the release, as the method is multi-resolution/sigma. Matlab, if I remember, used a sigma of 2.5 as its default cutoff for Gaussian window. And yes, due to their slow speed, I included denormal detection in order to error early. (infinity or nan are invalid outputs and happen only when the algorithm becomes unstable)

Aug 03 2007

Thanks for you reply. Thus your gauss function will simply check something like: if (x>18) return 0; I see how this effects the speed ;) (if only int are accepted as input I would just put all 18 of them in there :D)Exp with finite support, how does this go?The support of a function is basically the range in which its non-zero. See http://en.wikipedia.org/wiki/Support_(mathematics) for more details. I'm specifically using it to calculate the Gaussian, which is essentially exp(-x) where x > 0 (x is a distance). So I have a few small x's which become relatively large exp(-x) and many large x's that become really tiny exp(-x). By approximating all those large x's (tiny exp(-x)) with zero (i.e. making the support finite) I don't have to calculate as many exp functions, which speeds up the computation.

Aug 03 2007

Robert Jacques wrote:I recently presented at the 49th Annual Meeting of the American Association of Physicists in Medicine an application of a deformable registration algorithm during the general poster discussion session. Of course, the reason I?m posting this here is that I wrote it in D. Or more precisely, I refactored it to D from C++, with some surprisingWhat did the audience think of D?results. First, the main algorithm dropped to half the lines of code and I was able to completely remove some supporting routines. Second, the numerical stability of the algorithm increased, helping reduce a major fundamental issue/flaw in algorithm (Ah, the joys of a continuous algorithm working on discrete data). And lastly, it ran three to four times slower. The culprit turned out to be the exponential function, which is the crux of the algorithm and also what I assume was responsible for the increased stability. I switched to using an exponential with finite support and regained the speed while keeping the stability. For reference, I was using Microsoft Visual Studio 2003 and D1.x/Phobos on Intel/XP box. The code is now open source and available at: http://dailabs.duhs.duke.edu/imagequality.html And the associated abstract is published in Medical Physics: http://www.medphys.org/ P.S. Thank you Walter and the community for this great language.

Aug 03 2007

What did the audience think of D?My plane got delayed, so I missed a portion of the poster session, but I did get talking in depth with one person about it. He had heard about D before, but hadn’t yet had a chance to try it and generally seemed positive, or at least as much as anyone would be towards an unknown (to them) language.

Aug 03 2007