## digitalmars.D - Anything in the review queue?

- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Mar 20 2011
- dsimcha <dsimcha yahoo.com> Mar 20 2011
- Jesse Phillips <jessekphillips+D gmail.com> Mar 20 2011
- "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> Mar 20 2011
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Mar 20 2011
- "Simen kjaeraas" <simen.kjaras gmail.com> Mar 20 2011
- "Robert Jacques" <sandford jhu.edu> Mar 20 2011
- Jonathan M Davis <jmdavisProg gmx.com> Mar 20 2011
- Jonas Drewsen <jdrewsen nospam.com> Mar 20 2011
- Jonathan M Davis <jmdavisProg gmx.com> Mar 20 2011
- dsimcha <dsimcha yahoo.com> Mar 20 2011
- bearophile <bearophileHUGS lycos.com> Mar 20 2011
- dsimcha <dsimcha yahoo.com> Mar 20 2011
- Don <nospam nospam.com> Mar 21 2011
- dsimcha <dsimcha yahoo.com> Mar 21 2011
- KennyTM~ <kennytm gmail.com> Mar 21 2011
- Don <nospam nospam.com> Mar 21 2011
- dsimcha <dsimcha yahoo.com> Mar 21 2011
- Don <nospam nospam.com> Mar 21 2011
- dsimcha <dsimcha yahoo.com> Mar 21 2011
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Mar 21 2011
- dsimcha <dsimcha yahoo.com> Mar 21 2011
- Don <nospam nospam.com> Mar 22 2011
- bearophile <bearophileHUGS lycos.com> Mar 21 2011
- Don <nospam nospam.com> Mar 21 2011
- dsimcha <dsimcha yahoo.com> Mar 22 2011
- Don <nospam nospam.com> Mar 23 2011
- bearophile <bearophileHUGS lycos.com> Mar 23 2011
- Don <nospam nospam.com> Mar 23 2011
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Mar 23 2011
- "Simen kjaeraas" <simen.kjaras gmail.com> Mar 21 2011
- Jacob Carlborg <doob me.com> Mar 21 2011

Since David kindly agreed to work some more on std.parallelism, there is now time for carrying one review cycle on another library. I recall there's work on: * std.goodxml (status?) * std.net (beyond mere libcurl bindings) * std.path (improvements to path) * A bunch of almost-finished projects that need some tending (rational numbers etc.) * anything else? Is there anything ready for a formal review? Thanks, Andrei

Mar 20 2011

On 3/20/2011 11:10 AM, Andrei Alexandrescu wrote:Since David kindly agreed to work some more on std.parallelism, there is now time for carrying one review cycle on another library. I recall there's work on: * std.goodxml (status?) * std.net (beyond mere libcurl bindings) * std.path (improvements to path) * A bunch of almost-finished projects that need some tending (rational numbers etc.) * anything else? Is there anything ready for a formal review? Thanks, Andrei

Definitely not rational. I just looked at that and it's going to need some work to improve the docs a little and remove some workarounds for BigInt bugs that have been fixed. It doesn't need a lot of work, but I'd rather focus on std.parallelism and not have rational on my plate at the same time.

Mar 20 2011

On 03/20/2011 10:49 AM, dsimcha wrote:On 3/20/2011 11:10 AM, Andrei Alexandrescu wrote:Since David kindly agreed to work some more on std.parallelism, there is now time for carrying one review cycle on another library. I recall there's work on: * std.goodxml (status?) * std.net (beyond mere libcurl bindings) * std.path (improvements to path) * A bunch of almost-finished projects that need some tending (rational numbers etc.) * anything else? Is there anything ready for a formal review? Thanks, Andrei

Definitely not rational. I just looked at that and it's going to need some work to improve the docs a little and remove some workarounds for BigInt bugs that have been fixed. It doesn't need a lot of work, but I'd rather focus on std.parallelism and not have rational on my plate at the same time.

Would you donate the code to someone to finalize? Andrei

Mar 20 2011

Andrei Alexandrescu Wrote:

I'm still looking into taking advantage of David's CSV design for my own parser. But it isn't ready for review and I don't have a planned schedule for it. And I don't think my slicing design will be the appropriate one for including into Phobos.

Mar 20 2011

On Sun, 20 Mar 2011 10:10:20 -0500, Andrei Alexandrescu wrote:

My std.path proposal is pretty much ready, but the deadline for my PhD thesis is in less than two weeks, and I have very little time for anything else right now. :( -Lars

Mar 20 2011

On 03/20/2011 11:50 AM, Lars T. Kyllingstad wrote:On Sun, 20 Mar 2011 10:10:20 -0500, Andrei Alexandrescu wrote:

My std.path proposal is pretty much ready, but the deadline for my PhD thesis is in less than two weeks, and I have very little time for anything else right now. :( -Lars

Good luck!!! Andrei

Mar 20 2011

On Sun, 20 Mar 2011 17:50:59 +0100, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:On Sun, 20 Mar 2011 10:10:20 -0500, Andrei Alexandrescu wrote:

My std.path proposal is pretty much ready, but the deadline for my PhD thesis is in less than two weeks, and I have very little time for anything else right now. :(

Good luck! -- Simen

Mar 20 2011

On Sun, 20 Mar 2011 11:10:20 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

I've been working on an update to std.variant/std.json and recently an patch to the put/isOutputRange/Appender triad, to fix a major scaling issue. Regarding status, I need to add trivial function to json, but otherwise am happy with it. I've been struggling with Variant's compile times, but have finally gotten them somewhat acceptable. I still have a few features I wish to add, and need to improve/update the documentation and unit tests. So nothing ready for formal review yet.

Mar 20 2011

Since David kindly agreed to work some more on std.parallelism, there is now time for carrying one review cycle on another library. I recall there's work on: * std.goodxml (status?) * std.net (beyond mere libcurl bindings) * std.path (improvements to path) * A bunch of almost-finished projects that need some tending (rational numbers etc.) * anything else? Is there anything ready for a formal review?

It would take a couple of days to be ready given what I've got going at the moment (I had assumed that std.parallelism and std.path would be ahead in line), but I have been intending to take the portions of assertPred which won't end up in an improved assert, split them into the appropriate functions, and put those up for review. And it likely wouldn't need to be a long review process either, since I suspect that it would primarily be take it or leave it for the most part (either you like a particular function or you don't) - at most a few tweaks might be necessary. We could choose to vote on functions individually (making it take it or leave it per function as opposed to the whole), but I doubt that there would be a lot of review necessary simply because the functions are fairly simple and versions of them have been looked at before. So, I could get that ready for review within a couple of days. - Jonathan M Davis

Mar 20 2011

On 20/03/11 16.10, Andrei Alexandrescu wrote:

I've committet the review fixes for the curl C declarations. I wonder if the pull request has been resent after I updated the commit to be pulled. Anyone got notified? Regarding the libcurl wrapper I'm working on the improvement requests I got from the curl RFC thread. /Jonas

Mar 20 2011

On 20/03/11 16.10, Andrei Alexandrescu wrote:

I've committet the review fixes for the curl C declarations. I wonder if the pull request has been resent after I updated the commit to be pulled. Anyone got notified?

Anyone participating in the pull request should get notified whenever an update to it is made - be it a comment or commit (the only time that everyone with commit access to Phobos gets notified is with the creation of the initial pull request). - Jonathan M Davis

Mar 20 2011

On 3/20/2011 11:10 AM, Andrei Alexandrescu wrote:

On second thought, given the difficulty finding anything else, rational may be the thing that's most ready. I'll offer it up for review now under the following caveats, since it's a relatively simple module that I doubt will have nearly as many issues as std.parallelism. This may also be good because it won't overwhelm the community, so I can continue getting feedback about the much more complex std.parallelism in parallel. Caveats: 1. If anyone with less other stuff on their plate has something they want reviewed (so far noone seems to) their stuff can take precedence. 2. If major changes are suggested, we may have to "git stash" rational in a couple weeks to un-stash std.parallelism. Code: https://github.com/dsimcha/Rational Docs: http://cis.jhu.edu/~dsimcha/d/phobos/std_rational.html

Mar 20 2011

dsimcha:On second thought, given the difficulty finding anything else, rational may be the thing that's most ready. I'll offer it up for review now

It's good to have rationals in Phobos, thank you. Is this GCD? There is already a gcd in Phobos. Is this efficient when numbers gets very large? CommonInteger!(I1,I2) gcf(I1, I2)(I1 num1, I2 num2); An alternative is to give the number of binary digits of precision for the mantissa (see std.math.feqrel): Rational!(Int) toRational(Int)(real floatNum, real epsilon = 1e-08); Bye, bearophile

Mar 20 2011

On 3/20/2011 10:26 PM, bearophile wrote:dsimcha:On second thought, given the difficulty finding anything else, rational may be the thing that's most ready. I'll offer it up for review now

It's good to have rationals in Phobos, thank you. Is this GCD? There is already a gcd in Phobos. Is this efficient when numbers gets very large? CommonInteger!(I1,I2) gcf(I1, I2)(I1 num1, I2 num2);

I knew someone was going to ask this, so I probably should have answered it pre-emptively. std.numeric.gcd doesn't work with BigInts. I'm considering making my implementation private and eventually fixing std.numeric rather than having duplicate public functionality.An alternative is to give the number of binary digits of precision for the mantissa (see std.math.feqrel): Rational!(Int) toRational(Int)(real floatNum, real epsilon = 1e-08);

That's worth considering. The only reason I did it the way I did is because the Maxima computer algebra system did it this way and, when in doubt, I generally do what Maxima does in this library. I also think absolute error is the better metric for this case. If you do: auto foo = toRational!BigInt(1e-200); Do we really want to return some ridiculously huge number (that possibly overflows in the case of fixed width integers) for the denominator in this case?

Mar 20 2011

dsimcha wrote:On 3/20/2011 10:26 PM, bearophile wrote:dsimcha:On second thought, given the difficulty finding anything else, rational may be the thing that's most ready. I'll offer it up for review now

It's good to have rationals in Phobos, thank you. Is this GCD? There is already a gcd in Phobos. Is this efficient when numbers gets very large? CommonInteger!(I1,I2) gcf(I1, I2)(I1 num1, I2 num2);

I knew someone was going to ask this, so I probably should have answered it pre-emptively. std.numeric.gcd doesn't work with BigInts. I'm considering making my implementation private and eventually fixing std.numeric rather than having duplicate public functionality.An alternative is to give the number of binary digits of precision for the mantissa (see std.math.feqrel): Rational!(Int) toRational(Int)(real floatNum, real epsilon = 1e-08);

That's worth considering. The only reason I did it the way I did is because the Maxima computer algebra system did it this way and, when in doubt, I generally do what Maxima does in this library. I also think absolute error is the better metric for this case. If you do: auto foo = toRational!BigInt(1e-200); Do we really want to return some ridiculously huge number (that possibly overflows in the case of fixed width integers) for the denominator in this case?

The original plan for BigInt was to have a BigRational type. Suppose that such a thing existed -- would it affect plans for this library? I'm not sure that it is realistic to have a single templated type that does both BigInt rationals, together with rationals based on built-in integral types. The algorithms are different in a few respects: (1) copying built-ins is practically zero cost; (2) operations on BigInts depend on the size of the number (whereas for builtins, all operations are O(1)); (3) BigInts cannot overflow. In particular, gcd algorithms for arbitrary precision types are quite different to gcd for built-ins. I guess the primary question is: If there were a BigRational type, would you still want rational? Ie, are there use cases for rational over built-in types (rather than it just being a workaround for a lack of a BigRational type)? If the answer to this question is yes, then I suspect that the semantics for rational over built-ins are so different to the semantics for BigRational, that the two could be relatively independent.

Mar 21 2011

On 3/21/2011 3:59 AM, Don wrote:The original plan for BigInt was to have a BigRational type. Suppose that such a thing existed -- would it affect plans for this library? I'm not sure that it is realistic to have a single templated type that does both BigInt rationals, together with rationals based on built-in integral types. The algorithms are different in a few respects: (1) copying built-ins is practically zero cost;

Doesn't BigInt use COW rather than eager copying?(2) operations on BigInts depend on the size of the number (whereas for builtins, all operations are O(1));

I don't understand how this is relevant.(3) BigInts cannot overflow. In particular, gcd algorithms for arbitrary precision types are quite different to gcd for built-ins.

Ok, I don't know much about how BigInts work under the hood. I used a fairly simple implementation of Euler's algorithm here. Is there something much more efficient for BigInts?I guess the primary question is: If there were a BigRational type, would you still want rational? Ie, are there use cases for rational over built-in types (rather than it just being a workaround for a lack of a BigRational type)?

My thoughts were that Rational would be **primarily** used with BigInt, but occasionally, if you really know what you're doing with regard to overflow, you could used fixed width as an optimization.If the answer to this question is yes, then I suspect that the semantics for rational over built-ins are so different to the semantics for BigRational, that the two could be relatively independent.

Mar 21 2011

On Mar 21, 11 21:54, dsimcha wrote:(3) BigInts cannot overflow. In particular, gcd algorithms for arbitrary precision types are quite different to gcd for built-ins.

Ok, I don't know much about how BigInts work under the hood. I used a fairly simple implementation of Euler's algorithm here. Is there something much more efficient for BigInts?

http://en.wikipedia.org/wiki/Lehmer%27s_GCD_algorithm

Mar 21 2011

dsimcha wrote:On 3/21/2011 3:59 AM, Don wrote:The original plan for BigInt was to have a BigRational type. Suppose that such a thing existed -- would it affect plans for this library? I'm not sure that it is realistic to have a single templated type that does both BigInt rationals, together with rationals based on built-in integral types. The algorithms are different in a few respects: (1) copying built-ins is practically zero cost;

Doesn't BigInt use COW rather than eager copying?

Yes, but it operations which create temporaries are still expensive.(2) operations on BigInts depend on the size of the number (whereas for builtins, all operations are O(1));

I don't understand how this is relevant.

With a built-in type, multiplication and addition are equally cheap - eg on x86, multiply is only about half as slow as an addition. For BigInt, addition is O(N) while multiplication is O(N^^2) decreasing to say O(N^^1.5) for large numbers. This means that you really want to avoid multiplication with BigInts. Interestingly, division of builtins is really slow (~20 * multiply), but division of a BigInt is only a few times slower than a multiply.(3) BigInts cannot overflow. In particular, gcd algorithms for arbitrary precision types are quite different to gcd for built-ins.

Ok, I don't know much about how BigInts work under the hood. I used a fairly simple implementation of Euler's algorithm here. Is there something much more efficient for BigInts?

Yes. Euler's algorithm performs very poorly for BigInts. In fact, you probably want to use the BinaryGCD algorithm even for builtins. There is a sub-quadratic algorithm for GCD, but it's very complicated. (It's not the one Kenny posted).I guess the primary question is: If there were a BigRational type, would you still want rational? Ie, are there use cases for rational over built-in types (rather than it just being a workaround for a lack of a BigRational type)?

My thoughts were that Rational would be **primarily** used with BigInt, but occasionally, if you really know what you're doing with regard to overflow, you could used fixed width as an optimization.

This isn't sounding so promising. I hate to be saying that.

Mar 21 2011

== Quote from Don (nospam nospam.com)'s articledsimcha wrote:On 3/21/2011 3:59 AM, Don wrote:The original plan for BigInt was to have a BigRational type. Suppose that such a thing existed -- would it affect plans for this library? I'm not sure that it is realistic to have a single templated type that does both BigInt rationals, together with rationals based on built-in integral types. The algorithms are different in a few respects: (1) copying built-ins is practically zero cost;

Doesn't BigInt use COW rather than eager copying?

(2) operations on BigInts depend on the size of the number (whereas for builtins, all operations are O(1));

I don't understand how this is relevant.

on x86, multiply is only about half as slow as an addition. For BigInt, addition is O(N) while multiplication is O(N^^2) decreasing to say O(N^^1.5) for large numbers. This means that you really want to avoid multiplication with BigInts. Interestingly, division of builtins is really slow (~20 * multiply), but division of a BigInt is only a few times slower than a multiply.

Ok, I don't know much about how BigInts work under the hood. I used a fairly simple implementation of Euler's algorithm here. Is there something much more efficient for BigInts?

probably want to use the BinaryGCD algorithm even for builtins. There is a sub-quadratic algorithm for GCD, but it's very complicated. (It's not the one Kenny posted).I guess the primary question is: If there were a BigRational type, would you still want rational? Ie, are there use cases for rational over built-in types (rather than it just being a workaround for a lack of a BigRational type)?

My thoughts were that Rational would be **primarily** used with BigInt, but occasionally, if you really know what you're doing with regard to overflow, you could used fixed width as an optimization.

I hate to be saying that.

That's ok. Rational was a very small side project for me. I didn't have tons of time invested in it or anything (probably an order of magnitude less than std.parallelism). I do think it's interesting to decouple the Rational representation from the underlying integer type. Whether it's practical and warrants inclusion in Phobos is a different question. As far as BinaryGCD, would that be efficiently implementable using only the public interface of BigInt? If not, it might make sense to change the public interface of BigInt to make it implementable. (Or it might not. I have no idea what's involved here. Please comment.)

Mar 21 2011

dsimcha wrote:== Quote from Don (nospam nospam.com)'s articledsimcha wrote:On 3/21/2011 3:59 AM, Don wrote:

on x86, multiply is only about half as slow as an addition. For BigInt, addition is O(N) while multiplication is O(N^^2) decreasing to say O(N^^1.5) for large numbers. This means that you really want to avoid multiplication with BigInts. Interestingly, division of builtins is really slow (~20 * multiply), but division of a BigInt is only a few times slower than a multiply.

fairly simple implementation of Euler's algorithm here. Is there something much more efficient for BigInts?

probably want to use the BinaryGCD algorithm even for builtins. There is a sub-quadratic algorithm for GCD, but it's very complicated. (It's not the one Kenny posted).

but occasionally, if you really know what you're doing with regard to overflow, you could used fixed width as an optimization.

I hate to be saying that.

That's ok. Rational was a very small side project for me. I didn't have tons of time invested in it or anything (probably an order of magnitude less than std.parallelism). I do think it's interesting to decouple the Rational representation from the underlying integer type. Whether it's practical and warrants inclusion in Phobos is a different question. As far as BinaryGCD, would that be efficiently implementable using only the public interface of BigInt? If not, it might make sense to change the public interface of BigInt to make it implementable. (Or it might not. I have no idea what's involved here. Please comment.)

ExtendedGCD should be a part of the BigInt internals. (The other missing low-level BigInt function is square root). These functions should take care of switching algorithms from the naive version to ones with better big-O performance but large overheads, when the length is long. I had planned to implement it, but then got distracted by fixing compiler bugs. I've been distracted for a long time <g>.

Mar 21 2011

== Quote from Don (nospam nospam.com)'s articleOk, I don't know much about how BigInts work under the hood. I used a fairly simple implementation of Euler's algorithm here. Is there something much more efficient for BigInts?

probably want to use the BinaryGCD algorithm even for builtins. There is a sub-quadratic algorithm for GCD, but it's very complicated. (It's not the one Kenny posted).

My thoughts were that Rational would be **primarily** used with BigInt, but occasionally, if you really know what you're doing with regard to overflow, you could used fixed width as an optimization.

I hate to be saying that.

I've thought about this some more. The following **may** be a good way to proceed. I'd like Don's input, since I don't know in detail what he was planning. 1. Review the design of my rational lib with a focus on whether sufficient infrastructure for creating specializations without introducing breaking changes is available. 2. If no issues not related to the lack of specialization for BigInt are raised, accept Rational into Phobos for now. 3. At his leisure, Don can write a specialization of Rational specifically for BigInt (or even just a specialization of gcd, if that's all that's needed). Once this is done, we include it as an optimization, but without changes to the interface. Every integer type except BigInt still uses the generic implementation. 4. Maybe we provide hooks for specializing certain performance-critical things like GCD, in case someone wants to use their own custom BigInt type with Rational without having to modify the code for Phobos. For example, a BigInt type might optionally define a gcd() method. Rational would look for this and use it if it exists, or use the generic fallback algorithm otherwise.

Mar 21 2011

On 3/21/11 12:18 PM, dsimcha wrote:== Quote from Don (nospam nospam.com)'s articleOk, I don't know much about how BigInts work under the hood. I used a fairly simple implementation of Euler's algorithm here. Is there something much more efficient for BigInts?

probably want to use the BinaryGCD algorithm even for builtins. There is a sub-quadratic algorithm for GCD, but it's very complicated. (It's not the one Kenny posted).

My thoughts were that Rational would be **primarily** used with BigInt, but occasionally, if you really know what you're doing with regard to overflow, you could used fixed width as an optimization.

I hate to be saying that.

I've thought about this some more. The following **may** be a good way to proceed. I'd like Don's input, since I don't know in detail what he was planning. 1. Review the design of my rational lib with a focus on whether sufficient infrastructure for creating specializations without introducing breaking changes is available. 2. If no issues not related to the lack of specialization for BigInt are raised, accept Rational into Phobos for now. 3. At his leisure, Don can write a specialization of Rational specifically for BigInt (or even just a specialization of gcd, if that's all that's needed). Once this is done, we include it as an optimization, but without changes to the interface. Every integer type except BigInt still uses the generic implementation. 4. Maybe we provide hooks for specializing certain performance-critical things like GCD, in case someone wants to use their own custom BigInt type with Rational without having to modify the code for Phobos. For example, a BigInt type might optionally define a gcd() method. Rational would look for this and use it if it exists, or use the generic fallback algorithm otherwise.

I think by and large failure to define rational for BigInt in a way that has many commonalities with rational for built-in integrals reflects a failure of BigInt. By its very design, BigInt is supposed to transparently replace integers. If this is largely the case, the design has succeeded. If the design of anything must be reworked essentially from scratch to work with BigInt instead of int et al, then the abstraction may be too porous to be useful. There are a few approaches we can take from here. One is to define certain traits that differentiate BigInt from other integrals (e.g. preferAdditionToMultiplication or whatnot), and then design Rational to use those traits. Another is of course to specialize the entire Rational on BigInt. Third would be to specialize certain core routines (gcd and friends) for BigInt and keep Rational agnostic. Anyhow, BigInt should be cheap to copy (i.e. use refcounting). If there are good reasons for which that's impossible, we should give up on the dream of enacting that assumption throughout Phobos. Don? Andrei

Mar 21 2011

On 3/21/2011 8:33 PM, Andrei Alexandrescu wrote:I think by and large failure to define rational for BigInt in a way that has many commonalities with rational for built-in integrals reflects a failure of BigInt. By its very design, BigInt is supposed to transparently replace integers. If this is largely the case, the design has succeeded. If the design of anything must be reworked essentially from scratch to work with BigInt instead of int et al, then the abstraction may be too porous to be useful. There are a few approaches we can take from here. One is to define certain traits that differentiate BigInt from other integrals (e.g. preferAdditionToMultiplication or whatnot), and then design Rational to use those traits. Another is of course to specialize the entire Rational on BigInt. Third would be to specialize certain core routines (gcd and friends) for BigInt and keep Rational agnostic. Anyhow, BigInt should be cheap to copy (i.e. use refcounting). If there are good reasons for which that's impossible, we should give up on the dream of enacting that assumption throughout Phobos. Don? Andrei

The biggest perf issue, though, seems to be Euler's algorithm instead of BinaryGCD. This is definitely going to get fixed eventually by me, since I've read up on BinaryGCD and it doesn't look hard to implement generically. I naively thought Euler's algorithm was pretty darn efficient when I wrote the first version, not knowing much about how BigInts work under the hood. I'm not sure if Don had something even more efficient than a good generic BinaryGCD implementation in mind, or if there are other areas where a templated Rational is a Bad Idea, though.

Mar 21 2011

Simen kjaeraas wrote:On Tue, 22 Mar 2011 02:08:39 +0100, dsimcha <dsimcha yahoo.com> wrote:The biggest perf issue, though, seems to be Euler's algorithm instead of BinaryGCD. This is definitely going to get fixed eventually by me, since I've read up on BinaryGCD and it doesn't look hard to implement generically. I naively thought Euler's algorithm was pretty darn efficient when I wrote the first version, not knowing much about how BigInts work under the hood. I'm not sure if Don had something even more efficient than a good generic BinaryGCD implementation in mind, or if there are other areas where a templated Rational is a Bad Idea, though.

The algorithmic complexity is the same for BinaryGCD as it is for normal GCD. What Don had in mind was sub-quadratic. Might have been something from this paper: http://www.lysator.liu.se/~nisse/archive/S0025-5718-07-02017-0.pdf Page 10 onwards claims subquadratic performance.

Yes, and there's been a couple of improved algorithmic tweaks since then. The best algorithms are described in "Modern Computer Arithmetic" which was published just a few months back. But this stuff doesn't really matter terribly much, as these algorithms are only beneficial for very large numbers. The most important thing, really, is to avoid Euclid's algorithm.

Mar 22 2011

Andrei:By its very design, BigInt is supposed to transparently replace integers.

Small missing parts: http://d.puremagic.com/issues/show_bug.cgi?id=5765 To allow them in switch too: http://d.puremagic.com/issues/show_bug.cgi?id=596 Bye, bearophile

Mar 21 2011

Andrei Alexandrescu wrote:On 3/21/11 12:18 PM, dsimcha wrote:== Quote from Don (nospam nospam.com)'s articleOk, I don't know much about how BigInts work under the hood. I used a fairly simple implementation of Euler's algorithm here. Is there something much more efficient for BigInts?

probably want to use the BinaryGCD algorithm even for builtins. There is a sub-quadratic algorithm for GCD, but it's very complicated. (It's not the one Kenny posted).

My thoughts were that Rational would be **primarily** used with BigInt, but occasionally, if you really know what you're doing with regard to overflow, you could used fixed width as an optimization.

I hate to be saying that.

I've thought about this some more. The following **may** be a good way to proceed. I'd like Don's input, since I don't know in detail what he was planning. 1. Review the design of my rational lib with a focus on whether sufficient infrastructure for creating specializations without introducing breaking changes is available. 2. If no issues not related to the lack of specialization for BigInt are raised, accept Rational into Phobos for now. 3. At his leisure, Don can write a specialization of Rational specifically for BigInt (or even just a specialization of gcd, if that's all that's needed). Once this is done, we include it as an optimization, but without changes to the interface. Every integer type except BigInt still uses the generic implementation. 4. Maybe we provide hooks for specializing certain performance-critical things like GCD, in case someone wants to use their own custom BigInt type with Rational without having to modify the code for Phobos. For example, a BigInt type might optionally define a gcd() method. Rational would look for this and use it if it exists, or use the generic fallback algorithm otherwise.

I think by and large failure to define rational for BigInt in a way that has many commonalities with rational for built-in integrals reflects a failure of BigInt. By its very design, BigInt is supposed to transparently replace integers.If this is largely the case, the design has succeeded. If the design of anything must be reworked essentially from scratch to work with BigInt instead of int et al, then the abstraction may be too porous to be useful. There are a few approaches we can take from here. One is to define certain traits that differentiate BigInt from other integrals (e.g. preferAdditionToMultiplication or whatnot), and then design Rational to use those traits. Another is of course to specialize the entire Rational on BigInt. Third would be to specialize certain core routines (gcd and friends) for BigInt and keep Rational agnostic.

You're missing something important here. Rational big integers are a very well defined area of research, with its own set of algorithms. Although you can pretend a BigInt behaves as an int, and make a rational type, the performance will be pathetic, and everyone will laugh at you.Anyhow, BigInt should be cheap to copy (i.e. use refcounting). If there are good reasons for which that's impossible, we should give up on the dream of enacting that assumption throughout Phobos. Don?

Well, it uses copy-on-write so copying itself isn't expensive. But temporaries are never going to be cheap. Things like bigint *= bigint is NEVER an in-place operation (the size is twice as big, so a memory allocation is always required). Even bigint += bigint may trigger a memory allocation. Refcounting might be better than copy-on-write, in the end, but probably not a huge win. One thing which I will do eventually is implement the small-integer optimisation (so anything smaller than 31 bits/63 bits is stored inside the pointer and doesn't have any memory allocation at all). But that doesn't help the general case.

Mar 21 2011

On 3/22/2011 1:25 AM, Don wrote:There are a few approaches we can take from here. One is to define certain traits that differentiate BigInt from other integrals (e.g. preferAdditionToMultiplication or whatnot), and then design Rational to use those traits. Another is of course to specialize the entire Rational on BigInt. Third would be to specialize certain core routines (gcd and friends) for BigInt and keep Rational agnostic.

You're missing something important here. Rational big integers are a very well defined area of research, with its own set of algorithms. Although you can pretend a BigInt behaves as an int, and make a rational type, the performance will be pathetic, and everyone will laugh at you.

I fully understand this based on your previous posts. I therefore agree that Rational doesn't belong in Phobos exactly as-is. I'm trying to understand which of the following scenarios is true, though: Scenario A: Making BigInt efficient would require changes only in a few small places, like GCD. If we provide hooks for arbitrary precision types to specialize these, and provide specializations for std.bigint.BigInt, all will be well. Scenario B: Making BigInt efficient would have ripple effects all over the implementation, require rethinking of every operation, etc., but in a way that wouldn't bleed out into the interface. It might make sense to get the interface and a generic implementation right first, then specialize it to improve performance later. Scenario C: Making BigInt efficient would have ripple effects for both the interface and the implementation. In this case, we probably shouldn't provide a generic implementation because the arbitrary precision one is much more generally useful and the fixed width one would mostly add clutter. BTW, does BigInt over-allocate initially to allow certain operations (like +=) to be done in-place more frequently?

Mar 22 2011

dsimcha wrote:On 3/22/2011 1:25 AM, Don wrote:There are a few approaches we can take from here. One is to define certain traits that differentiate BigInt from other integrals (e.g. preferAdditionToMultiplication or whatnot), and then design Rational to use those traits. Another is of course to specialize the entire Rational on BigInt. Third would be to specialize certain core routines (gcd and friends) for BigInt and keep Rational agnostic.

You're missing something important here. Rational big integers are a very well defined area of research, with its own set of algorithms. Although you can pretend a BigInt behaves as an int, and make a rational type, the performance will be pathetic, and everyone will laugh at you.

I fully understand this based on your previous posts. I therefore agree that Rational doesn't belong in Phobos exactly as-is. I'm trying to understand which of the following scenarios is true, though: Scenario A: Making BigInt efficient would require changes only in a few small places, like GCD. If we provide hooks for arbitrary precision types to specialize these, and provide specializations for std.bigint.BigInt, all will be well. Scenario B: Making BigInt efficient would have ripple effects all over the implementation, require rethinking of every operation, etc., but in a way that wouldn't bleed out into the interface. It might make sense to get the interface and a generic implementation right first, then specialize it to improve performance later. Scenario C: Making BigInt efficient would have ripple effects for both the interface and the implementation. In this case, we probably shouldn't provide a generic implementation because the arbitrary precision one is much more generally useful and the fixed width one would mostly add clutter.

From the BigInt side, the only real change is the addition of gcd and gcdExtended. (gcdExtended returns x, y such that a*x + b*y = gcd(a, b) ). I believe that is the only low-level algorithm which is missing; pretty much everything else remains unchanged. However, the other case which is interesting is when BigInt is replaced with FixedInt!n (maybe someone can come up with a better name for this?) -- an integer with a length of a fixed number of ints. Unlike BigInt, this has basically the same semantics as built-in integer types. In fact, FixedInt!1 == int, FixedInt!2 == long, FixedInt!4 == cent. This is possibly even more relevant for Rational. I haven't thought much about the implications though.BTW, does BigInt over-allocate initially to allow certain operations (like +=) to be done in-place more frequently?

No, it doesn't. As long as it uses copy-on-write, there's no benefit to doing so. Reference counting would clearly be superior for FixedInt, but I'm not at all sure that it would be a win for BigInt. But of course FixedInt wouldn't need to over-allocate. Using TempAlloc would have a far greater effect on performance.

Mar 23 2011

Don:However, the other case which is interesting is when BigInt is replaced with FixedInt!n (maybe someone can come up with a better name for this?) -- an integer with a length of a fixed number of ints. Unlike BigInt, this has basically the same semantics as built-in integer types. In fact, FixedInt!1 == int, FixedInt!2 == long, FixedInt!4 == cent. This is possibly even more relevant for Rational. I haven't thought much about the implications though.BTW, does BigInt over-allocate initially to allow certain operations (like +=) to be done in-place more frequently?

No, it doesn't. As long as it uses copy-on-write, there's no benefit to doing so. Reference counting would clearly be superior for FixedInt, but I'm not at all sure that it would be a win for BigInt. But of course FixedInt wouldn't need to over-allocate.

Are FixedInt fully allocated on the stack? Bye, bearophile

Mar 23 2011

bearophile wrote:Don:However, the other case which is interesting is when BigInt is replaced with FixedInt!n (maybe someone can come up with a better name for this?) -- an integer with a length of a fixed number of ints. Unlike BigInt, this has basically the same semantics as built-in integer types. In fact, FixedInt!1 == int, FixedInt!2 == long, FixedInt!4 == cent. This is possibly even more relevant for Rational. I haven't thought much about the implications though.BTW, does BigInt over-allocate initially to allow certain operations (like +=) to be done in-place more frequently?

doing so. Reference counting would clearly be superior for FixedInt, but I'm not at all sure that it would be a win for BigInt. But of course FixedInt wouldn't need to over-allocate.

Are FixedInt fully allocated on the stack?

Note that they don't currently exist (though I heard someone is working on it) but the implementation I would envisage is: If size <= long.sizeof, alias to built-in type. If size < threshold (maybe 2*cent.sizeof), allocate on the stack. Otherwise, implement as a pointer to reference counted heap memory.

Mar 23 2011

On 3/21/11 10:25 PM, Don wrote:Andrei Alexandrescu wrote:I think by and large failure to define rational for BigInt in a way that has many commonalities with rational for built-in integrals reflects a failure of BigInt. By its very design, BigInt is supposed to transparently replace integers.If this is largely the case, the design has succeeded. If the design of anything must be reworked essentially from scratch to work with BigInt instead of int et al, then the abstraction may be too porous to be useful. There are a few approaches we can take from here. One is to define certain traits that differentiate BigInt from other integrals (e.g. preferAdditionToMultiplication or whatnot), and then design Rational to use those traits. Another is of course to specialize the entire Rational on BigInt. Third would be to specialize certain core routines (gcd and friends) for BigInt and keep Rational agnostic.

You're missing something important here. Rational big integers are a very well defined area of research, with its own set of algorithms. Although you can pretend a BigInt behaves as an int, and make a rational type, the performance will be pathetic, and everyone will laugh at you.

Interesting. Thanks for the insight (missed this post for a while). Andrei

Mar 23 2011

On Tue, 22 Mar 2011 02:08:39 +0100, dsimcha <dsimcha yahoo.com> wrote:The biggest perf issue, though, seems to be Euler's algorithm instead of BinaryGCD. This is definitely going to get fixed eventually by me, since I've read up on BinaryGCD and it doesn't look hard to implement generically. I naively thought Euler's algorithm was pretty darn efficient when I wrote the first version, not knowing much about how BigInts work under the hood. I'm not sure if Don had something even more efficient than a good generic BinaryGCD implementation in mind, or if there are other areas where a templated Rational is a Bad Idea, though.

The algorithmic complexity is the same for BinaryGCD as it is for normal GCD. What Don had in mind was sub-quadratic. Might have been something from this paper: http://www.lysator.liu.se/~nisse/archive/S0025-5718-07-02017-0.pdf Page 10 onwards claims subquadratic performance. -- Simen

Mar 21 2011

On 2011-03-20 16:10, Andrei Alexandrescu wrote:

I have the std.net.isemail module almost ready for review. Just need to generate the documentation, make sure it looks good and make it available online. -- /Jacob Carlborg

Mar 21 2011

On 3/21/11 3:15 AM, Jacob Carlborg wrote:On 2011-03-20 16:10, Andrei Alexandrescu wrote:

I have the std.net.isemail module almost ready for review. Just need to generate the documentation, make sure it looks good and make it available online.

Let's make this next. Please finalize, find a manager, and submit. Good luck! Andrei

Mar 21 2011