www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - Decimal string to floating point conversion with correct half-to-even

reply 9il <ilyayaroshenko gmail.com> writes:
Hey everyone,

So excited to finally announce we can correctly parse 
floating-point numbers according to IEEE round half-to-even 
(bankers) rule like in C/C++, Rust, and others.

 nogc, optionally nothrow API is provided as part of Mir 
Algorithm v3.9.0 [0].
The documentation is available [1].

In case you are surprised, you can be sure that neither D 
compilers can correctly parse decimal FP literals [2, 3, 4], nor 
Phobos can correctly parse decimal FP strings [6].

Mir decimal parsing supports up-to quadruple precision.

The conversion error is 0 ULP for normal numbers.

Subnormal numbers with a decimal exponent greater than or equal 
to -512 have upper error bound equal to 1 ULP. Zero error 
bound for subnormal numbers can be supported in the future when 
Mir Ion, the ASDF successor, is ready.

The implementation is based on the paper [7].

The error bounds above are valid for LDC. DMD may have slightly 
larger errors because of the wrong code generation for ulong to 
double conversion [5].

This work has been sponsored by Symmetry Investments and Kaleidic 
Associates.

Best regards,
Ilya

[0] https://github.com/libmir/mir-algorithm

https://issues.dlang.org/show_bug.cgi?id=20951
[3] https://issues.dlang.org/show_bug.cgi?id=20952
[4] https://issues.dlang.org/show_bug.cgi?id=20953
[5] https://issues.dlang.org/show_bug.cgi?id=20963
[6] https://issues.dlang.org/show_bug.cgi?id=20967
[7] https://www.researchgate.net/publication/2295884_How_to_Read_Floating_Point_Numbers_Accurately
Jun 21
next sibling parent reply Dukc <ajieskola gmail.com> writes:
On Sunday, 21 June 2020 at 15:24:14 UTC, 9il wrote:
 Hey everyone,

 So excited to finally announce we can correctly parse 
 floating-point numbers according to IEEE round half-to-even 
 (bankers) rule like in C/C++, Rust, and others.
Finally a worthy alternative to Vladimir Panteleevs parser [1]. A few months back I went looking for a `nothrow` parser that can handle errors reliably, and I was surprised that I could find only one that was better than my simple custom-made one. Can mir_parse handle other bases than decimal? [1]https://github.com/CyberShadow/ae/blob/master/utils/text/parsefp.d
Jun 22
parent reply 9il <ilyayaroshenko gmail.com> writes:
On Monday, 22 June 2020 at 10:53:02 UTC, Dukc wrote:
 On Sunday, 21 June 2020 at 15:24:14 UTC, 9il wrote:
 Can mir_parse handle other bases than decimal?
No, only the decimal basis is supported for now. Support for hexadecimal FP/integer parsing can be added though. The basic stuff for correct FP hexadecimal parsing is done: we can convert a big integer view to FP number with half-to-even rounding. So the algorithm would look like: 1. Parse hexadecimal big integer 2. Parse exponent 3. Cast big integer to `Fp` with a specific number of meaningful bits (its already implemented) 4. Add exponent to `Fp`'s exponent, and cast the result to a hardware floating point type.
Jun 22
parent reply 9il <ilyayaroshenko gmail.com> writes:
On Monday, 22 June 2020 at 12:04:13 UTC, 9il wrote:
 On Monday, 22 June 2020 at 10:53:02 UTC, Dukc wrote:
 On Sunday, 21 June 2020 at 15:24:14 UTC, 9il wrote:
 Can mir_parse handle other bases than decimal?
No, only the decimal basis is supported for now. Support for hexadecimal FP/integer parsing can be added though. The basic stuff for correct FP hexadecimal parsing is done: we can convert a big integer view to FP number with half-to-even rounding. So the algorithm would look like: 1. Parse hexadecimal big integer 2. Parse exponent 3. Cast big integer to `Fp` with a specific number of meaningful bits (its already implemented) 4. Add exponent to `Fp`'s exponent, and cast the result to a hardware floating point type.
My bad, the hexadecimal parsing is already implemented for big integers! So, each part of the algorithm above is implemented. Maybe we need to rework fromHexStringImpl to make it return a boolean value.
Jun 22
parent Dukc <ajieskola gmail.com> writes:
On Monday, 22 June 2020 at 12:07:26 UTC, 9il wrote:
 On Monday, 22 June 2020 at 12:04:13 UTC, 9il wrote:
 So the algorithm would look like:
 1. Parse hexadecimal big integer
 2. Parse exponent
 3. Cast big integer to `Fp` with a specific number of 
 meaningful bits (its already implemented)
 4. Add exponent to `Fp`'s exponent, and cast the result to a 
 hardware floating point type.
My bad, the hexadecimal parsing is already implemented for big integers!
So only a bit left to go. Great!
 So, each part of the algorithm above is implemented. Maybe we 
 need to rework fromHexStringImpl to make it return a boolean 
 value.
Good idea. It should pay back when one wants to parse a big amount of strings that are likely to contain a lot of non-integers.
Jun 22
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/21/2020 8:24 AM, 9il wrote:
 So excited to finally announce we can correctly parse floating-point numbers 
 according to IEEE round half-to-even (bankers) rule like in C/C++, Rust, and 
 others.
Great work! Would you like to add it to dmd?
Jul 04
next sibling parent reply 9il <ilyayaroshenko gmail.com> writes:
On Saturday, 4 July 2020 at 20:35:48 UTC, Walter Bright wrote:
 On 6/21/2020 8:24 AM, 9il wrote:
 So excited to finally announce we can correctly parse 
 floating-point numbers according to IEEE round half-to-even 
 (bankers) rule like in C/C++, Rust, and others.
Great work! Would you like to add it to dmd?
Thank you! Yes. It would be very much appreciated to preserve the `mir.` namespace for the `parse` module, the required `bignum` package*. Also, whenever the code will be located I would like to have control over it. Will you agree? * - `mir.bignum` is 6K LOC and it is expected to grow up to 20K LOC if finished. The package includes abstract views for big integers, decimal, and binary FP numbers; stack-allocated big integers; midsize unsigned integers; software FP numbers with extended precision.
Jul 04
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/4/2020 8:09 PM, 9il wrote:
 On Saturday, 4 July 2020 at 20:35:48 UTC, Walter Bright wrote:
 On 6/21/2020 8:24 AM, 9il wrote:
 So excited to finally announce we can correctly parse floating-point numbers 
 according to IEEE round half-to-even (bankers) rule like in C/C++, Rust, and 
 others.
Great work! Would you like to add it to dmd?
Thank you! Yes. It would be very much appreciated to preserve the `mir.` namespace for the `parse` module, the required `bignum` package*. Also, whenever the code will be located I would like to have control over it. Will you agree? * - `mir.bignum` is 6K LOC and it is expected to grow up to 20K LOC if finished. The package includes abstract views for big integers, decimal, and binary FP numbers; stack-allocated big integers; midsize unsigned integers; software FP numbers with extended precision.
Does the float parsing code require bignum? I'm also not sure I know what you mean by control. Contributions to dmd would need to be Boost Licensed, which means anyone can do what they like with them.
Jul 04
parent reply 9il <ilyayaroshenko gmail.com> writes:
On Sunday, 5 July 2020 at 06:23:35 UTC, Walter Bright wrote:
 On 7/4/2020 8:09 PM, 9il wrote:
 On Saturday, 4 July 2020 at 20:35:48 UTC, Walter Bright wrote:
 On 6/21/2020 8:24 AM, 9il wrote:
 So excited to finally announce we can correctly parse 
 floating-point numbers according to IEEE round half-to-even 
 (bankers) rule like in C/C++, Rust, and others.
Great work! Would you like to add it to dmd?
Thank you! Yes. It would be very much appreciated to preserve the `mir.` namespace for the `parse` module, the required `bignum` package*. Also, whenever the code will be located I would like to have control over it. Will you agree? * - `mir.bignum` is 6K LOC and it is expected to grow up to 20K LOC if finished. The package includes abstract views for big integers, decimal, and binary FP numbers; stack-allocated big integers; midsize unsigned integers; software FP numbers with extended precision.
Does the float parsing code require bignum?
Yes. The decimal float parsing requires big integer arithmetic and software floating-point multiplication with extended precision (128-bit mantissa).
 I'm also not sure I know what you mean by control. 
 Contributions to dmd would need to be Boost Licensed, which 
 means anyone can do what they like with them.
The code is already Boost licensed. We need not only literals parsing but also library text parsing. So the code should be available for users and for the compiler. I see two possible solutions that look good to me. The first one is to add mir-algorithm package or its part as an external dependency for DMD. It is preferable and either way. If you will accept the PR, I will do it. The second solution is to move `mir.bignum` and `mir.parse` to DRuntime/Phobos. In this case, I would like to preserve the `mir.` namespace and the same authority and veto right for this part of the codebase as I have at Mir Org. I mean the following. Your voice has a veto right for DMD and Dlang evaluation. Andrei has a veto right for Phobos. Atila seems to have almost the same veto right as Andrei and you and blocks required Dlang features for Mir [1, 2]. Furthermore, if you and Andrei really want to add or change something you will force it to happen. I want the same veto right for evaluation of the Mir parts in case you think they should be moved to DRuntime/Phobos. Also, the code under `mir.` namespace should be less constrained then `core`/`std` code in terms of API changes. [1] https://github.com/dlang/dmd/pull/9778 [2] https://github.com/dlang/DIPs/blob/master/DIPs/other/DIP1023.md
Jul 05
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/5/2020 12:24 AM, 9il wrote:
 On Sunday, 5 July 2020 at 06:23:35 UTC, Walter Bright wrote:
 On 7/4/2020 8:09 PM, 9il wrote:
 Does the float parsing code require bignum?
Yes. The decimal float parsing requires big integer arithmetic and software
Arbitrary precision or simply a fixed amount of more precision?
 floating-point multiplication with extended precision (128-bit mantissa).
Some of that is already done in dmd. Essentially, just enough to get this specific use to work.
 I'm also not sure I know what you mean by control. Contributions to dmd would 
 need to be Boost Licensed, which means anyone can do what they like with them.
The code is already Boost licensed. We need not only literals parsing but also library text parsing. So the code should be available for users and for the compiler. I see two possible solutions that look good to me. The first one is to add mir-algorithm package or its part as an external dependency for DMD. It is preferable and either way. If you will accept the PR, I will do it. The second solution is to move `mir.bignum` and `mir.parse` to DRuntime/Phobos. In this case, I would like to preserve the `mir.` namespace and the same authority and veto right for this part of the codebase as I have at Mir Org. I mean the following. Your voice has a veto right for DMD and Dlang evaluation. Andrei has a veto right for Phobos. Atila seems to have almost the same veto right as Andrei and you and blocks required Dlang features for Mir [1, 2]. Furthermore, if you and Andrei really want to add or change something you will force it to happen. I want the same veto right for evaluation of the Mir parts in case you think they should be moved to DRuntime/Phobos. Also, the code under `mir.` namespace should be less constrained then `core`/`std` code in terms of API changes. [1] https://github.com/dlang/dmd/pull/9778 [2] https://github.com/dlang/DIPs/blob/master/DIPs/other/DIP1023.md
It doesn't work quite like that. The D Language Foundation controls it. Andrei, Atila, and myself control it only as far as we DLF empowers us to, which can change. Official parts of the DMD distribution have to be controlled by the DLF. It's unworkable otherwise.
Jul 05
next sibling parent 9il <ilyayaroshenko gmail.com> writes:
On Sunday, 5 July 2020 at 08:15:53 UTC, Walter Bright wrote:
 On 7/5/2020 12:24 AM, 9il wrote:
 On Sunday, 5 July 2020 at 06:23:35 UTC, Walter Bright wrote:
 On 7/4/2020 8:09 PM, 9il wrote:
 Does the float parsing code require bignum?
Yes. The decimal float parsing requires big integer arithmetic and software
Arbitrary precision or simply a fixed amount of more precision?
Up to 2^^16384 - 1.
Jul 05
prev sibling parent reply 9il <ilyayaroshenko gmail.com> writes:
On Sunday, 5 July 2020 at 08:15:53 UTC, Walter Bright wrote:
 It doesn't work quite like that. The D Language Foundation 
 controls it. Andrei, Atila, and myself control it only as far 
 as we DLF empowers us to, which can change. Official parts of 
 the DMD distribution have to be controlled by the DLF. It's 
 unworkable otherwise.
If I remember correctly some time ago DMD hasn't been even Boost licensed. Also, DMD uses C libraries at least. I can't see why adding an open-source Boost licensed dependency is unworkable then.
Jul 05
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/5/2020 1:56 AM, 9il wrote:
 On Sunday, 5 July 2020 at 08:15:53 UTC, Walter Bright wrote:
 It doesn't work quite like that. The D Language Foundation controls it. 
 Andrei, Atila, and myself control it only as far as we DLF empowers us to, 
 which can change. Official parts of the DMD distribution have to be controlled 
 by the DLF. It's unworkable otherwise.
If I remember correctly some time ago DMD hasn't been even Boost licensed. Also, DMD uses C libraries at least. I can't see why adding an open-source Boost licensed dependency is unworkable then.
All of DMD, Druntime, and Phobos use Boost, except for Curl and the zip library (which we probably shouldn't have added).
Jul 05
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/5/2020 3:35 AM, Walter Bright wrote:
 All of DMD, Druntime, and Phobos use Boost, except for Curl and the zip
library 
 (which we probably shouldn't have added).
Also, there are no dependencies on Curl and zip. We don't distribute the C libraries, we use whatever is on the user's system.
Jul 05
parent reply 9il <ilyayaroshenko gmail.com> writes:
On Sunday, 5 July 2020 at 10:39:49 UTC, Walter Bright wrote:
 On 7/5/2020 3:35 AM, Walter Bright wrote:
 All of DMD, Druntime, and Phobos use Boost, except for Curl 
 and the zip library (which we probably shouldn't have added).
Also, there are no dependencies on Curl and zip. We don't distribute the C libraries, we use whatever is on the user's system.
DMD statically links the C standard library (and maybe something else). There is no risk for DMD and DFL to depend on a Mir's Boost licensed library. If something happens with Mir or Mir change the license, DFL will be able to fork the required code at any point in the Boost licensed part of git history.
Jul 05
parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Sunday, 5 July 2020 at 11:07:55 UTC, 9il wrote:
 There is no risk for DMD and DFL to depend on a Mir's Boost 
 licensed library. If something happens with Mir or Mir change 
 the license, DFL will be able to fork the required code at any 
 point in the Boost licensed part of git history.
Can't speak for Walter or the D foundation here, but I'm not sure the concern is really about licensing. It's about putting in place a required dependency on code where maintenance decisions are outside the hands of the D Foundation.
Jul 05
next sibling parent reply IGotD- <nise nise.com> writes:
On Sunday, 5 July 2020 at 12:46:58 UTC, Joseph Rushton Wakeling 
wrote:
 Can't speak for Walter or the D foundation here, but I'm not 
 sure the concern is really about licensing.  It's about putting 
 in place a required dependency on code where maintenance 
 decisions are outside the hands of the D Foundation.
It's a resource question again. I'm all for that for example D should have a native alternative to curl including SSL/TLS support. If someone is willing to invest the man hours into such project, I'm all for it. Nim went that way having partial native http support so it isn't impossible.
Jul 05
parent Jacob Carlborg <doob me.com> writes:
On Sunday, 5 July 2020 at 14:29:22 UTC, IGotD- wrote:

 It's a resource question again. I'm all for that for example D 
 should have a native alternative to curl including SSL/TLS 
 support. If someone is willing to invest the man hours into 
 such project, I'm all for it. Nim went that way having partial 
 native http support so it isn't impossible.
I agree, that would be really nice. Unfortunately it's quite tricky if you want to integrate with the platform provided TLS implementation [1]. Unless you're suggesting to create a crypto library from scratch, which is a whole different beast, and probably a bad idea [2]. [1] https://forum.dlang.org/post/prsrbzoanxwytrtpyqgv forum.dlang.org [2] https://forum.dlang.org/post/hsmjcgxzujgwsiegikos forum.dlang.org -- /Jacob Carlborg
Jul 06
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/5/2020 5:46 AM, Joseph Rushton Wakeling wrote:
 On Sunday, 5 July 2020 at 11:07:55 UTC, 9il wrote:
 There is no risk for DMD and DFL to depend on a Mir's Boost licensed library. 
 If something happens with Mir or Mir change the license, DFL will be able to 
 fork the required code at any point in the Boost licensed part of git history.
Can't speak for Walter or the D foundation here, but I'm not sure the concern is really about licensing.  It's about putting in place a required dependency on code where maintenance decisions are outside the hands of the D Foundation.
That's right, it's not about the licensing. It's that the DLF should control the code it distributes. Businesses will not want to commit to a balkanized project. The proposal is for Mir to become a central required component of DMD and Phobos. This means it needs to become part of the D Language Foundation.
Jul 07
next sibling parent reply Guillaume Piolat <first.name gmail.com> writes:
On Tuesday, 7 July 2020 at 07:49:02 UTC, Walter Bright wrote:
 That's right, it's not about the licensing. It's that the DLF 
 should control the code it distributes.

 Businesses will not want to commit to a balkanized project.
From a business point of view, having slightly more correct string to float conversion holds very little value. I'll stick with sscanf thanks...
Jul 07
parent reply 9il <ilyayaroshenko gmail.com> writes:
On Tuesday, 7 July 2020 at 09:28:26 UTC, Guillaume Piolat wrote:
 On Tuesday, 7 July 2020 at 07:49:02 UTC, Walter Bright wrote:
 That's right, it's not about the licensing. It's that the DLF 
 should control the code it distributes.

 Businesses will not want to commit to a balkanized project.
From a business point of view, having slightly more correct string to float conversion holds very little value. I'll stick with sscanf thanks...
For a high tech real markets (airspace, automotive, science, military-industrial complex) having a correct decimal literal parsing has a little but absolutely mandatory value. If SpaceX is lending a rocket, they want it located on the platform, something around wouldn't make sense. Note, that these companies hold a huge amount of the legacy C/C++ code and they are potential Dlang markets. But only if Dlang will be able to match C exactly for numeric code. Otherwise merging C/C++ code would have a huge negative impact on them.
Jul 07
next sibling parent reply Guillaume Piolat <first.name gmail.com> writes:
On Tuesday, 7 July 2020 at 10:58:25 UTC, 9il wrote:
 From a business point of view, having slightly more correct 
 string to float conversion holds very little value. I'll stick 
 with sscanf thanks...
For a high tech real markets (airspace, automotive, science, military-industrial complex) having a correct decimal literal parsing...
Phobos is the stdlib of the language. Mir is not. Likewise, you've made the std.experimental.allocator on DUB depends on mir-core... stdlib shouldn't depend on non-stdlib, there isn't anything to debate on this point.
Jul 07
next sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Tuesday, 7 July 2020 at 12:14:16 UTC, Guillaume Piolat wrote:
 [snip]

 Likewise, you've made the std.experimental.allocator on DUB 
 depends on mir-core...
Is that not an example of how Steve thinks it should work? Both are Boost licensed. mir-core has no external dependencies. Ilya could split the parse/bignum packages to a separate repo with no other dependencies and then public import them in mir-algorithm and just normally import them elsewhere.
 stdlib shouldn't depend on non-stdlib, there isn't anything to 
 debate on this point.
Walter's original comment was about adding it to DMD. He may have intended Phobos, but I read it as DMD.
Jul 07
parent reply jmh530 <john.michael.hall gmail.com> writes:
On Tuesday, 7 July 2020 at 12:33:40 UTC, jmh530 wrote:
 On Tuesday, 7 July 2020 at 12:14:16 UTC, Guillaume Piolat wrote:
 [snip]

 Likewise, you've made the std.experimental.allocator on DUB 
 depends on mir-core...
Is that not an example of how Steve thinks it should work? Both are Boost licensed. mir-core has no external dependencies. Ilya could split the parse/bignum packages to a separate repo with no other dependencies and then public import them in mir-algorithm and just normally import them elsewhere.
 stdlib shouldn't depend on non-stdlib, there isn't anything to 
 debate on this point.
Walter's original comment was about adding it to DMD. He may have intended Phobos, but I read it as DMD.
Eh, it looks like parse.d depends on std. So it wouldn't make sense to add to DMD since then the compiler would depend on the standard library.
Jul 07
parent 9il <ilyayaroshenko gmail.com> writes:
On Tuesday, 7 July 2020 at 12:35:57 UTC, jmh530 wrote:
 On Tuesday, 7 July 2020 at 12:33:40 UTC, jmh530 wrote:
 On Tuesday, 7 July 2020 at 12:14:16 UTC, Guillaume Piolat 
 wrote:
 [...]
Is that not an example of how Steve thinks it should work? Both are Boost licensed. mir-core has no external dependencies. Ilya could split the parse/bignum packages to a separate repo with no other dependencies and then public import them in mir-algorithm and just normally import them elsewhere.
 [...]
Walter's original comment was about adding it to DMD. He may have intended Phobos, but I read it as DMD.
Eh, it looks like parse.d depends on std. So it wouldn't make sense to add to DMD since then the compiler would depend on the standard library.
All std.* dependencies in Mir are minor and can be replaced in a day.
Jul 07
prev sibling next sibling parent 9il <ilyayaroshenko gmail.com> writes:
On Tuesday, 7 July 2020 at 12:14:16 UTC, Guillaume Piolat wrote:
 On Tuesday, 7 July 2020 at 10:58:25 UTC, 9il wrote:
 From a business point of view, having slightly more correct 
 string to float conversion holds very little value. I'll 
 stick with sscanf thanks...
For a high tech real markets (airspace, automotive, science, military-industrial complex) having a correct decimal literal parsing...
Phobos is the stdlib of the language. Mir is not. Likewise, you've made the std.experimental.allocator on DUB depends on mir-core... stdlib shouldn't depend on non-stdlib, there isn't anything to debate on this point.
Mir is stdlib for my business. Phobos is not. It wasn't me who proposed to use Mir code in DMD. I have heard Walter. There isn't anything to debate on this point.
Jul 07
prev sibling parent Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Tuesday, 7 July 2020 at 12:14:16 UTC, Guillaume Piolat wrote:
 Phobos is the stdlib of the language. Mir is not.
I'm not sure why you point this out. No one is arguing that it is. On the other hand, it does many things better already.
 Likewise, you've made the std.experimental.allocator on DUB 
 depends on mir-core... stdlib shouldn't depend on non-stdlib, 
 there isn't anything to debate on this point.
stdx-allocator [1] is not "stdlib" and not meant to be part of Phobos (the opposite actually). It's a fork of the code in Phobos made with the obvious intention of doing things differently than Phobos. If you want to use Phobos, then... use Phobos :D Actually, mir-core is not a dependency of stdx-allocator in general, just in V3. You can still use the 2.77.z branch which doesn't have any dependencies [2]. Many projects do use it. [1]: https://github.com/dlang-community/stdx-allocator [2]: https://github.com/dlang-community/stdx-allocator/blob/2.77.z/dub.sdl
Jul 07
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/7/2020 3:58 AM, 9il wrote:
 For a high tech real markets (airspace, automotive, science,
military-industrial 
 complex) having a correct decimal literal parsing has a little but absolutely 
 mandatory value. If SpaceX is lending a rocket, they want it located on the 
 platform, something around wouldn't make sense. Note, that these companies
hold 
 a huge amount of the legacy C/C++ code and they are potential Dlang markets.
But 
 only if Dlang will be able to match C exactly for numeric code. Otherwise 
 merging C/C++ code would have a huge negative impact on them.
I agree with the importance of being correct to the last bit.
Jul 08
prev sibling next sibling parent reply 9il <ilyayaroshenko gmail.com> writes:
On Tuesday, 7 July 2020 at 07:49:02 UTC, Walter Bright wrote:
 On 7/5/2020 5:46 AM, Joseph Rushton Wakeling wrote:
 On Sunday, 5 July 2020 at 11:07:55 UTC, 9il wrote:
 There is no risk for DMD and DFL to depend on a Mir's Boost 
 licensed library. If something happens with Mir or Mir change 
 the license, DFL will be able to fork the required code at 
 any point in the Boost licensed part of git history.
Can't speak for Walter or the D foundation here, but I'm not sure the concern is really about licensing.  It's about putting in place a required dependency on code where maintenance decisions are outside the hands of the D Foundation.
That's right, it's not about the licensing. It's that the DLF should control the code it distributes. Businesses will not want to commit to a balkanized project. The proposal is for Mir to become a central required component of DMD and Phobos. This means it needs to become part of the D Language Foundation.
These don't serve my business needs. DLF doesn't serve my business needs. DLF blocks the initiatives my business needs. For the current state of things being a part of DLF codebase for Mir is nonsense.
Jul 07
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/7/20 7:13 AM, 9il wrote:
 On Tuesday, 7 July 2020 at 07:49:02 UTC, Walter Bright wrote:
 On 7/5/2020 5:46 AM, Joseph Rushton Wakeling wrote:
 On Sunday, 5 July 2020 at 11:07:55 UTC, 9il wrote:
 There is no risk for DMD and DFL to depend on a Mir's Boost licensed 
 library. If something happens with Mir or Mir change the license, 
 DFL will be able to fork the required code at any point in the Boost 
 licensed part of git history.
Can't speak for Walter or the D foundation here, but I'm not sure the concern is really about licensing.  It's about putting in place a required dependency on code where maintenance decisions are outside the hands of the D Foundation.
That's right, it's not about the licensing. It's that the DLF should control the code it distributes. Businesses will not want to commit to a balkanized project. The proposal is for Mir to become a central required component of DMD and Phobos. This means it needs to become part of the D Language Foundation.
These don't serve my business needs. DLF doesn't serve my business needs. DLF blocks the initiatives my business needs. For the current state of things being a part of DLF codebase for Mir is nonsense.
Guys, this is all open source, all licensed identically. There are ways to solve this. Practically speaking, just because DMD depends on Mir, doesn't mean that Mir has control over how the dependency works. DMD can depend on a specific version of Mir, upgraded when reasonable (i.e. it should take a PR change to DMD for upgrading which code exactly is depended on) and if something changes in the future, you can fork it, or move back to using libc. This way, the code is only maintained in one place unless something catastrophic happens. In this sense, the DLF *does* control which code is used, as well as if it were in the DMD repository itself. We have a boost license for a reason. -Steve
Jul 07
next sibling parent 9il <ilyayaroshenko gmail.com> writes:
On Tuesday, 7 July 2020 at 12:04:43 UTC, Steven Schveighoffer 
wrote:
 On 7/7/20 7:13 AM, 9il wrote:
 [...]
Guys, this is all open source, all licensed identically. There are ways to solve this. Practically speaking, just because DMD depends on Mir, doesn't mean that Mir has control over how the dependency works. DMD can depend on a specific version of Mir, upgraded when reasonable (i.e. it should take a PR change to DMD for upgrading which code exactly is depended on) and if something changes in the future, you can fork it, or move back to using libc. This way, the code is only maintained in one place unless something catastrophic happens. In this sense, the DLF *does* control which code is used, as well as if it were in the DMD repository itself. We have a boost license for a reason. -Steve
Exactly. Thank you
Jul 07
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Tuesday, 7 July 2020 at 12:04:43 UTC, Steven Schveighoffer 
wrote:
 Guys, this is all open source, all licensed identically. There 
 are ways to solve this. Practically speaking, just because DMD 
 depends on Mir, doesn't mean that Mir has control over how the 
 dependency works. DMD can depend on a specific version of Mir, 
 upgraded when reasonable (i.e. it should take a PR change to 
 DMD for upgrading which code exactly is depended on) and if 
 something changes in the future, you can fork it, or move back 
 to using libc. This way, the code is only maintained in one 
 place unless something catastrophic happens.
Well, for avoidance of doubt, I'm not personally opposed per se to the use of 3rd party libs in DMD or druntime/phobos. It just struck me as important that everyone understands clearly each other's concerns. What I think provoked concern was not the idea of using mir-algorithm as a 3rd party lib, but this alternative suggestion:
 The second solution is to move `mir.bignum` and `mir.parse` to
 DRuntime/Phobos. In this case, I would like to preserve the 
 `mir.`
 namespace and the same authority and veto right for this part of
 the codebase as I have at Mir Org.
... where what's being asked is a rather stronger commitment, and not really workable: if the code's going to live in D foundation repos then any maintainer or veto rights are going to be provisional. It's just not reasonable for the maintainer of 2 stdlib modules to be able to unilaterally veto changes, or force breaking change. There's no reason per se why 3rd party libs can't be used under the hood to provide implementations, but druntime and phobos maintainers need to be able to make their own API guarantees.
Jul 07
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 7/7/20 8:04 AM, Steven Schveighoffer wrote:
 On 7/7/20 7:13 AM, 9il wrote:
 On Tuesday, 7 July 2020 at 07:49:02 UTC, Walter Bright wrote:
 On 7/5/2020 5:46 AM, Joseph Rushton Wakeling wrote:
 On Sunday, 5 July 2020 at 11:07:55 UTC, 9il wrote:
 There is no risk for DMD and DFL to depend on a Mir's Boost 
 licensed library. If something happens with Mir or Mir change the 
 license, DFL will be able to fork the required code at any point in 
 the Boost licensed part of git history.
Can't speak for Walter or the D foundation here, but I'm not sure the concern is really about licensing.  It's about putting in place a required dependency on code where maintenance decisions are outside the hands of the D Foundation.
That's right, it's not about the licensing. It's that the DLF should control the code it distributes. Businesses will not want to commit to a balkanized project. The proposal is for Mir to become a central required component of DMD and Phobos. This means it needs to become part of the D Language Foundation.
These don't serve my business needs. DLF doesn't serve my business needs. DLF blocks the initiatives my business needs. For the current state of things being a part of DLF codebase for Mir is nonsense.
Guys, this is all open source, all licensed identically. There are ways to solve this. Practically speaking, just because DMD depends on Mir, doesn't mean that Mir has control over how the dependency works. DMD can depend on a specific version of Mir, upgraded when reasonable (i.e. it should take a PR change to DMD for upgrading which code exactly is depended on) and if something changes in the future, you can fork it, or move back to using libc. This way, the code is only maintained in one place unless something catastrophic happens. In this sense, the DLF *does* control which code is used, as well as if it were in the DMD repository itself. We have a boost license for a reason.
FWIW it would be wisest to simply copy the code from Mir into druntime now with due credit. It's a minimally committal decision than can be easily revisited later. It is legal, appropriate, and there's no shame to it any more than it is for other projects to fork (parts of) dmd, druntime, or phobos.
Aug 07
prev sibling next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Tuesday, 7 July 2020 at 07:49:02 UTC, Walter Bright wrote:
 Businesses will not want to commit to a balkanized project.
It's been ages since I worked on a software project for a business that didn't have many random third (and fourth and fifth and sixth and seventh.....) party dependencies. Trying to remove or avoid them would universally encounter pushback from management. "Don't reinvent the wheel" they say. It is really absurd. But anyway this whole debate is moot because if you like the code, you can simply copy/paste it (with attribution as required by Boost copyright of course) into your own files. You keep full control and get all the benefits of using it.
Jul 07
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/7/20 8:52 AM, Adam D. Ruppe wrote:
 On Tuesday, 7 July 2020 at 07:49:02 UTC, Walter Bright wrote:
 Businesses will not want to commit to a balkanized project.
It's been ages since I worked on a software project for a business that didn't have many random third (and fourth and fifth and sixth and seventh.....) party dependencies. Trying to remove or avoid them would universally encounter pushback from management. "Don't reinvent the wheel" they say. It is really absurd.
+1 To any customer who is shy to include 3rd party dependencies, isn't using DMD a 3rd party dependency?
 
 
 But anyway this whole debate is moot because if you like the code, you 
 can simply copy/paste it (with attribution as required by Boost 
 copyright of course) into your own files. You keep full control and get 
 all the benefits of using it.
Doing that these days would be silly. You can depend on a specific version of a repository without problems. Git even allows you to add a dependency on another git project, and freeze it at that version. -Steve
Jul 07
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Tuesday, 7 July 2020 at 13:00:04 UTC, Steven Schveighoffer 
wrote:
 Doing that these days would be silly. You can depend on a 
 specific version of a repository without problems.
I always have problems when trying to do that. git submodules bring pretty consistent pain in my experience. But it probably isn't so bad if the submodule rarely changes. Just for 100% control anyway nothing beats copy/paste. Then there's zero difference between you writing it yourself. I kinda wish the D upstream were more willing to do that. My view is it shouldn't be on independent developers to add stuff to Phobos, for example, instead the Phobos team should just be copying and testing modules they are interested in on their own.
Jul 07
next sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/7/20 11:08 AM, Adam D. Ruppe wrote:
 On Tuesday, 7 July 2020 at 13:00:04 UTC, Steven Schveighoffer wrote:
 Doing that these days would be silly. You can depend on a specific 
 version of a repository without problems.
I always have problems when trying to do that. git submodules bring pretty consistent pain in my experience. But it probably isn't so bad if the submodule rarely changes.
It's cumbersome in my experience, I do it on some iOS projects that I have. But as long as you don't touch ANY FILES inside the cloned submodule, and just use git to update it (i.e. git checkout v1.x when upgrading), you are just versioning which commit your code depends on. Xcode makes this difficult because it tries to write some temporary files inside the submodule, and I have to clear those out before upgrading. But that problem shouldn't exist here. Like all things git, it's hard until you figure it out, and then it becomes easy.
 
 Just for 100% control anyway nothing beats copy/paste. Then there's zero 
 difference between you writing it yourself.
 
 I kinda wish the D upstream were more willing to do that. My view is it 
 shouldn't be on independent developers to add stuff to Phobos, for 
 example, instead the Phobos team should just be copying and testing 
 modules they are interested in on their own.
The problem here is that you are maintaining a separate copy. There is always the chance that you don't copy it right (e.g. you copy a local checkout that has some test changes you made, or you copy the wrong version). Imagine how fun the reviews will be when you have to check that all the cumulative changes from 1.0 to 1.5 match what the source repository has. We have computers and software to do source control for a reason. -Steve
Jul 07
prev sibling next sibling parent John Colvin <john.loughran.colvin gmail.com> writes:
On Tuesday, 7 July 2020 at 15:08:33 UTC, Adam D. Ruppe wrote:
 On Tuesday, 7 July 2020 at 13:00:04 UTC, Steven Schveighoffer 
 wrote:
 Doing that these days would be silly. You can depend on a 
 specific version of a repository without problems.
I always have problems when trying to do that. git submodules bring pretty consistent pain in my experience. But it probably isn't so bad if the submodule rarely changes. Just for 100% control anyway nothing beats copy/paste. Then there's zero difference between you writing it yourself. I kinda wish the D upstream were more willing to do that. My view is it shouldn't be on independent developers to add stuff to Phobos, for example, instead the Phobos team should just be copying and testing modules they are interested in on their own.
git subtree it's like submodules but also like copy-paste.
Jul 07
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jul 07, 2020 at 03:08:33PM +0000, Adam D. Ruppe via
Digitalmars-d-announce wrote:
 On Tuesday, 7 July 2020 at 13:00:04 UTC, Steven Schveighoffer wrote:
 Doing that these days would be silly. You can depend on a specific
 version of a repository without problems.
I always have problems when trying to do that. git submodules bring pretty consistent pain in my experience.
git submodules serves a very specific niche; using it for anything outside of that most definitely brings in gigantic pain. That very specific niche is this: there's an external repository R that contains code you'd like to use, BUT that you'll never edit (unless you wish to push it back upstream). You check it out in some subdirectory of your source tree, say ./someRepo/*, and add it as a submodule. This adds the SHA hash of the exact revision of the code in .gitmodules, meaning that you depend on that exact version of R. Occasionally, you want to update R to some (presumably newer) version, so you do a `git submodule foreach git pull ...` to pull the revision you want. This updates .gitmodules to point to the new revision. What you do *not* want to do is to edit the contents of the submodule, because that will start creating diverging branches in the submodule, which generally leads to a gigantic mess, like when somebody checks out your code and tries to fetch submodules, they may not find the revision being referred to (you haven't pushed the commits upstream, upstream rejected it, etc). Basically, git submodules let you refer to a specific commit in a specific repo. Don't expect it to do anything else for you, including housekeeping that you *thought* it ought to do. I've used git submodules quite happily for my own projects, where I want to pull in code from another project but don't want to have to worry about version compatibility and all of that dependency hell. Basically you update a submodule to a new revision when and only when *you* initiate it, and don't commit the submodule update until you've verified that the new revision didn't break your code. The submodule SHA ensures that you'll get the exact version of the submodule that you last checked in, not some random new version or some corrupted/edited version that some unreliable network source have given you instead.
 But it probably isn't so bad if the submodule rarely changes.
Yeah, you do *not* want to use submodules if you're interested in keeping up with the latest bleeding edge from upstream. Well I mean you *can*, but just don't expect it to automate anything for you. The onus is upon you to test everything with the new revision before committing it. Oh, and another point: it's *probably* a good idea to git clone (i.e. fork in github parlance) the submodule into a local copy, so that if the network source vanishes into the ether, you aren't left with uncompilable code. Don't laugh, the way modern software development is going, I will be SO not surprised when one day some obscure project that everyone implicitly depends on suddenly vanishes into the ether and the rest of the world collapses because everybody and his neighbour's dog blindly assumed that "if it's on the network, it'll be there forever".
 Just for 100% control anyway nothing beats copy/paste. Then there's
 zero difference between you writing it yourself.
I highly recommend this approach when your dependency is small. Or if you want to ensure no external dependencies. There have been far, FAR too many times IME in the past few years where I encountered a project that was no longer compilable because one or more dependencies have vanished into the ether. Or the code no longer compiles with the dependency because the latest version of said dependency has migrated to a brand new codebase, and the old revision that the project depended on is not compatible with the new version. Or said project itself has moved on and happily broke functionality I depended on. These days, my policy is: download the danged source code for the specific version of the specific project I'm depending on, AND download the danged source code of the danged dependencies of that project, etc., and keep a local copy of the whole recursive dependency tree so that I can ensure I can always build that specific version of that specific project with that specific functionality that I'm using. Trying to keep up with projects that gratuitously break stuff, or abandoned projects whose dependencies are no longer compatible with it, etc., is a hell I wish to have no part in. I've lost faith in the emperor's code reuse clothes; copy-n-paste is what gives the real guarantees. And I'm not alone in this -- I've noticed that quite a few open source projects are distributing a copy of the sources of the librar{y,ies} they depend on in their own source tree as a fallback, in case the target system's version of that library doesn't exist, or is hard to find, or is somehow incompatible with the version they're expecting to use.
 I kinda wish the D upstream were more willing to do that. My view is
 it shouldn't be on independent developers to add stuff to Phobos, for
 example, instead the Phobos team should just be copying and testing
 modules they are interested in on their own.
+1, this guarantees Phobos doesn't (directly) have dependencies outside of itself. As a standard library, that's a no-no (cf. the repeated problems we had over the years with libcurl, zlib, etc.). Just keep a copy of the source code in some dedicated subdirectory that can be easily replaced when we need to upgrade to a new version -- that's what open source is for!!! T -- People tell me I'm stubborn, but I refuse to accept it!
Jul 07
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/7/20 1:37 PM, H. S. Teoh wrote:
 cf. the repeated
 problems we had over the years with libcurl, zlib, etc.
zlib is actually included copy-paste style in Phobos [1]. So it's interesting that you cite it as an example of causing problems because we don't include a copy of it. -Steve [1] https://github.com/dlang/phobos/tree/ea070e9a5c168de2e8460413e44bff1d406ff5c3/etc/c/zlib
Jul 07
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jul 07, 2020 at 01:47:59PM -0400, Steven Schveighoffer via
Digitalmars-d-announce wrote:
 On 7/7/20 1:37 PM, H. S. Teoh wrote:
 cf. the repeated problems we had over the years with libcurl, zlib,
 etc.
zlib is actually included copy-paste style in Phobos [1]. So it's interesting that you cite it as an example of causing problems because we don't include a copy of it.
[...] Ah, haha, I think it was mainly libcurl that was causing problems. Now that I think about it again, I don't recall zlib causing problems, probably because we *didn't* rely on it being available on the target machine! T -- Dogs have owners ... cats have staff. -- Krista Casada
Jul 07
prev sibling parent reply 9il <ilyayaroshenko gmail.com> writes:
On Tuesday, 7 July 2020 at 12:52:35 UTC, Adam D. Ruppe wrote:
 On Tuesday, 7 July 2020 at 07:49:02 UTC, Walter Bright wrote:
 Businesses will not want to commit to a balkanized project.
It's been ages since I worked on a software project for a business that didn't have many random third (and fourth and fifth and sixth and seventh.....) party dependencies. Trying to remove or avoid them would universally encounter pushback from management. "Don't reinvent the wheel" they say. It is really absurd. But anyway this whole debate is moot because if you like the code, you can simply copy/paste it (with attribution as required by Boost copyright of course) into your own files. You keep full control and get all the benefits of using it.
This would be good advertising for DFL, haha.
Jul 07
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Tuesday, 7 July 2020 at 13:01:10 UTC, 9il wrote:
 This would be good advertising for DFL, haha.
I don't know what you mean...
Jul 07
prev sibling parent reply Dibyendu Majumdar <mobile majumdar.org.uk> writes:
On Tuesday, 7 July 2020 at 07:49:02 UTC, Walter Bright wrote:
 That's right, it's not about the licensing. It's that the DLF 
 should control the code it distributes.

 Businesses will not want to commit to a balkanized project.

 The proposal is for Mir to become a central required component 
 of DMD and Phobos. This means it needs to become part of the D 
 Language Foundation.
This argument seems a bit odd given ... when D code was contributed to gcc, did you follow the FSF rule of assigning copyright to FSF?
Jul 11
parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Saturday, 11 July 2020 at 10:58:31 UTC, Dibyendu Majumdar 
wrote:
 This argument seems a bit odd given ... when D code was 
 contributed to gcc, did you follow the FSF rule of assigning 
 copyright to FSF?
The issue is about maintenance of the codebase, not about who owns the copyright.
Jul 11
prev sibling parent reply kinke <kinke gmx.net> writes:
On Saturday, 4 July 2020 at 20:35:48 UTC, Walter Bright wrote:
 On 6/21/2020 8:24 AM, 9il wrote:
 So excited to finally announce we can correctly parse 
 floating-point numbers according to IEEE round half-to-even 
 (bankers) rule like in C/C++, Rust, and others.
Great work! Would you like to add it to dmd?
AFAIU, the 'problem' is that *all* floating-point literals are parsed as real_t values, which for DMD is x87 real (usually using the host C runtime's `strtold`). When emitting them as double or float literals, the compiler converts these values to a lower precision, where the increased intermediate precision might break the 'banker's rule'. So wouldn't the trivial 'fix' be using `strtod` for double literals and `strtof` for floats? [For LDC, we wouldn't rely on the host C runtime or a mir implementation, but use LLVM facilities anyway.]
Jul 07
parent reply 9il <ilyayaroshenko gmail.com> writes:
On Tuesday, 7 July 2020 at 16:38:39 UTC, kinke wrote:
 On Saturday, 4 July 2020 at 20:35:48 UTC, Walter Bright wrote:
 On 6/21/2020 8:24 AM, 9il wrote:
 [...]
Great work! Would you like to add it to dmd?
AFAIU, the 'problem' is that *all* floating-point literals are parsed as real_t values, which for DMD is x87 real (usually using the host C runtime's `strtold`). When emitting them as double or float literals, the compiler converts these values to a lower precision, where the increased intermediate precision might break the 'banker's rule'. So wouldn't the trivial 'fix' be using `strtod` for double literals and `strtof` for floats? [For LDC, we wouldn't rely on the host C runtime or a mir implementation, but use LLVM facilities anyway.]
This should work if the C runtime handles the values correctly. Does this actually mean DMD wouldn't be able to compile itself with DigigtalMars C runtime?
Jul 07
parent reply kinke <noone nowhere.com> writes:
On Tuesday, 7 July 2020 at 23:52:05 UTC, 9il wrote:
 On Tuesday, 7 July 2020 at 16:38:39 UTC, kinke wrote:
 So wouldn't the trivial 'fix' be using `strtod` for double 
 literals and `strtof` for floats? [For LDC, we wouldn't rely 
 on the host C runtime or a mir implementation, but use LLVM 
 facilities anyway.]
This should work if the C runtime handles the values correctly.
I've just opened a PR for DMD (and LDC too): https://github.com/dlang/dmd/pull/11387
 Does this actually mean DMD wouldn't be able to compile itself 
 with DigigtalMars C runtime?
Sorry, I don't understand. - I think this excess precision was at some point considered a feature (but probably only for D, not for DigitalMars C), there's even a test making sure 0.9L and 0.9 are parsed to the same compile-time value.
Jul 07
parent 9il <ilyayaroshenko gmail.com> writes:
On Tuesday, 7 July 2020 at 23:56:45 UTC, kinke wrote:
 On Tuesday, 7 July 2020 at 23:52:05 UTC, 9il wrote:
 On Tuesday, 7 July 2020 at 16:38:39 UTC, kinke wrote:
 So wouldn't the trivial 'fix' be using `strtod` for double 
 literals and `strtof` for floats? [For LDC, we wouldn't rely 
 on the host C runtime or a mir implementation, but use LLVM 
 facilities anyway.]
This should work if the C runtime handles the values correctly.
I've just opened a PR for DMD (and LDC too): https://github.com/dlang/dmd/pull/11387
 Does this actually mean DMD wouldn't be able to compile itself 
 with DigigtalMars C runtime?
Sorry, I don't understand. - I think this excess precision was at some point considered a feature (but probably only for D, not for DigitalMars C), there's even a test making sure 0.9L and 0.9 are parsed to the same compile-time value.
DMC strtod [1] isn't IEEE compatible. Just nitpick. Unlikely it is used to compile DMD thought. [1] https://github.com/DigitalMars/dmc/blob/9a774f3f2b3227fd416ec3a83cb9eb8f8751425f/src/core/strtod.c
Jul 07
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/21/2020 8:24 AM, 9il wrote:
 [0] https://github.com/libmir/mir-algorithm

 https://issues.dlang.org/show_bug.cgi?id=20951
 [3] https://issues.dlang.org/show_bug.cgi?id=20952
 [4] https://issues.dlang.org/show_bug.cgi?id=20953
 [5] https://issues.dlang.org/show_bug.cgi?id=20963
 [6] https://issues.dlang.org/show_bug.cgi?id=20967
 [7] https://www.researchgate.net/publication/2295884_How_to_Read_Floating_Poin
_Numbers_Accurately 
 
As an update, I've gotten 20963 working now. It's a foundational fix, and without it I expect correcting the other issues would be unduly difficult and frustrating. https://github.com/dlang/dmd/pull/11393
Jul 11