www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - RFC: SI Units facility for Phobos

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Benjamin Shropshire wrote an SI Units implementation which he refers to 
here:

http://d.puremagic.com/issues/show_bug.cgi?id=3725

For convenience, here are direct links to the code:

http://www.dsource.org/projects/scrapple/browser/trunk/units/si2.d
http://www.dsource.org/projects/scrapple/browser/trunk/units/rational.d

This cannot be considered a reviewable proposal yet because it doesn't 
have sufficient documentation and is essentially in the stage where 
Benjamin is asking for input on improving the design and implementation.

Below are my comments in response to Benjamin's RFC. In brief, the 
design is a bit off from what I hope to have an a units library, but 
things can be fixed without too much difficulty by shuffling things around.

* Call the library std.units and generally focus on units, not on SI units.

* I suggest doing away with abstract unit names ("Distance", "Time", 
"Mass" etc.) and use concrete plural units ("Meters", "Seconds", 
"Kilograms" etc) instead. I agree that at a level operating with the 
abstract names seems to be more pure, but at a concrete level you need 
to have various reference points. For example, a molecular physics 
program would want to operate with Angstroms, which should be a distinct 
type from Meters.

* There should be ways to define scalars of distinct types and 
relationships between them. For example, "Radians" and "Degrees" should 
be distinct types, although both are scalar.

* The previous points bring me to an important design artifact: each and 
every unit should have a multiplier (constant, template argument) that 
describes its relationship to the SI corresponding entity. The SI units 
themselves will have a 1.0 multiplier, and e.g. Angstroms has a 1e10 
multiplier. The current library has a facility for that, but I think 
that's not as good.

* In the proposed design the user can define a lot of distinct types, 
such as Miles, Yards, and Lbs, which are strictly unnecessary 
(Kilometers, Meters, and Kilograms could be used instead, with 
appropriate I/O conversions to and from other units). I think offering 
scale-less units chosen by the user is a good thing as long as there is 
a unified mechanism for converting between those units without risking 
confusion and bugs.

* There should be no implicit conversion to double and generally few 
conversion smarts. The units should have a writable public field "value".

* There should also be a property called siValue which yields the value, 
converted to SI, of type double. For an Angstroms, siValue returns value 
* 1e-10.

(Step-by-step on the code:)

* The code should use TypeTyple instead of T.

* I think FullSI should be always in effect. Even though many users 
don't care for lumens and moles, they can just sit there defaulted at 
the very end and shouldn't be bothersome.

* Each artifact (extra, extra2, Batch...) should be documented.

* I'm not sure about the use of fractional exponents. They add a fair 
amount of complication. Could we dump them or use a simple fixed-point 
scheme to accommodate them?

* The naming convention should consistently use NamesLikeThis for types 
and namesLikeThis for values (including constants).

* A scheme similar to std.conv.to should serve as an all-around 
converter, e.g. to!Kilometers(Miles(10)) should yield a value of type 
Kilometers that contains 16.05 or whatever.

* All operators should be converted to D2 (yet another success story of 
the new design :o)).

* Packages of remarkable constants would be nice to have, of course in 
the appropriate units. The fields of astronomy, classical/relativistic 
mechanics, electromagnetism, molecular physics, quantum mechanics, come 
to mind.

All - please add your comments. Benjamin, I hope you're as enthusiastic 
as always about pushing this into Phobos!


Andrei
Jan 01 2011
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Andrei:

 Benjamin Shropshire wrote an SI Units implementation which he refers to 
 here:

Frink is a little language that uses units or measure a lot: http://futureboy.us/frinkdocs/ Bye, bearophile
Jan 01 2011
prev sibling next sibling parent reply Simon <s.d.hammett gmail.com> writes:
On 01/01/2011 20:24, Andrei Alexandrescu wrote:

such as Miles, Yards, and Lbs, which are strictly unnecessary (Kilometers, Meters, and Kilograms could be used instead, with appropriate I/O conversions to and from other units).

Nope. Where I work we've had complaints before from American customers where we've done calcs in SI and then converted to Imperial. For engineering it's important to be able to do your calcs in the correct unit system. -- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk
Jan 01 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/1/11 3:56 PM, Simon wrote:
 On 01/01/2011 20:24, Andrei Alexandrescu wrote:

such as Miles, Yards, and Lbs, which are strictly unnecessary (Kilometers, Meters, and Kilograms could be used instead, with appropriate I/O conversions to and from other units).

Nope. Where I work we've had complaints before from American customers where we've done calcs in SI and then converted to Imperial. For engineering it's important to be able to do your calcs in the correct unit system.

You sure can! Andrei
Jan 01 2011
prev sibling parent reply BCS <anon anon.com> writes:
Hello Andrei,

 Benjamin Shropshire wrote an SI Units implementation which he refers
 to here:
 
 http://d.puremagic.com/issues/show_bug.cgi?id=3725
 
 For convenience, here are direct links to the code:
 
 http://www.dsource.org/projects/scrapple/browser/trunk/units/si2.d
 http://www.dsource.org/projects/scrapple/browser/trunk/units/rational.
 d
 
 This cannot be considered a reviewable proposal yet because it doesn't
 have sufficient documentation and is essentially in the stage where
 Benjamin is asking for input on improving the design and
 implementation.
 
 Below are my comments in response to Benjamin's RFC. In brief, the
 design is a bit off from what I hope to have an a units library, but
 things can be fixed without too much difficulty by shuffling things
 around.
 
 * Call the library std.units and generally focus on units, not on SI
 units.
 
 * I suggest doing away with abstract unit names ("Distance", "Time",
 "Mass" etc.) and use concrete plural units ("Meters", "Seconds",
 "Kilograms" etc) instead. I agree that at a level operating with the
 abstract names seems to be more pure, but at a concrete level you need
 to have various reference points. For example, a molecular physics
 program would want to operate with Angstroms, which should be a
 distinct type from Meters.

Why would that be an improvement? The current system encode as length measurements as meters but allows you to work with other units by converting at the point that you convert a FP type to a united type. The issue I see with making different units of length different types is that there is an unbounded set of those and I don't see any reasonable way to allow encoding the conversion structures for them. If someone else is able to make such a library that is as clean as this one, I'd not stand in its way, but I have no interest in writing such a beast.
 
 * There should be ways to define scalars of distinct types and
 relationships between them. For example, "Radians" and "Degrees"
 should be distinct types, although both are scalar.

Ditto my comments for non-scalers on a smaller scale.
 
 * The previous points bring me to an important design artifact: each
 and every unit should have a multiplier (constant, template argument)
 that describes its relationship to the SI corresponding entity. The SI
 units themselves will have a 1.0 multiplier, and e.g. Angstroms has a
 1e10 multiplier. The current library has a facility for that, but I
 think that's not as good.

That sounds to me like what the library has so I must not be understanding what you are asking for. Could you elaborate?
 
 * In the proposed design the user can define a lot of distinct types,
 such as Miles, Yards, and Lbs, which are strictly unnecessary
 (Kilometers, Meters, and Kilograms could be used instead, with
 appropriate I/O conversions to and from other units). I think offering
 scale-less units chosen by the user is a good thing as long as there
 is a unified mechanism for converting between those units without
 risking confusion and bugs.

Again, that sounds to me like what the library does. All distance units are of the same type and internally are encoded as meters, The rest of the units are converted on access.
 
 * There should be no implicit conversion to double and generally few
 conversion smarts. The units should have a writable public field
 "value".
 
 * There should also be a property called siValue which yields the
 value, converted to SI, of type double. For an Angstroms, siValue
 returns value * 1e-10.
 
 (Step-by-step on the code:)
 
 * The code should use TypeTyple instead of T.
 
 * I think FullSI should be always in effect. Even though many users
 don't care for lumens and moles, they can just sit there defaulted at
 the very end and shouldn't be bothersome.

That's my soap box protest to SI's (IMHO stupid) inclusion of them as base units. :)
 
 * Each artifact (extra, extra2, Batch...) should be documented.

Um, Yah. :o)
 
 * I'm not sure about the use of fractional exponents. They add a fair
 amount of complication. Could we dump them or use a simple fixed-point
 scheme to accommodate them?
 

The only unit I know for sure has fractional exponents is in fracture mechanics (kPa*m^0.5) but if you allow anything beyond that, any fixed-point scheme I can think of would fall over right away (X^1/3?).
 * The naming convention should consistently use NamesLikeThis for
 types and namesLikeThis for values (including constants).
 
 * A scheme similar to std.conv.to should serve as an all-around
 converter, e.g. to!Kilometers(Miles(10)) should yield a value of type
 Kilometers that contains 16.05 or whatever.
 
 * All operators should be converted to D2 (yet another success story
 of the new design :o)).
 
 * Packages of remarkable constants would be nice to have, of course in
 the appropriate units. The fields of astronomy, classical/relativistic
 mechanics, electromagnetism, molecular physics, quantum mechanics,
 come to mind.
 
 All - please add your comments. Benjamin, I hope you're as
 enthusiastic as always about pushing this into Phobos!
 

I am.
 Andrei
 

Jan 04 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/5/11 12:26 AM, BCS wrote:
 * I suggest doing away with abstract unit names ("Distance", "Time",
 "Mass" etc.) and use concrete plural units ("Meters", "Seconds",
 "Kilograms" etc) instead. I agree that at a level operating with the
 abstract names seems to be more pure, but at a concrete level you need
 to have various reference points. For example, a molecular physics
 program would want to operate with Angstroms, which should be a
 distinct type from Meters.

Why would that be an improvement? The current system encode as length measurements as meters but allows you to work with other units by converting at the point that you convert a FP type to a united type. The issue I see with making different units of length different types is that there is an unbounded set of those and I don't see any reasonable way to allow encoding the conversion structures for them. If someone else is able to make such a library that is as clean as this one, I'd not stand in its way, but I have no interest in writing such a beast.

It would be an improvement because there wouldn't be the need to multiply with a bias every time a value is assigned, with the corresponding loss in speed and precision. To exemplify, say a program wants to work in Angstroms. As all distances are stored in meters, ultimately all values stored and operated on would be very small, which adversely affects precision. At the other end of the scale, an astronomy program would want to work with light-years, which would force storage of large values as meters. To solve this issue, each unit may include a static multiplier that converts it to SI (e.g. meter), while at the same time allowing to store and operate directly on the unit of choice. So a program may actually store 10 Angstroms as the number 10, or 10 light-years as the number 10.
 * There should be ways to define scalars of distinct types and
 relationships between them. For example, "Radians" and "Degrees"
 should be distinct types, although both are scalar.

Ditto my comments for non-scalers on a smaller scale.

The crux of the matter is that Radians and Degrees should be distinct types, and that a conversion should be defined taking one to the other. How can we express that in the current library, or what could be added to it to make that possible?
 * The previous points bring me to an important design artifact: each
 and every unit should have a multiplier (constant, template argument)
 that describes its relationship to the SI corresponding entity. The SI
 units themselves will have a 1.0 multiplier, and e.g. Angstroms has a
 1e10 multiplier. The current library has a facility for that, but I
 think that's not as good.

That sounds to me like what the library has so I must not be understanding what you are asking for. Could you elaborate?

I think my comments above clarify this. If not please let me know. In brief: one should be able to operate on values that are implicitly scaled, which are of distinct types (Angstroms, LightYears, Radians, Degrees would be illustrative examples).
 * In the proposed design the user can define a lot of distinct types,
 such as Miles, Yards, and Lbs, which are strictly unnecessary
 (Kilometers, Meters, and Kilograms could be used instead, with
 appropriate I/O conversions to and from other units). I think offering
 scale-less units chosen by the user is a good thing as long as there
 is a unified mechanism for converting between those units without
 risking confusion and bugs.

Again, that sounds to me like what the library does. All distance units are of the same type and internally are encoded as meters, The rest of the units are converted on access.

The issue is that the choice of the unified format may be problematic. Andrei
Jan 04 2011
parent reply "BCS" <bcs not-here.com> writes:
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 On 1/5/11 12:26 AM, BCS wrote:
 * I suggest doing away with abstract unit names ("Distance", "Time",
 "Mass" etc.) and use concrete plural units ("Meters", "Seconds",
 "Kilograms" etc) instead. I agree that at a level operating with the
 abstract names seems to be more pure, but at a concrete level you need
 to have various reference points. For example, a molecular physics
 program would want to operate with Angstroms, which should be a
 distinct type from Meters.

Why would that be an improvement? The current system encode as length measurements as meters but allows you to work with other units by converting at the point that you convert a FP type to a united type. The issue I see with making different units of length different types is that there is an unbounded set of those and I don't see any reasonable way to allow encoding the conversion structures for them.

multiply with a bias every time a value is assigned, with the corresponding loss in speed and precision. To exemplify, say a program wants to work in Angstroms. As all distances are stored in meters, ultimately all values stored and operated on would be very small, which adversely affects precision. At the other end of the scale, an astronomy program would want to work with light-years, which would force storage of large values as meters. To solve this issue, each unit may include a static multiplier that converts it to SI (e.g. meter), while at the same time allowing to store and operate directly on the unit of choice. So a program may actually store 10 Angstroms as the number 10, or 10 light-years as the number 10.

Ah. I see what you are getting at. OTOH I'm still not convinced it's any better. A quick check shows that 1 light years = 9.4605284 10^25 angstroms. A mere 25 orders of magnitude differences, where IEEE754 doubles have a range of 307 orders of magnitude. As to the issue of where to do the conversions: I suspect that the majority of computation will be between unit carrying types (particularly if the library is used the way I'm intending it to be) and as such, I expect that both performance and precision will benefit from having a unified internal representation. There /might/ be reason to have a very limited set of scaling factors (e.g. atomic scale, human scale, astro scale) and define each of the other units from one of them. but then you run into issues of what to do when you do computations that involve more than one (for example, computing the resolution of an X-ray telescope involves all three scales). When I started writing the library, I looked at these issue just enough that I knew sorting it wasn't going to be a fun project. So, rather than hash out these issue my self, I copied as much as I could from the best units handling tool I know of: MathCAD. As best I can tell, it uses the same setup I am.
 * There should be ways to define scalars of distinct types and
 relationships between them. For example, "Radians" and "Degrees"
 should be distinct types, although both are scalar.

Ditto my comments for non-scalers on a smaller scale.

The crux of the matter is that Radians and Degrees should be distinct types, and that a conversion should be defined taking one to the other. How can we express that in the current library, or what could be added to it to make that possible?

I don't think there /is/ a good solution to that problem because many of the computations that result in radians naturally give scalar values (arc-length/radius). As a result, the type system has no way to determine what the correct type for the expression is without the user forcing a cast or the like. If angles are treated as an alias for scalar then the conversion to degrees can be handled in a reasonable way (but that would also allow converting any scalar value to degrees). I again punted on this one because people who have put more time than I have available (MathCAD again) couldn't come up with anything better.
 * In the proposed design the user can define a lot of distinct types,
 such as Miles, Yards, and Lbs, which are strictly unnecessary
 (Kilometers, Meters, and Kilograms could be used instead, with
 appropriate I/O conversions to and from other units). I think offering
 scale-less units chosen by the user is a good thing as long as there
 is a unified mechanism for converting between those units without
 risking confusion and bugs.

Again, that sounds to me like what the library does. All distance units are of the same type and internally are encoded as meters, The rest of the units are converted on access.


The issue I see is that the choice of a non unified format will be problematic. Unless you can show examples (e.i. benchmarks, etc.) of where the current solution has precision or performance problems or where it's expressive power is inadequate, I will remain reluctant to change it.
Jan 05 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/5/11 10:32 AM, BCS wrote:
 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:

 Ah. I see what you are getting at. OTOH I'm still not convinced it's any
better.

 A quick check shows that 1 light years = 9.4605284  10^25
 angstroms. A mere 25 orders of magnitude differences, where IEEE754
 doubles have a range of 307 orders of magnitude. As to the issue of
 where to do the conversions: I suspect that the majority of
 computation will be between unit carrying types (particularly if the
 library is used the way I'm intending it to be) and as such, I expect
 that both performance and precision will benefit from having a
 unified internal representation.

People might want to use float for compactness, which has range 1e38 or so. But that's not necessarily the largest issue (see below).
 There /might/ be reason to have a very limited set of scaling factors
 (e.g. atomic scale, human scale, astro scale) and define each of the
 other units from one of them. but then you run into issues of what to
 do when you do computations that involve more than one (for example,
 computing the resolution of an X-ray telescope involves all three
 scales).

There are two issues apart from scale. One, creeping errors due to conversions. Someone working in miles would not like that after a few calculations that look integral they get 67.9999998 miles. Second, let's not forget the cost of implicit conversions to and from. The way I see it, forcing an internal unit for representation has definite issues that reduce its potential applicability.
 When I started writing the library, I looked at these issue just
 enough that I knew sorting it wasn't going to be a fun project. So,
 rather than hash out these issue my self, I copied as much as I could
 from the best units handling tool I know of: MathCAD. As best I can
 tell, it uses the same setup I am.

I don't know MathCAD, but as far as I understand that's a system, not a library, and as such might have a slightly different charter. In terms of charter Boost units (http://www.boost.org/doc/libs/1_38_0/doc/html/boost_units.html) is the closest library to this. I haven't looked at it for a while, but indeed it does address the issue of scale as I suggested: it allows people to store numbers in their own units instead of forcing a specific unit. In fact the library makes it a point to distinguish itself from an "SI" library as early as its second page: "While this library attempts to make simple dimensional computations easy to code, it is in no way tied to any particular unit system (SI or otherwise). Instead, it provides a highly flexible compile-time system for dimensional analysis, supporting arbitrary collections of base dimensions, rational powers of units, and explicit quantity conversions. It accomplishes all of this via template metaprogramming techniques." Like it or not, Boost units will be the yardstick against which anything like it in D will be compared. I hope that D being a superior language it will make it considerably easier to implement anything metaprogramming-heavy.
 The crux of the matter is that Radians and Degrees should be distinct
 types, and that a conversion should be defined taking one to the other.
 How can we express that in the current library, or what could be added
 to it to make that possible?

I don't think there /is/ a good solution to that problem because many of the computations that result in radians naturally give scalar values (arc-length/radius). As a result, the type system has no way to determine what the correct type for the expression is without the user forcing a cast or the like.

Not a cast, but a conversion. Consider: void computeFiringSolution(Radians angle) { auto s = sin(angle.value); ... auto newAngle = Radians(arcsin(s)); } Much of the point of using units is that there is a good amount of being explicit in their handling. The user knows that sin takes a double which is meant in Radians. Her program encodes that assumption in a type, but is also free to simply fetch the value when using the untyped primitives.
 If angles are treated as an alias for scalar then the conversion to
 degrees can be handled in a reasonable way (but that would also allow
 converting any scalar value to degrees). I again punted on this one
 because people who have put more time than I have available (MathCAD
 again) couldn't come up with anything better.

ArcDegrees and Radians would be two distinct types. You wouldn't be able to add Angles to Radians without explicitly stating where you want to be: ArcDegrees a1; Radians a2; auto a = a1 + a2; // error! auto b = a1 + ArcDegrees(a2); // fine, b is stored in ArcDegrees auto c = Radians(a1) + a2; // fine, c is stored in Radians The same goes about Kilometers and Miles: Kilometers d1; Miles d2; ... auto a = d1 + d2; // error! auto b = d1 + Kilometers(a2); // fine, b is stored in Kilometers auto c = Miles(a1) + a2; // fine, c is stored in Miles
 Again, that sounds to me like what the library does. All distance units
 are of the same type and internally are encoded as meters, The rest of
 the units are converted on access.


The issue I see is that the choice of a non unified format will be problematic. Unless you can show examples (e.i. benchmarks, etc.) of where the current solution has precision or performance problems or where it's expressive power is inadequate, I will remain reluctant to change it.

Examples of precision issues with scaling back and forth by means of a multiplier shouldn't be necessary as the problem is obvious. Here's an example that took me a couple of minutes to produce: immutable real metersPerLightyear = 9.4605284e15; auto a1 = metersPerLightyear * 15.3; auto a2 = metersPerLightyear * 16.3; auto a3 = metersPerLightyear * 1; writeln("Total distance in lightyears: ", (a1 - a2 + a3) / metersPerLightyear); auto b1 = 15.3; auto b2 = 16.3; auto b3 = 1; writeln("Total distance in lightyears: ", (b1 - b2 + b3)); Regarding expressiveness, it is quite clear that there are features simply missing: working in Celsius vs. Fahrenheit vs. Kelvin, allowing the user to define and use their own units, allowing the user to define units with runtime multipliers (monetary) etc. There's always a need to stop somewhere as the list could go on forever, but I think the current submission stops a bit too early. If you believe that the library is good as it its, that's definitely fine. Don't forget, however, that a good part of the review's purpose is to improve the library, not to defend its initial design and implementation. A submitter who is willing to go with the library as-is although there are beneficial suggested improvements (and that refers to everything including e.g. documentation) may be less likely to maintain the library in the future. At least that's my perception. In contrast, I'm quite hopeful Jonathan will follow through with std.datetime because he has been willing to act on all sensible feedback. Andrei
Jan 05 2011
parent reply "BCS" <bcs not-here.com> writes:
In conclusion (yes I know this normally goes at the bottom) I think we are
wanting different and contradictorily things from this library.

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 On 1/5/11 10:32 AM, BCS wrote:
 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:

 Ah. I see what you are getting at. OTOH I'm still not convinced it's any
better.

 A quick check shows that 1 light years = 9.4605284  10^25
 angstroms. A mere 25 orders of magnitude differences, where IEEE754
 doubles have a range of 307 orders of magnitude. As to the issue of
 where to do the conversions: I suspect that the majority of
 computation will be between unit carrying types (particularly if the
 library is used the way I'm intending it to be) and as such, I expect
 that both performance and precision will benefit from having a
 unified internal representation.

so. But that's not necessarily the largest issue (see below).
 There /might/ be reason to have a very limited set of scaling factors
 (e.g. atomic scale, human scale, astro scale) and define each of the
 other units from one of them. but then you run into issues of what to
 do when you do computations that involve more than one (for example,
 computing the resolution of an X-ray telescope involves all three
 scales).

There are two issues apart from scale. One, creeping errors due to conversions. Someone working in miles would not like that after a few calculations that look integral they get 67.9999998 miles. Second, let's not forget the cost of implicit conversions to and from. The way I see it, forcing an internal unit for representation has definite issues that reduce its potential applicability.

IMHO both of these are somewhat synthetic, that is they aren't significant issues in the real word. For the first, anyone who expects FP to give exact answers needs to learn more about FP. If you need exact answers, use an integer or rational type as your base type. As for the second point about perf; the usage mode I designed for will only perform conversions in I/O operations. Values are converted to Unit-bearing types at the first opportunity and remain there until the last possible moment. As such I expect that any operation that is doing more that a handful of conversions will be I/O bound not compute bound.
 When I started writing the library, I looked at these issue just
 enough that I knew sorting it wasn't going to be a fun project. So,
 rather than hash out these issue my self, I copied as much as I could
 from the best units handling tool I know of: MathCAD. As best I can
 tell, it uses the same setup I am.

library, and as such might have a slightly different charter. In terms of charter Boost units (http://www.boost.org/doc/libs/1_38_0/doc/html/boost_units.html) is the closest library to this. I haven't looked at it for a while, but indeed it does address the issue of scale as I suggested: it allows people to store numbers in their own units instead of forcing a specific unit. In fact the library makes it a point to distinguish itself from an "SI" library as early as its second page: "While this library attempts to make simple dimensional computations easy to code, it is in no way tied to any particular unit system (SI or otherwise). Instead, it provides a highly flexible compile-time system for dimensional analysis, supporting arbitrary collections of base dimensions, rational powers of units, and explicit quantity conversions. It accomplishes all of this via template metaprogramming techniques." Like it or not, Boost units will be the yardstick against which anything like it in D will be compared. I hope that D being a superior language it will make it considerably easier to implement anything metaprogramming-heavy.

Reiterating my prior point, what I'm interested in developing is a library that handles the set of base physical units, of witch there is a know, finite set of base units and the derived units. You might be able to talk me into doing statically scaled units as distinct types but I'm not at all interested in allowing an arbitrary number of base units or in treating physicality equivalent units (e.g. feet and meters) to be considered as different base units. My unwillingness to go there is because I see very little value in doing a little of that (what other dimensions can be added that act like the SI base units?) and enormous cost in doing more of it (if you allow dimensions to act differently; how? in what ways? where do you stop?).
 The crux of the matter is that Radians and Degrees should be distinct
 types, and that a conversion should be defined taking one to the other.
 How can we express that in the current library, or what could be added
 to it to make that possible?

I don't think there /is/ a good solution to that problem because many of the computations that result in radians naturally give scalar values (arc-length/radius). As a result, the type system has no way to determine what the correct type for the expression is without the user forcing a cast or the like.


That's what I was referring to by "and the like". :)
 void computeFiringSolution(Radians angle)
 {
      auto s = sin(angle.value);
      ...
      auto newAngle = Radians(arcsin(s));
 }

The way I would like that code to look would be: void computeFiringSolution(Radians angle) { auto s = angle.sin(); // only exist for Radians (and Scaler) ... auto newAngle = std.units.arcsin(s); // returns Radians static assert(is(typeof(newAngle) : Radians)); }
 Much of the point of using units is that there is a good amount of being 
 explicit in their handling.

The objective I was going for is that you are explicit at the edges (converting to and from other types) and ignore it in the middle.
 The user knows that sin takes a double which 
 is meant in Radians.

I'd rather the user know they can take the sin of something that is an angle and not worry about the units.
 Her program encodes that assumption in a type, but 
 is also free to simply fetch the value when using the untyped primitives.

The way I wrote it, accessing the value directly is much the same as using a reinterpret_cast; a blunt hack. Rather than doing that, the user is forced (by design) to explicitly state what unit the value should be returned in or what it is being provided as.
 If angles are treated as an alias for scalar then the conversion to
 degrees can be handled in a reasonable way (but that would also allow
 converting any scalar value to degrees). I again punted on this one
 because people who have put more time than I have available (MathCAD
 again) couldn't come up with anything better.

ArcDegrees and Radians would be two distinct types. You wouldn't be able to add Angles to Radians without explicitly stating where you want to be: ArcDegrees a1; Radians a2; auto a = a1 + a2; // error!

That expression is something I explicitly want to be valid (thus the reason the type aliases are named Length, Mass, Time, ... rather than Meter, Kilogram, Second, ...). They are both a measure of angle so should be addable. One of the fundamental requirements of the library is that things that measure the same property can be used interchangeably. This ability is a very large part of the reason that I wrote the library in the first place and I have no interest in continuing without it.
 auto b = a1 + ArcDegrees(a2); // fine, b is stored in ArcDegrees
 auto c = Radians(a1) + a2;    // fine, c is stored in Radians
 The same goes about Kilometers and Miles:
 Kilometers d1;
 Miles d2;
 ...
 auto a = d1 + d2; // error!
 auto b = d1 + Kilometers(a2); // fine, b is stored in Kilometers
 auto c = Miles(a1) + a2;    // fine, c is stored in Miles

Ditto the same as above.
 Again, that sounds to me like what the library does. All distance units
 are of the same type and internally are encoded as meters, The rest of
 the units are converted on access.


The issue I see is that the choice of a non unified format will be problematic. Unless you can show examples (e.i. benchmarks, etc.) of where the current solution has precision or performance problems or where it's expressive power is inadequate, I will remain reluctant to change it.

multiplier shouldn't be necessary as the problem is obvious. Here's an example that took me a couple of minutes to produce: immutable real metersPerLightyear = 9.4605284e15; auto a1 = metersPerLightyear * 15.3; auto a2 = metersPerLightyear * 16.3; auto a3 = metersPerLightyear * 1; writeln("Total distance in lightyears: ", (a1 - a2 + a3) / metersPerLightyear); auto b1 = 15.3; auto b2 = 16.3; auto b3 = 1; writeln("Total distance in lightyears: ", (b1 - b2 + b3));

I wasn't asking for cases where values come out unequal but where they come out unusable.
 Regarding expressiveness, it is quite clear that there are features 
 simply missing: working in Celsius vs. Fahrenheit vs. Kelvin,

I'll grant I don't have Celsius and Fahrenheit but they are very special cases as they have non zero origins. OTOH it will give differences in Fahrenheit via the Rankine scale.
 allowing 
 the user to define and use their own units, allowing the user to define 
 units with runtime multipliers (monetary) etc. There's always a need to 
 stop somewhere as the list could go on forever,

Agreed
 but I think the current submission stops a bit too early.

I think that the current point is the only logical point (in that any other is just arbitrary).
 If you believe that the library is good as it its, that's definitely 
 fine. Don't forget, however, that a good part of the review's purpose is 
 to improve the library, not to defend its initial design and 
 implementation. A submitter who is willing to go with the library as-is 
 although there are beneficial suggested improvements (and that refers to 
 everything including e.g. documentation) may be less likely to maintain 
 the library in the future. At least that's my perception.

To be clear, many of your points are very relevant and I omitted commenting on them because a long list of Yup, yup, yup, ... is just noise. Also, given that we are thrashing out one or two fundamental points about it, I think those "lesser" issues can wait.
 In contrast, 
 I'm quite hopeful Jonathan will follow through with std.datetime because 
 he has been willing to act on all sensible feedback.
 Andrei

Jan 05 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/5/11 4:06 PM, BCS wrote:
 In conclusion (yes I know this normally goes at the bottom) I think
 we are wanting different and contradictorily things from this
 library.

I almost didn't read the rest thinking that that's all you inserted. All: there's more, scroll down! One additional practical matter:
 The way I would like that code to look would be:

 void computeFiringSolution(Radians angle)
 {
      auto s = angle.sin(); // only exist for Radians (and Scaler)
      ...
      auto newAngle = std.units.arcsin(s);  // returns Radians
      static assert(is(typeof(newAngle) : Radians));
 }

This is nice in theory but would have you essentially wrap by hand an unbounded number of functions. And for what? So they write angle.sin() instead of sin(angle.value). I appreciate the additional theoretical safety, but I don't see how that benefit compensates the cost. I want a practical library that allows me to work with libraries designed outside of it. Anyway, let's not forget that at the end of the day my opinion is one opinion and my vote is one vote. For the record, my vote is against the library in its current form for the following reasons: (a) Poor documentation (b) Limited expressiveness (c) Numeric issues as I described (and no amount of rhetoric will set that straight; FWIW given the obvious question of scaling you need to prove it works, not me to prove it doesn't) (d) Unrealized potential (if we approve this, backward compatibility will prevent more comprehensive libraries having the same aim but a different design). This argument is to be taken with a grain of salt as in general it can be easily abused. What I'm saying is that once this library is in we may as well forget about scaled units a la boost units (which are the kind I'd want to use). Going from here I see a few possibilities. 1. Other people deem the library adequate as it is and it gets voted in; 2. You and somebody else agree to work together on this submission; 3. You agree to pass your work to someone who will continue to work towards a submission; 4. The library is not made part of Phobos but remains of course available as a third-party library. Andrei
Jan 05 2011
next sibling parent reply "BCS" <bcs not-here.com> writes:
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 On 1/5/11 4:06 PM, BCS wrote:
 In conclusion (yes I know this normally goes at the bottom) I think
 we are wanting different and contradictorily things from this
 library.

All: there's more, scroll down!

Oops. :o|
 One additional practical matter:
 The way I would like that code to look would be:

 void computeFiringSolution(Radians angle)
 {
      auto s = angle.sin(); // only exist for Radians (and Scaler)
      ...
      auto newAngle = std.units.arcsin(s);  // returns Radians
      static assert(is(typeof(newAngle) : Radians));
 }

unbounded number of functions. And for what? So they write angle.sin() instead of sin(angle.value).

There are very few general functions I know of that take units other than scalers. As a result, I would expect that allowing scalers to implicitly convert to floating point would cover most of those. And for the rest (sin, etc.) there are few enough that adding them to the type may be practical. For the non-general functions that do take non scaler values, I would think explicitly asking for the value as a given unit (as the library currently does) would be the better choice rather than having to convert it to the related type for the unit and then asking for the values: FnTakingMeters(Meter(length).value); // that could be redundant if length is already meters... FnTakingMeters(length.value); // but are you sure length it's already meters? vs. FnTakingMeters(length.meter); // gives length in meters.
 I appreciate the additional theoretical 
 safety, but I don't see how that benefit compensates the cost. I want a 
 practical library that allows me to work with libraries designed outside 
 of it.

I agree on what but I'm not sure on how.
 Anyway, let's not forget that at the end of the day my opinion is one 
 opinion and my vote is one vote. For the record, my vote is against the 
 library in its current form for the following reasons:
 (a) Poor documentation

Um, Yeah. :o)
 (b) Limited expressiveness

In which way? Adding arbitrary base units? Things like Dynamic conversion rates?
 (c) Numeric issues as I described (and no amount of rhetoric will set 
 that straight; FWIW given the obvious question of scaling you need to 
 prove it works, not me to prove it doesn't)
 (d) Unrealized potential (if we approve this, backward compatibility 
 will prevent more comprehensive libraries having the same aim but a 
 different design). This argument is to be taken with a grain of salt as 
 in general it can be easily abused. What I'm saying is that once this 
 library is in we may as well forget about scaled units a la boost units 
 (which are the kind I'd want to use).

We have both said our piece on these, what do others think? I'd be particularly interested in what Don has to say on the numeric issues. Does an extra layer or two of FP rounding really mater.
Jan 05 2011
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
Jonathan M Davis wrote:
 On Wednesday, January 05, 2011 15:40:37 BCS wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 (c) Numeric issues as I described (and no amount of rhetoric will set
 that straight; FWIW given the obvious question of scaling you need to
 prove it works, not me to prove it doesn't)
 (d) Unrealized potential (if we approve this, backward compatibility
 will prevent more comprehensive libraries having the same aim but a
 different design). This argument is to be taken with a grain of salt as
 in general it can be easily abused. What I'm saying is that once this
 library is in we may as well forget about scaled units a la boost units
 (which are the kind I'd want to use).

particularly interested in what Don has to say on the numeric issues. Does an extra layer or two of FP rounding really mater.

Personally, I tend to cringe when I see much in the way of floating points in anything that needs precision, but it's not like you can avoid it in this case. Regardless, I agree with pretty much everything that Andrei has said. I particularly don't like that the values are all in meters internal - _especially_ when dealing with floating point values. I'd be very worried about precision issues. The Boost solution seems like a solid one me. However, I'm not likely the sort of person who's going to be using a unit library very often. I just don't deal with code that cares about that sort of thing very often.

One thing we learned in engineering school was to never do premature rounding. Always defer such to the final calculation. My experience with "round trip" rounding, where X=>Y=>X is that the result "drifts". You can see with the mouse sometimes, as it can slowly drift to one corner of the screen.
Jan 05 2011
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/5/11 9:55 PM, BCS wrote:
 After a little more thinking I'm wondering if I'm targeting a
 different use case than other people are thinking about.

 The case I'm designing for, is where you have a relatively small
 number of inputs (that may be in a mishmash of units and systems), a
 relatively large number of computations and a relatively small number
 of outputs. The systems that Andrei is arguing for may be more
 desirable if there are relatively less computation (thus less
 internal rounding) or if all or most of the inputs are in a
 consistent system of units (resulting in very few necessary
 conversions).

 I'm primarily interested in the first use case because it is the kind
 of problem I have dealt with the most (particularly the mishmash of
 units bit) and for that, the two proposals are almost equivalent from
 a perf and accuracy standpoint because each should convert the inputs
 to a consistent system, do all the math in it, and then convert to
 the output units (I'm not even assuming the outputs form a consistent
 system). The only difference is that the current arrangement picks
 the consistent system for you where the alternative allows (and
 forces) you to select it.

I think this all is sensible. What I like about Boost units is that they didn't define SI units; they defined a framework in which units can be defined (and indeed "si" is a sub-namespace inside units that has no special rights). This review conclusion is a very good read: http://lists.boost.org/boost-announce/2007/04/0126.php I recommend to all to read the entire review thread to get an idea of the scope and sophistication of the Boost review process. It has tremendously increased the quality of Boost libraries. We need to get there. Andrei
Jan 06 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, January 05, 2011 15:40:37 BCS wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 (c) Numeric issues as I described (and no amount of rhetoric will set
 that straight; FWIW given the obvious question of scaling you need to
 prove it works, not me to prove it doesn't)
 (d) Unrealized potential (if we approve this, backward compatibility
 will prevent more comprehensive libraries having the same aim but a
 different design). This argument is to be taken with a grain of salt as
 in general it can be easily abused. What I'm saying is that once this
 library is in we may as well forget about scaled units a la boost units
 (which are the kind I'd want to use).

We have both said our piece on these, what do others think? I'd be particularly interested in what Don has to say on the numeric issues. Does an extra layer or two of FP rounding really mater.

Personally, I tend to cringe when I see much in the way of floating points in anything that needs precision, but it's not like you can avoid it in this case. Regardless, I agree with pretty much everything that Andrei has said. I particularly don't like that the values are all in meters internal - _especially_ when dealing with floating point values. I'd be very worried about precision issues. The Boost solution seems like a solid one me. However, I'm not likely the sort of person who's going to be using a unit library very often. I just don't deal with code that cares about that sort of thing very often. - Jonathan M Davis
Jan 05 2011
prev sibling parent "BCS" <bcs not-here.com> writes:
Jonathan M Davis <jmdavisProg gmx.com> wrote:
 On Wednesday, January 05, 2011 15:40:37 BCS wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 (c) Numeric issues as I described (and no amount of rhetoric will set
 that straight; FWIW given the obvious question of scaling you need to
 prove it works, not me to prove it doesn't)
 (d) Unrealized potential (if we approve this, backward compatibility
 will prevent more comprehensive libraries having the same aim but a
 different design). This argument is to be taken with a grain of salt as
 in general it can be easily abused. What I'm saying is that once this
 library is in we may as well forget about scaled units a la boost units
 (which are the kind I'd want to use).

We have both said our piece on these, what do others think? I'd be particularly interested in what Don has to say on the numeric issues. Does an extra layer or two of FP rounding really mater.

Personally, I tend to cringe when I see much in the way of floating points in anything that needs precision, but it's not like you can avoid it in this case. Regardless, I agree with pretty much everything that Andrei has said. I particularly don't like that the values are all in meters internal - _especially_ when dealing with floating point values. I'd be very worried about precision issues. The Boost solution seems like a solid one me. However, I'm not likely the sort of person who's going to be using a unit library very often. I just don't deal with code that cares about that sort of thing very often.

After a little more thinking I'm wondering if I'm targeting a different use case than other people are thinking about. The case I'm designing for, is where you have a relatively small number of inputs (that may be in a mishmash of units and systems), a relatively large number of computations and a relatively small number of outputs. The systems that Andrei is arguing for may be more desirable if there are relatively less computation (thus less internal rounding) or if all or most of the inputs are in a consistent system of units (resulting in very few necessary conversions). I'm primarily interested in the first use case because it is the kind of problem I have dealt with the most (particularly the mishmash of units bit) and for that, the two proposals are almost equivalent from a perf and accuracy standpoint because each should convert the inputs to a consistent system, do all the math in it, and then convert to the output units (I'm not even assuming the outputs form a consistent system). The only difference is that the current arrangement picks the consistent system for you where the alternative allows (and forces) you to select it.
Jan 05 2011