digitalmars.D - RFC: SI Units facility for Phobos

Andrei Alexandrescu (64/64) Jan 01 2011 Benjamin Shropshire wrote an SI Units implementation which he refers to

bearophile (5/7) Jan 01 2011 Frink is a little language that uses units or measure a lot:
Simon (8/13) Jan 01 2011 Nope. Where I work we've had complaints before from American customers

Andrei Alexandrescu (3/13) Jan 01 2011 You sure can!

BCS (22/115) Jan 04 2011 Why would that be an improvement? The current system encode as length me...

Andrei Alexandrescu (23/61) Jan 04 2011 It would be an improvement because there wouldn't be the need to

BCS (8/57) Jan 05 2011 Ah. I see what you are getting at. OTOH I'm still not convinced it's any...

Andrei Alexandrescu (83/128) Jan 05 2011 People might want to use float for compactness, which has range 1e38 or

BCS (23/158) Jan 05 2011 IMHO both of these are somewhat synthetic, that is they aren't significa...

Andrei Alexandrescu (32/43) Jan 05 2011 I almost didn't read the rest thinking that that's all you inserted.

BCS (12/49) Jan 05 2011 There are very few general functions I know of that take units other tha...

Jonathan M Davis (10/24) Jan 05 2011 Personally, I tend to cringe when I see much in the way of floating poin...

Walter Bright (6/29) Jan 05 2011 One thing we learned in engineering school was to never do premature rou...
BCS (4/27) Jan 05 2011 After a little more thinking I'm wondering if I'm targeting a different ...

Andrei Alexandrescu (11/30) Jan 06 2011 I think this all is sensible. What I like about Boost units is that they...

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Benjamin Shropshire wrote an SI Units implementation which he refers to 
here:

http://d.puremagic.com/issues/show_bug.cgi?id=3725

For convenience, here are direct links to the code:

http://www.dsource.org/projects/scrapple/browser/trunk/units/si2.d
http://www.dsource.org/projects/scrapple/browser/trunk/units/rational.d

This cannot be considered a reviewable proposal yet because it doesn't 
have sufficient documentation and is essentially in the stage where 
Benjamin is asking for input on improving the design and implementation.

Below are my comments in response to Benjamin's RFC. In brief, the 
design is a bit off from what I hope to have an a units library, but 
things can be fixed without too much difficulty by shuffling things around.

* Call the library std.units and generally focus on units, not on SI units.

* I suggest doing away with abstract unit names ("Distance", "Time", 
"Mass" etc.) and use concrete plural units ("Meters", "Seconds", 
"Kilograms" etc) instead. I agree that at a level operating with the 
abstract names seems to be more pure, but at a concrete level you need 
to have various reference points. For example, a molecular physics 
program would want to operate with Angstroms, which should be a distinct 
type from Meters.

* There should be ways to define scalars of distinct types and 
relationships between them. For example, "Radians" and "Degrees" should 
be distinct types, although both are scalar.

* The previous points bring me to an important design artifact: each and 
every unit should have a multiplier (constant, template argument) that 
describes its relationship to the SI corresponding entity. The SI units 
themselves will have a 1.0 multiplier, and e.g. Angstroms has a 1e10 
multiplier. The current library has a facility for that, but I think 
that's not as good.

* In the proposed design the user can define a lot of distinct types, 
such as Miles, Yards, and Lbs, which are strictly unnecessary 
(Kilometers, Meters, and Kilograms could be used instead, with 
appropriate I/O conversions to and from other units). I think offering 
scale-less units chosen by the user is a good thing as long as there is 
a unified mechanism for converting between those units without risking 
confusion and bugs.

* There should be no implicit conversion to double and generally few 
conversion smarts. The units should have a writable public field "value".

* There should also be a property called siValue which yields the value, 
converted to SI, of type double. For an Angstroms, siValue returns value 
* 1e-10.

(Step-by-step on the code:)

* The code should use TypeTyple instead of T.

* I think FullSI should be always in effect. Even though many users 
don't care for lumens and moles, they can just sit there defaulted at 
the very end and shouldn't be bothersome.

* Each artifact (extra, extra2, Batch...) should be documented.

* I'm not sure about the use of fractional exponents. They add a fair 
amount of complication. Could we dump them or use a simple fixed-point 
scheme to accommodate them?

* The naming convention should consistently use NamesLikeThis for types 
and namesLikeThis for values (including constants).

* A scheme similar to std.conv.to should serve as an all-around 
converter, e.g. to!Kilometers(Miles(10)) should yield a value of type 
Kilometers that contains 16.05 or whatever.

* All operators should be converted to D2 (yet another success story of 
the new design :o)).

* Packages of remarkable constants would be nice to have, of course in 
the appropriate units. The fields of astronomy, classical/relativistic 
mechanics, electromagnetism, molecular physics, quantum mechanics, come 
to mind.

All - please add your comments. Benjamin, I hope you're as enthusiastic 
as always about pushing this into Phobos!


Andrei

Jan 01 2011

bearophile <bearophileHUGS lycos.com> writes:

Andrei:

 Benjamin Shropshire wrote an SI Units implementation which he refers to 
 here:

Frink is a little language that uses units or measure a lot:
http://futureboy.us/frinkdocs/

Bye,
bearophile

Jan 01 2011

Simon <s.d.hammett gmail.com> writes:

On 01/01/2011 20:24, Andrei Alexandrescu wrote:

 * In the proposed design the user can define a lot of distinct types,
 such as Miles, Yards, and Lbs, which are strictly unnecessary
 (Kilometers, Meters, and Kilograms could be used instead, with
 appropriate I/O conversions to and from other units).

Nope. Where I work we've had complaints before from American customers 
where we've done calcs in SI and then converted to Imperial.

For engineering it's important to be able to do your calcs in the 
correct unit system.

-- 
My enormous talent is exceeded only by my outrageous laziness.
http://www.ssTk.co.uk

Jan 01 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/1/11 3:56 PM, Simon wrote:
 On 01/01/2011 20:24, Andrei Alexandrescu wrote:

 * In the proposed design the user can define a lot of distinct types,
 such as Miles, Yards, and Lbs, which are strictly unnecessary
 (Kilometers, Meters, and Kilograms could be used instead, with
 appropriate I/O conversions to and from other units).

 Nope. Where I work we've had complaints before from American customers
 where we've done calcs in SI and then converted to Imperial.

 For engineering it's important to be able to do your calcs in the
 correct unit system.

You sure can!

Andrei

Jan 01 2011

BCS <anon anon.com> writes:

Hello Andrei,

 Benjamin Shropshire wrote an SI Units implementation which he refers
 to here:
 
 http://d.puremagic.com/issues/show_bug.cgi?id=3725
 
 For convenience, here are direct links to the code:
 
 http://www.dsource.org/projects/scrapple/browser/trunk/units/si2.d
 http://www.dsource.org/projects/scrapple/browser/trunk/units/rational.
 d
 
 This cannot be considered a reviewable proposal yet because it doesn't
 have sufficient documentation and is essentially in the stage where
 Benjamin is asking for input on improving the design and
 implementation.
 
 Below are my comments in response to Benjamin's RFC. In brief, the
 design is a bit off from what I hope to have an a units library, but
 things can be fixed without too much difficulty by shuffling things
 around.
 
 * Call the library std.units and generally focus on units, not on SI
 units.
 
 * I suggest doing away with abstract unit names ("Distance", "Time",
 "Mass" etc.) and use concrete plural units ("Meters", "Seconds",
 "Kilograms" etc) instead. I agree that at a level operating with the
 abstract names seems to be more pure, but at a concrete level you need
 to have various reference points. For example, a molecular physics
 program would want to operate with Angstroms, which should be a
 distinct type from Meters.

Why would that be an improvement? The current system encode as length
measurements 
as meters but allows you to work with other units by converting at the point 
that you convert a FP type to a united type. The issue I see with making 
different units of length different types is that there is an unbounded set 
of those and I don't see any reasonable way to allow encoding the conversion 
structures for them.

If someone else is able to make such a library that is as clean as this one, 
I'd not stand in its way, but I have no interest in writing such a beast.

 
 * There should be ways to define scalars of distinct types and
 relationships between them. For example, "Radians" and "Degrees"
 should be distinct types, although both are scalar.

Ditto my comments for non-scalers on a smaller scale.

 
 * The previous points bring me to an important design artifact: each
 and every unit should have a multiplier (constant, template argument)
 that describes its relationship to the SI corresponding entity. The SI
 units themselves will have a 1.0 multiplier, and e.g. Angstroms has a
 1e10 multiplier. The current library has a facility for that, but I
 think that's not as good.

That sounds to me like what the library has so I must not be understanding 
what you are asking for. Could you elaborate? 

 
 * In the proposed design the user can define a lot of distinct types,
 such as Miles, Yards, and Lbs, which are strictly unnecessary
 (Kilometers, Meters, and Kilograms could be used instead, with
 appropriate I/O conversions to and from other units). I think offering
 scale-less units chosen by the user is a good thing as long as there
 is a unified mechanism for converting between those units without
 risking confusion and bugs.

Again, that sounds to me like what the library does. All distance units are 
of the same type and internally are encoded as meters, The rest of the units 
are converted on access.

 
 * There should be no implicit conversion to double and generally few
 conversion smarts. The units should have a writable public field
 "value".
 
 * There should also be a property called siValue which yields the
 value, converted to SI, of type double. For an Angstroms, siValue
 returns value * 1e-10.
 
 (Step-by-step on the code:)
 
 * The code should use TypeTyple instead of T.
 
 * I think FullSI should be always in effect. Even though many users
 don't care for lumens and moles, they can just sit there defaulted at
 the very end and shouldn't be bothersome.

That's my soap box protest to SI's (IMHO stupid) inclusion of them as base 
units. :)

 
 * Each artifact (extra, extra2, Batch...) should be documented.

Um, Yah. :o)

 
 * I'm not sure about the use of fractional exponents. They add a fair
 amount of complication. Could we dump them or use a simple fixed-point
 scheme to accommodate them?
 

The only unit I know for sure has fractional exponents is in fracture mechanics 
(kPa*m^0.5) but if you allow anything beyond that, any fixed-point scheme 
I can think of would fall over right away (X^1/3?).

 * The naming convention should consistently use NamesLikeThis for
 types and namesLikeThis for values (including constants).
 
 * A scheme similar to std.conv.to should serve as an all-around
 converter, e.g. to!Kilometers(Miles(10)) should yield a value of type
 Kilometers that contains 16.05 or whatever.
 
 * All operators should be converted to D2 (yet another success story
 of the new design :o)).
 
 * Packages of remarkable constants would be nice to have, of course in
 the appropriate units. The fields of astronomy, classical/relativistic
 mechanics, electromagnetism, molecular physics, quantum mechanics,
 come to mind.
 
 All - please add your comments. Benjamin, I hope you're as
 enthusiastic as always about pushing this into Phobos!
 

I am.

 Andrei

Jan 04 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/5/11 12:26 AM, BCS wrote:
 * I suggest doing away with abstract unit names ("Distance", "Time",
 "Mass" etc.) and use concrete plural units ("Meters", "Seconds",
 "Kilograms" etc) instead. I agree that at a level operating with the
 abstract names seems to be more pure, but at a concrete level you need
 to have various reference points. For example, a molecular physics
 program would want to operate with Angstroms, which should be a
 distinct type from Meters.

 Why would that be an improvement? The current system encode as length
 measurements as meters but allows you to work with other units by
 converting at the point that you convert a FP type to a united type. The
 issue I see with making different units of length different types is
 that there is an unbounded set of those and I don't see any reasonable
 way to allow encoding the conversion structures for them.

 If someone else is able to make such a library that is as clean as this
 one, I'd not stand in its way, but I have no interest in writing such a
 beast.

It would be an improvement because there wouldn't be the need to 
multiply with a bias every time a value is assigned, with the 
corresponding loss in speed and precision.

To exemplify, say a program wants to work in Angstroms. As all distances 
are stored in meters, ultimately all values stored and operated on would 
be very small, which adversely affects precision. At the other end of 
the scale, an astronomy program would want to work with light-years, 
which would force storage of large values as meters.

To solve this issue, each unit may include a static multiplier that 
converts it to SI (e.g. meter), while at the same time allowing to store 
and operate directly on the unit of choice. So a program may actually 
store 10 Angstroms as the number 10, or 10 light-years as the number 10.

 * There should be ways to define scalars of distinct types and
 relationships between them. For example, "Radians" and "Degrees"
 should be distinct types, although both are scalar.

 Ditto my comments for non-scalers on a smaller scale.

The crux of the matter is that Radians and Degrees should be distinct 
types, and that a conversion should be defined taking one to the other. 
How can we express that in the current library, or what could be added 
to it to make that possible?

 * The previous points bring me to an important design artifact: each
 and every unit should have a multiplier (constant, template argument)
 that describes its relationship to the SI corresponding entity. The SI
 units themselves will have a 1.0 multiplier, and e.g. Angstroms has a
 1e10 multiplier. The current library has a facility for that, but I
 think that's not as good.

 That sounds to me like what the library has so I must not be
 understanding what you are asking for. Could you elaborate?

I think my comments above clarify this. If not please let me know. In 
brief: one should be able to operate on values that are implicitly 
scaled, which are of distinct types (Angstroms, LightYears, Radians, 
Degrees would be illustrative examples).

 * In the proposed design the user can define a lot of distinct types,
 such as Miles, Yards, and Lbs, which are strictly unnecessary
 (Kilometers, Meters, and Kilograms could be used instead, with
 appropriate I/O conversions to and from other units). I think offering
 scale-less units chosen by the user is a good thing as long as there
 is a unified mechanism for converting between those units without
 risking confusion and bugs.

 Again, that sounds to me like what the library does. All distance units
 are of the same type and internally are encoded as meters, The rest of
 the units are converted on access.

The issue is that the choice of the unified format may be problematic.


Andrei

Jan 04 2011

"BCS" <bcs not-here.com> writes:

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 On 1/5/11 12:26 AM, BCS wrote:
 * I suggest doing away with abstract unit names ("Distance", "Time",
 "Mass" etc.) and use concrete plural units ("Meters", "Seconds",
 "Kilograms" etc) instead. I agree that at a level operating with the
 abstract names seems to be more pure, but at a concrete level you need
 to have various reference points. For example, a molecular physics
 program would want to operate with Angstroms, which should be a
 distinct type from Meters.

 Why would that be an improvement? The current system encode as length
 measurements as meters but allows you to work with other units by
 converting at the point that you convert a FP type to a united type. The
 issue I see with making different units of length different types is
 that there is an unbounded set of those and I don't see any reasonable
 way to allow encoding the conversion structures for them.

 It would be an improvement because there wouldn't be the need to 
 multiply with a bias every time a value is assigned, with the 
 corresponding loss in speed and precision.
 To exemplify, say a program wants to work in Angstroms. As all distances 
 are stored in meters, ultimately all values stored and operated on would 
 be very small, which adversely affects precision. At the other end of 
 the scale, an astronomy program would want to work with light-years, 
 which would force storage of large values as meters.
 To solve this issue, each unit may include a static multiplier that 
 converts it to SI (e.g. meter), while at the same time allowing to store 
 and operate directly on the unit of choice. So a program may actually 
 store 10 Angstroms as the number 10, or 10 light-years as the number 10.

Ah. I see what you are getting at. OTOH I'm still not convinced it's any better.

A quick check shows that 1 light years = 9.4605284 �� 10^25 angstroms. A mere
25 orders of magnitude differences, where IEEE754 doubles have a range of 307
orders of magnitude. As to the issue of where to do the conversions: I suspect
that the majority of computation will be between unit carrying types
(particularly if the library is used the way I'm intending it to be) and as
such, I expect that both performance and precision will benefit from having a
unified internal representation. 

There /might/ be reason to have a very limited set of scaling factors (e.g.
atomic scale, human scale, astro scale) and define each of the other units from
one of them. but then you run into issues of what to do when you do
computations that involve more than one (for example, computing the resolution
of an X-ray telescope involves all three scales).

When I started writing the library, I looked at these issue just enough that I
knew sorting it wasn't going to be a fun project. So, rather than hash out
these issue my self, I copied as much as I could from the best units handling
tool I know of: MathCAD. As best I can tell, it uses the same setup I am.

 * There should be ways to define scalars of distinct types and
 relationships between them. For example, "Radians" and "Degrees"
 should be distinct types, although both are scalar.

 Ditto my comments for non-scalers on a smaller scale.

 The crux of the matter is that Radians and Degrees should be distinct 
 types, and that a conversion should be defined taking one to the other. 
 How can we express that in the current library, or what could be added 
 to it to make that possible?

I don't think there /is/ a good solution to that problem because many of the
computations that result in radians naturally give scalar values
(arc-length/radius). As a result, the type system has no way to determine what
the correct type for the expression is without the user forcing a cast or the
like.

If angles are treated as an alias for scalar then the conversion to degrees can
be handled in a reasonable way (but that would also allow converting any scalar
value to degrees). I again punted on this one because people who have put more
time than I have available (MathCAD again) couldn't come up with anything
better.

 * In the proposed design the user can define a lot of distinct types,
 such as Miles, Yards, and Lbs, which are strictly unnecessary
 (Kilometers, Meters, and Kilograms could be used instead, with
 appropriate I/O conversions to and from other units). I think offering
 scale-less units chosen by the user is a good thing as long as there
 is a unified mechanism for converting between those units without
 risking confusion and bugs.

 Again, that sounds to me like what the library does. All distance units
 are of the same type and internally are encoded as meters, The rest of
 the units are converted on access.

 The issue is that the choice of the unified format may be problematic.

The issue I see is that the choice of a non unified format will be problematic.
Unless you can show examples (e.i. benchmarks, etc.) of where the current
solution has precision or performance problems or where it's expressive power
is inadequate, I will remain reluctant to change it.

Jan 05 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/5/11 10:32 AM, BCS wrote:
 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:

[snip]
 Ah. I see what you are getting at. OTOH I'm still not convinced it's any
better.

 A quick check shows that 1 light years = 9.4605284 �� 10^25
 angstroms. A mere 25 orders of magnitude differences, where IEEE754
 doubles have a range of 307 orders of magnitude. As to the issue of
 where to do the conversions: I suspect that the majority of
 computation will be between unit carrying types (particularly if the
 library is used the way I'm intending it to be) and as such, I expect
 that both performance and precision will benefit from having a
 unified internal representation.

People might want to use float for compactness, which has range 1e38 or 
so. But that's not necessarily the largest issue (see below).

 There /might/ be reason to have a very limited set of scaling factors
 (e.g. atomic scale, human scale, astro scale) and define each of the
 other units from one of them. but then you run into issues of what to
 do when you do computations that involve more than one (for example,
 computing the resolution of an X-ray telescope involves all three
 scales).

There are two issues apart from scale. One, creeping errors due to 
conversions. Someone working in miles would not like that after a few 
calculations that look integral they get 67.9999998 miles. Second, let's 
not forget the cost of implicit conversions to and from. The way I see 
it, forcing an internal unit for representation has definite issues that 
reduce its potential applicability.

 When I started writing the library, I looked at these issue just
 enough that I knew sorting it wasn't going to be a fun project. So,
 rather than hash out these issue my self, I copied as much as I could
 from the best units handling tool I know of: MathCAD. As best I can
 tell, it uses the same setup I am.

I don't know MathCAD, but as far as I understand that's a system, not a 
library, and as such might have a slightly different charter. In terms 
of charter Boost units 
(http://www.boost.org/doc/libs/1_38_0/doc/html/boost_units.html) is the 
closest library to this. I haven't looked at it for a while, but indeed 
it does address the issue of scale as I suggested: it allows people to 
store numbers in their own units instead of forcing a specific unit. In 
fact the library makes it a point to distinguish itself from an "SI" 
library as early as its second page:

"While this library attempts to make simple dimensional computations 
easy to code, it is in no way tied to any particular unit system (SI or 
otherwise). Instead, it provides a highly flexible compile-time system 
for dimensional analysis, supporting arbitrary collections of base 
dimensions, rational powers of units, and explicit quantity conversions. 
It accomplishes all of this via template metaprogramming techniques."

Like it or not, Boost units will be the yardstick against which anything 
like it in D will be compared. I hope that D being a superior language 
it will make it considerably easier to implement anything 
metaprogramming-heavy.

 The crux of the matter is that Radians and Degrees should be distinct
 types, and that a conversion should be defined taking one to the other.
 How can we express that in the current library, or what could be added
 to it to make that possible?

 I don't think there /is/ a good solution to that problem because many
 of the computations that result in radians naturally give scalar
 values (arc-length/radius). As a result, the type system has no way
 to determine what the correct type for the expression is without the
 user forcing a cast or the like.

Not a cast, but a conversion. Consider:

void computeFiringSolution(Radians angle)
{
     auto s = sin(angle.value);
     ...
     auto newAngle = Radians(arcsin(s));
}

Much of the point of using units is that there is a good amount of being 
explicit in their handling. The user knows that sin takes a double which 
is meant in Radians. Her program encodes that assumption in a type, but 
is also free to simply fetch the value when using the untyped primitives.

 If angles are treated as an alias for scalar then the conversion to
 degrees can be handled in a reasonable way (but that would also allow
 converting any scalar value to degrees). I again punted on this one
 because people who have put more time than I have available (MathCAD
 again) couldn't come up with anything better.

ArcDegrees and Radians would be two distinct types. You wouldn't be able 
to add Angles to Radians without explicitly stating where you want to be:

ArcDegrees a1;
Radians a2;
auto a = a1 + a2; // error!
auto b = a1 + ArcDegrees(a2); // fine, b is stored in ArcDegrees
auto c = Radians(a1) + a2;    // fine, c is stored in Radians

The same goes about Kilometers and Miles:

Kilometers d1;
Miles d2;
...
auto a = d1 + d2; // error!
auto b = d1 + Kilometers(a2); // fine, b is stored in Kilometers
auto c = Miles(a1) + a2;    // fine, c is stored in Miles

 Again, that sounds to me like what the library does. All distance units
 are of the same type and internally are encoded as meters, The rest of
 the units are converted on access.

 The issue is that the choice of the unified format may be problematic.

 The issue I see is that the choice of a non unified format will be
 problematic. Unless you can show examples (e.i. benchmarks, etc.) of
 where the current solution has precision or performance problems or
 where it's expressive power is inadequate, I will remain reluctant to
 change it.

Examples of precision issues with scaling back and forth by means of a 
multiplier shouldn't be necessary as the problem is obvious. Here's an 
example that took me a couple of minutes to produce:

immutable real metersPerLightyear = 9.4605284e15;
auto a1 = metersPerLightyear * 15.3;
auto a2 = metersPerLightyear * 16.3;
auto a3 = metersPerLightyear * 1;
writeln("Total distance in lightyears: ", (a1 - a2 + a3) / 
metersPerLightyear);
auto b1 = 15.3;
auto b2 = 16.3;
auto b3 = 1;
writeln("Total distance in lightyears: ", (b1 - b2 + b3));

Regarding expressiveness, it is quite clear that there are features 
simply missing: working in Celsius vs. Fahrenheit vs. Kelvin, allowing 
the user to define and use their own units, allowing the user to define 
units with runtime multipliers (monetary) etc. There's always a need to 
stop somewhere as the list could go on forever, but I think the current 
submission stops a bit too early.

If you believe that the library is good as it its, that's definitely 
fine. Don't forget, however, that a good part of the review's purpose is 
to improve the library, not to defend its initial design and 
implementation. A submitter who is willing to go with the library as-is 
although there are beneficial suggested improvements (and that refers to 
everything including e.g. documentation) may be less likely to maintain 
the library in the future. At least that's my perception. In contrast, 
I'm quite hopeful Jonathan will follow through with std.datetime because 
he has been willing to act on all sensible feedback.


Andrei

Jan 05 2011

"BCS" <bcs not-here.com> writes:

In conclusion (yes I know this normally goes at the bottom) I think we are
wanting different and contradictorily things from this library.

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 On 1/5/11 10:32 AM, BCS wrote:
 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:

 [snip]
 Ah. I see what you are getting at. OTOH I'm still not convinced it's any
better.

 A quick check shows that 1 light years = 9.4605284 �� 10^25
 angstroms. A mere 25 orders of magnitude differences, where IEEE754
 doubles have a range of 307 orders of magnitude. As to the issue of
 where to do the conversions: I suspect that the majority of
 computation will be between unit carrying types (particularly if the
 library is used the way I'm intending it to be) and as such, I expect
 that both performance and precision will benefit from having a
 unified internal representation.

 People might want to use float for compactness, which has range 1e38 or 
 so. But that's not necessarily the largest issue (see below).
 There /might/ be reason to have a very limited set of scaling factors
 (e.g. atomic scale, human scale, astro scale) and define each of the
 other units from one of them. but then you run into issues of what to
 do when you do computations that involve more than one (for example,
 computing the resolution of an X-ray telescope involves all three
 scales).

 There are two issues apart from scale. One, creeping errors due to 
 conversions. Someone working in miles would not like that after a few 
 calculations that look integral they get 67.9999998 miles. Second, let's 
 not forget the cost of implicit conversions to and from. The way I see 
 it, forcing an internal unit for representation has definite issues that 
 reduce its potential applicability.

IMHO both of these are somewhat synthetic, that is they aren't significant
issues in the real word. For the first, anyone who expects FP to give exact
answers needs to learn more about FP. If you need exact answers, use an integer
or rational type as your base type. As for the second point about perf; the
usage mode I designed for will only perform conversions in I/O operations.
Values are converted to Unit-bearing types at the first opportunity and remain
there until the last possible moment. As such I expect that any operation that
is doing more that a handful of conversions will be I/O bound not compute
bound. 

 When I started writing the library, I looked at these issue just
 enough that I knew sorting it wasn't going to be a fun project. So,
 rather than hash out these issue my self, I copied as much as I could
 from the best units handling tool I know of: MathCAD. As best I can
 tell, it uses the same setup I am.

 I don't know MathCAD, but as far as I understand that's a system, not a 
 library, and as such might have a slightly different charter. In terms 
 of charter Boost units 
 (http://www.boost.org/doc/libs/1_38_0/doc/html/boost_units.html) is the 
 closest library to this. I haven't looked at it for a while, but indeed 
 it does address the issue of scale as I suggested: it allows people to 
 store numbers in their own units instead of forcing a specific unit. In 
 fact the library makes it a point to distinguish itself from an "SI" 
 library as early as its second page:
 "While this library attempts to make simple dimensional computations 
 easy to code, it is in no way tied to any particular unit system (SI or 
 otherwise). Instead, it provides a highly flexible compile-time system 
 for dimensional analysis, supporting arbitrary collections of base 
 dimensions, rational powers of units, and explicit quantity conversions. 
 It accomplishes all of this via template metaprogramming techniques."
 Like it or not, Boost units will be the yardstick against which anything 
 like it in D will be compared. I hope that D being a superior language 
 it will make it considerably easier to implement anything 
 metaprogramming-heavy.

Reiterating my prior point, what I'm interested in developing is a library that
handles the set of base physical units, of witch there is a know, finite set of
base units and the derived units. You might be able to talk me into doing
statically scaled units as distinct types but I'm not at all interested in
allowing an arbitrary number of base units or in treating physicality
equivalent units (e.g. feet and meters) to be considered as different base
units. My unwillingness to go there is because I see very little value in doing
a little of that (what other dimensions can be added that act like the SI base
units?) and enormous cost in doing more of it (if you allow dimensions to act
differently; how? in what ways? where do you stop?).

 The crux of the matter is that Radians and Degrees should be distinct
 types, and that a conversion should be defined taking one to the other.
 How can we express that in the current library, or what could be added
 to it to make that possible?

 I don't think there /is/ a good solution to that problem because many
 of the computations that result in radians naturally give scalar
 values (arc-length/radius). As a result, the type system has no way
 to determine what the correct type for the expression is without the
 user forcing a cast or the like.

 Not a cast, but a conversion. Consider:

That's what I was referring to by "and the like". :)

 void computeFiringSolution(Radians angle)
 {
      auto s = sin(angle.value);
      ...
      auto newAngle = Radians(arcsin(s));
 }

The way I would like that code to look would be:

void computeFiringSolution(Radians angle)
{
     auto s = angle.sin(); // only exist for Radians (and Scaler)
     ...
     auto newAngle = std.units.arcsin(s);  // returns Radians
     static assert(is(typeof(newAngle) : Radians));
}

 Much of the point of using units is that there is a good amount of being 
 explicit in their handling.

The objective I was going for is that you are explicit at the edges (converting
to and from other types) and ignore it in the middle.

 The user knows that sin takes a double which 
 is meant in Radians.

I'd rather the user know they can take the sin of something that is an angle
and not worry about the units.

 Her program encodes that assumption in a type, but 
 is also free to simply fetch the value when using the untyped primitives.

The way I wrote it, accessing the value directly is much the same as using a
reinterpret_cast; a blunt hack. Rather than doing that, the user is forced (by
design) to explicitly state what unit the value should be returned in or what
it is being provided as.

 If angles are treated as an alias for scalar then the conversion to
 degrees can be handled in a reasonable way (but that would also allow
 converting any scalar value to degrees). I again punted on this one
 because people who have put more time than I have available (MathCAD
 again) couldn't come up with anything better.

 ArcDegrees and Radians would be two distinct types. You wouldn't be able 
 to add Angles to Radians without explicitly stating where you want to be:
 ArcDegrees a1;
 Radians a2;
 auto a = a1 + a2; // error!

That expression is something I explicitly want to be valid (thus the reason the
type aliases are named Length, Mass, Time, ... rather than Meter, Kilogram,
Second, ...). They are both a measure of angle so should be addable. One of the
fundamental requirements of the library is that things that measure the same
property can be used interchangeably. This ability is a very large part of the
reason that I wrote the library in the first place and I have no interest in
continuing without it.

 auto b = a1 + ArcDegrees(a2); // fine, b is stored in ArcDegrees
 auto c = Radians(a1) + a2;    // fine, c is stored in Radians
 The same goes about Kilometers and Miles:
 Kilometers d1;
 Miles d2;
 ...
 auto a = d1 + d2; // error!
 auto b = d1 + Kilometers(a2); // fine, b is stored in Kilometers
 auto c = Miles(a1) + a2;    // fine, c is stored in Miles

Ditto the same as above. 

 Again, that sounds to me like what the library does. All distance units
 are of the same type and internally are encoded as meters, The rest of
 the units are converted on access.

 The issue is that the choice of the unified format may be problematic.

 The issue I see is that the choice of a non unified format will be
 problematic. Unless you can show examples (e.i. benchmarks, etc.) of
 where the current solution has precision or performance problems or
 where it's expressive power is inadequate, I will remain reluctant to
 change it.

 Examples of precision issues with scaling back and forth by means of a 
 multiplier shouldn't be necessary as the problem is obvious. Here's an 
 example that took me a couple of minutes to produce:
 immutable real metersPerLightyear = 9.4605284e15;
 auto a1 = metersPerLightyear * 15.3;
 auto a2 = metersPerLightyear * 16.3;
 auto a3 = metersPerLightyear * 1;
 writeln("Total distance in lightyears: ", (a1 - a2 + a3) / 
 metersPerLightyear);
 auto b1 = 15.3;
 auto b2 = 16.3;
 auto b3 = 1;
 writeln("Total distance in lightyears: ", (b1 - b2 + b3));

I wasn't asking for cases where values come out unequal but where they come out
unusable.

 Regarding expressiveness, it is quite clear that there are features 
 simply missing: working in Celsius vs. Fahrenheit vs. Kelvin,

I'll grant I don't have Celsius and Fahrenheit but they are very special cases
as they have non zero origins. OTOH it will give differences in Fahrenheit via
the Rankine scale.

 allowing 
 the user to define and use their own units, allowing the user to define 
 units with runtime multipliers (monetary) etc. There's always a need to 
 stop somewhere as the list could go on forever,

Agreed

 but I think the current submission stops a bit too early.

I think that the current point is the only logical point (in that any other is
just arbitrary).

 If you believe that the library is good as it its, that's definitely 
 fine. Don't forget, however, that a good part of the review's purpose is 
 to improve the library, not to defend its initial design and 
 implementation. A submitter who is willing to go with the library as-is 
 although there are beneficial suggested improvements (and that refers to 
 everything including e.g. documentation) may be less likely to maintain 
 the library in the future. At least that's my perception.

To be clear, many of your points are very relevant and I omitted commenting on
them because a long list of Yup, yup, yup, ... is just noise. Also, given that
we are thrashing out one or two fundamental points about it, I think those
"lesser" issues can wait. 

 In contrast, 
 I'm quite hopeful Jonathan will follow through with std.datetime because 
 he has been willing to act on all sensible feedback.
 Andrei

Jan 05 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/5/11 4:06 PM, BCS wrote:
 In conclusion (yes I know this normally goes at the bottom) I think
 we are wanting different and contradictorily things from this
 library.

I almost didn't read the rest thinking that that's all you inserted. 
All: there's more, scroll down!

One additional practical matter:

 The way I would like that code to look would be:

 void computeFiringSolution(Radians angle)
 {
      auto s = angle.sin(); // only exist for Radians (and Scaler)
      ...
      auto newAngle = std.units.arcsin(s);  // returns Radians
      static assert(is(typeof(newAngle) : Radians));
 }

This is nice in theory but would have you essentially wrap by hand an 
unbounded number of functions. And for what? So they write angle.sin() 
instead of sin(angle.value). I appreciate the additional theoretical 
safety, but I don't see how that benefit compensates the cost. I want a 
practical library that allows me to work with libraries designed outside 
of it.

Anyway, let's not forget that at the end of the day my opinion is one 
opinion and my vote is one vote. For the record, my vote is against the 
library in its current form for the following reasons:

(a) Poor documentation

(b) Limited expressiveness

(c) Numeric issues as I described (and no amount of rhetoric will set 
that straight; FWIW given the obvious question of scaling you need to 
prove it works, not me to prove it doesn't)

(d) Unrealized potential (if we approve this, backward compatibility 
will prevent more comprehensive libraries having the same aim but a 
different design). This argument is to be taken with a grain of salt as 
in general it can be easily abused. What I'm saying is that once this 
library is in we may as well forget about scaled units a la boost units 
(which are the kind I'd want to use).

Going from here I see a few possibilities.

1. Other people deem the library adequate as it is and it gets voted in;

2. You and somebody else agree to work together on this submission;

3. You agree to pass your work to someone who will continue to work 
towards a submission;

4. The library is not made part of Phobos but remains of course 
available as a third-party library.


Andrei

Jan 05 2011

"BCS" <bcs not-here.com> writes:

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 On 1/5/11 4:06 PM, BCS wrote:
 In conclusion (yes I know this normally goes at the bottom) I think
 we are wanting different and contradictorily things from this
 library.

 I almost didn't read the rest thinking that that's all you inserted. 
 All: there's more, scroll down!

Oops. :o|

 One additional practical matter:
 The way I would like that code to look would be:

 void computeFiringSolution(Radians angle)
 {
      auto s = angle.sin(); // only exist for Radians (and Scaler)
      ...
      auto newAngle = std.units.arcsin(s);  // returns Radians
      static assert(is(typeof(newAngle) : Radians));
 }

 This is nice in theory but would have you essentially wrap by hand an 
 unbounded number of functions. And for what? So they write angle.sin() 
 instead of sin(angle.value).

There are very few general functions I know of that take units other than
scalers. As a result, I would expect that allowing scalers to implicitly
convert to floating point would cover most of those. And for the rest (sin,
etc.) there are few enough that adding them to the type may be practical.

For the non-general functions that do take non scaler values, I would think
explicitly asking for the value as a given unit (as the library currently does)
would be the better choice rather than having to convert it to the related type
for the unit and then asking for the values:

FnTakingMeters(Meter(length).value);  // that could be redundant if length is
already meters...
FnTakingMeters(length.value);  // but are you sure length it's already meters?

vs.

FnTakingMeters(length.meter); // gives length in meters.

 I appreciate the additional theoretical 
 safety, but I don't see how that benefit compensates the cost. I want a 
 practical library that allows me to work with libraries designed outside 
 of it.

I agree on what but I'm not sure on how.

 Anyway, let's not forget that at the end of the day my opinion is one 
 opinion and my vote is one vote. For the record, my vote is against the 
 library in its current form for the following reasons:
 (a) Poor documentation

Um, Yeah. :o)

 (b) Limited expressiveness

In which way? Adding arbitrary base units? Things like Dynamic conversion
rates? 

 (c) Numeric issues as I described (and no amount of rhetoric will set 
 that straight; FWIW given the obvious question of scaling you need to 
 prove it works, not me to prove it doesn't)
 (d) Unrealized potential (if we approve this, backward compatibility 
 will prevent more comprehensive libraries having the same aim but a 
 different design). This argument is to be taken with a grain of salt as 
 in general it can be easily abused. What I'm saying is that once this 
 library is in we may as well forget about scaled units a la boost units 
 (which are the kind I'd want to use).

We have both said our piece on these, what do others think? I'd be particularly
interested in what Don has to say on the numeric issues. Does an extra layer or
two of FP rounding really mater.

Jan 05 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, January 05, 2011 15:40:37 BCS wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 (c) Numeric issues as I described (and no amount of rhetoric will set
 that straight; FWIW given the obvious question of scaling you need to
 prove it works, not me to prove it doesn't)
 (d) Unrealized potential (if we approve this, backward compatibility
 will prevent more comprehensive libraries having the same aim but a
 different design). This argument is to be taken with a grain of salt as
 in general it can be easily abused. What I'm saying is that once this
 library is in we may as well forget about scaled units a la boost units
 (which are the kind I'd want to use).

 
 We have both said our piece on these, what do others think? I'd be
 particularly interested in what Don has to say on the numeric issues. Does
 an extra layer or two of FP rounding really mater.

Personally, I tend to cringe when I see much in the way of floating points in 
anything that needs precision, but it's not like you can avoid it in this case. 
Regardless, I agree with pretty much everything that Andrei has said. I 
particularly don't like that the values are all in meters internal - 
_especially_ when dealing with floating point values. I'd be very worried about 
precision issues. The Boost solution seems like a solid one me. However, I'm
not 
likely the sort of person who's going to be using a unit library very often. I 
just don't deal with code that cares about that sort of thing very often.

- Jonathan M Davis

Jan 05 2011

Walter Bright <newshound2 digitalmars.com> writes:

Jonathan M Davis wrote:
 On Wednesday, January 05, 2011 15:40:37 BCS wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 (c) Numeric issues as I described (and no amount of rhetoric will set
 that straight; FWIW given the obvious question of scaling you need to
 prove it works, not me to prove it doesn't)
 (d) Unrealized potential (if we approve this, backward compatibility
 will prevent more comprehensive libraries having the same aim but a
 different design). This argument is to be taken with a grain of salt as
 in general it can be easily abused. What I'm saying is that once this
 library is in we may as well forget about scaled units a la boost units
 (which are the kind I'd want to use).

 We have both said our piece on these, what do others think? I'd be
 particularly interested in what Don has to say on the numeric issues. Does
 an extra layer or two of FP rounding really mater.

 
 Personally, I tend to cringe when I see much in the way of floating points in 
 anything that needs precision, but it's not like you can avoid it in this
case. 
 Regardless, I agree with pretty much everything that Andrei has said. I 
 particularly don't like that the values are all in meters internal - 
 _especially_ when dealing with floating point values. I'd be very worried
about 
 precision issues. The Boost solution seems like a solid one me. However, I'm
not 
 likely the sort of person who's going to be using a unit library very often. I 
 just don't deal with code that cares about that sort of thing very often.

One thing we learned in engineering school was to never do premature rounding. 
Always defer such to the final calculation.

My experience with "round trip" rounding, where X=>Y=>X is that the result 
"drifts". You can see with the mouse sometimes, as it can slowly drift to one 
corner of the screen.

Jan 05 2011

"BCS" <bcs not-here.com> writes:

Jonathan M Davis <jmdavisProg gmx.com> wrote:
 On Wednesday, January 05, 2011 15:40:37 BCS wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 (c) Numeric issues as I described (and no amount of rhetoric will set
 that straight; FWIW given the obvious question of scaling you need to
 prove it works, not me to prove it doesn't)
 (d) Unrealized potential (if we approve this, backward compatibility
 will prevent more comprehensive libraries having the same aim but a
 different design). This argument is to be taken with a grain of salt as
 in general it can be easily abused. What I'm saying is that once this
 library is in we may as well forget about scaled units a la boost units
 (which are the kind I'd want to use).

 
 We have both said our piece on these, what do others think? I'd be
 particularly interested in what Don has to say on the numeric issues. Does
 an extra layer or two of FP rounding really mater.

 Personally, I tend to cringe when I see much in the way of floating points in 
 anything that needs precision, but it's not like you can avoid it in this
case. 
 Regardless, I agree with pretty much everything that Andrei has said. I 
 particularly don't like that the values are all in meters internal - 
 _especially_ when dealing with floating point values. I'd be very worried
about 
 precision issues. The Boost solution seems like a solid one me. However, I'm
not 
 likely the sort of person who's going to be using a unit library very often. I 
 just don't deal with code that cares about that sort of thing very often.

After a little more thinking I'm wondering if I'm targeting a different use
case than other people are thinking about.

The case I'm designing for, is where you have a relatively small number of
inputs (that may be in a mishmash of units and systems), a relatively large
number of computations and a relatively small number of outputs. The systems
that Andrei is arguing for may be more desirable if there are relatively less
computation (thus less internal rounding) or if all or most of the inputs are
in a consistent system of units (resulting in very few necessary conversions).

I'm primarily interested in the first use case because it is the kind of
problem I have dealt with the most (particularly the mishmash of units bit) and
for that, the two proposals are almost equivalent from a perf and accuracy
standpoint because each should convert the inputs to a consistent system, do
all the math in it, and then convert to the output units (I'm not even assuming
the outputs form a consistent system). The only difference is that the current
arrangement picks the consistent system for you where the alternative allows
(and forces) you to select it.

Jan 05 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/5/11 9:55 PM, BCS wrote:
 After a little more thinking I'm wondering if I'm targeting a
 different use case than other people are thinking about.

 The case I'm designing for, is where you have a relatively small
 number of inputs (that may be in a mishmash of units and systems), a
 relatively large number of computations and a relatively small number
 of outputs. The systems that Andrei is arguing for may be more
 desirable if there are relatively less computation (thus less
 internal rounding) or if all or most of the inputs are in a
 consistent system of units (resulting in very few necessary
 conversions).

 I'm primarily interested in the first use case because it is the kind
 of problem I have dealt with the most (particularly the mishmash of
 units bit) and for that, the two proposals are almost equivalent from
 a perf and accuracy standpoint because each should convert the inputs
 to a consistent system, do all the math in it, and then convert to
 the output units (I'm not even assuming the outputs form a consistent
 system). The only difference is that the current arrangement picks
 the consistent system for you where the alternative allows (and
 forces) you to select it.

I think this all is sensible. What I like about Boost units is that they 
didn't define SI units; they defined a framework in which units can be 
defined (and indeed "si" is a sub-namespace inside units that has no 
special rights).

This review conclusion is a very good read:

http://lists.boost.org/boost-announce/2007/04/0126.php

I recommend to all to read the entire review thread to get an idea of 
the scope and sophistication of the Boost review process. It has 
tremendously increased the quality of Boost libraries. We need to get there.


Andrei

Jan 06 2011

D Programming

C/C++ Programming

Other

digitalmars.D - RFC: SI Units facility for Phobos