www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Implement the "unum" representation in D ?

reply "Nick B" <nick.barbalich gmail.com> writes:
Hi everyone.

I'm attend the SKA conference in Auckland next week and I would 
like to discuss a opportunity for the D community.

I am based in Wellington, New Zealand.  In Auckland, NZ, from 
Tuesday to Friday next week there will be two seminars held.

The first 2 days (Tuesday and Wednesday) are for the multicore 
conference. Details are   http://www.multicoreworld.com/

Here is the schedule for 2 days (thursday & friday) of the SKA 
conference:

http://openparallel.com/multicore-world-2014/computing-for-ska/schedule-computing-for-ska-2014/


John Gustafson Will be presenting a Keynote on Thursday 27th 
February at 11:00 am

The abstract is here:  
http://openparallel.com/multicore-world-2014/speakers/john-gustafson/

There is also a excellent background paper, (PDF - 64 pages) 
which can be found here:

http://sites.ieee.org/scv-cs/files/2013/03/Right-SizingPrecision1.pdf

The math details are beyond me, but I understand his basic idea.

I would like to bring your attention to Page 34 and his comments 
re "standard committees"  and page 62 and his comments "Coded in 
Mathematica  for now. Need a fast native version.."

I am sure you can see where I am going with this ....

1.  Would it be possible to implement the "unum"  representation 
in D and therefore make it a  contender for the SKA ?

2.  Is there any interest in this format within the D community

Destroy.

Nick
Feb 20 2014
next sibling parent "Nick B" <nick.barbalich gmail.com> writes:
On Thursday, 20 February 2014 at 10:10:13 UTC, Nick B wrote:
 Hi everyone.

 I'm attend the SKA conference in Auckland next week and I would 
 like to discuss a opportunity for the D community.

Sorry if I was not clear what the SKA is. In a nutshell is a truely massive telescope project which will require massive computing resources. https://www.skatelescope.org/ Nick
Feb 20 2014
prev sibling next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Thursday, 20 February 2014 at 10:10:13 UTC, Nick B wrote:
 Hi everyone.

 I'm attend the SKA conference in Auckland next week and I would 
 like to discuss a opportunity for the D community.

 I am based in Wellington, New Zealand.  In Auckland, NZ, from 
 Tuesday to Friday next week there will be two seminars held.

 The first 2 days (Tuesday and Wednesday) are for the multicore 
 conference. Details are   http://www.multicoreworld.com/

 Here is the schedule for 2 days (thursday & friday) of the SKA 
 conference:

 http://openparallel.com/multicore-world-2014/computing-for-ska/schedule-computing-for-ska-2014/


 John Gustafson Will be presenting a Keynote on Thursday 27th 
 February at 11:00 am

 The abstract is here:  
 http://openparallel.com/multicore-world-2014/speakers/john-gustafson/

 There is also a excellent background paper, (PDF - 64 pages) 
 which can be found here:

 http://sites.ieee.org/scv-cs/files/2013/03/Right-SizingPrecision1.pdf

 The math details are beyond me, but I understand his basic idea.

 I would like to bring your attention to Page 34 and his 
 comments re "standard committees"  and page 62 and his comments 
 "Coded in Mathematica  for now. Need a fast native version.."

 I am sure you can see where I am going with this ....

 1.  Would it be possible to implement the "unum"  
 representation in D and therefore make it a  contender for the 
 SKA ?

 2.  Is there any interest in this format within the D community

 Destroy.

 Nick

Hmm. Interesting. It could be done in D, definitely. However, I'm a bit skeptical on how efficient it would ever be without specialist hardware. I'd have to read it over a few more times to get a proper grip on it.
Feb 20 2014
prev sibling next sibling parent "Rikki Cattermole" <alphaglosined gmail.com> writes:
On Thursday, 20 February 2014 at 10:10:13 UTC, Nick B wrote:
 Hi everyone.

 I'm attend the SKA conference in Auckland next week and I would 
 like to discuss a opportunity for the D community.

 I am based in Wellington, New Zealand.  In Auckland, NZ, from 
 Tuesday to Friday next week there will be two seminars held.

 The first 2 days (Tuesday and Wednesday) are for the multicore 
 conference. Details are   http://www.multicoreworld.com/

 Here is the schedule for 2 days (thursday & friday) of the SKA 
 conference:

 http://openparallel.com/multicore-world-2014/computing-for-ska/schedule-computing-for-ska-2014/


 John Gustafson Will be presenting a Keynote on Thursday 27th 
 February at 11:00 am

 The abstract is here:  
 http://openparallel.com/multicore-world-2014/speakers/john-gustafson/

 There is also a excellent background paper, (PDF - 64 pages) 
 which can be found here:

 http://sites.ieee.org/scv-cs/files/2013/03/Right-SizingPrecision1.pdf

 The math details are beyond me, but I understand his basic idea.

 I would like to bring your attention to Page 34 and his 
 comments re "standard committees"  and page 62 and his comments 
 "Coded in Mathematica  for now. Need a fast native version.."

 I am sure you can see where I am going with this ....

 1.  Would it be possible to implement the "unum"  
 representation in D and therefore make it a  contender for the 
 SKA ?

 2.  Is there any interest in this format within the D community

 Destroy.

 Nick

It looks like something that could be made into a library, with the help of inline assembly. Shame I've got study next week. Definitely no chance of making it.
Feb 20 2014
prev sibling next sibling parent "w0rp" <devw0rp gmail.com> writes:
This is very interesting, thank you for sharing this. My 
knowledge of assembly, compilers, and how machines actually work 
is limited, but I knew enough about the details of floating point 
to get the gist of it. The diagrams in the PDF also helped. I 
hope someone does more research on this, as it looks promising 
for improving the quality of calculations.
Feb 20 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Slide 6:

Even the IEEE standard (1985) made choices that look dubious now<

On the other hand it's still working surprisingly well today.
 Negative zero. (ugh!)

It's not bad. Slide 15: are (Page Rank and) mp3 Players using floating point values? Slide 18: "is harder than doing THIS with them?" Hardware multiplication uses a different algorithm. Regarding the variable length of his FP numbers, their energy savings are beer-based numbers, I don't think they have any experimental basis (yet). Bye, bearophile
Feb 20 2014
prev sibling next sibling parent "jerro" <jkrempus gmail.com> writes:
 Regarding the variable length of his FP numbers, their energy 
 savings are beer-based numbers, I don't think they have any 
 experimental basis (yet).

Also, because they are variable sized, you need to access them through pointers if you want random access. Now you are reading both the pointer and the value from memory. Since pointers and double precision floats have the same size on modern hardware, one would expect this to actually consume more energy than just reading a double precision value. An additional indirection can also have great performance cost. And there's one more step you need to do. After getting the pointer you need to first read the utag so you can decide how many bytes to read. So where you would normally just read the value, you now need to read the pointer, use that to read the utag and use the utag to read the value.
Feb 20 2014
prev sibling next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On 20 February 2014 15:26, bearophile <bearophileHUGS lycos.com> wrote:
 Slide 6:

 Even the IEEE standard (1985) made choices that look dubious now<

On the other hand it's still working surprisingly well today.
 Negative zero. (ugh!)

It's not bad. Slide 15: are (Page Rank and) mp3 Players using floating point values?

Most encoding formats use a form of discrete cosine transform, which involves floating point maths.
Feb 20 2014
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/20/2014 2:10 AM, Nick B wrote:
 1.  Would it be possible to implement the "unum"  representation in D and
 therefore make it a  contender for the SKA ?

Yes, as a library type. I don't think it'll save any energy, though.
 2.  Is there any interest in this format within the D community

I think it would be a fun and useful project. Any takers? (Be aware that there's a Python 'unum' type that is something quite different.)
Feb 20 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Iain Buclaw:

 Most encoding formats use a form of discrete cosine transform, 
 which involves floating point maths.

I don't believe this much :-( Bye, bearophile
Feb 20 2014
prev sibling next sibling parent "Francesco Cattoglio" <francesco.cattoglio gmail.com> writes:
On Thursday, 20 February 2014 at 10:10:13 UTC, Nick B wrote:
 The abstract is here:  
 http://openparallel.com/multicore-world-2014/speakers/john-gustafson/

"The pursuit of exascale floating point is ridiculous, since we do not need to be making 10^18 sloppy rounding errors per second; we need instead to get provable, valid results for the first time, by turning the speed of parallel computers into higher quality answers instead of more junk per second" Ok, I think I know a bunch of people who could question the contents of that sentence. Or, at the very least, question this guy's way of presenting sensational news.
Feb 20 2014
prev sibling next sibling parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
 Also, because they are variable sized, you need to access them 
 through pointers if you want random access. Now you are reading 
 both the pointer and the value from memory.

We might not need random access though. For basic linear algebra forward or bidirectional should be enough which should suit this format. Depends on the application.
Feb 20 2014
prev sibling next sibling parent "jerro" <jkrempus gmail.com> writes:
 We might not need random access though.

Even if you don't need random access, you can't store them in a packed way if you want to be able to mutate them in place, since mathematical operations on them can change the number of bits they take.
 Depends on the application.

I suspect that in most numerical applications the number of bits needed to store the numbers will keep increasing during the computation until it reaches the maximal supported value. That can actually happen very fast - a single division with a non power of two is enough. Applications that could actually benefit from this in any way are probably extremely rare. The only one I can think of is to use it as a very simple form of compression, but I'm sure there are better options for that.
Feb 20 2014
prev sibling next sibling parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
The unum variable length encoding is very similar to how msgpack 
packs integers. See msgpack-d on github for a superb 
implementation in D.
Feb 20 2014
prev sibling next sibling parent "Francesco Cattoglio" <francesco.cattoglio gmail.com> writes:
On Thursday, 20 February 2014 at 23:21:26 UTC, Nordlöw wrote:
 We might not need random access though.

 For basic linear algebra forward or bidirectional should be 
 enough which should suit this format.

 Depends on the application.

"Basic Linear Algebra" often requires sparse matrices. Sparse Matrices = both random access and indirection.
Feb 20 2014
prev sibling next sibling parent "Chris Williams" <yoreanon-chrisw yahoo.co.jp> writes:
On Thursday, 20 February 2014 at 23:13:20 UTC, Francesco 
Cattoglio wrote:
 On Thursday, 20 February 2014 at 10:10:13 UTC, Nick B wrote:
 The abstract is here:  
 http://openparallel.com/multicore-world-2014/speakers/john-gustafson/

"The pursuit of exascale floating point is ridiculous, since we do not need to be making 10^18 sloppy rounding errors per second; we need instead to get provable, valid results for the first time, by turning the speed of parallel computers into higher quality answers instead of more junk per second" Ok, I think I know a bunch of people who could question the contents of that sentence. Or, at the very least, question this guy's way of presenting sensational news.

I don't quite understand his ubox stuff, but his unum format doesn't really solve the 0.1 problem, except maybe by allowing the size of his values to exceed 64-bits so that precision errors creap up a little bit slower. (I'm not sure how many bits his format tops out at and I don't want to re-open the PDF again to look). It also wasn't clear whether his format removed the multiple values of NaN, -0, etc. It looked like it was just the current IEEE formats bonded to a sliding bit-length, which would bring along with it all the problems of the IEEE format that he mentioned. I think the only way to solve for problems like 0.1 in decimal not mapping to any reasonable value in binary is to store numbers as integer equations (i.e. 0.1 = 1/10), which would take a hell of a complex format to represent and some pretty fancy CPUs.
Feb 20 2014
prev sibling next sibling parent "Francesco Cattoglio" <francesco.cattoglio gmail.com> writes:
On Thursday, 20 February 2014 at 23:52:13 UTC, Chris Williams 
wrote:
 I don't quite understand his ubox stuff, but his unum format 
 doesn't really solve the 0.1 problem, except maybe by allowing 
 the size of his values to exceed 64-bits so that precision 
 errors creap up a little bit slower. (I'm not sure how many 
 bits his format tops out at and I don't want to re-open the PDF 
 again to look).

It also wasn't clear whether his format removed
 the multiple values of NaN, -0, etc. It looked like it was just 
 the current IEEE formats bonded to a sliding bit-length

multiple representations is still there. The only real advantage is that you can probably store a NaN in 8 bits. Honestly, who cares. If I get a NaN in my numerical simulation, I have bigger concerns other than saving memory space.
 I think the only way to solve for problems like 0.1 in decimal 
 not mapping to any reasonable value in binary is to store 
 numbers as integer equations (i.e. 0.1 = 1/10), which would 
 take a hell of a complex format to represent and some pretty 
 fancy CPUs.

rational numbers. How extraordinary! :P The whole idea has one merit: the float now stores it's own accuracy. But I think you can achieve more or less the same goal by storing a pair of something. The whole idea of saving space sounds bogus, at least for my field of application. We already have an amazing technique for saving a lot of space for numerical simulation of PDEs, it's called grid refinement.
Feb 20 2014
prev sibling next sibling parent "Frustrated" <c1514843 drdrb.com> writes:
On Thursday, 20 February 2014 at 23:41:12 UTC, jerro wrote:
 We might not need random access though.

Even if you don't need random access, you can't store them in a packed way if you want to be able to mutate them in place, since mathematical operations on them can change the number of bits they take.
 Depends on the application.

I suspect that in most numerical applications the number of bits needed to store the numbers will keep increasing during the computation until it reaches the maximal supported value. That can actually happen very fast - a single division with a non power of two is enough. Applications that could actually benefit from this in any way are probably extremely rare. The only one I can think of is to use it as a very simple form of compression, but I'm sure there are better options for that.

Yes, but his ubox method attempts to solve this problem by providing only the most accurate answers. By using a "sliding" floating point you can control the accuracy much better and by using the ubox method you can zero in on the most accurate solutions. I think a few here are missing the point. First: The unums self scale to provide the "fewest bit" representation ala his morse code vs ascii example. Not a huge deal, after all, it's only memory. But by using such representations one can have more accurate results because one can better represent values that are not accurately representable in standard fp. I think though adding a "repeating" bit would make it even more accurate so that repeating decimals within the bounds of maximum bits used could be represented perfectly. e.g., 1/3 = 0.3333... could be represented perfectly with such a bit and sliding fp type. With proper cpu support one could have 0.3333... * 3 = 1 exactly. By having two extra bits one could represent constants to any degree of accuracy. e.g., the last bit says the value represents the ith constant in some list. This would allow very common irrational constants to be used: e, pi, sqrt(2), etc... with more accuracy than normal and handled internally by the cpu. With such constants one could have sqrt(2)*sqrt(2) = 2 exactly. (designate part of the constants as "squares" since one could have up to 2^n - 1 constants) The problem I see is that to make it all work requires a lot of work on the hardware. Without it, there is no real reason to use it. One also runs into the problem of optimization. Having variable sized fp numbers may not be very efficient in memory alignment. A list of resizedable fp types would contain 1, 2, 3, 4, 5, ... byte numbers of random size. If you do a calculation on the list and try to use the list as storage again could end up with a different sized list... which could be very inefficient. You'll probably have to allocate a list of the maximum size possible in the first place to prevent buffer overflows, which then defeats the purpose in some sense In any case, one could have a type where the last 2 bits designate the representation. 00 - Positive Integer 01 - Floating point 10 - Constant - (value represents an index, possibly virtual, into a table of constants) 11 - Negative Integer For floating points, the 3rd last bit could represent a repeating decimal or they could be used in the constants for common repeating decimals. (since chances are, repeated calculations would not produce repeating decimals)
Feb 20 2014
prev sibling next sibling parent "francesco cattoglio" <francesco.cattoglio gmail.com> writes:
On Friday, 21 February 2014 at 05:21:53 UTC, Frustrated wrote:
 I think though adding a "repeating" bit would make it even more
 accurate so that repeating decimals within the bounds of maximum
 bits used could be represented perfectly. e.g., 1/3 = 0.3333...
 could be represented perfectly with such a bit and sliding fp
 type. With proper cpu support one could have 0.3333... * 3 = 1
 exactly.

 By having two extra bits one could represent constants to any
 degree of accuracy. e.g., the last bit says the value represents
 the ith constant in some list. This would allow very common
 irrational constants to be used: e, pi, sqrt(2), etc...

constants. More, such a "repeating bit" should become a "repeating counter", since you usually get a certain number of repeating digits, not just a single one.
 For floating points, the 3rd last bit could represent a 
 repeating
 decimal or they could be used in the constants for common
 repeating decimals. (since chances are, repeated calculations
 would not produce repeating decimals)

thinking mostly at message passing via TCP/IP), but have no real use in scientific computation. If you want good precision, you might as well be better off with bignum numbers.
Feb 20 2014
prev sibling next sibling parent "ponce" <contact gam3sfrommars.fr> writes:
On Thursday, 20 February 2014 at 23:43:18 UTC, Nordlöw wrote:
 The unum variable length encoding is very similar to how 
 msgpack packs integers. See msgpack-d on github for a superb 
 implementation in D.

msgpack-d is indeed a great library that makes serialization almost instant to implement. I implemented CBOR another binary encoding scheme and it was obvious CBOR brings nothing over msgpack, it even did worse with integer encoding.
Feb 21 2014
prev sibling next sibling parent "Ivan Kazmenko" <gassa mail.ru> writes:
On Friday, 21 February 2014 at 05:21:53 UTC, Frustrated wrote:
 I think though adding a "repeating" bit would make it even more
 accurate so that repeating decimals within the bounds of maximum
 bits used could be represented perfectly. e.g., 1/3 = 0.3333...
 could be represented perfectly with such a bit and sliding fp
 type. With proper cpu support one could have 0.3333... * 3 = 1
 exactly.

I believe that the repeating decimals, or better, repeating binary fractions, will hardly be more useful than a rational representation like p/q. The reason is, if we take a reciprocal 1/q and represent it as a binary, decimal, or other fixed-base fraction, the representation is surely periodic, but the upper bound on the length of its period is as high as q-1, and it is not unlikely to be the exact bound. For example, at http://en.wikipedia.org/wiki/Binary_number#Fractions , note that the binary fraction for 1/11 has period 10, and for 1/13 the period is 12. Thus repeating decimal for a fraction p/q will take up to q-1 bits when we store it as a repeating decimal, but log(p)+log(q) bits when stored as a rational number (numerator and denominator). Ivan Kazmenko.
Feb 21 2014
prev sibling next sibling parent "Chris Williams" <yoreanon-chrisw yahoo.co.jp> writes:
On Friday, 21 February 2014 at 09:04:40 UTC, Ivan Kazmenko wrote:
 I believe that the repeating decimals, or better, repeating 
 binary fractions, will hardly be more useful than a rational 
 representation like p/q.

Yeah, in retrospect I would say that a binary layout like: numberator length | +- | numerator | divisor Might be a stronger solution, or at least one which offers different advantages over what we have today. It still wouldn't be able to represent pi accurately, since it isn't a rational number, but I wouldn't be surprised if a rational number exists that could be represented than can be represented in IEEE format.
Feb 21 2014
prev sibling next sibling parent "Frustrated" <c1514843 drdrb.com> writes:
On Friday, 21 February 2014 at 07:42:36 UTC, francesco cattoglio
wrote:
 On Friday, 21 February 2014 at 05:21:53 UTC, Frustrated wrote:
 I think though adding a "repeating" bit would make it even more
 accurate so that repeating decimals within the bounds of 
 maximum
 bits used could be represented perfectly. e.g., 1/3 = 0.3333...
 could be represented perfectly with such a bit and sliding fp
 type. With proper cpu support one could have 0.3333... * 3 = 1
 exactly.

 By having two extra bits one could represent constants to any
 degree of accuracy. e.g., the last bit says the value 
 represents
 the ith constant in some list. This would allow very common
 irrational constants to be used: e, pi, sqrt(2), etc...

constants. More, such a "repeating bit" should become a "repeating counter", since you usually get a certain number of repeating digits, not just a single one.

I disagree. Many computations done by computers involve constants in the calculation... pi, e, sqrt(2), sqrt(-1), many physical constants, etc. The cpu could store these constants at a higher bit resolution than standard, say 128 bits or whatever.
 For floating points, the 3rd last bit could represent a 
 repeating
 decimal or they could be used in the constants for common
 repeating decimals. (since chances are, repeated calculations
 would not produce repeating decimals)

(I'm thinking mostly at message passing via TCP/IP), but have no real use in scientific computation. If you want good precision, you might as well be better off with bignum numbers.

Simply not true. Maple, for example, uses constants and can compute using constants. It might speed up the calculations significantly if one could compute at the cpu level using constants. e.g. pi*sqrt(2) is just another constant. 2*pi is another constant. Obviously there is a limit to the number of constants one can store but one can encode many constants easily without storage. (where the "index" itself is used as the value) Also, the cpu could reduce values to constants... so if you have a fp value that is 3.14159265..... or whatever, the cpu could convert this to the constant pi since for all purposes it is pi(even if it is just an approximation or pi - 1/10^10). Basically the benefit of this(and potential) outweigh the cost of 1 bit out of 64-bits. I doubt anyone would miss it. In fact, probably no real loss.... NaN could be a constant, in which case you would have only one rather than the 2 billion or whatever you have now).
Feb 21 2014
prev sibling next sibling parent "Francesco Cattoglio" <francesco.cattoglio gmail.com> writes:
On Friday, 21 February 2014 at 19:12:39 UTC, Frustrated wrote:
 Simply not true. Maple, for example, uses constants and can
 compute using constants.

two are completely unrelated.
 Basically the benefit of this(and potential) outweigh the cost 
 of
 1 bit out of 64-bits.

basically creating a tagged union type. It's pretty much unrelated from the "unum" proposed by the PDF.
Feb 21 2014
prev sibling next sibling parent "Frustrated" <c1514843 drdrb.com> writes:
On Friday, 21 February 2014 at 19:59:36 UTC, Francesco Cattoglio
wrote:
 On Friday, 21 February 2014 at 19:12:39 UTC, Frustrated wrote:
 Simply not true. Maple, for example, uses constants and can
 compute using constants.

The two are completely unrelated.
 Basically the benefit of this(and potential) outweigh the cost 
 of
 1 bit out of 64-bits.

basically creating a tagged union type. It's pretty much unrelated from the "unum" proposed by the PDF.

Yes, I mentioned that would could extend floating points to include other representations that are more efficient and reduce errors and such.
Feb 21 2014
prev sibling next sibling parent "Frustrated" <c1514843 drdrb.com> writes:
On Friday, 21 February 2014 at 09:04:40 UTC, Ivan Kazmenko wrote:
 On Friday, 21 February 2014 at 05:21:53 UTC, Frustrated wrote:
 I think though adding a "repeating" bit would make it even more
 accurate so that repeating decimals within the bounds of 
 maximum
 bits used could be represented perfectly. e.g., 1/3 = 0.3333...
 could be represented perfectly with such a bit and sliding fp
 type. With proper cpu support one could have 0.3333... * 3 = 1
 exactly.

I believe that the repeating decimals, or better, repeating binary fractions, will hardly be more useful than a rational representation like p/q. The reason is, if we take a reciprocal 1/q and represent it as a binary, decimal, or other fixed-base fraction, the representation is surely periodic, but the upper bound on the length of its period is as high as q-1, and it is not unlikely to be the exact bound. For example, at http://en.wikipedia.org/wiki/Binary_number#Fractions , note that the binary fraction for 1/11 has period 10, and for 1/13 the period is 12. Thus repeating decimal for a fraction p/q will take up to q-1 bits when we store it as a repeating decimal, but log(p)+log(q) bits when stored as a rational number (numerator and denominator).

No, you are comparing apples to oranges. The q is not the same in both equations. The number of bits for p/q when p and q are stored separately is log[2](p) + log[2](q). But when p/q is stored as a repeating decimal(assuming it repeats), then it is a fixed constant dependent on the number of bits. So in some cases the rational expression is cheaper and in other cases the repeating decimal is cheaper. e.g., .1.... in theory could take 1 bit to represent wile 1/9 takes 1 + 4 = 5. (the point being is that the rational representation sometimes takes way more bits than the repeating decimal representation, other times it doesn't. Basically if we use n bits to represent the repeating decimal we can't represent repetitions that require more than n bits, in this case a rational expression may be cheaper. If we use some type of compression then rational expressions would, in general, be cheaper the larger n is. e.g., 1/9 takes 5 bits to represent as as a rational faction(4 if we require the numerator to be 1). 0.1.... takes 1 bit to represent in theory but for n bits, it takes n bits. e.g., 1/9 = 11111111/10^8 = 0.000111... in decimal. Which takes 6 bits to store(000111). If n was large and fixed then I think statistically it would be best to use rational expressions rather than repeating decimals. But rational expressions are not unique and that may cause some problems with representation... unless the cpu implemented some algorithms for reduction. 1/9 = 10/90 but 10/90 takes way more bits to represent than 1/9 even though they represent the same repeating decimal and the repeating decimals bits are fixed. In any case, if the fpu had a history of calculations it could potentially store the calculations in an expression tree and attempt to reduce them. e.g., p/q*(q/p) = 1. A history could also be useful for constants in that multiplying several constants together could produce a new constant which would not introduce any new accumulation errors.
Feb 21 2014
prev sibling next sibling parent "Ivan Kazmenko" <gassa mail.ru> writes:
On Friday, 21 February 2014 at 20:27:18 UTC, Frustrated wrote:
 On Friday, 21 February 2014 at 09:04:40 UTC, Ivan Kazmenko 
 wrote:
 Thus repeating decimal for a fraction p/q will take up to q-1 
 bits when we store it as a repeating decimal, but 
 log(p)+log(q) bits when stored as a rational number (numerator 
 and denominator).

No, you are comparing apples to oranges. The q is not the same in both equations.

Huh? I believe my q is actually the same q in both. What I am saying is, more formally: 1. Representing p/q as a rational number asymptotically takes on the order of log(p)+log(q) bits. 2. Representing p/q as a repeating binary fraction asymptotically takes on the order of log(p)+q bits in the worst case, and this worst case is common. Make that a decimal fraction instead of a binary fraction if you wish, the statement remains the same.
 The number of bits for p/q when p and q are stored separately is
 log[2](p) + log[2](q). But when p/q is stored as a repeating
 decimal(assuming it repeats), then it is a fixed constant
 dependent on the number of bits.

A rational represented as a fixed-base (binary, decimal, etc.) fraction always has a period. Sometimes the period is degenerate, that is, consists of a single zero: 1/2 = 0.100000... = 0.1 as a binary fraction. Sometimes there is a non-degenerate pre-period part before the period: 13/10 = 1.0100110011 = 1.0(1001) as a binary fraction, the "1.0" part being the pre-period and the "(1001)" part the period. The same goes for decimal fractions.
 So in some cases the rational expression is cheaper and in other
 cases the repeating decimal is cheaper.

Sure, it depends on the p and q, I was talking about the asymptotic case. Say, for p and q uniformly distributed in [100..999], I believe the rational representation is much shorter on average. We can measure that if the practical need arises... :) In practice, one may argue that large prime denominators don't occur too often; well, for such applications, fine then.
 If n was large and fixed then I think statistically it would be
 best to use rational expressions rather than repeating decimals.

Yeah, that's similar to what I was trying to state.
 But rational expressions are not unique and that may cause some
 problems with representation... unless the cpu implemented some
 algorithms for reduction. 1/9 = 10/90 but 10/90 takes way more
 bits to represent than 1/9 even though they represent the same
 repeating decimal and the repeating decimals bits are fixed.

Given a rational, bringing it to lowest terms is as easy (yet costly) as calculating the GCD.
 In any case, if the fpu had a history of calculations it could
 potentially store the calculations in an expression tree and
 attempt to reduce them. e.g., p/q*(q/p) = 1. A history could 
 also be useful for constants in that multiplying several 
 constants
 together could produce a new constant which would not introduce
 any new accumulation errors.

As far as I know, most compilers do that for constants known at compile time. As for the run-time usage of constants, note that the pool of constants will still have to be stored somewhere, and addressed somehow, and this may cancel out the performance gains. Ivan Kazmenko.
Feb 22 2014
prev sibling next sibling parent "Ivan Kazmenko" <gassa mail.ru> writes:
On Saturday, 22 February 2014 at 14:17:23 UTC, Ivan Kazmenko 
wrote:
 Sometimes there is a non-degenerate pre-period part before the 
 period:
 13/10 = 1.0100110011 = 1.0(1001) as a binary fraction, the 
 "1.0" part being the pre-period and the "(1001)" part the 
 period.  The same goes for decimal fractions.

Sorry, I meant 13/10 = 1.0100110011..., that is, the "1001" part repeating forever - as opposed to just stopping at 10-th bit of the fraction.
Feb 22 2014
prev sibling next sibling parent "Nick B" <nick.barbalich gmail.com> writes:
Hi

I will ask my question again.

Is there any interest in this format within the D community ?

Nick
Feb 23 2014
prev sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On 20 February 2014 22:46, bearophile <bearophileHUGS lycos.com> wrote:
 Iain Buclaw:


 Most encoding formats use a form of discrete cosine transform, which
 involves floating point maths.

I don't believe this much :-(

:-( I looked up vorbis (very, very) briefly way of example. Most structs that deal with encoding/decoding use integers, floats and doubles to represent values such as decay, attenuation, thresholds, amplification etc.
Feb 23 2014