www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Less free underscores in number literals

reply bearophile <bearophileHUGS lycos.com> writes:
This is a minor thing, if you aren't interested, ignore it.

The support for underscore in number literals as done in D and Ada is a feature
I like a lot. But you may write:

long x = 1_000_000_000_00;

The usage of underscores there doesn't correspond to the thousands, this may
lead to mistakes, and then maybe to bugs. Something similar may happen for hex
(both integral and FP), octal or binary number literals (that usually you don't
divide in groups of 3).

In D I have written numbers with underscores positioned in a way that I
consider wrong.

So isn't it better to restrict the usage of the underscores every 3 digits
(starting from the less significant one) for decimal literals, and every 4 or 8
or 16 or 32 digits in binary/octal/hex number literals? (4 or 8 or 16 or 32
means that you are free to use one of those four styles, but then you need to
use it consistently in one number literal).

A problem with this is that not everybody uses groups of 3 digits in decimal
number literals (Do Chinese people use groups of four?).

(When I have proposed to introduce underscores in Python number literals they
have discussed about this sub topic too.)

Bye,
bearophile
Oct 22 2010
next sibling parent KennyTM~ <kennytm gmail.com> writes:
On Oct 23, 10 11:11, bearophile wrote:
 This is a minor thing, if you aren't interested, ignore it.

 The support for underscore in number literals as done in D and Ada is a
feature I like a lot. But you may write:

 long x = 1_000_000_000_00;

 The usage of underscores there doesn't correspond to the thousands, this may
lead to mistakes, and then maybe to bugs. Something similar may happen for hex
(both integral and FP), octal or binary number literals (that usually you don't
divide in groups of 3).

 In D I have written numbers with underscores positioned in a way that I
consider wrong.

 So isn't it better to restrict the usage of the underscores every 3 digits
(starting from the less significant one) for decimal literals, and every 4 or 8
or 16 or 32 digits in binary/octal/hex number literals? (4 or 8 or 16 or 32
means that you are free to use one of those four styles, but then you need to
use it consistently in one number literal).

 A problem with this is that not everybody uses groups of 3 digits in decimal
number literals (Do Chinese people use groups of four?).

 (When I have proposed to introduce underscores in Python number literals they
have discussed about this sub topic too.)

 Bye,
 bearophile

Not all kinds of numbers are naturally grouped by 3 digits, e.g. phone number 555_6789, credit card numbers 1234_5678_9012_3456L etc. (arguably they may be better stored at strings though.) std/traits.d of Phobos also contains non-standard position of separation: enum ParameterStorageClass : uint { /** * These flags can be bitwise OR-ed together to represent complex storage * class. */ NONE = 0, SCOPE = 0b000_1, /// ditto OUT = 0b001_0, /// ditto REF = 0b010_0, /// ditto LAZY = 0b100_0, /// ditto }
Oct 22 2010
prev sibling next sibling parent Austin Hastings <ah08010-d yahoo.com> writes:
On 10/22/2010 11:11 PM, bearophile wrote:
 This is a minor thing, if you aren't interested, ignore it.

 The support for underscore in number literals as done in D and Ada is a
feature I like a lot. But you may write:

 long x = 1_000_000_000_00;

 The usage of underscores there doesn't correspond to the thousands, this may
lead to mistakes, and then maybe to bugs. Something similar may happen for hex
(both integral and FP), octal or binary number literals (that usually you don't
divide in groups of 3).

 In D I have written numbers with underscores positioned in a way that I
consider wrong.

 So isn't it better to restrict the usage of the underscores every 3 digits
(starting from the less significant one) for decimal literals, and every 4 or 8
or 16 or 32 digits in binary/octal/hex number literals? (4 or 8 or 16 or 32
means that you are free to use one of those four styles, but then you need to
use it consistently in one number literal).

 A problem with this is that not everybody uses groups of 3 digits in decimal
number literals (Do Chinese people use groups of four?).

 (When I have proposed to introduce underscores in Python number literals they
have discussed about this sub topic too.)

I'm pretty opposed to this idea. Not just because it's euro-centric: ========== From http://en.wikipedia.org/wiki/Decimal_mark#Digit_grouping: For example, in various countries (e.g., China, India, and Japan), there have been traditional conventions of grouping by 2 or 4 digits. ========== But also because there's a lot I do that doesn't involve 3-digit grouping. Hex numbers, for example, make sense grouped as 2 or 4 digits. Binary numbers make sense grouped as 3 (for octal) and 4 (for nibbles), and bit-masks will frequently be unaligned, or aligned left instead of right (to describe upper-bit masks). It may be that a warning is convenient if the radix is 10. But it should probably be a very low-profile warning. And easy to suppress. =Austin
Oct 22 2010
prev sibling parent reply Olivier Pisano <olivier.pisano laposte.net> writes:
Le 23/10/2010 05:11, bearophile a écrit :
 This is a minor thing, if you aren't interested, ignore it.

 The support for underscore in number literals as done in D and Ada is a
feature I like a lot. But you may write:

 long x = 1_000_000_000_00;

 The usage of underscores there doesn't correspond to the thousands, this may
lead to mistakes, and then maybe to bugs. Something similar may happen for hex
(both integral and FP), octal or binary number literals (that usually you don't
divide in groups of 3).

 In D I have written numbers with underscores positioned in a way that I
consider wrong.

 So isn't it better to restrict the usage of the underscores every 3 digits
(starting from the less significant one) for decimal literals, and every 4 or 8
or 16 or 32 digits in binary/octal/hex number literals? (4 or 8 or 16 or 32
means that you are free to use one of those four styles, but then you need to
use it consistently in one number literal).

 A problem with this is that not everybody uses groups of 3 digits in decimal
number literals (Do Chinese people use groups of four?).

 (When I have proposed to introduce underscores in Python number literals they
have discussed about this sub topic too.)

 Bye,
 bearophile

Hi, Chinese and Japanese people do create large numbers are by grouping digits in myriads (every 10,000) rather than the Western thousands (1000) : http://en.wikipedia.org/wiki/Japanese_numerals#Large_numbers Thanks to unicode, D has successfuly enabled those people to write their identifiers using their own characters if they want to. I don't see why we should force them to count the same way as we do (I am European). Olivier.
Oct 23 2010
parent reply Kagamin <spam here.lot> writes:
Olivier Pisano Wrote:

 Chinese and Japanese people do create large numbers are by grouping 
 digits in myriads (every 10,000) rather than the Western thousands (1000) :
 
 http://en.wikipedia.org/wiki/Japanese_numerals#Large_numbers

It's how their language builds numbers. Numbers written in ideographs use this grouping, but this doesn't mean, they use the same grouping for arabic digits. For example, amazon.co.jp uses arabic numbers and western 3-digit grouping.
Oct 23 2010
parent reply Rainer Deyke <rainerd eldwood.com> writes:
On 10/23/2010 10:53, Kagamin wrote:
 It's how their language builds numbers. Numbers written in ideographs
 use this grouping, but this doesn't mean, they use the same grouping
 for arabic digits. For example, amazon.co.jp uses arabic numbers and
 western 3-digit grouping.

Using groupings of three digits in Japanese seems extremely awkward, especially for larger numbers, since you would have to mentally regroup the digits in groups of four in order to read it. It's not just the written language but the spoken language that uses groups of four. For example, the number 1,234,567,890 would be read as 12億, 3456万, 7890. If amazon.jp uses groups of three, then my initial reaction is "imperfect localization". Also, even in English there are cases where groupings other than three make sense. Consider: int price_in_cents = 54_95; -- Rainer Deyke - rainerd eldwood.com
Oct 23 2010
next sibling parent KennyTM~ <kennytm gmail.com> writes:
On Oct 24, 10 03:11, Rainer Deyke wrote:
 On 10/23/2010 10:53, Kagamin wrote:
 It's how their language builds numbers. Numbers written in ideographs
 use this grouping, but this doesn't mean, they use the same grouping
 for arabic digits. For example, amazon.co.jp uses arabic numbers and
 western 3-digit grouping.

Using groupings of three digits in Japanese seems extremely awkward, especially for larger numbers, since you would have to mentally regroup the digits in groups of four in order to read it. It's not just the written language but the spoken language that uses groups of four. For example, the number 1,234,567,890 would be read as 12億, 3456万, 7890. If amazon.jp uses groups of three, then my initial reaction is "imperfect localization".

[Off topic] While it is read in groups of 4, I've never seen the number grouped by 4 digits in any part of East Asia. It is enter written as groups of three "1,234,567,890", without grouping "1234567890" or using the native units like "12億", "十二億三千四百五十六万七千八百九十" etc.
 Also, even in English there are cases where groupings other than three
 make sense.  Consider:

 int price_in_cents = 54_95;

Oct 23 2010
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Rainer Deyke:

 Also, even in English there are cases where groupings other than three
 make sense.  Consider:
 
 int price_in_cents = 54_95;

I see. It was a cute idea, but in the end it doesn't work. Thank you and the other people for all the answers. Bye, bearophile
Oct 23 2010
prev sibling parent reply Kagamin <spam here.lot> writes:
Rainer Deyke Wrote:

 Using groupings of three digits in Japanese seems extremely awkward,
 especially for larger numbers, since you would have to mentally regroup
 the digits in groups of four in order to read it.  It's not just the
 written language but the spoken language that uses groups of four.  For
 example, the number 1,234,567,890 would be read as 12億, 3456万, 7890.
 
 If amazon.jp uses groups of three, then my initial reaction is
 "imperfect localization".

You can see a video made by japanese themselves. http://www.youtube.com/watch?v=zQrP8ELjVLc At 0:10 you can see youtube site with watch count 1024,310 which is effortlessly read as hyaku man. screenshot: http://i55.tinypic.com/2niazvk.jpg
Oct 24 2010
parent Olivier Pisano <olivier.pisano laposte.net> writes:
Le 24/10/2010 15:23, Kagamin a écrit :
 Rainer Deyke Wrote:

 Using groupings of three digits in Japanese seems extremely awkward,
 especially for larger numbers, since you would have to mentally regroup
 the digits in groups of four in order to read it.  It's not just the
 written language but the spoken language that uses groups of four.  For
 example, the number 1,234,567,890 would be read as 12億, 3456万, 7890.

 If amazon.jp uses groups of three, then my initial reaction is
 "imperfect localization".

You can see a video made by japanese themselves. http://www.youtube.com/watch?v=zQrP8ELjVLc At 0:10 you can see youtube site with watch count 1024,310 which is effortlessly read as hyaku man. screenshot: http://i55.tinypic.com/2niazvk.jpg

Really strange, IMHO. My last lesson of Japanese language at university is nine years old and what I can remember is pretty much what Rainer Deyke wrote. I don't think an anime is relevant since what has been actually read by the voice actor is not the "1024,310" on the screen, but a text which is not shown on the final product and could have been written differently. Only a native Japanese person could tell us if (s)he can read a number formatted in such a way effortlessly, and if (s)he finds it natural. Anyway, I still don't see the point of imposing our way of doing things to others. Cheers, Olivier
Oct 24 2010