digitalmars.D - Less free underscores in number literals

bearophile (10/10) Oct 22 2010 This is a minor thing, if you aren't interested, ignore it.

KennyTM~ (18/28) Oct 22 2010 Not all kinds of numbers are naturally grouped by 3 digits, e.g. phone
Austin Hastings (15/23) Oct 22 2010 I'm pretty opposed to this idea. Not just because it's euro-centric:
Olivier Pisano (9/19) Oct 23 2010 Hi,

Kagamin (2/6) Oct 23 2010 It's how their language builds numbers. Numbers written in ideographs us...

Rainer Deyke (13/17) Oct 23 2010 Using groupings of three digits in Japanese seems extremely awkward,

KennyTM~ (6/21) Oct 23 2010 [Off topic]
bearophile (4/8) Oct 23 2010 I see. It was a cute idea, but in the end it doesn't work. Thank you and...
Kagamin (5/13) Oct 24 2010 You can see a video made by japanese themselves.

Olivier Pisano (13/26) Oct 24 2010 Really strange, IMHO.

Jesse Phillips (2/8) Oct 24 2010 I just asked, and the Japanese separate by the thousands when they write...

Jimmy Cao (8/19) Oct 24 2010 I don't know about the Japanese, but Chinese people read like

bearophile <bearophileHUGS lycos.com> writes:

This is a minor thing, if you aren't interested, ignore it.

The support for underscore in number literals as done in D and Ada is a feature
I like a lot. But you may write:

long x = 1_000_000_000_00;

The usage of underscores there doesn't correspond to the thousands, this may
lead to mistakes, and then maybe to bugs. Something similar may happen for hex
(both integral and FP), octal or binary number literals (that usually you don't
divide in groups of 3).

In D I have written numbers with underscores positioned in a way that I
consider wrong.

So isn't it better to restrict the usage of the underscores every 3 digits
(starting from the less significant one) for decimal literals, and every 4 or 8
or 16 or 32 digits in binary/octal/hex number literals? (4 or 8 or 16 or 32
means that you are free to use one of those four styles, but then you need to
use it consistently in one number literal).

A problem with this is that not everybody uses groups of 3 digits in decimal
number literals (Do Chinese people use groups of four?).

(When I have proposed to introduce underscores in Python number literals they
have discussed about this sub topic too.)

Bye,
bearophile

Oct 22 2010

KennyTM~ <kennytm gmail.com> writes:

On Oct 23, 10 11:11, bearophile wrote:
 This is a minor thing, if you aren't interested, ignore it.

 The support for underscore in number literals as done in D and Ada is a
feature I like a lot. But you may write:

 long x = 1_000_000_000_00;

 The usage of underscores there doesn't correspond to the thousands, this may
lead to mistakes, and then maybe to bugs. Something similar may happen for hex
(both integral and FP), octal or binary number literals (that usually you don't
divide in groups of 3).

 In D I have written numbers with underscores positioned in a way that I
consider wrong.

 So isn't it better to restrict the usage of the underscores every 3 digits
(starting from the less significant one) for decimal literals, and every 4 or 8
or 16 or 32 digits in binary/octal/hex number literals? (4 or 8 or 16 or 32
means that you are free to use one of those four styles, but then you need to
use it consistently in one number literal).

 A problem with this is that not everybody uses groups of 3 digits in decimal
number literals (Do Chinese people use groups of four?).

 (When I have proposed to introduce underscores in Python number literals they
have discussed about this sub topic too.)

 Bye,
 bearophile

Not all kinds of numbers are naturally grouped by 3 digits, e.g. phone 
number 555_6789, credit card numbers 1234_5678_9012_3456L etc. (arguably 
they may be better stored at strings though.)

std/traits.d of Phobos also contains non-standard position of separation:

enum ParameterStorageClass : uint
{
     /**
      * These flags can be bitwise OR-ed together to represent complex 
storage
      * class.
      */
     NONE    = 0,
     SCOPE   = 0b000_1,  /// ditto
     OUT     = 0b001_0,  /// ditto
     REF     = 0b010_0,  /// ditto
     LAZY    = 0b100_0,  /// ditto
}

Oct 22 2010

Austin Hastings <ah08010-d yahoo.com> writes:

On 10/22/2010 11:11 PM, bearophile wrote:
 This is a minor thing, if you aren't interested, ignore it.

 The support for underscore in number literals as done in D and Ada is a
feature I like a lot. But you may write:

 long x = 1_000_000_000_00;

 The usage of underscores there doesn't correspond to the thousands, this may
lead to mistakes, and then maybe to bugs. Something similar may happen for hex
(both integral and FP), octal or binary number literals (that usually you don't
divide in groups of 3).

 In D I have written numbers with underscores positioned in a way that I
consider wrong.

 So isn't it better to restrict the usage of the underscores every 3 digits
(starting from the less significant one) for decimal literals, and every 4 or 8
or 16 or 32 digits in binary/octal/hex number literals? (4 or 8 or 16 or 32
means that you are free to use one of those four styles, but then you need to
use it consistently in one number literal).

 A problem with this is that not everybody uses groups of 3 digits in decimal
number literals (Do Chinese people use groups of four?).

 (When I have proposed to introduce underscores in Python number literals they
have discussed about this sub topic too.)

I'm pretty opposed to this idea. Not just because it's euro-centric:

==========
 From http://en.wikipedia.org/wiki/Decimal_mark#Digit_grouping:

For example, in various countries (e.g., China, India, and Japan), there 
have been traditional conventions of grouping by 2 or 4 digits.

==========

But also because there's a lot I do that doesn't involve 3-digit 
grouping. Hex numbers, for example, make sense grouped as 2 or 4 digits.

Binary numbers make sense grouped as 3 (for octal) and 4 (for nibbles),
and bit-masks will frequently be unaligned, or aligned left instead of 
right (to describe upper-bit masks).

It may be that a warning is convenient if the radix is 10. But it should 
probably be a very low-profile warning. And easy to suppress.

=Austin

Oct 22 2010

Olivier Pisano <olivier.pisano laposte.net> writes:

Le 23/10/2010 05:11, bearophile a �crit :
 This is a minor thing, if you aren't interested, ignore it.

 The support for underscore in number literals as done in D and Ada is a
feature I like a lot. But you may write:

 long x = 1_000_000_000_00;

 The usage of underscores there doesn't correspond to the thousands, this may
lead to mistakes, and then maybe to bugs. Something similar may happen for hex
(both integral and FP), octal or binary number literals (that usually you don't
divide in groups of 3).

 In D I have written numbers with underscores positioned in a way that I
consider wrong.

 So isn't it better to restrict the usage of the underscores every 3 digits
(starting from the less significant one) for decimal literals, and every 4 or 8
or 16 or 32 digits in binary/octal/hex number literals? (4 or 8 or 16 or 32
means that you are free to use one of those four styles, but then you need to
use it consistently in one number literal).

 A problem with this is that not everybody uses groups of 3 digits in decimal
number literals (Do Chinese people use groups of four?).

 (When I have proposed to introduce underscores in Python number literals they
have discussed about this sub topic too.)

 Bye,
 bearophile

Hi,

Chinese and Japanese people do create large numbers are by grouping 
digits in myriads (every 10,000) rather than the Western thousands (1000) :

http://en.wikipedia.org/wiki/Japanese_numerals#Large_numbers

Thanks to unicode, D has successfuly enabled those people to write their 
identifiers using their own characters if they want to. I don't see why 
we should force them to count the same way as we do (I am European).

Olivier.

Oct 23 2010

Kagamin <spam here.lot> writes:

Olivier Pisano Wrote:

 Chinese and Japanese people do create large numbers are by grouping 
 digits in myriads (every 10,000) rather than the Western thousands (1000) :
 
 http://en.wikipedia.org/wiki/Japanese_numerals#Large_numbers

It's how their language builds numbers. Numbers written in ideographs use this
grouping, but this doesn't mean, they use the same grouping for arabic digits.
For example, amazon.co.jp uses arabic numbers and western 3-digit grouping.

Oct 23 2010

Rainer Deyke <rainerd eldwood.com> writes:

On 10/23/2010 10:53, Kagamin wrote:
 It's how their language builds numbers. Numbers written in ideographs
 use this grouping, but this doesn't mean, they use the same grouping
 for arabic digits. For example, amazon.co.jp uses arabic numbers and
 western 3-digit grouping.

Using groupings of three digits in Japanese seems extremely awkward,
especially for larger numbers, since you would have to mentally regroup
the digits in groups of four in order to read it.  It's not just the
written language but the spoken language that uses groups of four.  For
example, the number 1,234,567,890 would be read as 12億, 3456万, 7890.

If amazon.jp uses groups of three, then my initial reaction is
"imperfect localization".

Also, even in English there are cases where groupings other than three
make sense.  Consider:

int price_in_cents = 54_95;


-- 
Rainer Deyke - rainerd eldwood.com

Oct 23 2010

KennyTM~ <kennytm gmail.com> writes:

On Oct 24, 10 03:11, Rainer Deyke wrote:
 On 10/23/2010 10:53, Kagamin wrote:
 It's how their language builds numbers. Numbers written in ideographs
 use this grouping, but this doesn't mean, they use the same grouping
 for arabic digits. For example, amazon.co.jp uses arabic numbers and
 western 3-digit grouping.

 Using groupings of three digits in Japanese seems extremely awkward,
 especially for larger numbers, since you would have to mentally regroup
 the digits in groups of four in order to read it.  It's not just the
 written language but the spoken language that uses groups of four.  For
 example, the number 1,234,567,890 would be read as 12億, 3456万, 7890.

 If amazon.jp uses groups of three, then my initial reaction is
 "imperfect localization".

[Off topic]

While it is read in groups of 4, I've never seen the number grouped by 4 
digits in any part of East Asia. It is enter written as groups of three 
"1,234,567,890", without grouping "1234567890" or using the native units 
like "12億", "十二億三千四百五十六万七千八百九十" etc.

 Also, even in English there are cases where groupings other than three
 make sense.  Consider:

 int price_in_cents = 54_95;

Oct 23 2010

bearophile <bearophileHUGS lycos.com> writes:

Rainer Deyke:

 Also, even in English there are cases where groupings other than three
 make sense.  Consider:
 
 int price_in_cents = 54_95;

I see. It was a cute idea, but in the end it doesn't work. Thank you and the
other people for all the answers.

Bye,
bearophile

Oct 23 2010

Kagamin <spam here.lot> writes:

Rainer Deyke Wrote:

 Using groupings of three digits in Japanese seems extremely awkward,
 especially for larger numbers, since you would have to mentally regroup
 the digits in groups of four in order to read it.  It's not just the
 written language but the spoken language that uses groups of four.  For
 example, the number 1,234,567,890 would be read as 12億, 3456万, 7890.
 
 If amazon.jp uses groups of three, then my initial reaction is
 "imperfect localization".

You can see a video made by japanese themselves.
http://www.youtube.com/watch?v=zQrP8ELjVLc
At 0:10 you can see youtube site with watch count 1024,310 which is
effortlessly read as hyaku man.
screenshot: http://i55.tinypic.com/2niazvk.jpg

Oct 24 2010

Olivier Pisano <olivier.pisano laposte.net> writes:

Le 24/10/2010 15:23, Kagamin a �crit :
 Rainer Deyke Wrote:

 Using groupings of three digits in Japanese seems extremely awkward,
 especially for larger numbers, since you would have to mentally regroup
 the digits in groups of four in order to read it.  It's not just the
 written language but the spoken language that uses groups of four.  For
 example, the number 1,234,567,890 would be read as 12億, 3456万, 7890.

 If amazon.jp uses groups of three, then my initial reaction is
 "imperfect localization".

 You can see a video made by japanese themselves.
 http://www.youtube.com/watch?v=zQrP8ELjVLc
 At 0:10 you can see youtube site with watch count 1024,310 which is
effortlessly read as hyaku man.
 screenshot: http://i55.tinypic.com/2niazvk.jpg

Really strange, IMHO.
My last lesson of Japanese language at university is nine years old and 
what I can remember is pretty much what Rainer Deyke wrote.

I don't think an anime is relevant since what has been actually read by 
the voice actor is not the "1024,310" on the screen, but a text which is 
not shown on the final product and could have been written differently.

Only a native Japanese person could tell us if (s)he can read a number 
formatted in such a way effortlessly, and if (s)he finds it natural.
Anyway, I still don't see the point of imposing our way of doing things 
to others.

Cheers,

Olivier

Oct 24 2010

Jesse Phillips <jessekphillips+D gmail.com> writes:

Rainer Deyke Wrote:

 Using groupings of three digits in Japanese seems extremely awkward,
 especially for larger numbers, since you would have to mentally regroup
 the digits in groups of four in order to read it.  It's not just the
 written language but the spoken language that uses groups of four.  For
 example, the number 1,234,567,890 would be read as 12億, 3456万, 7890.

 Rainer Deyke - rainerd eldwood.com

I just asked, and the Japanese separate by the thousands when they write, and
it is read as you described.

Oct 24 2010

Jimmy Cao <jcao219 gmail.com> writes:

I don't know about the Japanese, but Chinese people read like
12=D2=DA=A3=AC3456=CD=F2=A3=AC7890.  (simplified, traditional version would=
 be exactly the same
writing as Japanese).
I've never seen it separated with commas though, I always see 1234567890.

On Sun, Oct 24, 2010 at 6:59 PM, Jesse Phillips
<jessekphillips+D gmail.com<jessekphillips%2BD gmail.com>
 wrote:

 Rainer Deyke Wrote:

 Using groupings of three digits in Japanese seems extremely awkward,
 especially for larger numbers, since you would have to mentally regroup
 the digits in groups of four in order to read it.  It's not just the
 written language but the spoken language that uses groups of four.  For
 example, the number 1,234,567,890 would be read as 12=83|, 3456=CD=F2, =


7890.
 Rainer Deyke - rainerd eldwood.com

 I just asked, and the Japanese separate by the thousands when they write,
 and it is read as you described.

Oct 24 2010

D Programming

C/C++ Programming

Other

digitalmars.D - Less free underscores in number literals