digitalmars.D - Octal literals: who uses this?

Christopher Wright (17/17) Mar 14 2009 I've been looking at dil and lexing D. Lexing character literals and

Sean Kelly (2/24) Mar 14 2009 All the escaped literals are going away, I believe.

Stewart Gordon (5/6) Mar 14 2009 I think all that's happening there is the removal of escaped characters

Jarrett Billingsley (3/19) Mar 14 2009 People use octal?
Stewart Gordon (6/13) Mar 14 2009 One octal literal is very commonly used: \0.
Walter Bright (4/5) Mar 14 2009 The octal literals are done the way C does them. The reason they are

Stewart Gordon (4/7) Mar 14 2009 How would making them illegal not achieve this aim?

BCS (3/13) Mar 14 2009 Unless you also drop \0 then any octal literal starting in 0 will get in...
Walter Bright (4/11) Mar 14 2009 The only point to making them illegal would be to eventually remove them...

Don (18/30) Mar 17 2009 The "Obscure bugs during translation from C" argument presumes that such...

BCS (2/4) Mar 17 2009 OTOH even if I grant that, I don't see much reason for dropping them.
Walter Bright (10/19) Mar 17 2009 It doesn't matter, because if you're translating C code to D, the code

BCS (5/8) Mar 17 2009 I am working with a ~11KLOC c# code base and a tool to automatically tra...

Walter Bright (2/10) Mar 17 2009 Color me wrong, then!

BCS (5/17) Mar 17 2009 Not to far off, you just forgot to qualify it with "sane".

Walter Bright (2/3) Mar 17 2009 That'll be cool!

Don (16/41) Mar 18 2009 Note that in C, you can't reasonably have \0 embedded in a string. But

Christopher Wright (3/9) Mar 14 2009 Okay, that makes sense. Removing it would be an option; \0 would have to...

Christopher Wright <dhasenan gmail.com> writes:

I've been looking at dil and lexing D. Lexing character literals and 
string literals is not quite so easy as I thought it would be, but 
overall not difficult either.

One thing I'm curious about:
There are three forms of hex literals:
\x: 2 digits
\u: 4 digits
\U: 8 digits

There is one form of octal literal:
\: 1 to 3 digits

Why? With hex literals, each option is a fixed width. That is sensible.

Octal literals aren't necessary with hex literals, but they might be 
convenient. However, making them variable width seems like it opens up 
the possibility for obscure bugs. I would not recommend that anyone use 
octal literals, and I don't think they're an advantage to the language. 
Even if they were, their current representation is not.

Can we just remove this?

Mar 14 2009

Sean Kelly <sean invisibleduck.org> writes:

Christopher Wright wrote:
 I've been looking at dil and lexing D. Lexing character literals and 
 string literals is not quite so easy as I thought it would be, but 
 overall not difficult either.
 
 One thing I'm curious about:
 There are three forms of hex literals:
 \x: 2 digits
 \u: 4 digits
 \U: 8 digits
 
 There is one form of octal literal:
 \: 1 to 3 digits
 
 Why? With hex literals, each option is a fixed width. That is sensible.
 
 Octal literals aren't necessary with hex literals, but they might be 
 convenient. However, making them variable width seems like it opens up 
 the possibility for obscure bugs. I would not recommend that anyone use 
 octal literals, and I don't think they're an advantage to the language. 
 Even if they were, their current representation is not.
 
 Can we just remove this?

All the escaped literals are going away, I believe.

Mar 14 2009

Stewart Gordon <smjg_1998 yahoo.com> writes:

Sean Kelly wrote:
<snip>
 All the escaped literals are going away, I believe.

I think all that's happening there is the removal of escaped characters 
not enclosed in quotes.

Stewart.

Mar 14 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Sat, Mar 14, 2009 at 9:13 AM, Christopher Wright <dhasenan gmail.com> wrote:
 I've been looking at dil and lexing D. Lexing character literals and string
 literals is not quite so easy as I thought it would be, but overall not
 difficult either.

 One thing I'm curious about:
 There are three forms of hex literals:
 \x: 2 digits
 \u: 4 digits
 \U: 8 digits

 There is one form of octal literal:
 \: 1 to 3 digits

 Why? With hex literals, each option is a fixed width. That is sensible.

 Octal literals aren't necessary with hex literals, but they might be
 convenient. However, making them variable width seems like it opens up the
 possibility for obscure bugs. I would not recommend that anyone use octal
 literals, and I don't think they're an advantage to the language. Even if
 they were, their current representation is not.

People use octal?

Agreed.

Mar 14 2009

Stewart Gordon <smjg_1998 yahoo.com> writes:

Christopher Wright wrote:
<snip>
 Octal literals aren't necessary with hex literals, but they might be 
 convenient. However, making them variable width seems like it opens up 
 the possibility for obscure bugs. I would not recommend that anyone use 
 octal literals, and I don't think they're an advantage to the language. 
 Even if they were, their current representation is not.
 
 Can we just remove this?

One octal literal is very commonly used: \0.

At least save this one.  Just don't go allowing things like "\012" to 
mean ['\0', '1', '2'].

Stewart.

Mar 14 2009

Walter Bright <newshound1 digitalmars.com> writes:

Christopher Wright wrote:
 Can we just remove this?

The octal literals are done the way C does them. The reason they are 
there are for when translating C code to D code, obscure bugs are not 
introduced.

Mar 14 2009

Stewart Gordon <smjg_1998 yahoo.com> writes:

Walter Bright wrote:
<snip>
 The octal literals are done the way C does them. The reason they are 
 there are for when translating C code to D code, obscure bugs are not 
 introduced.

How would making them illegal not achieve this aim?

Stewart.

Mar 14 2009

BCS <none anon.com> writes:

Hello Stewart,

 Walter Bright wrote:
 <snip>
 The octal literals are done the way C does them. The reason they are
 there are for when translating C code to D code, obscure bugs are not
 introduced.
 

 How would making them illegal not achieve this aim?
 
 Stewart.
 

Unless you also drop \0 then any octal literal starting in 0 will get
incorrectly 
lexed.

Mar 14 2009

Walter Bright <newshound1 digitalmars.com> writes:

Stewart Gordon wrote:
 Walter Bright wrote:
 <snip>
 The octal literals are done the way C does them. The reason they are 
 there are for when translating C code to D code, obscure bugs are not 
 introduced.

 
 How would making them illegal not achieve this aim?

The only point to making them illegal would be to eventually remove them 
completely, which puts us back to \00 meaning something different in D 
than in C.

Mar 14 2009

Don <nospam nospam.com> writes:

Walter Bright wrote:
 Stewart Gordon wrote:
 Walter Bright wrote:
 <snip>
 The octal literals are done the way C does them. The reason they are 
 there are for when translating C code to D code, obscure bugs are not 
 introduced.

 How would making them illegal not achieve this aim?

 
 The only point to making them illegal would be to eventually remove them 
 completely, which puts us back to \00 meaning something different in D 
 than in C.

The "Obscure bugs during translation from C" argument presumes that such 
errors are more likely than ones such as:

int powersOfTen[] = {
   0001, //okay
   0010, // error: this is 8, not 10
   0100, // error: this is 64, not 100
   1000, // okay
};

and what the heck does "\000000\000000000\000\0000" mean?
I doubt there is much extant C code which uses octal. Automated 
translations of octal literals can be done accurately, and you're even 
supplying the 'htod' converter!


there's a precedent for dropping them. This also means that right now, 

that that's a scenario that is at least as likely as bugs from C.

I think the argument for octal is very, very weak.

Mar 17 2009

BCS <ao pathlink.com> writes:

Reply to don,


 I think the argument for octal is very, very weak.
 

OTOH even if I grant that, I don't see much reason for dropping them.

Mar 17 2009

Walter Bright <newshound1 digitalmars.com> writes:

Don wrote:
 and what the heck does "\000000\000000000\000\0000" mean?

It doesn't matter, because if you're translating C code to D, the code 
is probably correct even if you don't know what it means.

 I doubt there is much extant C code which uses octal. Automated 
 translations of octal literals can be done accurately, and you're even 
 supplying the 'htod' converter!

htod is not intended for creating implementation source code. It's just 
for headers. I expect most C translations will be done by hand.



 there's a precedent for dropping them. This also means that right now, 

 that that's a scenario that is at least as likely as bugs from C.


see translating C to D (I do it myself!).

 I think the argument for octal is very, very weak.

The issue is really the cost of it being in vs the benefit of pulling it 
out. I see very little cost of leaving it in, so it doesn't need much 
benefit to make it worthwhile.

Mar 17 2009

BCS <ao pathlink.com> writes:

Reply to Walter,


 do see translating C to D (I do it myself!).
 


it to D

"I had a problem, I decided to solve it with reg-ex, not I have 200 problems" 
<g>

Mar 17 2009

Walter Bright <newshound1 digitalmars.com> writes:

BCS wrote:
 Reply to Walter,
 

 do see translating C to D (I do it myself!).

 

 translate it to D

Color me wrong, then!

Mar 17 2009

BCS <none anon.com> writes:

Hello Walter,

 BCS wrote:
 
 Reply to Walter,
 

 I do see translating C to D (I do it myself!).
 


 translate it to D
 

 Color me wrong, then!
 

Not to far off, you just forgot to qualify it with "sane".


translate well.

We have plans to release the translator "at some point".

Mar 17 2009

Walter Bright <newshound1 digitalmars.com> writes:

BCS wrote:
 We have plans to release the translator "at some point".

That'll be cool!

Mar 17 2009

Don <nospam nospam.com> writes:

Walter Bright wrote:
 Don wrote:
 and what the heck does "\000000\000000000\000\0000" mean?

 
 It doesn't matter, because if you're translating C code to D, the code 
 is probably correct even if you don't know what it means.

Note that in C, you can't reasonably have \0 embedded in a string. But 

C. It's far more likely in D that someone would write:
"1st\02nd\03rd\04th\0";
and expect it to work.

 I doubt there is much extant C code which uses octal. Automated 
 translations of octal literals can be done accurately, and you're even 
 supplying the 'htod' converter!

 
 htod is not intended for creating implementation source code. It's just 
 for headers. I expect most C translations will be done by hand.

The point is that a reasonable fraction of the few remaining instances 
of octal literals, will be machine translated, and will therefore be 
free from these errors.

 

 there's a precedent for dropping them. This also means that right now, 

 argue that that's a scenario that is at least as likely as bugs from C.

 

 see translating C to D (I do it myself!).
 
 I think the argument for octal is very, very weak.

 
 The issue is really the cost of it being in vs the benefit of pulling it 
 out. I see very little cost of leaving it in, so it doesn't need much 
 benefit to make it worthwhile.

Inertia is the strongest argument, I think.
Octal-related bugs may occur
(1) when translating from ancient C code, if octal is removed.

(3) when writing new D code, if octal is retained.

IMHO, (2) and (3) are more probable than (1). However, all 3 cases are 
quite unlikely. It's extremely low on the list of priorities.

Mar 18 2009

Christopher Wright <dhasenan gmail.com> writes:

Walter Bright wrote:
 Christopher Wright wrote:
 Can we just remove this?

 
 The octal literals are done the way C does them. The reason they are 
 there are for when translating C code to D code, obscure bugs are not 
 introduced.

Okay, that makes sense. Removing it would be an option; \0 would have to 
change to \x00. But it's not a big deal, just an annoying blemish.

Mar 14 2009

D Programming

C/C++ Programming

Other

digitalmars.D - Octal literals: who uses this?