www.digitalmars.com         C & C++   DMDScript  

D - String literals

reply "Walter" <walter digitalmars.com> writes:
Currently, there are 3 kinds of string literals:
'string' : wysiwyg strings
"string" : escaped strings
\ : single character strings

There is no character literal syntax; 1 character long strings are
implicitly converted to character literals based on context. Unfortunately,
this leads to ambiguities with no reasonable way out (other than crafting
arbitrary and confusing rules).

So, I've been thinking of going back to the C way and having ' ' for
character literals. That means that wysiwyg strings are left without a
lexical syntax. Any ideas for something that would look nice? How about
using back quotes ` `, or is that just too hard to distinguish in certain
fonts? One thing to keep in mind is that wysiwyg strings are not going to be
used with nearly the same frequency as escaped strings, so the syntax can be
a bit less convenient for them.

I'd like to use /string/, but that leads to too may lexical ambiguities.

Some possibilities are:
1) prefixing the " with a letter or a character, as in:
    W"string"
    %"string"
    !"string"
2) using a character not used in C, such as:
    `string`
    $string$
     string 
    #string#
Jul 26 2003
next sibling parent reply Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
What about double double quotes for wysiwyg?

""string"" : wysiwyg string

I don't really like that syntax, but I think it's better than any of the 
other alternatives proposed so far.  I like the fact that it uses a 
quote character to denote a string, rather than some other, arbitrarily 
selected character.

Walter wrote:
 Currently, there are 3 kinds of string literals:
 'string' : wysiwyg strings
 "string" : escaped strings
 \ : single character strings
 
 There is no character literal syntax; 1 character long strings are
 implicitly converted to character literals based on context. Unfortunately,
 this leads to ambiguities with no reasonable way out (other than crafting
 arbitrary and confusing rules).
 
 So, I've been thinking of going back to the C way and having ' ' for
 character literals. That means that wysiwyg strings are left without a
 lexical syntax. Any ideas for something that would look nice? How about
 using back quotes ` `, or is that just too hard to distinguish in certain
 fonts? One thing to keep in mind is that wysiwyg strings are not going to be
 used with nearly the same frequency as escaped strings, so the syntax can be
 a bit less convenient for them.
 
 I'd like to use /string/, but that leads to too may lexical ambiguities.
 
 Some possibilities are:
 1) prefixing the " with a letter or a character, as in:
     W"string"
     %"string"
     !"string"
 2) using a character not used in C, such as:
     `string`
     $string$
      string 
     #string#
 
 

Jul 26 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:bfuc2b$5ck$1 digitaldaemon.com...
 What about double double quotes for wysiwyg?

 ""string"" : wysiwyg string

 I don't really like that syntax, but I think it's better than any of the
 other alternatives proposed so far.  I like the fact that it uses a
 quote character to denote a string, rather than some other, arbitrarily
 selected character.

It's a good idea, but it conflicts with using "" to denote an empty string. I thought of using "'string'", but it looks too weird <g>.
Jul 26 2003
next sibling parent reply "Sean L. Palmer" <palmer.sean verizon.net> writes:
I thought of that too, but it conflicts with '"' and "'".

What about '' (two single quotes)?

'c' // char
''string''  // wysiwyg string
"string"  // escaped string

I wouldn't be opposed to backquotes, either.

`string`

or

``string``

Unicode has some nice quotes too.

Sean

"Walter" <walter digitalmars.com> wrote in message
news:bfucor$64l$1 digitaldaemon.com...
 "Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
 news:bfuc2b$5ck$1 digitaldaemon.com...
 What about double double quotes for wysiwyg?

 ""string"" : wysiwyg string

 I don't really like that syntax, but I think it's better than any of the
 other alternatives proposed so far.  I like the fact that it uses a
 quote character to denote a string, rather than some other, arbitrarily
 selected character.

It's a good idea, but it conflicts with using "" to denote an empty

 I thought of using "'string'", but it looks too weird <g>.

Jul 26 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Sean L. Palmer" <palmer.sean verizon.net> wrote in message
news:bfunju$g8e$1 digitaldaemon.com...
 What about '' (two single quotes)?

 'c' // char
 ''string''  // wysiwyg string
 "string"  // escaped string

In a proportional font, they are nearly indistinguishable.
 I wouldn't be opposed to backquotes, either.

 `string`

 or

 ``string``

Probably ` is the current leading contender.
 Unicode has some nice quotes too.

I want to use something on a regular keyboard <g>.
Jul 26 2003
parent reply Ilya Minkov <midiclub 8ung.at> writes:
Walter wrote:
 "Sean L. Palmer" <palmer.sean verizon.net> wrote in message
 news:bfunju$g8e$1 digitaldaemon.com...
 
What about '' (two single quotes)?

'c' // char
''string''  // wysiwyg string
"string"  // escaped string

In a proportional font, they are nearly indistinguishable.

I think Python uses something like triple double quotes for something. """This is a docstring or a preformatted string, IIRC.""" -i.
Jul 26 2003
parent Chris Lawson <cl nospamhere.tinfoilhat.ca> writes:
Ilya Minkov wrote:
 Walter wrote:
 
 In a proportional font, they are nearly indistinguishable.

I think Python uses something like triple double quotes for something. """This is a docstring or a preformatted string, IIRC."""

Preformatted. foo = """ here the \n's are maintained. """ There are some formatting options as well with them (%s etc)
 
 -i.
 

Jul 28 2003
prev sibling parent Andy Friesen <andy ikagames.com> writes:
Walter wrote:
 "Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
 news:bfuc2b$5ck$1 digitaldaemon.com...
 
What about double double quotes for wysiwyg?

""string"" : wysiwyg string

I don't really like that syntax, but I think it's better than any of the
other alternatives proposed so far.  I like the fact that it uses a
quote character to denote a string, rather than some other, arbitrarily
selected character.

It's a good idea, but it conflicts with using "" to denote an empty string. I thought of using "'string'", but it looks too weird <g>.

Python uses r"string" for raw (wysiwyg) strings. C# uses "string". This is probably easier to parse, as isn't used anywhere else in the language. (as far as I know)
Jul 27 2003
prev sibling next sibling parent reply Patrick Down <pat codemoon.com> writes:
"Walter" <walter digitalmars.com> wrote in
news:bfu9kp$3f1$1 digitaldaemon.com: 

 Currently, there are 3 kinds of string literals:
 'string' : wysiwyg strings
 "string" : escaped strings
 \ : single character strings
 
 There is no character literal syntax; 1 character long strings are
 implicitly converted to character literals based on context.
 Unfortunately, this leads to ambiguities with no reasonable way out
 (other than crafting arbitrary and confusing rules).
 
 So, I've been thinking of going back to the C way and having ' ' for
 character literals. That means that wysiwyg strings are left without a
 lexical syntax. Any ideas for something that would look nice? How
 about using back quotes ` `, or is that just too hard to distinguish
 in certain fonts? One thing to keep in mind is that wysiwyg strings
 are not going to be used with nearly the same frequency as escaped
 strings, so the syntax can be a bit less convenient for them.
 
 I'd like to use /string/, but that leads to too may lexical
 ambiguities. 
 
 Some possibilities are:
 1) prefixing the " with a letter or a character, as in:
     W"string"
     %"string"
     !"string"

These are ok.
 2) using a character not used in C, such as:
     `string`
     $string$
      string 
     #string#

I think you should reserve these symbols for other things. Python uses """ for multi line strings. I find it useful. a=""" Line 1 Line 2 """
Jul 26 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Patrick Down" <pat codemoon.com> wrote in message
news:Xns93C480D51FA3Dpatcodemooncom 63.105.9.61...
 Python uses """ for multi line strings.  I find it
 useful.

 a="""
 Line 1
 Line 2
 """

That will work. But an empty string would be """""". Hmm. How about perhaps two single quotes, as in ''f''? Sadly, in a proportional font, it looks just like "f".
Jul 26 2003
parent reply "Fabian Giesen" <rygNO SPAMgmx.net> writes:
 That will work. But an empty string would be """""". Hmm. How about

Yeah, but as there is absolutely no point in using the WYSIWYG syntax for empty strings, I doubt this will have any practical problems :) -fg
Jul 27 2003
parent "Walter" <walter digitalmars.com> writes:
"Fabian Giesen" <rygNO SPAMgmx.net> wrote in message
news:bg05b6$1tnm$1 digitaldaemon.com...
 That will work. But an empty string would be """""". Hmm. How about

Yeah, but as there is absolutely no point in using the WYSIWYG syntax for empty strings, I doubt this will have any practical problems :)

You do have a point.
Jul 27 2003
prev sibling next sibling parent Ilya Minkov <midiclub 8ung.at> writes:
Walter wrote:
 Some possibilities are:
 2) using a character not used in C, such as:
     `string`
     $string$
      string 
     #string#

Using these ugly chacacters seems somewhat... ugly to me. Vote Against.
 1) prefixing the " with a letter or a character, as in:
     W"string"
     %"string"
     !"string"

Why do you want to prefix wysisyg strings? You may also prefix a string which is to mean character literal - which seems to make more sense to me. W"" looks too much like wide-something. Using % would prohibit overloading % as an operator where string is a second parameter. I consider ir useful as a formatting operator or somesuch. ! seems to be OK since it's a unary opertor anyway, and as such cannot be overloaded for a built-in types. -i.
Jul 26 2003
prev sibling next sibling parent reply "DeadCow" <deadcow-remove-this free.fr> writes:
what about a perl-like quotation ?

s (for string?) followed by any character as delimiter.

s/.../
s#...#
s%...%
s!...!

So you can choose the delimiter to avoid escaping.

ok it's a bit ugly.
can the lexer handle this kind of backward reference ?

-- Nicolas Repiquet
Jul 26 2003
parent reply "Walter" <walter digitalmars.com> writes:
"DeadCow" <deadcow-remove-this free.fr> wrote in message
news:bfuit9$bnr$1 digitaldaemon.com...
 what about a perl-like quotation ?

 s (for string?) followed by any character as delimiter.

 s/.../
 s#...#
 s%...%
 s!...!

 So you can choose the delimiter to avoid escaping.

 ok it's a bit ugly.

<g>
 can the lexer handle this kind of backward reference ?

Yes.
Jul 26 2003
parent "Walter" <walter digitalmars.com> writes:
"Walter" <walter digitalmars.com> wrote in message
news:bfussh$kiq$1 digitaldaemon.com...
 "DeadCow" <deadcow-remove-this free.fr> wrote in message
 news:bfuit9$bnr$1 digitaldaemon.com...
 what about a perl-like quotation ?

 s (for string?) followed by any character as delimiter.

 s/.../
 s#...#
 s%...%
 s!...!

 So you can choose the delimiter to avoid escaping.

 ok it's a bit ugly.

<g>
 can the lexer handle this kind of backward reference ?

Yes.

My mistake, no it can't. s% would tokenize ambiguously.
Jul 26 2003
prev sibling next sibling parent reply "Luna Kid" <lunakid neuropolis.org> writes:
"More spectacularly" distinguishing char literals from string
literals, in that they won't be surrounded by delimiters but
just prefixed with something, may also be a good solution.
(Char literals, after all, are totally different animals than
string literals in many ways.) Then some prefix char and some
simple rules to allow various (unambiguous) representations
of a char need to be defined.

Sz.


"Walter" <walter digitalmars.com> wrote in message
news:bfu9kp$3f1$1 digitaldaemon.com...
 Currently, there are 3 kinds of string literals:
 'string' : wysiwyg strings
 "string" : escaped strings
 \ : single character strings

 There is no character literal syntax; 1 character long strings are
 implicitly converted to character literals based on context. Unfortunately,
 this leads to ambiguities with no reasonable way out (other than crafting
 arbitrary and confusing rules).

 So, I've been thinking of going back to the C way and having ' ' for
 character literals. That means that wysiwyg strings are left without a
 lexical syntax. Any ideas for something that would look nice? How about
 using back quotes ` `, or is that just too hard to distinguish in certain
 fonts? One thing to keep in mind is that wysiwyg strings are not going to be
 used with nearly the same frequency as escaped strings, so the syntax can be
 a bit less convenient for them.

 I'd like to use /string/, but that leads to too may lexical ambiguities.

 Some possibilities are:
 1) prefixing the " with a letter or a character, as in:
     W"string"
     %"string"
     !"string"
 2) using a character not used in C, such as:
     `string`
     $string$
      string 
     #string#

Jul 26 2003
parent reply "Luna Kid" <lunakid neuropolis.org> writes:
 simple rules to allow various (unambiguous) representations
 of a char need to be defined.

Umm, those rules are already defined as EscapeSequence. :) So, how about this? Strings: 'verbatim, as before' "escaped,\x20as\x20before" Chars: #c <-- letter 'c' ## <-- is this problematic in D? #\t <-- Tab #\x20 <-- Space #\\ <-- Backslash Sz.
Jul 26 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Luna Kid" <lunakid neuropolis.org> wrote in message
news:bfuqr0$irj$1 digitaldaemon.com...
 simple rules to allow various (unambiguous) representations
 of a char need to be defined.

Umm, those rules are already defined as EscapeSequence. :) So, how about this? Strings: 'verbatim, as before' "escaped,\x20as\x20before" Chars: #c <-- letter 'c' ## <-- is this problematic in D? #\t <-- Tab #\x20 <-- Space #\\ <-- Backslash

It's a good idea, but # has a problem with making it impractical to pass the source through a C preprocessor first!
Jul 26 2003
parent "Luna Kid" <lunakid neuropolis.org> writes:
 Strings:

     'verbatim, as before'

     "escaped,\x20as\x20before"

 Chars:

     #c    <-- letter 'c'
     ##    <-- is this problematic in D?
     #\t   <-- Tab
     #\x20 <-- Space
     #\\   <-- Backslash

It's a good idea, but # has a problem with making it impractical to pass the source through a C preprocessor first!

I see. Well, for me, any other prefix would do, if not too obscure (should be easy to type). perhaps? (Familiar to the fingers from emails. ;) ) Sz.
Jul 26 2003
prev sibling next sibling parent reply John Reimer <jjreimer telus.net> writes:
I like backquotes:

`string`

I don't think it's too confusing.
Jul 26 2003
next sibling parent "Matthew Wilson" <matthew stlsoft.org> writes:
Make that 3 then. :)

"John Reimer" <jjreimer telus.net> wrote in message
news:bfur97$ieh$1 digitaldaemon.com...
 I like backquotes:

 `string`

 I don't think it's too confusing.

Jul 26 2003
prev sibling parent reply Ilya Minkov <midiclub 8ung.at> writes:
The difference to single quotes is hardly noticable if you have a 160dpi 
display and space is tight. :) Like, my main programming machine is a 
high-performace handheld notebook with a *really* tiny display. :) You 
can guess that i don't have sausage-like fingers. :)

The backquote is better left for something, which cannot be easily 
confused with strings.

Better a sane prefix (like Burton's r -- whatever it means it looks 
good) or """ """.

-i.

John Reimer wrote:
 I like backquotes:
 
 `string`
 
 I don't think it's too confusing.
 

Jul 26 2003
parent reply John Reimer <jjreimer telus.net> writes:
Ilya Minkov wrote:

 The difference to single quotes is hardly noticable if you have a 160dpi 
 display and space is tight. :) Like, my main programming machine is a 
 high-performace handheld notebook with a *really* tiny display. :) You 
 can guess that i don't have sausage-like fingers. :)
 
 The backquote is better left for something, which cannot be easily 
 confused with strings.
 
 Better a sane prefix (like Burton's r -- whatever it means it looks 
 good) or """ """.
 

What's sane? That's subjective ;-). I like `string` because it's simple, clean, and easy. The other ways look ugly, I think. Not everybody uses a nasty, cramped laptop :-). With all those syntax highlighting text editors out there, I don't think it's much of a problem setting special colours for the string to make it more obvious for those with small/dense displays, no? Later, John
Jul 26 2003
next sibling parent reply "Matthew Wilson" <matthew stlsoft.org> writes:
 With all those syntax highlighting text editors out there, I don't think
 it's much of a problem setting special colours for the string to make it
 more obvious for those with small/dense displays, no?

Excellent point
Jul 27 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Matthew Wilson" <matthew stlsoft.org> wrote in message
news:bg02ei$1qsh$1 digitaldaemon.com...
 With all those syntax highlighting text editors out there, I don't think
 it's much of a problem setting special colours for the string to make it
 more obvious for those with small/dense displays, no?

Excellent point

But when people write books on D, they'll be using monochrome text. Most people still use monochrome printers to print source code.
Jul 27 2003
next sibling parent reply "Sean L. Palmer" <palmer.sean verizon.net> writes:
You can still use a different font, or make them bold, or italicized, or
something.

Sean

"Walter" <walter digitalmars.com> wrote in message
news:bg0uo4$2nkk$3 digitaldaemon.com...
 "Matthew Wilson" <matthew stlsoft.org> wrote in message
 news:bg02ei$1qsh$1 digitaldaemon.com...
 With all those syntax highlighting text editors out there, I don't



 it's much of a problem setting special colours for the string to make



 more obvious for those with small/dense displays, no?

Excellent point

But when people write books on D, they'll be using monochrome text. Most people still use monochrome printers to print source code.

Jul 27 2003
parent Ilya Minkov <midiclub 8ung.at> writes:
Sean L. Palmer wrote:

But when people write books on D, they'll be using monochrome text. Most
people still use monochrome printers to print source code.


Go figure out whether this tiny thing is bold, or maybe italics? Serif or Sans-Serif? It's next to impossible. Besides, there are so few usable formatting styles that they are better used for something else. Unless you want yOUR cODE tO lOOK lIKE mADMAN,S pARTY iNVITATION fLYER! :) -i.
Jul 27 2003
prev sibling parent John Reimer <jjreimer telus.net> writes:
Walter wrote:
 "Matthew Wilson" <matthew stlsoft.org> wrote in message
 news:bg02ei$1qsh$1 digitaldaemon.com...
 
With all those syntax highlighting text editors out there, I don't think
it's much of a problem setting special colours for the string to make it
more obvious for those with small/dense displays, no?

Excellent point

But when people write books on D, they'll be using monochrome text. Most people still use monochrome printers to print source code.

Ok, counter to counter argument coming up... :-) But when people write books, they seem to still use visual aids of different kinds to amplify what that code means -- like a switch in font or a bold face or something. They could always do something like that. People are creative when they need to be :-). People overcome problems and that one seems to be a minor one. We just have to know if the problems outweigh the benefits. The greatest argument against `string` to me would perhaps be the "cultural" one or the fact that it's a none-standard/unusual character. I still vote for it, but I understand the drawbacks people see in it. I still don't think the fact that it may be hard to render on monochrome cell phones is a strong argument against it. At that point, there would be other characters that would be hard to see anyway. Later, John
Jul 27 2003
prev sibling parent reply Ilya Minkov <midiclub 8ung.at> writes:
John Reimer wrote:

 With all those syntax highlighting text editors out there, I don't think 
 it's much of a problem setting special colours for the string to make it 
 more obvious for those with small/dense displays, no?

Legible colors use up very fast. They are simply too few. And when working in the sun, one can hardly distinguish even those. Besides, there must be someone with monochrome displays, no? Like, some people use an old 486-based Nokia organiser& cell phone, IIRC it's monochrome. -i.
Jul 27 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Ilya Minkov" <midiclub 8ung.at> wrote in message
news:bg0vsn$2opm$1 digitaldaemon.com...
 Legible colors use up very fast. They are simply too few. And when
 working in the sun, one can hardly distinguish even those. Besides,
 there must be someone with monochrome displays, no? Like, some people
 use an old 486-based Nokia organiser& cell phone, IIRC it's monochrome.

While I agree that working in monochrome is an important issue (heck, *I* work in monochrome, longtime habits die really, really hard!), I just can't see anyone writing programs on their cell phone!
Jul 27 2003
parent Ilya Minkov <midiclub 8ung.at> writes:
Walter wrote:
 While I agree that working in monochrome is an important issue (heck, *I*
 work in monochrome, longtime habits die really, really hard!), I just can't
 see anyone writing programs on their cell phone!

One friend of mine tried to debug my piece of C code using it, as he was in their "summer house" in northern finnland. It was midst in winter, they had very little electricity, so this thing was the only choice. They let the generator run a couple of hours a day. But my notebook comes quite close to it in size. :) That old nokia phone is much larger than usual, and about the size of a typical old Windows CE handheld and half the length of my notebook. -i.
Jul 27 2003
prev sibling next sibling parent reply Burton Radons <loth users.sourceforge.net> writes:
Walter wrote:
 Currently, there are 3 kinds of string literals:
 'string' : wysiwyg strings
 "string" : escaped strings
 \ : single character strings
 
 There is no character literal syntax; 1 character long strings are
 implicitly converted to character literals based on context. Unfortunately,
 this leads to ambiguities with no reasonable way out (other than crafting
 arbitrary and confusing rules).
 
 So, I've been thinking of going back to the C way and having ' ' for
 character literals. That means that wysiwyg strings are left without a
 lexical syntax. Any ideas for something that would look nice? How about
 using back quotes ` `, or is that just too hard to distinguish in certain
 fonts? One thing to keep in mind is that wysiwyg strings are not going to be
 used with nearly the same frequency as escaped strings, so the syntax can be
 a bit less convenient for them.

I like r"string", which is used in a number of languages already.
Jul 26 2003
next sibling parent reply "Walter" <walter digitalmars.com> writes:
"Burton Radons" <loth users.sourceforge.net> wrote in message
news:bfusdh$k5e$1 digitaldaemon.com...
 I like r"string", which is used in a number of languages already.

Which ones?
Jul 26 2003
parent Burton Radons <loth users.sourceforge.net> writes:
Walter wrote:
 "Burton Radons" <loth users.sourceforge.net> wrote in message
 news:bfusdh$k5e$1 digitaldaemon.com...
 
I like r"string", which is used in a number of languages already.

Which ones?

I can only find Python right now, but I've seen it used in some other languages, prototypes exclusively I think (Xana is one). I don't think the problem really exists. There's no reason to have single-character variants of functions like string.find; if the benefits to such a search is significant, then the cost of matching a check and running a special loop is insignificant. I have never had the problem come up outside of calling string module functions.
Jul 26 2003
prev sibling next sibling parent Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
(grin) Php uses this ?>   <? to "quote" literal HTML.  It would look a 
little odd in D, though.

I was about to send this as a joke, but it occurs to me, especially if 
we plan to sometimes embed D in HTML, that HTML tags aren't necessarily 
a bad idea.  What about:

<wysiwyg>string</wysiwyg>

????

Burton Radons wrote:
 Walter wrote:
 
 Currently, there are 3 kinds of string literals:
 'string' : wysiwyg strings
 "string" : escaped strings
 \ : single character strings

 There is no character literal syntax; 1 character long strings are
 implicitly converted to character literals based on context. 
 Unfortunately,
 this leads to ambiguities with no reasonable way out (other than crafting
 arbitrary and confusing rules).

 So, I've been thinking of going back to the C way and having ' ' for
 character literals. That means that wysiwyg strings are left without a
 lexical syntax. Any ideas for something that would look nice? How about
 using back quotes ` `, or is that just too hard to distinguish in certain
 fonts? One thing to keep in mind is that wysiwyg strings are not going 
 to be
 used with nearly the same frequency as escaped strings, so the syntax 
 can be
 a bit less convenient for them.

I like r"string", which is used in a number of languages already.

Jul 26 2003
prev sibling next sibling parent "Vathix" <vathix dprogramming.com> writes:
 I like r"string", which is used in a number of languages already.

I've finally decided I vote for this one.
Jul 28 2003
prev sibling parent "J. Daniel Smith" <J_Daniel_Smith HoTMaiL.com> writes:
C# is similar:  "string".

   Dan

"Burton Radons" <loth users.sourceforge.net> wrote in message
news:bfusdh$k5e$1 digitaldaemon.com...
 Walter wrote:
 Currently, there are 3 kinds of string literals:
 'string' : wysiwyg strings
 "string" : escaped strings
 \ : single character strings

 There is no character literal syntax; 1 character long strings are
 implicitly converted to character literals based on context.


 this leads to ambiguities with no reasonable way out (other than


 arbitrary and confusing rules).

 So, I've been thinking of going back to the C way and having ' ' for
 character literals. That means that wysiwyg strings are left without a
 lexical syntax. Any ideas for something that would look nice? How about
 using back quotes ` `, or is that just too hard to distinguish in


 fonts? One thing to keep in mind is that wysiwyg strings are not going


 used with nearly the same frequency as escaped strings, so the syntax


 a bit less convenient for them.

I like r"string", which is used in a number of languages already.

Jul 28 2003
prev sibling next sibling parent reply "Matthew Wilson" <matthew stlsoft.org> writes:
 1) prefixing the " with a letter or a character, as in:
     W"string"
     %"string"

Neither of these attract me. The % will be confusing with printf() format strings. The W ... just don't like it.
     !"string"

Definitely not. !"message string" is useful in asserts in C++. I've not given it any thought, but it seems that this could/would similarly in D.
 2) using a character not used in C, such as:
     `string`

I like this, but probably in a minority of 1 (or 2)
     $string$
      string 
     #string#

Hate all of these. Here's an unpopular thought: why not use "string" and be consistent with C#? Easier on the brain even if it doffs one's cap to M$
Jul 26 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Matthew Wilson" <matthew stlsoft.org> wrote in message
news:bfv0or$oeh$2 digitaldaemon.com...
 Here's an unpopular thought: why not use  "string" and be consistent with
 C#? Easier on the brain even if it doffs one's cap to M$

It's aesthetically unpleasing.
Jul 26 2003
parent reply "Matthew Wilson" <matthew stlsoft.org> writes:
LOL. What isn't? :)

"Walter" <walter digitalmars.com> wrote in message
news:bfv2s8$ql4$1 digitaldaemon.com...
 "Matthew Wilson" <matthew stlsoft.org> wrote in message
 news:bfv0or$oeh$2 digitaldaemon.com...
 Here's an unpopular thought: why not use  "string" and be consistent


 C#? Easier on the brain even if it doffs one's cap to M$

It's aesthetically unpleasing.

Jul 26 2003
parent reply "Walter" <walter digitalmars.com> writes:
That's what I'm trying to find!

"Matthew Wilson" <matthew stlsoft.org> wrote in message
news:bfv9m8$118q$1 digitaldaemon.com...
 LOL. What isn't? :)

 "Walter" <walter digitalmars.com> wrote in message
 news:bfv2s8$ql4$1 digitaldaemon.com...
 "Matthew Wilson" <matthew stlsoft.org> wrote in message
 news:bfv0or$oeh$2 digitaldaemon.com...
 Here's an unpopular thought: why not use  "string" and be consistent


 C#? Easier on the brain even if it doffs one's cap to M$

It's aesthetically unpleasing.


Jul 26 2003
parent reply "Matthew Wilson" <matthew stlsoft.org> writes:
`string`

"Walter" <walter digitalmars.com> wrote in message
news:bfvihp$19sq$1 digitaldaemon.com...
 That's what I'm trying to find!

 "Matthew Wilson" <matthew stlsoft.org> wrote in message
 news:bfv9m8$118q$1 digitaldaemon.com...
 LOL. What isn't? :)

 "Walter" <walter digitalmars.com> wrote in message
 news:bfv2s8$ql4$1 digitaldaemon.com...
 "Matthew Wilson" <matthew stlsoft.org> wrote in message
 news:bfv0or$oeh$2 digitaldaemon.com...
 Here's an unpopular thought: why not use  "string" and be consistent


 C#? Easier on the brain even if it doffs one's cap to M$

It's aesthetically unpleasing.



Jul 27 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Matthew Wilson" <matthew stlsoft.org> wrote in message
news:bg02ej$1qsh$2 digitaldaemon.com...
 `string`

The three frontrunners are at the moment: `string` """string""" r"string"
Jul 27 2003
next sibling parent reply "Matthew Wilson" <matthew stlsoft.org> writes:
 `string`
 """string"""

Instinct tells me that this is a parsing nightmare, no?
 r"string"

Jul 27 2003
parent "Walter" <walter digitalmars.com> writes:
"Matthew Wilson" <matthew stlsoft.org> wrote in message
news:bg29h0$12sg$1 digitaldaemon.com...
 `string`
 """string"""

Instinct tells me that this is a parsing nightmare, no?

Not that bad, actually. The clunky ones are the arbitrary lookahead problems.
Jul 28 2003
prev sibling parent Mark T <Mark_member pathlink.com> writes:
The three frontrunners are at the moment:

`string`
"""string"""
r"string"

I vote: THIS Python uses r"string" for raw (wysiwyg) strings. but how about raw"string" - isn't that much clearer OR C# uses "string" since they are already in use in another language """string""" is absolutely horrible
Jul 29 2003
prev sibling next sibling parent reply Helmut Leitner <helmut.leitner chello.at> writes:
Walter wrote:
 
 Currently, there are 3 kinds of string literals:
 'string' : wysiwyg strings
 "string" : escaped strings
 \ : single character strings
 
 There is no character literal syntax; 1 character long strings are
 implicitly converted to character literals based on context. Unfortunately,
 this leads to ambiguities with no reasonable way out (other than crafting
 arbitrary and confusing rules).

Just a few thoughts. I like 'string' : wysiwyg strings "string" : escaped strings the way it is (like Perl). Perl uses an alternative Syntax for various types of quoting ("any nonwhitespace delimiter may be used in place of / "): q/.../ single quote qq/.../ double quote qr/.../ quote regex expression qx/.../ quote execution (instead of backtick) qw/.../ word lists which seems quite handy and open for extensions On the other hand what would happen if you changed the semantics of \n to single character literal? The problems or ambiguities would be different and perhaps easier to solve. One may also note that VB uses ' and " identically. This is very handy to build strings containing the opposite type of quoting. E. g. SQL: "SELECT * FROM Table WHERE Name='Peter';" or: 'SELECT * FROM Table WHERE Name="Peter";' This would also simplify building HTML typical Perl: "<td width=\"1%\" align=\"right\">" VBish: '<td width="1%" align="right">' (of course no-one will ever use VB to generate HTML for other reasons) -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com
Jul 26 2003
parent "Walter" <walter digitalmars.com> writes:
"Helmut Leitner" <helmut.leitner chello.at> wrote in message
news:3F236984.85F075D1 chello.at...
 Walter wrote:
 Currently, there are 3 kinds of string literals:
 'string' : wysiwyg strings
 "string" : escaped strings
 \ : single character strings

 There is no character literal syntax; 1 character long strings are
 implicitly converted to character literals based on context.


 this leads to ambiguities with no reasonable way out (other than


 arbitrary and confusing rules).

Just a few thoughts. I like 'string' : wysiwyg strings "string" : escaped strings the way it is (like Perl). Perl uses an alternative Syntax for various types of quoting ("any nonwhitespace delimiter may be used in place of / "): q/.../ single quote qq/.../ double quote qr/.../ quote regex expression qx/.../ quote execution (instead of backtick) qw/.../ word lists which seems quite handy and open for extensions

This is nice, but it won't work due to lexer confusion with: a = q/3;
 On the other hand what would happen if you changed the semantics of
   \n
 to single character literal? The problems or ambiguities would
 be different and perhaps easier to solve.

It would work lexically, but I think it just wouldn't look right.
 One may also note that VB uses ' and " identically. This is very
 handy to build strings containing the opposite type of
 quoting.
 E. g. SQL: "SELECT * FROM Table WHERE Name='Peter';"
        or: 'SELECT * FROM Table WHERE Name="Peter";'
 This would also simplify building HTML
   typical Perl: "<td width=\"1%\" align=\"right\">"
          VBish: '<td width="1%" align="right">'
 (of course no-one will ever use VB to generate HTML for other reasons)

Using D to build HTML will happen with CGI apps. A single ' seems to be rare in HTML, so using 'string' for wysiwyg strings should work well.
Jul 27 2003
prev sibling next sibling parent reply "Vathix" <vathix dprogramming.com> writes:
`string` seems good. It doesn't really matter to me. I'm fine with the way
it is now, too.

"Walter" <walter digitalmars.com> wrote in message
news:bfu9kp$3f1$1 digitaldaemon.com...
 Currently, there are 3 kinds of string literals:
 'string' : wysiwyg strings
 "string" : escaped strings
 \ : single character strings

 There is no character literal syntax; 1 character long strings are
 implicitly converted to character literals based on context.

 this leads to ambiguities with no reasonable way out (other than crafting
 arbitrary and confusing rules).

 So, I've been thinking of going back to the C way and having ' ' for
 character literals. That means that wysiwyg strings are left without a
 lexical syntax. Any ideas for something that would look nice? How about
 using back quotes ` `, or is that just too hard to distinguish in certain
 fonts? One thing to keep in mind is that wysiwyg strings are not going to

 used with nearly the same frequency as escaped strings, so the syntax can

 a bit less convenient for them.

 I'd like to use /string/, but that leads to too may lexical ambiguities.

 Some possibilities are:
 1) prefixing the " with a letter or a character, as in:
     W"string"
     %"string"
     !"string"
 2) using a character not used in C, such as:
     `string`
     $string$
      string 
     #string#

Jul 27 2003
parent reply <sorry no.spam> writes:
 'string' : wysiwyg strings
 "string" : escaped strings

Should be as they are I think. What about a basic like syntax? (flames to /dev/null :-) char( ) || char(' ') char(0x20) char(\t) Can of chourse be made less verbose by shortening to chr() or c(). Just an idea. Roald
Jul 27 2003
parent "Walter" <walter digitalmars.com> writes:
<sorry no.spam> wrote in message news:bg0ktu$2cpc$1 digitaldaemon.com...
 What about a basic like syntax? (flames to /dev/null :-)

 char( ) || char(' ')
 char(0x20)
 char(\t)

 Can of chourse be made less verbose by shortening to
 chr() or c().

 Just an idea.

It'll work, but too much typing!
Jul 27 2003
prev sibling next sibling parent reply Burton Radons <loth users.sourceforge.net> writes:
Walter wrote:

 Currently, there are 3 kinds of string literals:
 'string' : wysiwyg strings
 "string" : escaped strings
 \ : single character strings
 
 There is no character literal syntax; 1 character long strings are
 implicitly converted to character literals based on context. Unfortunately,
 this leads to ambiguities with no reasonable way out (other than crafting
 arbitrary and confusing rules).

All implicit casting rules are arbitrary. Incorrect even; the value of: ubyte a = 64; ubyte b = 16; ubyte c = a * b / b; Should be zero by a strict reading of intention, not sixty-four. I'm jiggy with it because its utility trumps its arbitrary and confusing rules in my book. Likewise with preferring a function which takes a single character argument over a function which takes a string.
 So, I've been thinking of going back to the C way and having ' ' for
 character literals. That means that wysiwyg strings are left without a
 lexical syntax. Any ideas for something that would look nice? How about
 using back quotes ` `, or is that just too hard to distinguish in certain
 fonts? One thing to keep in mind is that wysiwyg strings are not going to be
 used with nearly the same frequency as escaped strings, so the syntax can be
 a bit less convenient for them.

Just a note. The "`" character is not a back quote, it's a grave accent. Like the tilde and circumflex accent (the caret), it was apparently originally intended as a character modifier, although I can't find any history on computer keyboard layout. That's why it looks nothing like a quote character, and more like the window system has made a rendering error. It's likely to be on every Romantic language's keyboard; you can look at their layouts at (http://www.microsoft.com/globaldev/reference/keyboards.aspx) using Internet Explorer only. But I wouldn't use the symbol; I don't think the problem, even if I agreed that it exists, merits the use of a new symbol. There hasn't been a new symbol in C in thirty years. C++ didn't add any new symbols. This conservatism can get out-of-hand, but unless if there is absolutely no way to do otherwise I don't think there should be new symbols.
 I'd like to use /string/, but that leads to too may lexical ambiguities.
 
 Some possibilities are:
 1) prefixing the " with a letter or a character, as in:
     W"string"
     %"string"
     !"string"
 2) using a character not used in C, such as:
     `string`
     $string$
      string 
     #string#

Dollar, "at", and octothorpe are all language- or culture-specific. I see pretty good keyboard layout coverage though.
Jul 27 2003
next sibling parent reply "Walter" <walter digitalmars.com> writes:
"Burton Radons" <loth users.sourceforge.net> wrote in message
news:bg0l6c$2d3h$1 digitaldaemon.com...
 All implicit casting rules are arbitrary.  Incorrect even; the value of:

     ubyte a = 64;
     ubyte b = 16;
     ubyte c = a * b / b;

 Should be zero by a strict reading of intention, not sixty-four.  I'm
 jiggy with it because its utility trumps its arbitrary and confusing
 rules in my book.

I know. However, those are the C rules for expression evaluation. They are very, very ingrained in C programmers, and subtly changing them would be a disaster for D.
 Likewise with preferring a function which takes a
 single character argument over a function which takes a string.

I can see many uses for such overloading. Also, there's: char[23] a; a[] = "b"; If "b" is a string, then a[] is set to "b\0\0\0\0\0....". If "b" is a single character, then a[] is set to "bbbbbb...";
 It's likely to be on every Romantic language's keyboard; you can look at
 their layouts at
 (http://www.microsoft.com/globaldev/reference/keyboards.aspx) using
 Internet Explorer only.  But I wouldn't use the symbol; I don't think
 the problem, even if I agreed that it exists, merits the use of a new
 symbol.  There hasn't been a new symbol in C in thirty years.  C++
 didn't add any new symbols.  This conservatism can get out-of-hand, but
 unless if there is absolutely no way to do otherwise I don't think there
 should be new symbols.

I feel the same way.
Jul 27 2003
parent Burton Radons <loth users.sourceforge.net> writes:
Walter wrote:

Likewise with preferring a function which takes a
single character argument over a function which takes a string.

I can see many uses for such overloading. Also, there's: char[23] a; a[] = "b"; If "b" is a string, then a[] is set to "b\0\0\0\0\0....". If "b" is a single character, then a[] is set to "bbbbbb...";

No, the first's a length mismatch; "Error: lengths don't match for array copy". It's a good error, results in a little overspecification at times, but keeps the nonsense code down.
Jul 30 2003
prev sibling next sibling parent Ilya Minkov <midiclub 8ung.at> writes:
Burton Radons wrote:

 Dollar, "at", and octothorpe are all language- or culture-specific.  I 
 see pretty good keyboard layout coverage though.

Being language- or culture-specific is OK. In Russia, Israel, and other countries with non-roman characters, dual layout keyboards are used: you switch between the native language and U.S. -English layout. It is accepted that you need next to all symbols from it to program and everyday configuration and somesuch. http://www.cyrillicstore.com/kbd/ru-btc.gif http://www.translation.net/keyshots.html -i.
Jul 27 2003
prev sibling parent "Matthew Wilson" <matthew stlsoft.org> writes:
 It's likely to be on every Romantic language's keyboard; you can look at
 their layouts at
 (http://www.microsoft.com/globaldev/reference/keyboards.aspx) using
 Internet Explorer only.  But I wouldn't use the symbol; I don't think
 the problem, even if I agreed that it exists, merits the use of a new
 symbol.  There hasn't been a new symbol in C in thirty years.  C++
 didn't add any new symbols.  This conservatism can get out-of-hand, but
 unless if there is absolutely no way to do otherwise I don't think there
 should be new symbols.

That is a good point.
Jul 27 2003
prev sibling next sibling parent reply "Daniel Yokomiso" <daniel_yokomiso yahoo.com.br> writes:
"Walter" <walter digitalmars.com> escreveu na mensagem
news:bfu9kp$3f1$1 digitaldaemon.com...
 Currently, there are 3 kinds of string literals:
 'string' : wysiwyg strings
 "string" : escaped strings
 \ : single character strings

 There is no character literal syntax; 1 character long strings are
 implicitly converted to character literals based on context.

 this leads to ambiguities with no reasonable way out (other than crafting
 arbitrary and confusing rules).

 So, I've been thinking of going back to the C way and having ' ' for
 character literals. That means that wysiwyg strings are left without a
 lexical syntax. Any ideas for something that would look nice? How about
 using back quotes ` `, or is that just too hard to distinguish in certain
 fonts? One thing to keep in mind is that wysiwyg strings are not going to

 used with nearly the same frequency as escaped strings, so the syntax can

 a bit less convenient for them.

 I'd like to use /string/, but that leads to too may lexical ambiguities.

 Some possibilities are:
 1) prefixing the " with a letter or a character, as in:
     W"string"
     %"string"
     !"string"
 2) using a character not used in C, such as:
     `string`
     $string$
      string 
     #string#

Hi, Couldn't we use a symbol for character literals, like other languages, like #a or $a for the letter "a"? It would probably be simpler, and we could keep '' and "" for strings. Also \ would be character literals, instead of single character strings. Best regards, Daniel Yokomiso. "Whenever I climb I am followed by a dog called 'Ego'." - Friedrich Nietzsche (1844-1900) --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.502 / Virus Database: 300 - Release Date: 18/7/2003
Jul 27 2003
parent reply "Carlos Santander B." <carlos8294 msn.com> writes:
"Daniel Yokomiso" <daniel_yokomiso yahoo.com.br> wrote in message
news:bg0q5t$2ieb$1 digitaldaemon.com...
|
| Hi,
|
|     Couldn't we use a symbol for character literals, like other languages,
| like #a or $a for the letter "a"? It would probably be simpler, and we
could
| keep '' and "" for strings. Also \ would be character literals, instead of
| single character strings.
|
|     Best regards,
|     Daniel Yokomiso.
|

I was going to vote for $"wysiwyg string", but now this seems like a better
idea, IMHO.

覧覧覧覧覧覧覧覧覧覧覧覧
Carlos Santander


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.504 / Virus Database: 302 - Release Date: 2003-07-24
Jul 27 2003
next sibling parent reply "Sean L. Palmer" <palmer.sean verizon.net> writes:
I don't see the point of wysiwyg string.  You will need escapes for the
quotes anyway, and in order to have escapes you have to have an escape for
the escape character.  Next thing you know you're back at the "normal" C
string.

Why do we need two kinds of string again?  So people can embed control
characters in the string or something?

Sean

"Carlos Santander B." <carlos8294 msn.com> wrote in message
news:bg2469$ta2$1 digitaldaemon.com...
 I was going to vote for $"wysiwyg string", but now this seems like a

 idea, IMHO.

Jul 28 2003
parent reply "Charles Sanders" <sanders-consulting comcast.net> writes:
I don't see the point either, I vote not to include it.

Charles

"Sean L. Palmer" <palmer.sean verizon.net> wrote in message
news:bg2kii$1fs0$1 digitaldaemon.com...
 I don't see the point of wysiwyg string.  You will need escapes for the
 quotes anyway, and in order to have escapes you have to have an escape for
 the escape character.  Next thing you know you're back at the "normal" C
 string.

 Why do we need two kinds of string again?  So people can embed control
 characters in the string or something?

 Sean

 "Carlos Santander B." <carlos8294 msn.com> wrote in message
 news:bg2469$ta2$1 digitaldaemon.com...
 I was going to vote for $"wysiwyg string", but now this seems like a

 idea, IMHO.


Jul 28 2003
next sibling parent reply Andy Friesen <andy ikagames.com> writes:
Charles Sanders wrote:
 I don't see the point either, I vote not to include it.
 
 Charles
 
 "Sean L. Palmer" <palmer.sean verizon.net> wrote in message
 news:bg2kii$1fs0$1 digitaldaemon.com...
 
I don't see the point of wysiwyg string.  You will need escapes for the
quotes anyway, and in order to have escapes you have to have an escape for
the escape character.  Next thing you know you're back at the "normal" C
string.

Why do we need two kinds of string again?  So people can embed control
characters in the string or something?

Sean

"Carlos Santander B." <carlos8294 msn.com> wrote in message
news:bg2469$ta2$1 digitaldaemon.com...

I was going to vote for $"wysiwyg string", but now this seems like a

better
idea, IMHO.



Regexps. :) If you want to match "c:\con\*.exe" you'd have to write the regexp as "c\\:\\\\con\\\\.*\\.\\*" if raw strings were not available.
Jul 28 2003
next sibling parent "Sean L. Palmer" <palmer.sean verizon.net> writes:
It strikes me that perhaps backslash is a poor choice for regexp escape
character, since it conflicts with both C escapes and path separators.

Sean

"Andy Friesen" <andy ikagames.com> wrote in message
news:bg3gqj$2c6t$1 digitaldaemon.com...
 Regexps. :)

 If you want to match "c:\con\*.exe" you'd have to write the regexp as
 "c\\:\\\\con\\\\.*\\.\\*" if raw strings were not available.

Jul 28 2003
prev sibling parent Mark Evans <Mark_member pathlink.com> writes:
Regexps. :)

If you want to match "c:\con\*.exe" you'd have to write the regexp as 
"c\\:\\\\con\\\\.*\\.\\*" if raw strings were not available.

So true, and reason to avoid regex-meaningful chars when inventing a string syntax. It might be smart to follow the C# convention just on user numbers. Mark
Jul 28 2003
prev sibling next sibling parent Ilya Minkov <midiclub 8ung.at> writes:
Charles Sanders wrote:
 I don't see the point either, I vote not to include it.

However, a way to conveniently include tabs and linebreaks without cluttering text with excape characters would be good. Actually, D has no parsing problems with all of this, so why not make "" and '' be semi-escaped strings, with a difference that with one the single and with the other the double quotes don't have to be escaped. Besides, since D faces no severe parsing problems (unlike Python), these *both* can be made to include tabs and linebreaks as plain text without replacing them by escapes. And yet we need some character designator - why not use a "c" prefix for any kind of string which would cast it into a character? Yet another possibility would be """string""" and '''string''' for strings completely without escape characters - being basically both the same. This clunky syntax could only get better if you allow to ignore (or force) the first linebreak directly after the opening """, as well as the last one just before the closing ones? These things are intended for writing blocks of text anyway. -i.
Jul 28 2003
prev sibling parent Bill Cox <bill viasic.com> writes:
Charles Sanders wrote:
 I don't see the point either, I vote not to include it.
 
 Charles

Ditto.
Jul 28 2003
prev sibling next sibling parent Frank Wills <name host.com> writes:
I definitely go for something like this, using
single quotes '' and double quotes, with something
else, for single characters. I would hate to see
the syntax get cluttered up just to diferentiate
single characters from strings. The syntax of ''
and "" is simple, clean, and attractive.

Carlos Santander B. wrote:
 "Daniel Yokomiso" <daniel_yokomiso yahoo.com.br> wrote in message
 news:bg0q5t$2ieb$1 digitaldaemon.com...
 |
 | Hi,
 |
 |     Couldn't we use a symbol for character literals, like other languages,
 | like #a or $a for the letter "a"? It would probably be simpler, and we
 could
 | keep '' and "" for strings. Also \ would be character literals, instead of
 | single character strings.
 |
 |     Best regards,
 |     Daniel Yokomiso.
 |
 
 I was going to vote for $"wysiwyg string", but now this seems like a better
 idea, IMHO.
 
 覧覧覧覧覧覧覧覧覧覧覧覧
 Carlos Santander
 
 
 ---
 Outgoing mail is certified Virus Free.
 Checked by AVG anti-virus system (http://www.grisoft.com).
 Version: 6.0.504 / Virus Database: 302 - Release Date: 2003-07-24
 
 

Jul 28 2003
prev sibling parent BenjiSmith <BenjiSmith_member pathlink.com> writes:
I agree. I use WYSIWYG strings and escaped strings all the time. I very rarely
use character literals. So, I'd prefer a syntax that uses traditional '' and "",
respectively for WYSIWYG and escaped strings, giving an oddball syntax to the
character literals.

And the nice thing about character literals is that, by definition, they are
only once character long, so there is no need for a closing delimiter, only a
starting delimiter.

My favorite options for character literals are:

\c
\\n     (for an escaped single-character, in this case a newline)
 c
#c
{c}
<c>

--Benji Smith


"Daniel Yokomiso" <daniel_yokomiso yahoo.com.br> wrote in message
news:bg0q5t$2ieb$1 digitaldaemon.com...

 Hi,

     Couldn't we use a symbol for character literals, like other languages,
 like #a or $a for the letter "a"? It would probably be simpler, and we
could
 keep '' and "" for strings. Also \ would be character literals, instead of
 single character strings.

     Best regards,
     Daniel Yokomiso.

Jul 28 2003
prev sibling next sibling parent reply Karl Bochert <kbochert copper.net> writes:
 
 Some possibilities are:
 1) prefixing the " with a letter or a character, as in:
     W"string"
     %"string"
     !"string"
 2) using a character not used in C, such as:
     `string`
     $string$
      string 
     #string#

I vote for method 1) It has the advantage of changing from one to the other more easily -- as in "oops - that should be the other kind" *"string" might work. Karl Bochert
Jul 28 2003
next sibling parent w <w_member pathlink.com> writes:
I vote for ( a bit verbose ) 
W"string"
C"string"
E"string"

In article <1103_1059382594 bose>, Karl Bochert says..
Jul 28 2003
prev sibling parent w <w_member pathlink.com> writes:
I vote for ( a bit verbose ) 
Ws"string"
Cs"string"
Es"string"
Sorry on the previos post

In article <1103_1059382594 bose>, Karl Bochert says..
Jul 28 2003
prev sibling next sibling parent reply Mark Evans <Mark_member pathlink.com> writes:
Walter,

Do things the C way as your intuition suggests.  Revert 'x' to char literal and
find something else for WYSIWYGs.  In a language so heavily based on C it makes
no sense to confuse end users about the meaning of 'x'.

I dislike # or $ or   because # is a comment in some files, $ reminds me of
Perl, and   makes me think about the Internet.  These are ugly solutions.

There is always ォFrenchサ or Python triple-quotes for WYSIWYGs.  I do not think
they will be as rare as you imagine.

Mark
Jul 28 2003
next sibling parent Ilya Minkov <midiclub 8ung.at> writes:
Mark Evans wrote:
 There is always =ABFrench=BB or Python triple-quotes for WYSIWYGs.  I d=

 they will be as rare as you imagine.

How do i enter these "french" ones with my keyboard? "<< >>"? Ugly. -i.
Jul 28 2003
prev sibling parent reply "Walter" <walter digitalmars.com> writes:
"Mark Evans" <Mark_member pathlink.com> wrote in message
news:bg42f2$2ucm$1 digitaldaemon.com...
 Walter,

 Do things the C way as your intuition suggests.  Revert 'x' to char

 find something else for WYSIWYGs.  In a language so heavily based on C it

 no sense to confuse end users about the meaning of 'x'.

Yes. Excellent point. I will do away with, however, the C notion of 'abcd' to stuff 4 characters into a long. Not only is it poorly supported in C, it is never clear what order the chars appear in the long.
 I dislike # or $ or   because # is a comment in some files, $ reminds me

 Perl, and   makes me think about the Internet.  These are ugly solutions.

Yes.
 There is always ォFrenchサ or Python triple-quotes for WYSIWYGs.  I do not

 they will be as rare as you imagine.

I find the """ rather awful looking. For example, struct A { char[] a, b, c; } A a = { """a""","""b""", """c""" } It looks like some sort of funky micro-barcode that would make sense if you looked at it with a magnifying glass <g>. It's beginning to look like to me the only solutions are bad and worse.
Jul 28 2003
next sibling parent reply "Sean L. Palmer" <palmer.sean verizon.net> writes:
"Walter" <walter digitalmars.com> wrote in message
news:bg4b89$57s$1 digitaldaemon.com...
 Yes. Excellent point. I will do away with, however, the C notion of 'abcd'
 to stuff 4 characters into a long. Not only is it poorly supported in C,

 is never clear what order the chars appear in the long.

I actually use this sometimes to make FourCC codes or signature markers for file headers. It can be convenient. Which is best? cast(uint) 'FHdr' or ('F' | ('H' << 8) | ('d' << 16) | ('r' << 24)) or 0x72644846 ? Yes, it'd be great if there were some way to know for sure the endianness of the ordering. That said, it's a minor feature, and has its problems, and I wouldn't cry much if it were lost. Sean
Jul 28 2003
parent "Walter" <walter digitalmars.com> writes:
"Sean L. Palmer" <palmer.sean verizon.net> wrote in message
news:bg5351$sp1$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:bg4b89$57s$1 digitaldaemon.com...
 Yes. Excellent point. I will do away with, however, the C notion of


 to stuff 4 characters into a long. Not only is it poorly supported in C,

 is never clear what order the chars appear in the long.

I actually use this sometimes to make FourCC codes or signature markers

 file headers.  It can be convenient.

 Which is best?

 cast(uint) 'FHdr'

 or

 ('F' | ('H' << 8) | ('d' << 16) | ('r' << 24))

 or

 0x72644846

 ?

 Yes, it'd be great if there were some way to know for sure the endianness

 the ordering.

 That said, it's a minor feature, and has its problems, and I wouldn't cry
 much if it were lost.

I tend to use such things only once every 30,000 lines of code or so <g> and the shift example you give above will serve nicely and reliably.
Jul 28 2003
prev sibling next sibling parent Dario <Dario_member pathlink.com> writes:
What about dropping the escaped strings?
You can always write escape sequences out of the string, can't you?
E.G. "Hello" \n "World" \n instead of "Hello\nWorld\n".
Yes, it's more typing but IMO it looks prettier and cleaner,
especially considering that editors show strings with a different color,
so escape sequences would be more visible.
-Dario

Walter:
It's beginning to look like to me the only solutions are bad and worse.

Jul 29 2003
prev sibling parent reply Mark Evans <Mark_member pathlink.com> writes:
OK, Walter, here is my final answer.  Do not use   even though C# uses it.  That
was a bad choice on Microsoft's part.  They really should have known better.
Too many preprocessors and languages use   for special purposes.  Think SWIG,
JavaDoc, etc.

Use r"string"r or raw"string"raw.  The advantage here is that you can later
define new types of strings with a new letter (Unicode? a string of bits,
b"101010101111111000000" or hexadecimal bit groups, x"ABCD12340000FFFF11111"?).
So in a sense it's extensible and future-proof.  This syntax is also reminiscent
of C's numeric prefix and suffix notations, 0xABCD, 0b1010, 1.234L, etc.

The numeric string concept is convenient for static pre-assignment of memory.
The alternatives in C are not pretty:  arrays of smaller things (comma, comma,
comma, comma, another comma,...) or an unreadable string ("#$~H*G_# jdkBG$*&").
So this notation is extra candy on top for embedded programming work.

The redundant closing letter is optional but recommended.  It solves the
meta-escape problem very cleanly.  (Actually it's dumbed-down XML.)  The b and x
variants would not require closing letters, as their contents are intrinsically
limited.  Whitespace should be allowed in them of course, x"ABCD 1234 FF00".

Mark
Jul 29 2003
parent reply "j anderson" <anderson badmama.com.au.REMOVE> writes:
"Mark Evans" <Mark_member pathlink.com> wrote in message
news:bg6rkb$2k4p$1 digitaldaemon.com...
 OK, Walter, here is my final answer.  Do not use   even though C# uses it.

 was a bad choice on Microsoft's part.  They really should have known

 Too many preprocessors and languages use   for special purposes.  Think

 JavaDoc, etc.

 Use r"string"r or raw"string"raw.  The advantage here is that you can

 define new types of strings with a new letter (Unicode? a string of bits,
 b"101010101111111000000" or hexadecimal bit groups,

 So in a sense it's extensible and future-proof.  This syntax is also

 of C's numeric prefix and suffix notations, 0xABCD, 0b1010, 1.234L, etc.

 The numeric string concept is convenient for static pre-assignment of

 The alternatives in C are not pretty:  arrays of smaller things (comma,

 comma, comma, another comma,...) or an unreadable string

 So this notation is extra candy on top for embedded programming work.

 The redundant closing letter is optional but recommended.  It solves the
 meta-escape problem very cleanly.  (Actually it's dumbed-down XML.)  The b

 variants would not require closing letters, as their contents are

 limited.  Whitespace should be allowed in them of course, x"ABCD 1234

 Mark

I came to a simular conclusion. The there suggested syntaxes (I haven't read all the replies) would make it much harder for the transition from C to D. Explaining that a character is put in front of the quote is easier then explaining that you need to use a particular symbol instead of quote. There may be some use for this syntax on array's as well. u{10,12,16}; //enforce unsigned. r{10,12,16}; //read only. o{10,12,16}; //ordered. fi{c:\data.txt}; //file to import integer array from. I can't think of anything *good* right now, but it's an option for latter down the track.
Aug 07 2003
parent "j anderson" <anderson badmama.com.au.REMOVE> writes:
"j anderson" <anderson badmama.com.au.REMOVE> wrote in message
news:bgte8b$2cs$1 digitaldaemon.com...
 "Mark Evans" <Mark_member pathlink.com> wrote in message
 news:bg6rkb$2k4p$1 digitaldaemon.com...
 OK, Walter, here is my final answer.  Do not use   even though C# uses


 That
 was a bad choice on Microsoft's part.  They really should have known

 Too many preprocessors and languages use   for special purposes.  Think

 JavaDoc, etc.

 Use r"string"r or raw"string"raw.  The advantage here is that you can

 define new types of strings with a new letter (Unicode? a string of


 b"101010101111111000000" or hexadecimal bit groups,

 So in a sense it's extensible and future-proof.  This syntax is also

 of C's numeric prefix and suffix notations, 0xABCD, 0b1010, 1.234L, etc.

 The numeric string concept is convenient for static pre-assignment of

 The alternatives in C are not pretty:  arrays of smaller things (comma,

 comma, comma, another comma,...) or an unreadable string

 So this notation is extra candy on top for embedded programming work.

 The redundant closing letter is optional but recommended.  It solves the
 meta-escape problem very cleanly.  (Actually it's dumbed-down XML.)  The


 and x
 variants would not require closing letters, as their contents are

 limited.  Whitespace should be allowed in them of course, x"ABCD 1234

 Mark

I came to a simular conclusion. The there suggested syntaxes (I haven't

 all the replies) would make it much harder for the transition from C to D.
 Explaining that a character is put in front of the quote is easier then
 explaining that you need to use a particular symbol instead of quote.

  There may be some use for this syntax on array's as well.

 u{10,12,16}; //enforce unsigned.
 r{10,12,16}; //read only.
 o{10,12,16}; //ordered.
 fi{c:\data.txt}; //file to import integer array from.

 I can't think of anything *good* right now, but it's an option for latter
 down the track.

Another idea would be {" "} The {" "} would imply it's an array of chars. Of course " would still need to be overloaded, but that could be done with a double quote. i.e. char a[] = {"This is an in-text quote ""."}; or parhaps even better char a[] = {"This is an in-text quote ", 34, "."); char a[] = {"This is an in-text quote ", '"', "."); All would print This is an in-text quote ". Although I still don't mind the append char "r" to front of string technique. - J Anderson
Aug 07 2003
prev sibling next sibling parent reply "Yeric" <REMOVEamigabloke yahoo.co.ukREMOVE> writes:
Just to put my two penneth worth in why not

^string^

that is the hat symbol above key 6 on a standard UK keyboard.


"Walter" <walter digitalmars.com> wrote in message
news:bfu9kp$3f1$1 digitaldaemon.com...
 Currently, there are 3 kinds of string literals:
 'string' : wysiwyg strings
 "string" : escaped strings
 \ : single character strings

 There is no character literal syntax; 1 character long strings are
 implicitly converted to character literals based on context.

 this leads to ambiguities with no reasonable way out (other than crafting
 arbitrary and confusing rules).

 So, I've been thinking of going back to the C way and having ' ' for
 character literals. That means that wysiwyg strings are left without a
 lexical syntax. Any ideas for something that would look nice? How about
 using back quotes ` `, or is that just too hard to distinguish in certain
 fonts? One thing to keep in mind is that wysiwyg strings are not going to

 used with nearly the same frequency as escaped strings, so the syntax can

 a bit less convenient for them.

 I'd like to use /string/, but that leads to too may lexical ambiguities.

 Some possibilities are:
 1) prefixing the " with a letter or a character, as in:
     W"string"
     %"string"
     !"string"
 2) using a character not used in C, such as:
     `string`
     $string$
      string 
     #string#

Jul 28 2003
parent "Walter" <walter digitalmars.com> writes:
"Yeric" <REMOVEamigabloke yahoo.co.ukREMOVE> wrote in message
news:bg4afp$4io$1 digitaldaemon.com...
 Just to put my two penneth worth in why not

 ^string^

Since ^ is also a binary operator, the lexer won't be independent from the parser.
Jul 28 2003
prev sibling parent reply "Walter" <walter digitalmars.com> writes:
This has been a most entertaining discussion, with a lot of great ideas. But
I have to pick one. I personally liked the `string`, but there are too many
problems with it, such as unreadable fonts, using a non-C character, etc.
The """string""" was just too ugly for me (sorry).

That leaves r"string" for wysiwyg string literals. The justifications are
(thanks to Mark, etc., who pointed them all out):

1) It sticks to the C character set.
2) No problems with different fonts.
3) Establishes a precedent for new types of special strings.
4) Easy to tokenize.
5) There's precedent experience with it in other languages, such as Python.
Jul 29 2003
next sibling parent reply "Matthew Wilson" <matthew stlsoft.org> writes:
Only """string""" would have had me talking to God on the big white
telephone (I'm learning Australian, and that means "puking"), so am
perfectly content. r"string" is a good one.  "string" was also good, but on
balance I think it's nicer to ape Python than C#. ;)

"Walter" <walter digitalmars.com> wrote in message
news:bg70ml$2oo8$1 digitaldaemon.com...
 This has been a most entertaining discussion, with a lot of great ideas.

 I have to pick one. I personally liked the `string`, but there are too

 problems with it, such as unreadable fonts, using a non-C character, etc.
 The """string""" was just too ugly for me (sorry).

 That leaves r"string" for wysiwyg string literals. The justifications are
 (thanks to Mark, etc., who pointed them all out):

 1) It sticks to the C character set.
 2) No problems with different fonts.
 3) Establishes a precedent for new types of special strings.
 4) Easy to tokenize.
 5) There's precedent experience with it in other languages, such as


Jul 29 2003
next sibling parent John Reimer <jjreimer telus.net> writes:
Matthew Wilson wrote:

 Only """string""" would have had me talking to God on the big white
 telephone (I'm learning Australian, and that means "puking"), so am
 perfectly content. r"string" is a good one.  "string" was also good, but on
 balance I think it's nicer to ape Python than C#. ;)

I agree.
Jul 29 2003
prev sibling parent reply Mark T <Mark_member pathlink.com> writes:
...would have had me talking to God on the big white
telephone (I'm learning Australian, and that means "puking"), 

I think that is "calling God on the big white phone" I'm American and we used that expression back in the 1970's.
Jul 30 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Mark T" <Mark_member pathlink.com> wrote in message
news:bg9g0a$2b7v$1 digitaldaemon.com...
...would have had me talking to God on the big white
telephone (I'm learning Australian, and that means "puking"),

I think that is "calling God on the big white phone" I'm American and we used that expression back in the 1970's.

Hmm. I always heard it as "praying to the porcelain gods."
Jul 31 2003
next sibling parent Mark T <Mark_member pathlink.com> writes:
In article <bgblpp$1d3a$1 digitaldaemon.com>, Walter says...
"Mark T" <Mark_member pathlink.com> wrote in message
news:bg9g0a$2b7v$1 digitaldaemon.com...
...would have had me talking to God on the big white
telephone (I'm learning Australian, and that means "puking"),

I think that is "calling God on the big white phone" I'm American and we used that expression back in the 1970's.

Hmm. I always heard it as "praying to the porcelain gods."

I also did some of that, it helped to clear my head for doing programming homework in Algol, with all those BEGIN-END pairs
Aug 01 2003
prev sibling parent "Rich C" <no spam.com> writes:
"Walter" <walter digitalmars.com> wrote in message
news:bgblpp$1d3a$1 digitaldaemon.com...

 Hmm. I always heard it as "praying to the porcelain gods."

My favorite was always "driving the big white bus." Rich C.
Aug 01 2003
prev sibling next sibling parent reply Ilya Minkov <midiclub 8ung.at> writes:
Walter wrote:
 That leaves r"string" for wysiwyg string literals. The justifications are
 (thanks to Mark, etc., who pointed them all out):

Good. How do i insert a " in such a string? Perhaps with ""? -i.
Jul 29 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Ilya Minkov" <midiclub 8ung.at> wrote in message
news:bg729k$2qfc$1 digitaldaemon.com...
 Walter wrote:
 That leaves r"string" for wysiwyg string literals. The justifications


 (thanks to Mark, etc., who pointed them all out):


Two ways: r"string" \" r"more" "string\"more"
Jul 29 2003
parent reply Ilya Minkov <midiclub 8ung.at> writes:
Walter wrote:
 Two ways:
 
 r"string" \" r"more"
 "string\"more"

Both being totally awful. One would run over many " when outputting any kind of markup or source. 'string' had solved that problem quite neatly. -i.
Jul 30 2003
parent reply Mark Evans <Mark_member pathlink.com> writes:
Walter wrote:
 Two ways:
 
 r"string" \" r"more"
 "string\"more"


The second idea is wrong. In a WYSIWYG string backslash means backslash, not meta-escape. Raw strings should have no meta-escapes. That's one reason I suggested r"string"more"r with a tail delimiter. It alleviates some of the problem. The only way to deal with the whole problem (e.g. now you want to embed "r in the string) is to use a smarter (less greedy) lexer with (possibly, but not ideally) semantic feedback from the parser. That gets messy. I think "r is a reasonable compromise in that it permits embedded quote marks, the most common need, but beyond that point, one should realize that static strings are just a small subset of any program, and a little manual divide-and-conquer is not that hard for the really hairy ones. There aren't that many to begin with. If a lot of them are staring at you, then at that point you should use a Python script to output proper D code, and automate the labor of implementing divide-and-conquer as per Walter idea #1. Mark
Jul 30 2003
parent reply Burton Radons <loth users.sourceforge.net> writes:
Mark Evans wrote:

Walter wrote:

Two ways:

r"string" \" r"more"
"string\"more"


The second idea is wrong. In a WYSIWYG string backslash means backslash, not meta-escape. Raw strings should have no meta-escapes. That's one reason I suggested r"string"more"r with a tail delimiter. It alleviates some of the problem.

Which is why he didn't USE a raw string. I think using a character terminator is a good recipe for confusing lexing. I think about the consequence of the symbols I write. Characters, no. This should be solved in the text editor. Hit Control-Quote to enter a quote, hit Control-Quote to exit one, and handle control characters automatically. The only real solution at the language level is to put in a count indicator before the string which is then read as raw UTF-8; everything else is just an inferior simulacrum. But the IDE can do it correctly.
Jul 30 2003
next sibling parent Burton Radons <loth users.sourceforge.net> writes:
Burton Radons wrote:

 This should be solved in the text editor.  Hit Control-Quote to enter a 
 quote, hit Control-Quote to exit one, and handle control characters 
 automatically.  The only real solution at the language level is to put 
 in a count indicator before the string which is then read as raw UTF-8; 
 everything else is just an inferior simulacrum.  But the IDE can do it 
 correctly.

Ack, ambiguous. I mean that if you put in a control character - an escape, an out-of-range value - while in an enforced quote, the IDE will simply show the string as it really means, not with escapes. It could use any number of schemes for indicating a string; something that would change when the caret is within the string. When saving, this would be unpacked into escaped data. The IDE would also be careful of rendering characters like NUL, which should be drawn with a special symbol. \n would be transformed into a special symbol (typed using Control-Enter?), while a newline would be rendered normally. I think it would work. I'll play with it in dedit.
Jul 30 2003
prev sibling parent reply Mark Evans <Mark_member pathlink.com> writes:
Oh, I see now, it was a regular string wherein meta-escape is allowed.  Thanks
Burton.

Still that leaves open how you put " into a raw string.  Maybe the lexer will be
smart enough to search for the outermost feasible close-quote token.

I think using a character 
terminator is a good recipe for confusing lexing.  I think about the 
consequence of the symbols I write.  Characters, no.

I don't follow this statement. If it means "I'm a sloppy thinker when I see characters so please no characters as terminator tokens," then why are they valid as initiator tokens. And why not complain about 1.23L in D. I can buy the argument both ways, but not endwise only. Considering that perspective, initiator r" should presumably become raw" instead. Now we're on the path to XML... What might align with good D taste and simpler lexing is, r["string"more"] x["ABCD FFFF 0000"] The important thing to me is not how characters as tokens feel, but what buys maximum code readability and minimum forbidden embeddings for the minimum keystrokes and lexing hassle. Mark
Jul 30 2003
parent reply "Sean L. Palmer" <palmer.sean verizon.net> writes:
r["string"] conflicts with syntax for associative arrays.

To be honest, I really don't give a damn about raw strings.  If you want a
string in D, run it through a teeny program that escapes it properly and
paste it in.  The IDE can assist with this.

As far as the language goes, I see it as a non-issue, one already solved
perfectly well in C.

Sean

"Mark Evans" <Mark_member pathlink.com> wrote in message
news:bg9lu5$2gpt$1 digitaldaemon.com...
 Oh, I see now, it was a regular string wherein meta-escape is allowed.

 Burton.

 Still that leaves open how you put " into a raw string.  Maybe the lexer

 smart enough to search for the outermost feasible close-quote token.

I think using a character
terminator is a good recipe for confusing lexing.  I think about the
consequence of the symbols I write.  Characters, no.

I don't follow this statement. If it means "I'm a sloppy thinker when I

 characters so please no characters as terminator tokens," then why are

 valid as initiator tokens.  And why not complain about 1.23L in D.  I can

 the argument both ways, but not endwise only.  Considering that

 initiator r" should presumably become raw" instead.  Now we're on the path

 XML...

 What might align with good D taste and simpler lexing is,

 r["string"more"]
 x["ABCD FFFF 0000"]

 The important thing to me is not how characters as tokens feel, but what

 maximum code readability and minimum forbidden embeddings for the minimum
 keystrokes and lexing hassle.

 Mark

Jul 30 2003
parent reply Mark Evans <Mark_member pathlink.com> writes:
Sean L. Palmer says...
r["string"] conflicts with syntax for associative arrays.

To be honest, I really don't give a damn about raw strings.  If you want a
string in D, run it through a teeny program that escapes it properly and
paste it in.

You are right about the square brackets. They just came off the top of my head. Raw strings help in specialized niches. Embedded work is one. C offers no way to declare a readable block of hexadecimal digits larger than one integer word. You either have lots of commas, or lots of backslashes, or cryptic gibberish, depending how you go about it. Variable-width spacing of the hex is hard if not impossible, though often desirable in embedded work, e.g. x"04EAC AB CD FAF 1234FFFFDDEE". Not only have I written the translator scripts you suggest (only too many times), but I have programmed dynamic regex construction for communications protocols. That means the program creates and uses regular expressions which are unknowable at compile time. These expressions are heavy with escape characters. Writing and debugging such code is hard without raw strings. Mark P.S. Walter I don't know what you mean about Unicode but consider utf8"string" vs. utf16"string" vs. utf32"string"...just thinking out loud.
Jul 31 2003
next sibling parent reply "Walter" <walter digitalmars.com> writes:
"Mark Evans" <Mark_member pathlink.com> wrote in message
news:bgbr92$1j3j$1 digitaldaemon.com...
 P.S. Walter I don't know what you mean about Unicode but consider
 utf8"string" vs. utf16"string" vs. utf32"string"...just thinking out loud.

1) Unicode source text will be accepted. 2) Use of \uXXXX and \UXXXXXXXX will be accepted in strings. 3) String literals can be converted at compile time between UTF-8, UTF-16 and UCS-32 all by doing the appropriate cast.
Jul 31 2003
parent reply Ilya Minkov <midiclub 8ung.at> writes:
Walter wrote:
 1) Unicode source text will be accepted.
 2) Use of \uXXXX and \UXXXXXXXX will be accepted in strings.
 3) String literals can be converted at compile time between UTF-8, UTF-16
 and UCS-32 all by doing the appropriate cast.

Then why not convert single-character-strings into character literals using a cast? -i.
Jul 31 2003
next sibling parent "Matthew Wilson" <matthew stlsoft.org> writes:
Overloadability

"Ilya Minkov" <midiclub 8ung.at> wrote in message
news:bgc8ut$210h$1 digitaldaemon.com...
 Walter wrote:
 1) Unicode source text will be accepted.
 2) Use of \uXXXX and \UXXXXXXXX will be accepted in strings.
 3) String literals can be converted at compile time between UTF-8,


 and UCS-32 all by doing the appropriate cast.

Then why not convert single-character-strings into character literals using a cast? -i.

Jul 31 2003
prev sibling parent "Walter" <walter digitalmars.com> writes:
"Ilya Minkov" <midiclub 8ung.at> wrote in message
news:bgc8ut$210h$1 digitaldaemon.com...
 Walter wrote:
 1) Unicode source text will be accepted.
 2) Use of \uXXXX and \UXXXXXXXX will be accepted in strings.
 3) String literals can be converted at compile time between UTF-8,


 and UCS-32 all by doing the appropriate cast.

Then why not convert single-character-strings into character literals using a cast?

Because it's too much typing: cast(char)"a" for a list of them.
Jul 31 2003
prev sibling parent "Sean L. Palmer" <palmer.sean verizon.net> writes:
Perhaps I was a bit too hasty.  On reflection, it would be nice to be able
to declare large binary or hex arrays without so many extra 0x and commas.
And I guess for paths and regex it'd be nice to disable escapes.

What if you put the "raw" signifier *into* the string instead?  As an
escape.

"string\n"
"\:Rraw string"
"\:Xdeadbeef"
"\:B01001110"

"\:" would be the escape trigger for string mode in the above (I chose
backslash colon because I believe it to be unused currently, but it could be
anything).  This escape code doesn't emit any character at all, just changes
string mode to raw, hex, or binary (or whatever).

Would that work?  Seems more C compatible than the other alternatives.

Sean

"Mark Evans" <Mark_member pathlink.com> wrote in message
news:bgbr92$1j3j$1 digitaldaemon.com...
 Sean L. Palmer says...
r["string"] conflicts with syntax for associative arrays.

To be honest, I really don't give a damn about raw strings.  If you want


string in D, run it through a teeny program that escapes it properly and
paste it in.

You are right about the square brackets. They just came off the top of my

 Raw strings help in specialized niches.  Embedded work is one.  C offers

 to declare a readable block of hexadecimal digits larger than one integer

 You either have lots of commas, or lots of backslashes, or cryptic

 depending how you go about it.  Variable-width spacing of the hex is hard

 impossible, though often desirable in embedded work, e.g.
 x"04EAC AB CD FAF 1234FFFFDDEE".

 Not only have I written the translator scripts you suggest (only too many
 times), but I have programmed dynamic regex construction for

 protocols.  That means the program creates and uses regular expressions

 are unknowable at compile time.  These expressions are heavy with escape
 characters.  Writing and debugging such code is hard without raw strings.

 Mark
 P.S. Walter I don't know what you mean about Unicode but consider
 utf8"string" vs. utf16"string" vs. utf32"string"...just thinking out loud.

Jul 31 2003
prev sibling next sibling parent reply Frank Wills <name host.com> writes:
I'm glad this is the choice. I was worried (just a little) that
we were going to end up with something ugly and complicated.

Walter wrote:
 This has been a most entertaining discussion, with a lot of great ideas. But
 I have to pick one. I personally liked the `string`, but there are too many
 problems with it, such as unreadable fonts, using a non-C character, etc.
 The """string""" was just too ugly for me (sorry).
 
 That leaves r"string" for wysiwyg string literals. The justifications are
 (thanks to Mark, etc., who pointed them all out):
 
 1) It sticks to the C character set.
 2) No problems with different fonts.
 3) Establishes a precedent for new types of special strings.
 4) Easy to tokenize.
 5) There's precedent experience with it in other languages, such as Python.
 
 

Jul 30 2003
parent "Walter" <walter digitalmars.com> writes:
"Frank Wills" <name host.com> wrote in message
news:bg84t6$ujs$1 digitaldaemon.com...
 I'm glad this is the choice. I was worried (just a little) that
 we were going to end up with something ugly and complicated.

I hate ugly and complicated, that's why I'm doing D <G>.
Jul 30 2003
prev sibling next sibling parent Burton Radons <loth users.sourceforge.net> writes:
Walter wrote:

 That leaves r"string" for wysiwyg string literals. The justifications are
 (thanks to Mark, etc., who pointed them all out):

If RE had an escape for the double-quote that would be even more helpful; \q is available and not used in Python RE or Perl RE.
Jul 30 2003
prev sibling parent reply Mark Evans <Mark_member pathlink.com> writes:
Walter wrote,

1) It sticks to the C character set.
2) No problems with different fonts.
3) Establishes a precedent for new types of special strings.
4) Easy to tokenize.
5) There's precedent experience with it in other languages, such as Python.

6) Permits qualifiers such as n (null), hN (length header of size N bytes), and pN (pad to next Nth byte). These fine-tuning controls could become important without C's single-quote 'abcd' construct. Here are some C language translations. D proposed ANSI C r"string" --> 'string' rn"string" --> 'string\0' rh2"string" --> '\0\6string' rh4"string" --> '\0\0\0\6string' rh7"string" --> '\0\0\0\0\0\0\6string' rh4n"string" --> '\0\0\0\6string\0' rp4"string" --> 'string\0\0' rnp4"string" --> 'string\0\0' Python also has u for Unicode, which I would simply copy like r. Maybe going over the top here, I suggest that all of these have command line default settings so that the meaning of r can be set once and forgotten. The - symbol could be used to override in the source code for special cases. rn- means turn off null even if defaulted 'on' rh- means turn off header rp- means turn off padding rn-h-p- means turn off all The b and x strings would address a serious need in embedded work and could benefit from the header and padding qualifiers. Strings should concatenate by simple juxtaposition. That behavior enables embedded comments: // comment myVar = x"ABCD 0000" // another comment x"FFFF BCDA" // a final comment ; means myVar = x"ABCD0000FFFFBCDA"; Word alignment issues would be decided after concatenation, not before. Strings concatenate bit by bit. Mark
Jul 30 2003
next sibling parent reply Mark Evans <Mark_member pathlink.com> writes:
Different string types concatenate, too:

myVar = x"0123" r"string";  -->  myVar = '\0\1\2\3string';

while

myVar = x"0123" r"string" b"101";

creates an 83-bit entity whose alignment issues are open to discussion, but
should be controllable.  Then D will be a real systems language.

Mark
Jul 30 2003
parent reply Dario <Dario_member pathlink.com> writes:
Mark Evans:
Different string types concatenate, too:
myVar = x"0123" r"string";  -->  myVar = '\0\1\2\3string';

What should x"0123" be? A byte array like [0x01, 0x23] or like [0x0, 0x1, 0x2, 0x3]? This seems strange this to me. It's not that intuitive. -Dario
Jul 31 2003
parent Mark Evans <Mark_member pathlink.com> writes:
Dario says...
Mark Evans:
Different string types concatenate, too:
myVar = x"0123" r"string";  -->  myVar = '\0\1\2\3string';

What should x"0123" be? A byte array like [0x01, 0x23] or like [0x0, 0x1, 0x2, 0x3]? This seems strange this to me. It's not that intuitive. -Dario

Sorry that my memory for C escape syntax is getting rusty. x"0123" would be a 16-bit data chunk with C equivalents '\x01\x23' {0x01,0x23}
Jul 31 2003
prev sibling parent "Walter" <walter digitalmars.com> writes:
Some good ideas here.

Some nits:
1) D already concatenates strings that are juxtaposed.
2) Embedded unicode in strings will be fully supported in the next release,
no special prefix needed.

"Mark Evans" <Mark_member pathlink.com> wrote in message
news:bg96tv$21cf$1 digitaldaemon.com...
 Walter wrote,

1) It sticks to the C character set.
2) No problems with different fonts.
3) Establishes a precedent for new types of special strings.
4) Easy to tokenize.
5) There's precedent experience with it in other languages, such as


 6) Permits qualifiers such as n (null), hN (length header of size N

 and pN (pad to next Nth byte).  These fine-tuning controls could become
 important without C's single-quote 'abcd' construct.  Here are some C
 language translations.

 D proposed              ANSI C

 r"string"          --> 'string'
 rn"string"         --> 'string\0'
 rh2"string"        --> '\0\6string'
 rh4"string"        --> '\0\0\0\6string'
 rh7"string"        --> '\0\0\0\0\0\0\6string'
 rh4n"string"       --> '\0\0\0\6string\0'
 rp4"string"        --> 'string\0\0'
 rnp4"string"       --> 'string\0\0'

 Python also has u for Unicode, which I would simply copy like r.

 Maybe going over the top here, I suggest that all of these have
 command line default settings so that the meaning of r can be
 set once and forgotten.  The - symbol could be used to override
 in the source code for special cases.

 rn- means turn off null even if defaulted 'on'
 rh- means turn off header
 rp- means turn off padding
 rn-h-p- means turn off all

 The b and x strings would address a serious need in embedded work and

 benefit from the header and padding qualifiers.

 Strings should concatenate by simple juxtaposition.  That behavior enables
 embedded comments:

 // comment
 myVar = x"ABCD 0000"
 // another comment
 x"FFFF BCDA"
 // a final comment
 ;

 means

 myVar = x"ABCD0000FFFFBCDA";

 Word alignment issues would be decided after concatenation, not before.

 concatenate bit by bit.

 Mark

Jul 31 2003