|
Archives
D Programming
digitalmars.D
digitalmars.D.bugs
digitalmars.D.dtl
digitalmars.D.ide
digitalmars.D.dwt
digitalmars.D.announce
digitalmars.D.learn
digitalmars.D.debugger
D.gnu
D
C/C++ Programming
c++
c++.announce
c++.atl
c++.beta
c++.chat
c++.command-line
c++.dos
c++.dos.16-bits
c++.dos.32-bits
c++.idde
c++.mfc
c++.rtl
c++.stl
c++.stl.hp
c++.stl.port
c++.stl.sgi
c++.stlsoft
c++.windows
c++.windows.16-bits
c++.windows.32-bits
c++.wxwindows
digitalmars.empire
digitalmars.DMDScript
electronics
|
digitalmars.D.learn - Encoding problems...
Hi all,
Quick question: I want to use some unicode identifiers, but I get
"unsupported char 0xe2", both with using and not using a BOM. The
characters in question are the superset/subset-equals operators: ⊇ and
⊆... Perhaps these are just unsupported by DMD (in which case, I'll file
a bug)?
Thanks,
Robert
On Wed, May 27, 2009 at 8:55 PM, Robert Fraser
<fraserofthenight gmail.com> wrote:
Hi all,
Quick question: I want to use some unicode identifiers, but I get
"unsupported char 0xe2", both with using and not using a BOM. The characters
in question are the superset/subset-equals operators: $B"=(B and $B"<(B...
Perhaps
these are just unsupported by DMD (in which case, I'll file a bug)?
Thanks,
Robert
If they're not classified as "universal alpha" I don't think you can
use them in identifiers.
Jarrett Billingsley wrote:
On Wed, May 27, 2009 at 8:55 PM, Robert Fraser
<fraserofthenight gmail.com> wrote:
Hi all,
Quick question: I want to use some unicode identifiers, but I get
"unsupported char 0xe2", both with using and not using a BOM. The characters
in question are the superset/subset-equals operators: $B"=(B and $B"<(B...
Perhaps
these are just unsupported by DMD (in which case, I'll file a bug)?
Thanks,
Robert
If they're not classified as "universal alpha" I don't think you can
use them in identifiers.
Lame. K; thanks.
Robert Fraser wrote:
Jarrett Billingsley wrote:
On Wed, May 27, 2009 at 8:55 PM, Robert Fraser
<fraserofthenight gmail.com> wrote:
Hi all,
Quick question: I want to use some unicode identifiers, but I get
"unsupported char 0xe2", both with using and not using a BOM. The characters
in question are the superset/subset-equals operators: $B"=(B and $B"<(B...
Perhaps
these are just unsupported by DMD (in which case, I'll file a bug)?
Thanks,
Robert
use them in identifiers.
How the hell did your news client switch from UTF-8 to
Japanese-something? (charset=UTF-8 => charset=ISO-2022-JP)
Lame. K; thanks.
Don't worry, people working with your code will be thankful!
grauzone wrote:
Robert Fraser wrote:
Jarrett Billingsley wrote:
On Wed, May 27, 2009 at 8:55 PM, Robert Fraser
<fraserofthenight gmail.com> wrote:
Hi all,
Quick question: I want to use some unicode identifiers, but I get
"unsupported char 0xe2", both with using and not using a BOM. The characters
in question are the superset/subset-equals operators: $B"=(B and $B"<(B...
Perhaps
these are just unsupported by DMD (in which case, I'll file a bug)?
Thanks,
Robert
use them in identifiers.
How the hell did your news client switch from UTF-8 to
Japanese-something? (charset=UTF-8 => charset=ISO-2022-JP)
Lame. K; thanks.
Don't worry, people working with your code will be thankful!
Hmm... I'd say x.$B"<(B(y) is preferable x.isSubsetOf(y), but it's not a huge
deal.
Reply to Robert,
Hmm... I'd say x.⊆(y) is preferable x.isSubsetOf(y), but it's not a
huge deal.
Only until you have to type it. I think universal alpha includes only the
union of things that can be easily typed on standard keyboards. I don't think
any keyboard (ok maybe an APL keyboard) has the subset symbol on it.
BCS wrote:
Reply to Robert,
Hmm... I'd say x.⊆(y) is preferable x.isSubsetOf(y), but it's not a
huge deal.
Only until you have to type it. I think universal alpha includes only
the union of things that can be easily typed on standard keyboards.
What inspired you to form that opinion?
My impression was that it's some standard list of Unicode characters
that are letters (or logogram or ideogram or whatever) in some language
somewhere in the world.
Anyway....
http://www.digitalmars.com/d/1.0/lex.html
"Identifiers start with a letter, _, or universal alpha, and are
followed by any number of letters, _, digits, or universal alphas.
Universal alphas are as defined in ISO/IEC 9899:1999(E) Appendix D.
(This is the C99 Standard.)"
I eventually managed to find this:
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
Stewart.
Reply to Stewart,
BCS wrote:
Only until you have to type it. I think universal alpha includes only
the union of things that can be easily typed on standard keyboards.
What inspired you to form that opinion?
My impression was that it's some standard list of Unicode characters
that are letters (or logogram or ideogram or whatever) in some
language somewhere in the world.
That's more or less the same thing (although I'll admit, my original comment
is not well stated). I'm not just talking about standard QWERTY keyboard
but also standard keyboards for other languages and alphabets. I rather suspect
that for every char in universal alpha, there is a standard keyboard somewhere
that has it.
BCS wrote:
Reply to Stewart,
My impression was that it's some standard list of Unicode characters
that are letters (or logogram or ideogram or whatever) in some
language somewhere in the world.
That's more or less the same thing (although I'll admit, my original
comment is not well stated).
Indeed, my keyboard has a number of punctuation characters, most of
which aren't valid in identifiers.
I'm not just talking about standard QWERTY
keyboard but also standard keyboards for other languages and alphabets.
I'd got that far.
I rather suspect that for every char in universal alpha, there is a
standard keyboard somewhere that has it.
So I guess it's therefore likely to exclude ancient scripts with not
enough modern use to have warranted the invention of a standard keyboard
therefor. (One omission I noticed is Phoenician, though that may be
also due to its later arrival in Unicode.)
Stewart.
Hello Stewart,
So I guess it's therefore likely to exclude ancient scripts with not
enough modern use to have warranted the invention of a standard
keyboard therefor. (One omission I noticed is Phoenician, though that
may be also due to its later arrival in Unicode.)
Anyone who really wants to use Phoenician for symbol names should be taken
out and shot (with a nerf gun).
Stewart.
BCS wrote:
Reply to Robert,
Hmm... I'd say x.⊆(y) is preferable x.isSubsetOf(y), but it's not a
huge deal.
Only until you have to type it. I think universal alpha includes only
the union of things that can be easily typed on standard keyboards. I
don't think any keyboard (ok maybe an APL keyboard) has the subset
symbol on it.
I have 10 configurable keys on my keyboard, none of which are in use. I
could also remap my numpad (cause, seriously, who uses this?) Also, many
editors can be configured so that a sequence of characters converts to a
single one.
There appears to be no reason that mathematical symbols aren't allowed
in identifiers... Think of how awesome it would be to write
assert(x⊇y→∀a∈x∃b∈y(a⊇b)) ... Okay, that would require
overloading of
those operators (and instantiating variables in a new way), but still!
Hello Robert,
BCS wrote:
Reply to Robert,
Hmm... I'd say x.⊆(y) is preferable x.isSubsetOf(y), but it's not a
huge deal.
the union of things that can be easily typed on standard keyboards. I
don't think any keyboard (ok maybe an APL keyboard) has the subset
symbol on it.
I could also remap my numpad (cause, seriously, who uses this?) Also,
many editors can be configured so that a sequence of characters
converts to a single one.
There appears to be no reason that mathematical symbols aren't allowed
in identifiers... Think of how awesome it would be to write
assert(x⊇y→∀a∈x∃b∈y(a⊇b)) ... Okay, that would require
overloading of
those operators (and instantiating variables in a new way), but still!
Allowing them as operators would be cool (and won't happen for another whole
host of reasons that have nothing to do with this) but in identifiers? Not
a chance. I don't care what you can type, what matters is what /I/ can type
(the generic 'I', assuming I can read your comments -> I use your language
-> I use your alphabet).
Robert Fraser wrote:
BCS wrote:
Reply to Robert,
Hmm... I'd say x.⊆(y) is preferable x.isSubsetOf(y), but it's not a
huge deal.
Only until you have to type it. I think universal alpha includes only
the union of things that can be easily typed on standard keyboards. I
don't think any keyboard (ok maybe an APL keyboard) has the subset
symbol on it.
I have 10 configurable keys on my keyboard, none of which are in use. I
could also remap my numpad (cause, seriously, who uses this?) Also, many
editors can be configured so that a sequence of characters converts to a
single one.
Which would possibly make D the first language to *require* a
specialised keyboard and/or editor since APL.
Not a good precedent.
Oh, and don't try to argue it isn't mandatory. If you can overload
those operators, people WILL use them and WILL complain that it's too hard.
There appears to be no reason that mathematical symbols aren't allowed
in identifiers... Think of how awesome it would be to write
assert(x⊇y→∀a∈x∃b∈y(a⊇b)) ... Okay, that would require
overloading of
those operators (and instantiating variables in a new way), but still!
I think that example you gave is an excellent reason not to allow them. :D
It would be nice, but it's really not feasible without widespread editor
and/or keyboard support for extra symbols, which I just don't see happening.
Daniel Keep Wrote:
It would be nice, but it's really not feasible without widespread editor
and/or keyboard support for extra symbols, which I just don't see happening.
http://www.microsoft.com/globaldev/tools/msklc.mspx
:)))
Robert Fraser wrote:
Hi all,
Quick question: I want to use some unicode identifiers, but I get
"unsupported char 0xe2", both with using and not using a BOM. The
characters in question are the superset/subset-equals operators: ⊇ and
⊆... Perhaps these are just unsupported by DMD (in which case, I'll file
a bug)?
Thanks,
Robert
www.open-std.org/JTC1/SC22/wg14/www/docs/n1124.pdf
(As an aside, Google's link obfuscation is hella annoying.)
The relevant range is U+2200 to U+22FF (specifically U+2286, U+2287).
It's not included.
Hello Christopher,
(As an aside, Google's link obfuscation is hella annoying.)
??
Reply to Christopher,
BCS wrote:
Hello Christopher,
(As an aside, Google's link obfuscation is hella annoying.)
http://www.google.com/url?sa=t&source=web&ct=res&cd=4&url=http%3A%2F%2
Fwww.open-std.org%2FJTC1%2FSC22%2Fwg14%2Fwww%2Fdocs%2Fn1124.pdf&ei=IQo
fSs23FNjXlAeJmeXGBQ&usg=AFQjCNGZNITNpxvZKard5pSr7RQvxmTDkQ&sig2=8T5gS1
aSODl4KdKmy2jp_w
Eugh.
only if you are logged in to a google account. The mangling is so they can
tell what you click on for ( you.are(paranoid) ? "stalking you" : "creating
better personalized search results" )
|
|