digitalmars.D.announce - Adding Unicode operators to D

Andrei Alexandrescu (3/3) Oct 22 2008 Please vote up before the haters take it down, and discuss:

Andrei Alexandrescu (4/11) Oct 22 2008 Correx:

Steven Schveighoffer (24/27) Oct 22 2008 No thanks. Please let's only use operators that are on the keys of my

Jarrett Billingsley (3/4) Oct 22 2008 Beeeecause not everyone uses emacs?

Steven Schveighoffer (6/10) Oct 22 2008 Including myself ;)

Bill Baxter (18/23) Oct 22 2008 Actually, the solutions aren't that far apart. Andrei's solution
Bill Baxter (10/14) Oct 22 2008 In fact, I think there are only like three of us using emacs. :-) So
Steven Schveighoffer (14/81) Oct 23 2008 All that is being proposed right now is syntax sugar. Cross product, do...
Sergey Gromov (6/8) Oct 23 2008 I think an editor is not the only thing that displays your program's

KennyTM~ (6/15) Oct 23 2008 I agree.

Paul D. Anderson (5/19) Oct 22 2008 Java allows unicode variable names. The Greek letter 'pi' is a valid var...
Spacen Jasset (11/26) Oct 23 2008 I haven't really ever felt the need for such things. It would require

Bill Baxter (19/29) Oct 23 2008 I think that's the conclusion I'm coming too as well. While the use

Walter Bright (3/21) Oct 23 2008 Unfortunately, you might be right in that D is not currently in a

Nick Sabalausky (34/55) Oct 23 2008 My various thoughts:

Don (32/47) Oct 28 2008 Entering this debate late:

Sergey Gromov (5/7) Oct 28 2008 I'd use dot "⋅" and cross "×" products for 3D, union "∪" and

bearophile (17/20) Oct 28 2008 I just want to note that the whole thread is almost unreadable on the di...

KennyTM~ (3/31) Oct 28 2008 If the two sets are incomparable, just return NaN... We need an opCmp

KennyTM~ (4/40) Oct 28 2008 Actually I've made a working solution. Even the exotic operators like

Andrei Alexandrescu (9/18) Oct 28 2008 In my opinion, a workable feature is this:

Bill Baxter (20/20) Oct 28 2008 T24gV2VkLCBPY3QgMjksIDIwMDggYXQgNDoxMiBBTSwgQW5kcmVpIEFsZXhhbmRyZXNjdQo8...
Don (18/38) Oct 29 2008 Do we really need to do that? How many Unicode binary operators are ther...
Walter Bright (3/5) Oct 29 2008 That throws out the ability to parse without semantic analysis. It's not...

Andrei Alexandrescu (3/9) Oct 29 2008 It doesn't per a previous post of mine, but I agree it's still not worth...

Benji Smith (5/14) Oct 28 2008 I have pretty much the same list.

Moritz Warning (5/11) Oct 22 2008 It would be very nice to have unicode operators.

Moritz Warning (2/16) Oct 22 2008 sorry posted in d.announce by .. accident. :/
Nick Sabalausky (3/14) Oct 22 2008 I'd certainly like opIntersection and maybe opUnion.

Bill Baxter (24/26) Oct 22 2008 (My comment cross posted here from reddit)

Jesse Phillips (11/45) Oct 22 2008 I don't find this terribly appealing. Walter mentions having thrown out
Don (16/37) Oct 23 2008 I agree.

Sergey Gromov (3/4) Oct 23 2008 Lots of question marks here. This sucks.

Spacen Jasset (36/71) Oct 25 2008 I am not entirely sure that 30 or (x amount) of new operators would be a...

Andrei Alexandrescu (11/64) Oct 25 2008 I have noticed that in pretty much all scientific code, the f(a, b) and

Spacen Jasset (13/83) Oct 25 2008 Yes, that is indeed a fair point and I agree. D is a "systems

Bill Baxter (28/45) Oct 25 2008 Yes, heavy math code is hard to read in the current situation.

bearophile (5/8) Oct 25 2008 I have seen many scientific programs that use numpy, so sometimes it's f...

Bill Baxter (28/35) Oct 25 2008 Yep C/D/C++ is easier. The SciPy.org site has a growing section of

bearophile (11/14) Oct 25 2008 Mixing languages isn't nice, I agree. That's why I too use D for several...

Andrei Alexandrescu (9/67) Oct 25 2008 Surprisingly there's not a lot of choice, witnessed by the prevalence of

Bruno Medeiros (9/59) Oct 26 2008 But what operators would be added? Some mathematician programmers might

KennyTM~ (9/70) Oct 26 2008 Composition may be useful for functional programming (I've never used
Andrei Alexandrescu (21/80) Oct 26 2008 I was thinking of allowing a general way of defining one Unicode

KennyTM~ (15/104) Oct 26 2008 LaTeX in D? :p

Charles Hixson (39/100) Oct 26 2008 Perhaps what needs to be added is a syntax for defining character to

Simen Kjaeraas (23/41) Oct 26 2008 This made me think. What if we /could/ define arbitrary infix operators ...

Bill Baxter (10/53) Oct 26 2008 uk>

Simen Kjaeraas (5/67) Oct 26 2008 Yup, I realized this myself as well. Seemed like such a great idea when ...

Bill Baxter (13/79) Oct 26 2008 :

Simen Kjaeraas (12/19) Oct 26 2008 An interesting read, though I have looked at downs' code before. It

Andrei Alexandrescu (4/73) Oct 26 2008 An operator could always be defined to have the same precedent as an

Bill Baxter (6/14) Oct 26 2008 Walter said in a previous post a few days ago when I suggested it that

Andrei Alexandrescu (7/21) Oct 26 2008 It can be done, but it's kinda involved. You define a grammar in which

Bill Baxter (8/32) Oct 26 2008 I see. So the price you pay is that you defer more decisions till

Andrei Alexandrescu (3/32) Oct 26 2008 Yah. Something tells me Walter won't embark on that soon.

Walter Bright (3/4) Oct 26 2008 Not a chance . Producing an amorphous list of tokens isn't what I'd

Max Samukha (5/8) Oct 22 2008 I'm already having problems with unicode: the news reader I'm using
bearophile (12/12) Oct 23 2008 Andrei Alexandrescu:

Robert Fraser (2/3) Oct 23 2008 So does D.

Max Samukha (6/9) Oct 23 2008 I'd like to note that identifiers in a non-English language are

Yigal Chripun (9/21) Oct 23 2008 isn't that something that should be decided upon on a per-project basis?

bearophile (4/4) Oct 23 2008 I always use English for variable names, instead of my language, because...

Max Samukha (3/7) Oct 23 2008 Keep children away from Python. Let them have happy lives :)

Walter Bright (3/8) Oct 23 2008 D currently allows Unicode in identifiers, comments, and strings. In

Andrei Alexandrescu (5/6) Oct 23 2008 [snip]

Yigal Chripun (22/29) Oct 23 2008 A few thoughts on the subject:
KennyTM~ (78/85) Oct 23 2008 I suggest not. There are problems if you adopt Unicode as operators:

Bruno Medeiros (5/9) Oct 24 2008 Then I suggest a change in career... ^^'

Simen Kjaeraas (30/33) Oct 23 2008 I really like the idea of having more unicode in the language, but I fee...

Bill Baxter (9/9) Oct 23 2008 T24gRnJpLCBPY3QgMjQsIDIwMDggYXQgNTo0OCBBTSwgU2ltZW4gS2phZXJhYXMgPHNpbWVu...

Simen Kjaeraas (4/13) Oct 23 2008 That seems to make sense indeed.

Bruno Medeiros (8/15) Oct 24 2008 Hum, interesting example, it actually made me realize that 'null' would

Simen Kjaeraas (6/16) Oct 24 2008 Well, we norwegians got the Ø (html entity Ø, Latin-1 character ...
KennyTM~ (3/19) Oct 24 2008 auto Ø = null; // \Ø

Bruno Medeiros (8/29) Oct 26 2008 It's an interesting and effective way to save some typing, and it might

Bruno Medeiros (30/37) Oct 24 2008 I'm unsure about this idea.

Simen Kjaeraas (6/33) Oct 24 2008 I'd guess this oughtta do it:

Bruno Medeiros (6/46) Oct 26 2008 Yes, exactly that! I had the impression there was such a program for
Robert Fraser (5/12) Oct 26 2008 I remember this same question being asked on a Microsoft DL when I was

bearophile (35/36) Oct 24 2008 Fortress uses pairs of symbols to denote various sequence literarls. Som...

ore-sama (2/4) Oct 24 2008 Console is a legacy technology (you even still call it "DOS"), why expec...

Bill Baxter (5/9) Oct 24 2008 So tell me what the alternative is? I had trouble with running D

Sergey Gromov (6/18) Oct 24 2008 A regular Windows console supports UTF-8 to some extent:

Bill Baxter (3/21) Oct 24 2008 I did that but "type " still prints garbage.

Yigal Chripun (10/33) Oct 24 2008 so don't use type. use notepad instead...

Benji Smith (9/43) Oct 24 2008 Oh, and one of my favorite tricks in Windows is to install cygwin

Bill Baxter (6/60) Oct 24 2008 But that has the same problem. Cygtools don't understand windows

Benji Smith (20/75) Oct 24 2008 Wha???

Bill Baxter (12/104) Oct 24 2008 Oh, I didn't realize that. There is one thing that doesn't work,

Benji Smith (3/19) Oct 24 2008 Glad I could be of service!

Steven Schveighoffer (14/26) Oct 24 2008 It's not the paths with wildcards that is the problem. In this case, it...

Bill Baxter (8/24) Oct 24 2008 Read again. Particularly this part:

Steven Schveighoffer (9/35) Oct 24 2008 Then that must be something grep is doing extra. Or perhaps the Windows...

Bill Baxter (17/57) Oct 24 2008 Yep, that was what I said.

Benji Smith (18/37) Oct 25 2008 Interesting.

Bill Baxter (8/19) Oct 24 2008 No, that's how it works with the Bash shell and most Unix shells, but
Steven Schveighoffer (17/87) Oct 24 2008 No, grep accepts either input. The shell does not change paths to windo...

Bill Baxter (6/95) Oct 24 2008 Yeh, I love the bash shell. Really the only thing keeping me from

Steven Schveighoffer (9/122) Oct 24 2008 It's ugly, but can be aliased or scripted, look into cygpath:

Benji Smith (4/6) Oct 25 2008 Definitely!

Bill Baxter (10/43) Oct 24 2008 Ok what about grep and sort and uniq then? Can notepad do that?

Benji Smith (5/14) Oct 24 2008 That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to

Bill Baxter (4/19) Oct 24 2008 Ok. Thanks for the info. Knowing that it has actually worked for at

Benji Smith (10/28) Oct 24 2008 Write a tiny little D program and see what you get on the console:

Bill Baxter (10/46) Oct 24 2008 Ah, I see. I guess more what I want to know is if I had utf-8 source

Yigal Chripun (9/54) Oct 24 2008 Msys does autocomplete. it's not perfect but it works. the path will

Bill Baxter (10/64) Oct 24 2008 Right that's what Cygwin does too, and it's useless if I want to call

Robert Fraser (9/79) Oct 25 2008 PowerShell is MS's concession that there are things better done in a

Sergey Gromov (8/51) Oct 27 2008 They all work for me: type, cat, less. The file is UTF-8 with BOM.

Steven Schveighoffer (17/28) Oct 24 2008 Any text-based program uses the same Windows console (unless it's a GUI

Yigal Chripun (9/44) Oct 25 2008 windows console AKA DOS Box *is* in fact legacy technology. It is

Bill Baxter (8/45) Oct 25 2008 After downloading it and giving it a try, I find this claim somewhat

Steven Schveighoffer (11/65) Oct 25 2008 I've never used powershell, but most likely you are correct. I think th...

ore-sama (2/5) Oct 26 2008 One important feature of legacy technology is it must not change for com...

Robert Fraser (7/18) Oct 25 2008 It uses the same console application to do the displaying/execution.

Bill Baxter (6/26) Oct 25 2008 I'm using "Console2" as my facade on the console window.
KennyTM~ (2/22) Oct 25 2008 Hey, they do have fixed MSPaint and WordPad! :)
torhu (3/5) Oct 26 2008 That works fine for me if I enable Quick edit mode in the options. Then...

Bill Baxter (4/10) Oct 26 2008 Except it only does block-oriented rectangular selection, which is odd

torhu (2/13) Oct 26 2008 Yeah, that's true. Pretty stupid.

Robert Fraser (5/21) Oct 26 2008 My main problem is that you can't do it just with the keyboard, which is...

Bill Baxter (5/28) Oct 26 2008 By the way I tried running powershell as a tab inside the Console2

Yigal Chripun (5/51) Oct 27 2008 I've just checked (it's been a long time since I used it) and you're

Andrei Alexandrescu (7/50) Oct 25 2008 Windows has gotten a lot better in the recent times - ever since it

ore-sama (2/12) Oct 25 2008 gui of course. MSYS's console is gui in fact.

ore-sama (2/10) Oct 25 2008 It's not windows, it's program's standard startup module gets command li...
ore-sama (2/5) Oct 25 2008 if application prints garbage, this indicates that it's implemented inco...
Kevin Bealer (10/16) Oct 25 2008 I think this is a bad idea -- there are a lot of places that don't use U...
Alix Pexton (22/29) Oct 26 2008 I've been following this thread without really having an opinion to

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Please vote up before the haters take it down, and discuss:

http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/


Andrei

Oct 22 2008

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Correx:

http://www.reddit.com/r/programming/comments/78rmc/allowing_unicode_operators_in_d_similarly_to/

Andrei

Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
 
 
 
 Andrei

Oct 22 2008

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"Andrei Alexandrescu"  wrote
 Correx:

 http://www.reddit.com/r/programming/comments/78rmc/allowing_unicode_operators_in_d_similarly_to/

 Andrei

No thanks.  Please let's only use operators that are on the keys of my 
keyboard. I don't fancy having to type key digraphs or trigraphs to try and 
write code.

I understand that others already have this problem, but I don't.  This would 
be a huge detractor from D for me.  I'd definitely support a language fork 
at that point, or at least refuse to deal with any code that has unicode 
operators.  I think you'd find others feel the same way.

Why can't the emacs module solution work that was used for the cheverons? 
That is, when emacs sees:

x opCross(y);

display it as

x x y

(of course, assume the middle x is the cross symbol, I have no idea how to 
type it).

And upon save, regenerate the correct code.

I see no issue with something like that.  This is all the compiler is doing 
anyways...

Note that any operators for unicode would be user-defined anyways, the 
standard operator symbols already cover what actually gets generated to 
machine code.  That is, unicode operator X is invariably going to map to 
opX, so there is no benefit to the compiler performing this step instead of 
an editor.

-Steve

Oct 22 2008

"Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:

On Wed, Oct 22, 2008 at 9:36 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 Why can't the emacs module solution work that was used for the cheverons?

Beeeecause not everyone uses emacs?

Oct 22 2008

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"Jarrett Billingsley" wrote
 On Wed, Oct 22, 2008 at 9:36 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 Why can't the emacs module solution work that was used for the cheverons?

 Beeeecause not everyone uses emacs?

Including myself ;)

But I really meant the same *type* of solution.  If you use another editor, 
especially if it is used for coding, it probably has a macro feature that 
you can use for doing this.

-Steve

Oct 22 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Thu, Oct 23, 2008 at 10:36 AM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 No thanks.  Please let's only use operators that are on the keys of my
 keyboard. I don't fancy having to type key digraphs or trigraphs to try and
 write code.
 [...]
 Why can't the emacs module solution work that was used for the cheverons?

Actually, the solutions aren't that far apart.  Andrei's solution
displays XXX as YYY, the actual Unicode version you'd still type XXX
just it would actually be replaced by YYY instead of just being
displayed as YYY.

The nice thing about getting such AutoCorrect replacements working
well across a wide range of editors is that it has benefits beyond
just typing unicode characters.  You can have it insert code snippets
when you type [[main]] for example, or some people have said that some
of the existing characters are hard to type on their non-US keyboards.
 You could define replacements for those.

I'm certainly not saying going Unicode is the right thing to do right
now.  More like trying to explore what has to change (if anything)
before it really becomes viable to introduce Unicode.  The topic seems
to keep coming up in a lot of places, so I think eventually it is
inevitable that we will see more and more languages start using it.

---bb

Oct 22 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Thu, Oct 23, 2008 at 10:45 AM, Jarrett Billingsley
<jarrett.billingsley gmail.com> wrote:
 On Wed, Oct 22, 2008 at 9:36 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 Why can't the emacs module solution work that was used for the cheverons?

 Beeeecause not everyone uses emacs?

In fact, I think there are only like three of us using emacs.  :-)  So
it's not a very general solution.

But I think the point is that you should be able to implement
something similar in many editors.
Although I think the trick of showing one thing but saving another is
more tricky for most editors than just replacing the strings outright
a la AutoCorrect.

--bb

Oct 22 2008

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"davidl" wrote
? Thu, 23 Oct 2008 09:36:29 +0800,Steven Schveighoffer
<schveiguy yahoo.com> ??:

"Andrei Alexandrescu" wrote
Correx:

http://www.reddit.com/r/programming/comments/78rmc/allowing_unicode_operators_in_d_similarly_to/

Andrei

No thanks. Please let's only use operators that are on the keys of my
keyboard. I don't fancy having to type key digraphs or trigraphs to try
and
write code.

I understand that others already have this problem, but I don't. This
would
be a huge detractor from D for me. I'd definitely support a language
fork
at that point, or at least refuse to deal with any code that has unicode
operators. I think you'd find others feel the same way.

Why can't the emacs module solution work that was used for the cheverons?
That is, when emacs sees:

x opCross(y);

display it as

x x y

(of course, assume the middle x is the cross symbol, I have no idea how
to
type it).

And upon save, regenerate the correct code.

I see no issue with something like that. This is all the compiler is
doing
anyways...

Everything you worry about is just poor editor. Why do you think an editor
can affect the language?

All that is being proposed right now is syntax sugar. Cross product, dot
product, union, etc. All of these will map to a function, so there is no
reason to require compiler support (that is, they don't translate directly
to assembly/machine code). I'm proposing the editor be used to do the sugar
instead of the compiler.

Right now Unicode is not universally accepted by all editors, ASCII is.
Right now, I don't have cross product symbol on my keyboard, all currently
supported symbols I do have. Why should my experience with D be severely
affected by your desire for syntax sugar?

And It complexes the language, if it's not priorly converted by the
programmer. Also it possibly sets up
future restrictions of extending the language in the correct direction!

Today, I can call opX functions instead of using the appropriate operator.
This is no different.

In your case: x opCross(y) , why identifier opCross(identifier) is
considered as identifier x identifier?
So would the typical operator overload function declaration should be
considered that way?

x opCross(y)
{
}

x x y
{
}

or even

x opCross(y, m){}

--->

x x y, m {}

also consider a template declaration

Matrix opCross(T)(T a)
{
}

should it be considered as Matrix x T (T a)?

If not , how do you distinguish in all those circumstances(and not all
possible "shouldn't be" situations are listed here)

The editor module would have to be (and can be) smarter than that.

-Steve

Oct 23 2008

Sergey Gromov <snake.scaly gmail.com> writes:

Thu, 23 Oct 2008 18:21:18 +0800,
davidl wrote:
 Everything you worry about is just poor editor. Why do you think an
 editor can affect the language?

I think an editor is not the only thing that displays your program's 
source.  I think that compiler's error message should be readable over a 
TTY terminal.  Otherwise you're limited to working with fancy graphical 
shells.

Oct 23 2008

KennyTM~ <kennytm gmail.com> writes:

Sergey Gromov wrote:
 Thu, 23 Oct 2008 18:21:18 +0800,
 davidl wrote:
 Everything you worry about is just poor editor. Why do you think an
 editor can affect the language?

 
 I think an editor is not the only thing that displays your program's 
 source.  I think that compiler's error message should be readable over a 
 TTY terminal.  Otherwise you're limited to working with fancy graphical 
 shells.

I agree.

My real world experience: Sometimes I need to code over ssh. The server 
admin only installed vim (which I don't use) and nano, no emacs.

Probably there could be a vim module also (is it possible?), but that's 
just palliatives.

Oct 23 2008

Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:

Andrei Alexandrescu Wrote:

 Correx:
 
 http://www.reddit.com/r/programming/comments/78rmc/allowing_unicode_operators_in_d_similarly_to/
 
 Andrei
 
 Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
 
 
 
 Andrei


Java allows unicode variable names. The Greek letter 'pi' is a valid variable
name in Java (see www.jscience.org for an example). Having said that, I've had
Java IDEs choke on these.

An opportunity may exist here for someone to create/modify a D language IDE
that supports same. [Although Descent (being Eclipse-based and therefore
Java-based) should have a leg up already.]

I know projects exist that intend to be 'the' D IDE (written in D, for D,
etc.). Maybe this could be a discriminator that makes one stand out.

Paul

Oct 22 2008

Spacen Jasset <spacenjasset yahoo.co.uk> writes:

Andrei Alexandrescu wrote:
 Correx:
 
 http://www.reddit.com/r/programming/comments/78rmc/allowing_unicode_operator
_in_d_similarly_to/ 
 
 
 Andrei
 
 Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 



 Andrei


I haven't really ever felt the need for such things. It would require 
editor support and I think that it could hinder readability as one would 
have to know that symbol 'x' is say, crossproduct. -- It isn't always, 
it depends on the mathematical domain.

There are, I belive, far more pressing matters, and this feature would 
make editor support a bit more difficult, and we are currently in the 
days where there isn't enough editor and/or ide support for D. I would 
personally prefer it not be added to the language in the near future, 
this is of course only my perferance, which in honesty may be biased but 
isn't entirely for self reasons.

Oct 23 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Fri, Oct 24, 2008 at 3:42 AM, Spacen Jasset <spacenjasset yahoo.co.uk> wrote:
 I haven't really ever felt the need for such things. It would require editor
 support and I think that it could hinder readability as one would have to
 know that symbol 'x' is say, crossproduct. -- It isn't always, it depends on
 the mathematical domain.

 There are, I belive, far more pressing matters, and this feature would make
 editor support a bit more difficult, and we are currently in the days where
 there isn't enough editor and/or ide support for D. I would personally
 prefer it not be added to the language in the near future, this is of course
 only my perferance, which in honesty may be biased but isn't entirely for
 self reasons.

I think that's the conclusion I'm coming too as well.  While the use
of Unicode would have some advantages, there are various technical
issues with it (like I haven't been able to figure out how to get the
DOS console in Windows to display UTF-8).  I think those issues can
all be solved, but it would be a large distraction for the D
community.  Better to let some big, well-funded, massively popular
language pioneer in this area.  If some language with a billion
programmers decided to use Unicode, then you can bet that most of
these infrastructure problems would start to disappear quickly as
annoyed programmers start scratching their own itches and as they
start complaining to the people who write the tools they use.

Realistically, if I complain to any software vendor now that their
editor doesn't work well with D because they don't have funky Unicode
functionality, the response is likely to be "Sounds like a problem
with D, whatever that is".  If the language were Java or C++, though,
they would have little choice but to take the complaint seriously,
regardless of the effort required.

--bb

Oct 23 2008

Walter Bright <newshound1 digitalmars.com> writes:

Bill Baxter wrote:
 I think that's the conclusion I'm coming too as well.  While the use
 of Unicode would have some advantages, there are various technical
 issues with it (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8).  I think those issues can
 all be solved, but it would be a large distraction for the D
 community.  Better to let some big, well-funded, massively popular
 language pioneer in this area.  If some language with a billion
 programmers decided to use Unicode, then you can bet that most of
 these infrastructure problems would start to disappear quickly as
 annoyed programmers start scratching their own itches and as they
 start complaining to the people who write the tools they use.
 
 Realistically, if I complain to any software vendor now that their
 editor doesn't work well with D because they don't have funky Unicode
 functionality, the response is likely to be "Sounds like a problem
 with D, whatever that is".  If the language were Java or C++, though,
 they would have little choice but to take the complaint seriously,
 regardless of the effort required.

Unfortunately, you might be right in that D is not currently in a 
position to force the issue.

Oct 23 2008

"Nick Sabalausky" <a a.a> writes:

"Walter Bright" <newshound1 digitalmars.com> wrote in message 
news:gdr4pe$2uje$1 digitalmars.com...
 Bill Baxter wrote:
 I think that's the conclusion I'm coming too as well.  While the use
 of Unicode would have some advantages, there are various technical
 issues with it (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8).  I think those issues can
 all be solved, but it would be a large distraction for the D
 community.  Better to let some big, well-funded, massively popular
 language pioneer in this area.  If some language with a billion
 programmers decided to use Unicode, then you can bet that most of
 these infrastructure problems would start to disappear quickly as
 annoyed programmers start scratching their own itches and as they
 start complaining to the people who write the tools they use.

 Realistically, if I complain to any software vendor now that their
 editor doesn't work well with D because they don't have funky Unicode
 functionality, the response is likely to be "Sounds like a problem
 with D, whatever that is".  If the language were Java or C++, though,
 they would have little choice but to take the complaint seriously,
 regardless of the effort required.

 Unfortunately, you might be right in that D is not currently in a position 
 to force the issue.

My various thoughts:

Whatever language does end up forcing the issue is going to come up against 
(inertial) resistance, either successfully or unsuccessfully. If D, right 
now, were to be the language to attempt to force the issue, then like you 
two have said, it would probably be unsuccesful. So, in order for the 
unicode transition to ever be successful, it would have to be some other 
language (or a version of D later down the road) that forces the issue.

However, if D and/or other similarly less-than-mainstream (I hate referring 
to D that way, BTW) languages already had useful unicode support in a way 
that *wasn't* trying to force the issue (ie, purely optional, with perfectly 
acceptable ASCII fallbacks) when that "force the issue" language does come 
along, then that can help cut down on the resistance that the "force the 
issue" language encounters. We might not be able to crack the 
chicken-and-the-egg, but we could help weaken it by providing a little extra 
incentive of out own (again, as long as it was in a way that wasn't 
forceful).

I do agree, though, with the people who have said that D has more important 
things to focus on right now than unicode. And I would add that I see most 
of D's biggest strengths as things where it cleans up and fixes the mistakes 
made by the more pioneering languages like C++ or Java. So I think it would 
be in true D style (in a good way) to wait for something else, like maybe 
Fortress, to go muck around in unicode, and then we can design our unicode 
to clean up the mistakes those languages will inevitably end up making 
(instead leading our own language into a corner by making those "pioneer" 
mistakes ourselves). Plus, hopefully by that time we'll have finally taken 
care of the more pressing issues that we're currently facing. (Like 
eliminating foreward reference issues!! Please!!)

I hope that all made sense. I guess my summary is: Hold off on official 
unicode stuff for now and learn from other's unicode mistakes. But, if we do 
put official unicode stuff in right now, keep it in a way that doesn't force 
the issue. And as for unofficial unicode stuff, I say go ahead, play around 
with it, post it, do whatever.

Oct 23 2008

Don <nospam nospam.com.au> writes:

Andrei Alexandrescu wrote:
 Correx:
 
 http://www.reddit.com/r/programming/comments/78rmc/allowing_unicode_operator
_in_d_similarly_to/ 
 
 
 Andrei
 
 Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 



 Andrei


Entering this debate late:

I think that operator overloading itself is syntactic sugar, and 
primarily exists for numerical programmers. So it's not so unreasonable 
to support for operator overloading which is not hugely intelligible to 
non-mathematicians.
"Funny" operators should never be seen by anyone without a mathematical 
background. However, I'm not so sure how common they'd actually be.

The strongest use case seems to me to be the situation where multiple 
related operations exist, but only one operator is available.
The classic example is vector products, where we have:
- vector dot vector
- vector cross vector
- Elementwise product of two vectors.
But we only have one opMul. So it would be useful to have alternate 
multiplication signs available.
Adding � (opCross) as a multiplication which is non-associative would, I 
think, be quite generally useful.

But, I think there aren't actually very many other operators which are 
easy to justify on mathematical grounds. Largely because most unary 
operations look quite OK when implemented as functions, and 
mathematicians don't have a huge number of binary operators.
Other than dot product, cross product, and convolution, there's the 
exclusive or symbol (+ with a circle around it), and everything else is 
pretty obscure.

Apart from the dot and cross product, the inability to have superscripts 
and subscripts in variable names (and comments!) is a much bigger issue, 
in my experience.
Oh. And the lack of an exponentiation operator. I miss the old Commodore 
64 up-arrow for power <g>

If you could completely ignore keyboard and display issues, and use any 
unicode character as an operator, which ones would you actually use?

Oct 28 2008

Sergey Gromov <snake.scaly gmail.com> writes:

Don wrote:
 If you could completely ignore keyboard and display issues, and use any
 unicode character as an operator, which ones would you actually use?

I'd use dot "⋅" and cross "×" products for 3D, union "∪" and
intersection "∩", subset "⊂" and superset "⊃" and their negative forms.
 I don't think I'd use anything else.

Well, comparisons look better when converted into appropriate unicode.

Oct 28 2008

bearophile <bearophileHUGS lycos.com> writes:

Sergey Gromov:
 I'd use dot "⋅" and cross "×" products for 3D, union "∪" and
 intersection "∩", subset "⊂" and superset "⊃" and their negative forms.
  I don't think I'd use anything else.

I just want to note that the whole thread is almost unreadable on the
digitalmars.com/webnews/, because it doesn't digest unicode chars at all. So
adding unicode to D will give problems to show code.

Unrelated to the unicode, but related on those opSubset, opSuperset, etc:
while implementing a set() class with the same API of the Python sets, I have
seen there are the following operators/methods too:

issubset(other) 
set <= other 
Test whether every element in the set is in other.

set < other 
Test whether the set is a true subset of other, that is, set <= other and set
!= other.

issuperset(other) 
set >= other 
Test whether every element in other is in the set.

set > other 
Test whether the set is a true superset of other, that is, set >= other and set
!= other.

A full opCmp can't be defined on sets, so I think in D1 we can't overload <= >=
among sets... I think this is a problem has to be solved in D2, because sets
are important enough.

Bye,
bearophile

Oct 28 2008

KennyTM~ <kennytm gmail.com> writes:

bearophile wrote:
 Sergey Gromov:
 I'd use dot "â‹…" and cross "Ã—" products for 3D, union "âˆª" and
 intersection "âˆ©", subset "âŠ‚" and superset "âŠƒ" and their
negative forms.
  I don't think I'd use anything else.

 
 I just want to note that the whole thread is almost unreadable on the
digitalmars.com/webnews/, because it doesn't digest unicode chars at all. So
adding unicode to D will give problems to show code.
 
 Unrelated to the unicode, but related on those opSubset, opSuperset, etc:
 while implementing a set() class with the same API of the Python sets, I have
seen there are the following operators/methods too:
 
 issubset(other) 
 set <= other 
 Test whether every element in the set is in other.
 
 set < other 
 Test whether the set is a true subset of other, that is, set <= other and set
!= other.
 
 issuperset(other) 
 set >= other 
 Test whether every element in other is in the set.
 
 set > other 
 Test whether the set is a true superset of other, that is, set >= other and
set != other.
 
 A full opCmp can't be defined on sets, so I think in D1 we can't overload <=
>= among sets... I think this is a problem has to be solved in D2, because sets
are important enough.
 
 Bye,
 bearophile

If the two sets are incomparable, just return NaN... We need an opCmp 
that returns a float :)

Oct 28 2008

KennyTM~ <kennytm gmail.com> writes:

KennyTM~ wrote:
 bearophile wrote:
 Sergey Gromov:
 I'd use dot "â‹…" and cross "Ã—" products for 3D, union "âˆª" and
 intersection "âˆ©", subset "âŠ‚" and superset "âŠƒ" and their 
 negative forms.
  I don't think I'd use anything else.

 I just want to note that the whole thread is almost unreadable on the 
 digitalmars.com/webnews/, because it doesn't digest unicode chars at 
 all. So adding unicode to D will give problems to show code.

 Unrelated to the unicode, but related on those opSubset, opSuperset, etc:
 while implementing a set() class with the same API of the Python sets, 
 I have seen there are the following operators/methods too:

 issubset(other) set <= other Test whether every element in the set is 
 in other.

 set < other Test whether the set is a true subset of other, that is, 
 set <= other and set != other.

 issuperset(other) set >= other Test whether every element in other is 
 in the set.

 set > other Test whether the set is a true superset of other, that is, 
 set >= other and set != other.

 A full opCmp can't be defined on sets, so I think in D1 we can't 
 overload <= >= among sets... I think this is a problem has to be 
 solved in D2, because sets are important enough.

 Bye,
 bearophile

 
 If the two sets are incomparable, just return NaN... We need an opCmp 
 that returns a float :)

Actually I've made a working solution. Even the exotic operators like 
!<= (not a subset of, ⊈) works too. It's designed for demonstration, not 
performance, though.

Oct 28 2008

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Sergey Gromov wrote:
 Don wrote:
 If you could completely ignore keyboard and display issues, and use any
 unicode character as an operator, which ones would you actually use?

 
 I'd use dot "⋅" and cross "×" products for 3D, union "∪" and
 intersection "∩", subset "⊂" and superset "⊃" and their negative forms.
  I don't think I'd use anything else.
 
 Well, comparisons look better when converted into appropriate unicode.

In my opinion, a workable feature is this:

* Functions can be defined with a leading backspace. They will be usable 
with the infix notation.

* There is a way of specifying that precedence of a function defined as 
above is the same as precedence of a built-in operator.

* Functions of which name is the same as an HTML entity name for a 
symbol can be replaced with the actual symbol.


Andrei

Oct 28 2008

"Bill Baxter" <wbaxter gmail.com> writes:

T24gV2VkLCBPY3QgMjksIDIwMDggYXQgNDoxMiBBTSwgQW5kcmVpIEFsZXhhbmRyZXNjdQo8U2Vl
V2Vic2l0ZUZvckVtYWlsQGVyZGFuaS5vcmc+IHdyb3RlOgo+IFNlcmdleSBHcm9tb3Ygd3JvdGU6
Cj4+Cj4+IERvbiB3cm90ZToKPj4+Cj4+PiBJZiB5b3UgY291bGQgY29tcGxldGVseSBpZ25vcmUg
a2V5Ym9hcmQgYW5kIGRpc3BsYXkgaXNzdWVzLCBhbmQgdXNlIGFueQo+Pj4gdW5pY29kZSBjaGFy
YWN0ZXIgYXMgYW4gb3BlcmF0b3IsIHdoaWNoIG9uZXMgd291bGQgeW91IGFjdHVhbGx5IHVzZT8K
Pj4KPj4gSSdkIHVzZSBkb3QgIuKLhSIgYW5kIGNyb3NzICLDlyIgcHJvZHVjdHMgZm9yIDNELCB1
bmlvbiAi4oiqIiBhbmQKPj4gaW50ZXJzZWN0aW9uICLiiKkiLCBzdWJzZXQgIuKKgiIgYW5kIHN1
cGVyc2V0ICLiioMiIGFuZCB0aGVpciBuZWdhdGl2ZSBmb3Jtcy4KPj4gIEkgZG9uJ3QgdGhpbmsg
SSdkIHVzZSBhbnl0aGluZyBlbHNlLgo+Pgo+PiBXZWxsLCBjb21wYXJpc29ucyBsb29rIGJldHRl
ciB3aGVuIGNvbnZlcnRlZCBpbnRvIGFwcHJvcHJpYXRlIHVuaWNvZGUuCj4KPiBJbiBteSBvcGlu
aW9uLCBhIHdvcmthYmxlIGZlYXR1cmUgaXMgdGhpczoKPgo+ICogRnVuY3Rpb25zIGNhbiBiZSBk
ZWZpbmVkIHdpdGggYSBsZWFkaW5nIGJhY2tzcGFjZS4gVGhleSB3aWxsIGJlIHVzYWJsZQo+IHdp
dGggdGhlIGluZml4IG5vdGF0aW9uLgoKRGlkIHlvdSBtZWFuIGJhY2tzbGFzaD8gIEkgaG9wZSB5
b3UncmUgbm90IHN1Z2dlc3Rpbmcgd2Ugd3JpdGUKXkhpbmZpeE9wZXJhdG9yLiA6LSkKCj4gKiBU
aGVyZSBpcyBhIHdheSBvZiBzcGVjaWZ5aW5nIHRoYXQgcHJlY2VkZW5jZSBvZiBhIGZ1bmN0aW9u
IGRlZmluZWQgYXMKPiBhYm92ZSBpcyB0aGUgc2FtZSBhcyBwcmVjZWRlbmNlIG9mIGEgYnVpbHQt
aW4gb3BlcmF0b3IuCgpXb3JrYWJsZSwgYnV0IGl0IGFpbid0IHdoYXQgV2FsdGVyIGNhbGxzIHBh
cnNpbmcuCgo+ICogRnVuY3Rpb25zIG9mIHdoaWNoIG5hbWUgaXMgdGhlIHNhbWUgYXMgYW4gSFRN
TCBlbnRpdHkgbmFtZSBmb3IgYSBzeW1ib2wKPiBjYW4gYmUgcmVwbGFjZWQgd2l0aCB0aGUgYWN0
dWFsIHN5bWJvbC4KCi0tYmIK

Oct 28 2008

Don <nospam nospam.com.au> writes:

Andrei Alexandrescu wrote:
 Sergey Gromov wrote:
 Don wrote:
 If you could completely ignore keyboard and display issues, and use any
 unicode character as an operator, which ones would you actually use?

 I'd use dot "⋅" and cross "×" products for 3D, union "∪" and
 intersection "∩", subset "⊂" and superset "⊃" and their negative forms.
  I don't think I'd use anything else.

 Well, comparisons look better when converted into appropriate unicode.

 
 In my opinion, a workable feature is this:
 
 * Functions can be defined with a leading backspace. They will be usable 
 with the infix notation.
 
 * There is a way of specifying that precedence of a function defined as 
 above is the same as precedence of a built-in operator.

Do we really need to do that? How many Unicode binary operators are there?

This list of symbols which work in web browsers is very short.
http://en.wikipedia.org/wiki/Wikipedia:Mathematical_symbols

The interesting thing about this second list is just how short it is, 
and how many of the items in it are comparison operators.
Any of the unicode comparison operators could be given the same 
precedence as <,> and 'in'.
Cross should be given the same precedence as opMul and opDiv.
That just leaves oplus, otimes, which probably the same precedence as 
plus and mul.

You can do the same thing with this list:
http://en.wikipedia.org/wiki/Unicode_Mathematical_Operators
And you find that the precedence of almost everything is easy to 
determine. Seems like 90% of them are relational operators.

Specifying the precedence of each unicode operator (eg by a lookup 
table) would be adequate for any use case I can imagine, and it wouldn't 
make syntactic analysis any more ambiguous.

 * Functions of which name is the same as an HTML entity name for a 
 symbol can be replaced with the actual symbol.

Oct 29 2008

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 * There is a way of specifying that precedence of a function defined as 
 above is the same as precedence of a built-in operator.

That throws out the ability to parse without semantic analysis. It's not 
worth it.

Oct 29 2008

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Walter Bright wrote:
 Andrei Alexandrescu wrote:
 * There is a way of specifying that precedence of a function defined 
 as above is the same as precedence of a built-in operator.

 
 That throws out the ability to parse without semantic analysis. It's not 
 worth it.

It doesn't per a previous post of mine, but I agree it's still not worth it.

Andrei

Oct 29 2008

Benji Smith <dlanguage benjismith.net> writes:

Sergey Gromov wrote:
 Don wrote:
 If you could completely ignore keyboard and display issues, and use any
 unicode character as an operator, which ones would you actually use?

 
 I'd use dot "⋅" and cross "×" products for 3D, union "∪" and
 intersection "∩", subset "⊂" and superset "⊃" and their negative forms.
  I don't think I'd use anything else.
 
 Well, comparisons look better when converted into appropriate unicode.

I have pretty much the same list.

For me the really compelling case for unicode characters isn't in 
finding more operators. It's the brackets!!

--benji

Oct 28 2008

Moritz Warning <moritzwarning web.de> writes:

On Wed, 22 Oct 2008 17:27:58 -0500, Andrei Alexandrescu wrote:

 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/

allowing_unicode_operators_in_d_similarly_to/
 
 
 Andrei

It would be very nice to have unicode operators.
But what opFooBar functions do users need (most)?

opDotProduct and opCrossProduct would be definitely cool.

Oct 22 2008

Moritz Warning <moritzwarning web.de> writes:

On Wed, 22 Oct 2008 23:37:43 +0000, Moritz Warning wrote:

 On Wed, 22 Oct 2008 17:27:58 -0500, Andrei Alexandrescu wrote:
 
 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/

 allowing_unicode_operators_in_d_similarly_to/
 
 
 Andrei

 
 It would be very nice to have unicode operators. But what opFooBar
 functions do users need (most)?
 
 opDotProduct and opCrossProduct would be definitely cool.

sorry posted in d.announce by .. accident. :/

Oct 22 2008

"Nick Sabalausky" <a a.a> writes:

"Moritz Warning" <moritzwarning web.de> wrote in message 
news:gdodg7$1f5o$1 digitalmars.com...
 On Wed, 22 Oct 2008 17:27:58 -0500, Andrei Alexandrescu wrote:

 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/

 allowing_unicode_operators_in_d_similarly_to/
 Andrei

 It would be very nice to have unicode operators.
 But what opFooBar functions do users need (most)?

 opDotProduct and opCrossProduct would be definitely cool.

I'd certainly like opIntersection and maybe opUnion.

Oct 22 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/

(My comment cross posted here from reddit)

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

The truth is keyboards aren't very good for inputting Unicode. That
isn't likely to change. Yes they've dealt with the problem in Asian
languages by using IMEs but in my opinion IMEs are horrible to use.

Some people seem to argue it's a waste to go to Unicode only for a few
symbols. If you're going to go Unicode, you should go whole hog. I'd
argue the exact opposite. If you're going to go Unicode, it should be
done in moderation. Use as little Unicode as necessary and no more.

As for how to input unicode -- Microsoft Word solved that problem ages
ago, assuming we're talking about small numbers of special characters.
It's called AutoCorrect. You just register your unicode symbol as a
misspelling for "(X)" or something unique like that and then every
time you type "(X)" a funky unicode character instantly replaces those
chars.

Yeh, not many editors support such a feature. But it's very easy to
implement. And with that one generic mechanism, your editor is ready
to support input of Unicode chars in any language just by adding the
right definitions.

--bb

Oct 22 2008

Jesse Phillips <jessekphillips gmail.com> writes:

On Thu, 23 Oct 2008 09:52:34 +0900, Bill Baxter wrote:

 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/


allowing_unicode_operators_in_d_similarly_to/

 (My comment cross posted here from reddit)
 
 I think the right way to do it is not to make everything Unicode. All
 the pressure on the existing symbols would be dramatically relieved by
 the addition of just a handful of new symbols.
 
 The truth is keyboards aren't very good for inputting Unicode. That
 isn't likely to change. Yes they've dealt with the problem in Asian
 languages by using IMEs but in my opinion IMEs are horrible to use.
 
 Some people seem to argue it's a waste to go to Unicode only for a few
 symbols. If you're going to go Unicode, you should go whole hog. I'd
 argue the exact opposite. If you're going to go Unicode, it should be
 done in moderation. Use as little Unicode as necessary and no more.
 
 As for how to input unicode -- Microsoft Word solved that problem ages
 ago, assuming we're talking about small numbers of special characters.
 It's called AutoCorrect. You just register your unicode symbol as a
 misspelling for "(X)" or something unique like that and then every time
 you type "(X)" a funky unicode character instantly replaces those chars.
 
 Yeh, not many editors support such a feature. But it's very easy to
 implement. And with that one generic mechanism, your editor is ready to
 support input of Unicode chars in any language just by adding the right
 definitions.
 
 --bb

I don't find this terribly appealing. Walter mentions having thrown out 
support for 16bit processors and such. Why not through out 32bit too? 
Those are going out of style.

The point is, it's not the languages job to force change of hardware. And 
support via a text editor is also not acceptable. Going the software 
support route relies on the OS to support a universal easy method to 
enter unicode.

As for D's case, I say support unicode for these new operators, but 
provide the same function with keyboard provided symbols.

Oct 22 2008

Don <nospam nospam.com.au> writes:

Bill Baxter wrote:
On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/

(My comment cross posted here from reddit)

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

I agree.
There is in fact a fairly defensible subset of Unicode: those characters
which are easy to type on some keyboard. This would includes chevrons,
currency symbols (especially pound, euro, yen); european accented
characters (not terribly useful) and a couple of other punctuation
marks. After all, if it's painful to type a Euro symbol on your
keyboard, you're heading for oblivion.

The list is pretty much equivalent to the US-International keyboard
layout in Windows. There aren't many useful characters in there, but it
might be enough.

� � � � � � � � � � � � � � � �

The chevrons and the inverted ? and ! are perhaps the most interesting,
since they are paired. The multiply sign isn't bad, though.
With the German keyboards I have to use, some of these are less painful
to type than {}.

Oct 23 2008

Sergey Gromov <snake.scaly gmail.com> writes:

Thu, 23 Oct 2008 09:36:39 +0200,
Don wrote:
 =AB =BB ? ? =B6 =A7 =AC ? ? ? ? ? =A4 ? =A9 =AE

Lots of question marks here.  This sucks.

Oct 23 2008

Spacen Jasset <spacenjasset yahoo.co.uk> writes:

Bill Baxter wrote:
On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/

(My comment cross posted here from reddit)

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

--bb

I am not entirely sure that 30 or (x amount) of new operators would be a
good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs
m3 = m1 X m2 ? and how often will that happen? It's also going to make
the language more difficult to learn and understand.

If set memebrship test operator and a few others are introduced, then
really to be "complete" all the set operators must be added, and
implemented.

Futhermore, the introduction of set operators should really mean that
you can use them on something by default, that means implementing sets
that presumably are usable, quick, and are worth using, otherwise peope
will roll thier own (all the time) in many different ways.

Unicode symbol 'x' may look better, but is it really more readable? I
think it is -- a bit, and it may be cool, but I don't think it's one of
the things that is going to make developing software siginficantly easier.

Why unicode anyway? In the same way that editor support is required to
actually type them in, why not let the editor render them. So instead of
symbol 'x' in the source code, say:

m3 = m1 cross_product m2

as an infix notatation in a similar way to the (uniary) sizeof operator.

While cross_product is a bit long and unwieldy any editor capable can
replace the rendition of that keyword with a symbol for it. But in
editors that don't it means that it still can be typed in and/or
displayed easily.

Another option includes providing cross_product as an 'alias' and 'X'
aswell.

Which then leads on to the introduction of a facility to add arbitary
operators, which could be interesting becuase you can supply any
operator you see fit for the domains that you use that require it. --
This provide exactly the right solution though as all the additions
would be 'non standard' and I can see books in the future recommending
people not use unicode operators, becuase editors don't have support for
them.

If D is to be used on a wide variety of platforms, which would be
desirable if it is to gain traction, then editor support barriers like
this could impeede it's progress.

Oct 25 2008

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 

 (My comment cross posted here from reddit)

 I think the right way to do it is not to make everything Unicode. All
 the pressure on the existing symbols would be dramatically relieved by
 the addition of just a handful of new symbols.

 The truth is keyboards aren't very good for inputting Unicode. That
 isn't likely to change. Yes they've dealt with the problem in Asian
 languages by using IMEs but in my opinion IMEs are horrible to use.

 Some people seem to argue it's a waste to go to Unicode only for a few
 symbols. If you're going to go Unicode, you should go whole hog. I'd
 argue the exact opposite. If you're going to go Unicode, it should be
 done in moderation. Use as little Unicode as necessary and no more.

 As for how to input unicode -- Microsoft Word solved that problem ages
 ago, assuming we're talking about small numbers of special characters.
 It's called AutoCorrect. You just register your unicode symbol as a
 misspelling for "(X)" or something unique like that and then every
 time you type "(X)" a funky unicode character instantly replaces those
 chars.

 Yeh, not many editors support such a feature. But it's very easy to
 implement. And with that one generic mechanism, your editor is ready
 to support input of Unicode chars in any language just by adding the
 right definitions.

 --bb

 I am not entirely sure that 30 or (x amount) of new operators would be a 
 good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs 
 m3 = m1 X m2 ? and how often will that happen? It's also going to make 
 the language more difficult to learn and understand.

I have noticed that in pretty much all scientific code, the f(a, b) and 
a.f(b) notations fall off a readability cliff when the number of 
operators grows only to a handful. Lured by simple examples like yours, 
people don't see that as a problem until they actually have to read or 
write such code. Adding temporaries and such is not that great because 
it further takes the algorithm away from its mathematical form just for 
serving a notation that was the problem in the first place.

 If set memebrship test operator and a few others are introduced, then 
 really to be "complete" all the set operators must be added, and 
 implemented.
 
 Futhermore, the introduction of set operators should really mean that 
 you can use them on something by default, that means implementing sets 
 that presumably are usable, quick, and are worth using, otherwise peope 
 will roll thier own (all the time) in many different ways.
 
 Unicode symbol 'x' may look better, but is it really more readable? I 
 think it is -- a bit, and it may be cool, but I don't think it's one of 
 the things that is going to make developing software siginficantly easier.

I think "cool" has not a lot to do with it. For scientific code, it's 
closer to a necessity.


Andrei

Oct 25 2008

Spacen Jasset <spacenjasset yahoo.co.uk> writes:

Andrei Alexandrescu wrote:
 Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 

 (My comment cross posted here from reddit)

 I think the right way to do it is not to make everything Unicode. All
 the pressure on the existing symbols would be dramatically relieved by
 the addition of just a handful of new symbols.

 The truth is keyboards aren't very good for inputting Unicode. That
 isn't likely to change. Yes they've dealt with the problem in Asian
 languages by using IMEs but in my opinion IMEs are horrible to use.

 Some people seem to argue it's a waste to go to Unicode only for a few
 symbols. If you're going to go Unicode, you should go whole hog. I'd
 argue the exact opposite. If you're going to go Unicode, it should be
 done in moderation. Use as little Unicode as necessary and no more.

 As for how to input unicode -- Microsoft Word solved that problem ages
 ago, assuming we're talking about small numbers of special characters.
 It's called AutoCorrect. You just register your unicode symbol as a
 misspelling for "(X)" or something unique like that and then every
 time you type "(X)" a funky unicode character instantly replaces those
 chars.

 Yeh, not many editors support such a feature. But it's very easy to
 implement. And with that one generic mechanism, your editor is ready
 to support input of Unicode chars in any language just by adding the
 right definitions.

 --bb

 I am not entirely sure that 30 or (x amount) of new operators would be 
 a good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? 
 vs m3 = m1 X m2 ? and how often will that happen? It's also going to 
 make the language more difficult to learn and understand.

 
 I have noticed that in pretty much all scientific code, the f(a, b) and 
 a.f(b) notations fall off a readability cliff when the number of 
 operators grows only to a handful. Lured by simple examples like yours, 
 people don't see that as a problem until they actually have to read or 
 write such code. Adding temporaries and such is not that great because 
 it further takes the algorithm away from its mathematical form just for 
 serving a notation that was the problem in the first place.
 

Yes, that is indeed a fair point and I agree. D is a "systems 
programming language." [sic] though; and so what will people use it for 
in the main? I suggest that communities that require scientific code 
have options now, and that they can and do choose languages for the 
purpose which have better support for thier needs than D might achieve.


 If set memebrship test operator and a few others are introduced, then 
 really to be "complete" all the set operators must be added, and 
 implemented.

 Futhermore, the introduction of set operators should really mean that 
 you can use them on something by default, that means implementing sets 
 that presumably are usable, quick, and are worth using, otherwise 
 peope will roll thier own (all the time) in many different ways.

 Unicode symbol 'x' may look better, but is it really more readable? I 
 think it is -- a bit, and it may be cool, but I don't think it's one 
 of the things that is going to make developing software siginficantly 
 easier.

 
 I think "cool" has not a lot to do with it. For scientific code, it's 
 closer to a necessity.

On my use of "cool" I only brought it up as this thread has a few 
mentions of the word and it's a bit nebulous. I, personally, am more 
concerened with practicality than "cool".

 
 
 Andrei

What I think of unicode symbols therefore depends on whether D should be 
more scientific oriented or not. If it should be, then unicode symbols 
would undoubtedly be a benefit. My responses were guided by the 
assumption that D was more generic in nature, though.

Oct 25 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sun, Oct 26, 2008 at 3:46 AM, Spacen Jasset <spacenjasset yahoo.co.uk> wrote:
 I am not entirely sure that 30 or (x amount) of new operators would be a
 good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs m3 =
 m1 X m2 ? and how often will that happen? It's also going to make the
 language more difficult to learn and understand.

 I have noticed that in pretty much all scientific code, the f(a, b) and
 a.f(b) notations fall off a readability cliff when the number of operators
 grows only to a handful. Lured by simple examples like yours, people don't
 see that as a problem until they actually have to read or write such code.
 Adding temporaries and such is not that great because it further takes the
 algorithm away from its mathematical form just for serving a notation that
 was the problem in the first place.


Yes, heavy math code is hard to read in the current situation.
I almost always prefix any significant math with a comment giving the
equations being implemented in a more compact notation.
Having to write the same thing in two different ways like that is a
waste of effort.
It would be very cool if I could just write it once and have it look
like it does in my notebook.


 Yes, that is indeed a fair point and I agree. D is a "systems programming
 language." [sic] though; and so what will people use it for in the main?

D is a compile-to-the-metal language that is of interest to anyone who
ranks performance high on their list of priorities.  Mathemeticians
and scientists are among the few remaining groups where maximum speed
is still needed.  Games are another area, and games are becoming more
and more sophisticated mathematically under the hood.

 I suggest that communities that require scientific code have options now, and
 that they can and do choose languages for the purpose which have better
 support for thier needs than D might achieve.

The traditional math languages suck at doing anything besides math.
Want to do a bit of math then display the results interactively in an
OpenGL window?  With Fortran?!  Ha!

On the other end there are the Matlab and NumPy-type solutions.  They
are convenient for tinkering around and displaying some results, but
these are not good for performance.

D has both.  So I think D has potential to gain traction in the world
of math-heavy computing.

But anyway, I'm got convinced several posts back that the time is not
yet ripe for Unicode in D.  So I'm not gonna argue that D go Unicode
now.   I'm just saying that math code is hard to read, and that heavy
math users are a good target audience for D because they need
performance, but don't necessarily want to give up
general-purposeness.

--bb

Oct 25 2008

bearophile <bearophileHUGS lycos.com> writes:

Bill Baxter:
 On the other end there are the Matlab and NumPy-type solutions.  They
 are convenient for tinkering around and displaying some results, but
 these are not good for performance.

I have seen many scientific programs that use numpy, so sometimes it's fast
enough. But it forces you to write everything in a vector programming style,
that a procedural programmer needs time to learn. Normal C/D/C++ code is more
flexible, you can work on single items too in a fast way, while in numpy you
can go fast only when you work in bulk, on vectors.

On the other hand numpy offers you some higher level operations on arrays that
are currently missing in D, like certain complex slicing operations, that may
reduce your code length significantly, increasing code readability (because it
looks more like formulas); I can show you some examples if you want. Note that
in D there's no built-in rectangular dynamic arrays, that are basic stuff in
numpy/matlab.

Bye,
bearophile

Oct 25 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sun, Oct 26, 2008 at 5:10 AM, bearophile <bearophileHUGS lycos.com> wrote:
 Bill Baxter:
 On the other end there are the Matlab and NumPy-type solutions.  They
 are convenient for tinkering around and displaying some results, but
 these are not good for performance.

 I have seen many scientific programs that use numpy, so sometimes it's fast
enough. But it forces you to write everything in a vector programming style,
that a procedural programmer needs time to learn. Normal C/D/C++ code is more
flexible, you can work on single items too in a fast way, while in numpy you
can go fast only when you work in bulk, on vectors.

Yep  C/D/C++ is easier.  The SciPy.org site has a growing section of
their wiki devoted to how to make your code fast using various levels
of python/native hybrids.  I was using python heavily for numerical
stuff for a while and it got to the point where I realized that the
time I spent trying to figure out how to vectorize things and use
other tricks to make things fast, and to make python modules out of
external code I wanted to call,  etc.  was actually more work than it
would be to just use D for everything.   Sure Python does have some
nice features as a language that D lacks, but from 10,000 ft  D is a
lot closer to Python than C++ in terms of ease of use.  Also, while
Python is nice for arrays and number crunching, I found the lack of
typing to be a liability when it comes to complicated graph
structures.  Instead of nicely typed pointers that the compiler can
tell apart, you end up with 23 different integer index variables that
you have to keep straight.  And finally, also type related, there's
the annoyance that you have to actually run your app to detect typos.

I'm sure there's way's to work around all those issues, but to me D's
a lot easier.  I simply don't need the workarounds.

I still fire up NumPy and Matplotlib for analyzing the from results
from my D programs.  And SymPy is great too.  I just don't use it as
my main development langauge any more.

 On the other hand numpy offers you some higher level operations on arrays that
are currently missing in D, like certain complex slicing operations, that may
reduce your code length significantly, increasing code readability (because it
looks more like formulas); I can show you some examples if you want.

No thanks!  Been there, done that!

 Note that in D there's no built-in rectangular dynamic arrays, that are basic
stuff in numpy/matlab.

I've got my dflat and gobo
(http://www.dsource.org/projects/multiarray) that are working for me
pretty well.  They could use some full-time loving to make more
operations work intuitively, but the basics work ok.

--bb

Oct 25 2008

bearophile <bearophileHUGS lycos.com> writes:

Bill Baxter:

was actually more work than it would be to just use D for everything.<

Mixing languages isn't nice, I agree. That's why I too use D for several
purposes.

But if you have to change your code very often (and if your problems are of a
certain kind that allow a natural vectorization), then having vectorial (short)
code may have some advantages), think about how much C++ code you need to write
to implement the programs of this book:
http://wiki.deductivethinking.com/wiki/Python_Programs_for_Modelling_Infectious_Diseases_book
So it allows a more explorative way of coding.

Sure Python does have some nice features as a language that D lacks, but from
10,000 ft D is a lot closer to Python than C++ in terms of ease of use.<

My experience with the ShedSkin compiler shows me that most of those features
that D lacks (complex slices, list comps, generators, short syntax, some
near-zero-cost safeties, etc) are absent because of cultural or inertial
reasons present in the brain of people used to C/C++, and not because they
can't be present/added in a language like D.
ShedSkin translates Python code to clean C++ code, showing that it can be done,
it gives advantages, and it's not too much difficult to do. It shows once and
forever, that you can have a C++-class language with a short and nice syntax,
etc.
Hopefully the Delight language has less of the cultural inertia coming from
C/C++, so it may become a better compromise than D itself.

I've got my dflat and gobo (http://www.dsource.org/projects/multiarray) that
are working for me pretty well. They could use some full-time loving to make
more operations work intuitively, but the basics work ok.<

Nice stuff, lot of stuff. More comments require more study of that code. D
(Tango) may gain from having more batteries.

Bye,
bearophile

Oct 25 2008

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Spacen Jasset wrote:
 Andrei Alexandrescu wrote:
 Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 

 (My comment cross posted here from reddit)

 I think the right way to do it is not to make everything Unicode. All
 the pressure on the existing symbols would be dramatically relieved by
 the addition of just a handful of new symbols.

 The truth is keyboards aren't very good for inputting Unicode. That
 isn't likely to change. Yes they've dealt with the problem in Asian
 languages by using IMEs but in my opinion IMEs are horrible to use.

 Some people seem to argue it's a waste to go to Unicode only for a few
 symbols. If you're going to go Unicode, you should go whole hog. I'd
 argue the exact opposite. If you're going to go Unicode, it should be
 done in moderation. Use as little Unicode as necessary and no more.

 As for how to input unicode -- Microsoft Word solved that problem ages
 ago, assuming we're talking about small numbers of special characters.
 It's called AutoCorrect. You just register your unicode symbol as a
 misspelling for "(X)" or something unique like that and then every
 time you type "(X)" a funky unicode character instantly replaces those
 chars.

 Yeh, not many editors support such a feature. But it's very easy to
 implement. And with that one generic mechanism, your editor is ready
 to support input of Unicode chars in any language just by adding the
 right definitions.

 --bb

 I am not entirely sure that 30 or (x amount) of new operators would 
 be a good thing anyway. How hard is it to say m3 = 
 m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that 
 happen? It's also going to make the language more difficult to learn 
 and understand.

 I have noticed that in pretty much all scientific code, the f(a, b) 
 and a.f(b) notations fall off a readability cliff when the number of 
 operators grows only to a handful. Lured by simple examples like 
 yours, people don't see that as a problem until they actually have to 
 read or write such code. Adding temporaries and such is not that great 
 because it further takes the algorithm away from its mathematical form 
 just for serving a notation that was the problem in the first place.

 Yes, that is indeed a fair point and I agree. D is a "systems 
 programming language." [sic] though; and so what will people use it for 
 in the main? I suggest that communities that require scientific code 
 have options now, and that they can and do choose languages for the 
 purpose which have better support for thier needs than D might achieve.

Surprisingly there's not a lot of choice, witnessed by the prevalence of
Fortran for scientific code. One interesting thing is that quite a few
scientific coders mess with D and hang out around here, such as Don
Clugston, Bill Baxter, bearophile, Benji Smith (he's doing machine 
learning if I remember correctly) and, if I may aspire to the status,
yours truly.

(I remain with an unformed opinion regarding Unicode operators.)

Andrei

Oct 25 2008

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

Andrei Alexandrescu wrote:
 Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 

 (My comment cross posted here from reddit)

 I think the right way to do it is not to make everything Unicode. All
 the pressure on the existing symbols would be dramatically relieved by
 the addition of just a handful of new symbols.

 The truth is keyboards aren't very good for inputting Unicode. That
 isn't likely to change. Yes they've dealt with the problem in Asian
 languages by using IMEs but in my opinion IMEs are horrible to use.

 Some people seem to argue it's a waste to go to Unicode only for a few
 symbols. If you're going to go Unicode, you should go whole hog. I'd
 argue the exact opposite. If you're going to go Unicode, it should be
 done in moderation. Use as little Unicode as necessary and no more.

 As for how to input unicode -- Microsoft Word solved that problem ages
 ago, assuming we're talking about small numbers of special characters.
 It's called AutoCorrect. You just register your unicode symbol as a
 misspelling for "(X)" or something unique like that and then every
 time you type "(X)" a funky unicode character instantly replaces those
 chars.

 Yeh, not many editors support such a feature. But it's very easy to
 implement. And with that one generic mechanism, your editor is ready
 to support input of Unicode chars in any language just by adding the
 right definitions.

 --bb

 I am not entirely sure that 30 or (x amount) of new operators would be 
 a good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? 
 vs m3 = m1 X m2 ? and how often will that happen? It's also going to 
 make the language more difficult to learn and understand.

 
 I have noticed that in pretty much all scientific code, the f(a, b) and 
 a.f(b) notations fall off a readability cliff when the number of 
 operators grows only to a handful. Lured by simple examples like yours, 
 people don't see that as a problem until they actually have to read or 
 write such code. Adding temporaries and such is not that great because 
 it further takes the algorithm away from its mathematical form just for 
 serving a notation that was the problem in the first place.

But what operators would be added? Some mathematician programmers might 
want vector and matrix operators, others set operators, others still 
derivation/integration operators, and so on. Where would we stop?
I don't deny it might be useful for them, but it does seem like too 
specific a need to integrate in the language.


-- 
Bruno Medeiros - Software Developer, MSc. in CS/E graduate
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Oct 26 2008

KennyTM~ <kennytm gmail.com> writes:

Bruno Medeiros wrote:
 Andrei Alexandrescu wrote:
 Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 

 (My comment cross posted here from reddit)

 I think the right way to do it is not to make everything Unicode. All
 the pressure on the existing symbols would be dramatically relieved by
 the addition of just a handful of new symbols.

 The truth is keyboards aren't very good for inputting Unicode. That
 isn't likely to change. Yes they've dealt with the problem in Asian
 languages by using IMEs but in my opinion IMEs are horrible to use.

 Some people seem to argue it's a waste to go to Unicode only for a few
 symbols. If you're going to go Unicode, you should go whole hog. I'd
 argue the exact opposite. If you're going to go Unicode, it should be
 done in moderation. Use as little Unicode as necessary and no more.

 As for how to input unicode -- Microsoft Word solved that problem ages
 ago, assuming we're talking about small numbers of special characters.
 It's called AutoCorrect. You just register your unicode symbol as a
 misspelling for "(X)" or something unique like that and then every
 time you type "(X)" a funky unicode character instantly replaces those
 chars.

 Yeh, not many editors support such a feature. But it's very easy to
 implement. And with that one generic mechanism, your editor is ready
 to support input of Unicode chars in any language just by adding the
 right definitions.

 --bb

 I am not entirely sure that 30 or (x amount) of new operators would 
 be a good thing anyway. How hard is it to say m3 = 
 m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that 
 happen? It's also going to make the language more difficult to learn 
 and understand.

 I have noticed that in pretty much all scientific code, the f(a, b) 
 and a.f(b) notations fall off a readability cliff when the number of 
 operators grows only to a handful. Lured by simple examples like 
 yours, people don't see that as a problem until they actually have to 
 read or write such code. Adding temporaries and such is not that great 
 because it further takes the algorithm away from its mathematical form 
 just for serving a notation that was the problem in the first place.

 
 But what operators would be added? Some mathematician programmers might 
 want vector and matrix operators, others set operators, others still 
 derivation/integration operators, and so on. Where would we stop?
 I don't deny it might be useful for them, but it does seem like too 
 specific a need to integrate in the language.
 
 

Composition may be useful for functional programming (I've never used 
any functional programming paradigm except "reduce".)

Matrix operations: + - * .tr() .inv() .det() etc are already sufficient 
for most jobs.

Vector operations: Maybe an operator for cross product.

Set operators: Just use + - * (| ~ &) instead like Pascal.

So only 2 Unicode operators I see are really useful and the replacements 
are ugly: Composition (o) and cross product (×).

Oct 26 2008

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bruno Medeiros wrote:
 Andrei Alexandrescu wrote:
 Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 

 (My comment cross posted here from reddit)

 I think the right way to do it is not to make everything Unicode. All
 the pressure on the existing symbols would be dramatically relieved by
 the addition of just a handful of new symbols.

 The truth is keyboards aren't very good for inputting Unicode. That
 isn't likely to change. Yes they've dealt with the problem in Asian
 languages by using IMEs but in my opinion IMEs are horrible to use.

 Some people seem to argue it's a waste to go to Unicode only for a few
 symbols. If you're going to go Unicode, you should go whole hog. I'd
 argue the exact opposite. If you're going to go Unicode, it should be
 done in moderation. Use as little Unicode as necessary and no more.

 As for how to input unicode -- Microsoft Word solved that problem ages
 ago, assuming we're talking about small numbers of special characters.
 It's called AutoCorrect. You just register your unicode symbol as a
 misspelling for "(X)" or something unique like that and then every
 time you type "(X)" a funky unicode character instantly replaces those
 chars.

 Yeh, not many editors support such a feature. But it's very easy to
 implement. And with that one generic mechanism, your editor is ready
 to support input of Unicode chars in any language just by adding the
 right definitions.

 --bb

 I am not entirely sure that 30 or (x amount) of new operators would 
 be a good thing anyway. How hard is it to say m3 = 
 m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that 
 happen? It's also going to make the language more difficult to learn 
 and understand.

 I have noticed that in pretty much all scientific code, the f(a, b) 
 and a.f(b) notations fall off a readability cliff when the number of 
 operators grows only to a handful. Lured by simple examples like 
 yours, people don't see that as a problem until they actually have to 
 read or write such code. Adding temporaries and such is not that great 
 because it further takes the algorithm away from its mathematical form 
 just for serving a notation that was the problem in the first place.

 
 But what operators would be added? Some mathematician programmers might 
 want vector and matrix operators, others set operators, others still 
 derivation/integration operators, and so on. Where would we stop?
 I don't deny it might be useful for them, but it does seem like too 
 specific a need to integrate in the language.

I was thinking of allowing a general way of defining one Unicode 
character to stand in as one operator, and then have libraries implement 
  the actual operators.

There's the remaining problem of different libraries defining the same 
character to mean different operators. This may not be huge as math 
subdomains tend to be rather consistent in their use of operators. 
Across math subdomains, types and overloading can take care of things.

Also, ascii representation should be allowed for operators, and one nice 
thing about Unicode characters is that many have HTML ascii and 
human-readable names, see 
http://www.fileformat.info/format/w3c/htmlentity.htm. So 
\unicodecharname may be a good alternate way to enter these operators. 
For example, the empty set could be \empty, and the cross-product could 
be written as \times. So

c = a \times b;

doesn't quite look bad to me.

One nice thing about this is that we don't need to pore over naming and 
such, we just use stuff that others (creators and users alike) have 
already pored over. Saves on documentation writing too :o).


Andrei

Oct 26 2008

KennyTM~ <kennytm gmail.com> writes:

Andrei Alexandrescu wrote:
 Bruno Medeiros wrote:
 Andrei Alexandrescu wrote:
 Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 

 (My comment cross posted here from reddit)

 I think the right way to do it is not to make everything Unicode. All
 the pressure on the existing symbols would be dramatically relieved by
 the addition of just a handful of new symbols.

 The truth is keyboards aren't very good for inputting Unicode. That
 isn't likely to change. Yes they've dealt with the problem in Asian
 languages by using IMEs but in my opinion IMEs are horrible to use.

 Some people seem to argue it's a waste to go to Unicode only for a few
 symbols. If you're going to go Unicode, you should go whole hog. I'd
 argue the exact opposite. If you're going to go Unicode, it should be
 done in moderation. Use as little Unicode as necessary and no more.

 As for how to input unicode -- Microsoft Word solved that problem ages
 ago, assuming we're talking about small numbers of special characters.
 It's called AutoCorrect. You just register your unicode symbol as a
 misspelling for "(X)" or something unique like that and then every
 time you type "(X)" a funky unicode character instantly replaces those
 chars.

 Yeh, not many editors support such a feature. But it's very easy to
 implement. And with that one generic mechanism, your editor is ready
 to support input of Unicode chars in any language just by adding the
 right definitions.

 --bb

 I am not entirely sure that 30 or (x amount) of new operators would 
 be a good thing anyway. How hard is it to say m3 = 
 m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that 
 happen? It's also going to make the language more difficult to learn 
 and understand.

 I have noticed that in pretty much all scientific code, the f(a, b) 
 and a.f(b) notations fall off a readability cliff when the number of 
 operators grows only to a handful. Lured by simple examples like 
 yours, people don't see that as a problem until they actually have to 
 read or write such code. Adding temporaries and such is not that 
 great because it further takes the algorithm away from its 
 mathematical form just for serving a notation that was the problem in 
 the first place.

 But what operators would be added? Some mathematician programmers 
 might want vector and matrix operators, others set operators, others 
 still derivation/integration operators, and so on. Where would we stop?
 I don't deny it might be useful for them, but it does seem like too 
 specific a need to integrate in the language.

 
 I was thinking of allowing a general way of defining one Unicode 
 character to stand in as one operator, and then have libraries implement 
  the actual operators.
 
 There's the remaining problem of different libraries defining the same 
 character to mean different operators. This may not be huge as math 
 subdomains tend to be rather consistent in their use of operators. 
 Across math subdomains, types and overloading can take care of things.
 
 Also, ascii representation should be allowed for operators, and one nice 
 thing about Unicode characters is that many have HTML ascii and 
 human-readable names, see 
 http://www.fileformat.info/format/w3c/htmlentity.htm. So 
 \unicodecharname may be a good alternate way to enter these operators. 
 For example, the empty set could be \empty, and the cross-product could 
 be written as \times. So
 
 c = a \times b;
 
 doesn't quite look bad to me.
 
 One nice thing about this is that we don't need to pore over naming and 
 such, we just use stuff that others (creators and users alike) have 
 already pored over. Saves on documentation writing too :o).
 
 
 Andrei

LaTeX in D? :p

Anyway we already have \&times; and \&empty; so we could reuse them in 
source code level as I've described somewhere in this thread.


   auto torque = position \&times; force;

This is uglier than

   auto torque = position \times force;

but it gives a uniform syntax between escape sequences inside and 
outside strings.

The problem is you may have to invent some names, i.e. the composition 
operator ∘ (U+2218 ring operator) has no name in SGML entities. In LaTeX 
it is represented as \circ but \&circ; is already taken by ˆ (U+02C6 
modifier letter circumflex accent).

And you'll need to predefine the associativity and operation precedence 
too. ;) See my other entry in this thread.

Oct 26 2008

Charles Hixson <charleshixsn earthlink.net> writes:

Bruno Medeiros wrote:
 Andrei Alexandrescu wrote:
 Spacen Jasset wrote:
 Bill Baxter wrote:
 On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 

 (My comment cross posted here from reddit)

 I think the right way to do it is not to make everything Unicode. All
 the pressure on the existing symbols would be dramatically relieved by
 the addition of just a handful of new symbols.

 The truth is keyboards aren't very good for inputting Unicode. That
 isn't likely to change. Yes they've dealt with the problem in Asian
 languages by using IMEs but in my opinion IMEs are horrible to use.

 Some people seem to argue it's a waste to go to Unicode only for a few
 symbols. If you're going to go Unicode, you should go whole hog. I'd
 argue the exact opposite. If you're going to go Unicode, it should be
 done in moderation. Use as little Unicode as necessary and no more.

 As for how to input unicode -- Microsoft Word solved that problem ages
 ago, assuming we're talking about small numbers of special characters.
 It's called AutoCorrect. You just register your unicode symbol as a
 misspelling for "(X)" or something unique like that and then every
 time you type "(X)" a funky unicode character instantly replaces those
 chars.

 Yeh, not many editors support such a feature. But it's very easy to
 implement. And with that one generic mechanism, your editor is ready
 to support input of Unicode chars in any language just by adding the
 right definitions.

 --bb

 I am not entirely sure that 30 or (x amount) of new operators would 
 be a good thing anyway. How hard is it to say m3 = 
 m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that 
 happen? It's also going to make the language more difficult to learn 
 and understand.

 I have noticed that in pretty much all scientific code, the f(a, b) 
 and a.f(b) notations fall off a readability cliff when the number of 
 operators grows only to a handful. Lured by simple examples like 
 yours, people don't see that as a problem until they actually have to 
 read or write such code. Adding temporaries and such is not that great 
 because it further takes the algorithm away from its mathematical form 
 just for serving a notation that was the problem in the first place.

 
 But what operators would be added? Some mathematician programmers might 
 want vector and matrix operators, others set operators, others still 
 derivation/integration operators, and so on. Where would we stop?
 I don't deny it might be useful for them, but it does seem like too 
 specific a need to integrate in the language.
 
 

Perhaps what needs to be added is a syntax for defining character to 
function correspondence?  That way people could define the binary 
functions that they need, and then define a corresponding character 
string that represented it.  I once recommended that Eiffel include a 
means of defining user operators (i.e., binary functions that sit 
between the terms on which the operate) using the name syntax thusly:

Starts and ends with '|' and doesn't contain any whitespace.  Must be 
surrounded by whitespace when used.  I.e. 1 |X|-3 would be forbidden, as 
there is no whitespace following the |X| operator.

That still seems like a good rule to me.  If you want to include 
unicode, that's no problem.  And the function could also be used as:
    X(1, -3)
with identical meaning.  I.e., marking a function as an operator by 
surrounding it with pipes would be purely syntax sugar.  Note that such 
operators would have a precedence higher than assignment, but lower than 
everything else, so in practice the choice would be between writing:
   X (1, -3)
and writing:
   (1 |X| -3)
unless all one were doing is making an assignment.  This is analogous to 
the class member variable in object methods, or the class name in class 
methods, except that that is often understood.

OTOH, I'm not certain how much such syntax buys you.

P.S.:  another possibility, which is more in line with current D syntax 
requires an assignment of the operator character to a function that 
starts with op.  As in '+' is associated with opAdd.  However even 
though this is more in line with current D syntax, it seems to buy you a 
lot less.  And it seems to require that the operator be a single 
character.  This appears to me to be more work than it's worth for the 
return.  Even the approach that I suggested is probably marginal.

P.P.S:  Any system that requires that a specific IDE or editor be used 
is no going to work.  Not unless the IDE were provided with the 
language, and even then the most successful examples I can thing of are 
EMACS and Smalltalk.  (I'm excluding programs that don't run on Linux, 
as I have no familiarity with either how they function or how popular 
they are.  Probably, though, one could include Visual Basic and maybe 
some others.  But one certainly couldn't include Basic, merely one 
dialect of it.)

Oct 26 2008

"Simen Kjaeraas" <simen.kjaras gmail.com> writes:

On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset  
<spacenjasset yahoo.co.uk> wrote:

 Why unicode anyway? In the same way that editor support is required to  
 actually type them in, why not let the editor render them. So instead of  
 symbol 'x' in the source code, say:

 m3 = m1 cross_product m2

 as an infix notatation in a similar way to the (uniary) sizeof operator.


 While cross_product is a bit long and unwieldy any editor capable can  
 replace the rendition of that keyword with a symbol for it. But in  
 editors that don't it means that it still can be typed in and/or  
 displayed easily.

 Another option includes providing cross_product as an 'alias' and 'X'  
 aswell.

 Which then leads on to the introduction of a facility to add arbitary  
 operators, which could be interesting becuase you can supply any  
 operator you see fit for the domains that you use that require it. --  
 This provide exactly the right solution though as all the additions  
 would be 'non standard' and I can see books in the future recommending  
 people not use unicode operators, becuase editors don't have support for  
 them.

This made me think. What if we /could/ define arbitrary infix operators in  
D? I'm thinking something along the lines of:


operator cross_product(T, U)
{
   static if (T.opCross)
   {
     T.opCross(T)
   }
   else static if (U.opCross)
   {
     U.opCross_r(T);
   }
   else
   {
     static assert(false, "Operator not applicable to operands.");
   }
}

alias cross_product ×;


I'm not sure if this is possible, but it sure would please downs. :P

-- 
Simen

Oct 26 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sun, Oct 26, 2008 at 11:02 PM, Simen Kjaeraas <simen.kjaras gmail.com> w=
rote:
 On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset <spacenjasset yahoo.co.=

uk>
 wrote:

 Why unicode anyway? In the same way that editor support is required to
 actually type them in, why not let the editor render them. So instead of
 symbol 'x' in the source code, say:

 m3 =3D m1 cross_product m2

 as an infix notatation in a similar way to the (uniary) sizeof operator.


 While cross_product is a bit long and unwieldy any editor capable can
 replace the rendition of that keyword with a symbol for it. But in edito=


rs
 that don't it means that it still can be typed in and/or displayed easil=


y.
 Another option includes providing cross_product as an 'alias' and 'X'
 aswell.

 Which then leads on to the introduction of a facility to add arbitary
 operators, which could be interesting becuase you can supply any operato=


r
 you see fit for the domains that you use that require it. -- This provid=


e
 exactly the right solution though as all the additions would be 'non
 standard' and I can see books in the future recommending people not use
 unicode operators, becuase editors don't have support for them.

 This made me think. What if we /could/ define arbitrary infix operators i=

n
 D? I'm thinking something along the lines of:


 operator cross_product(T, U)
 {
  static if (T.opCross)
  {
    T.opCross(T)
  }
  else static if (U.opCross)
  {
    U.opCross_r(T);
  }
  else
  {
    static assert(false, "Operator not applicable to operands.");
  }
 }

 alias cross_product =D7;


 I'm not sure if this is possible, but it sure would please downs. :P

What's the precedence of your user-defined in-fix operator?

--bb

Oct 26 2008

"Simen Kjaeraas" <simen.kjaras gmail.com> writes:

On Sun, 26 Oct 2008 22:28:16 +0100, Bill Baxter <wbaxter gmail.com> wrote:

 On Sun, Oct 26, 2008 at 11:02 PM, Simen Kjaeraas  
 <simen.kjaras gmail.com> wrote:
 On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset  
 <spacenjasset yahoo.co.uk>
 wrote:

 Why unicode anyway? In the same way that editor support is required to
 actually type them in, why not let the editor render them. So instead  
 of
 symbol 'x' in the source code, say:

 m3 = m1 cross_product m2

 as an infix notatation in a similar way to the (uniary) sizeof  
 operator.


 While cross_product is a bit long and unwieldy any editor capable can
 replace the rendition of that keyword with a symbol for it. But in  
 editors
 that don't it means that it still can be typed in and/or displayed  
 easily.

 Another option includes providing cross_product as an 'alias' and 'X'
 aswell.

 Which then leads on to the introduction of a facility to add arbitary
 operators, which could be interesting becuase you can supply any  
 operator
 you see fit for the domains that you use that require it. -- This  
 provide
 exactly the right solution though as all the additions would be 'non
 standard' and I can see books in the future recommending people not use
 unicode operators, becuase editors don't have support for them.

 This made me think. What if we /could/ define arbitrary infix operators  
 in
 D? I'm thinking something along the lines of:


 operator cross_product(T, U)
 {
  static if (T.opCross)
  {
    T.opCross(T)
  }
  else static if (U.opCross)
  {
    U.opCross_r(T);
  }
  else
  {
    static assert(false, "Operator not applicable to operands.");
  }
 }

 alias cross_product ×;


 I'm not sure if this is possible, but it sure would please downs. :P

 What's the precedence of your user-defined in-fix operator?

 --bb

Yup, I realized this myself as well. Seemed like such a great idea when I  
only thought of it for three seconds. :p

-- 
Simen

Oct 26 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Mon, Oct 27, 2008 at 8:23 AM, Simen Kjaeraas <simen.kjaras gmail.com> wr=
ote:
 On Sun, 26 Oct 2008 22:28:16 +0100, Bill Baxter <wbaxter gmail.com> wrote=

:
 On Sun, Oct 26, 2008 at 11:02 PM, Simen Kjaeraas <simen.kjaras gmail.com=

 wrote:
 On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset
 <spacenjasset yahoo.co.uk>
 wrote:

 Why unicode anyway? In the same way that editor support is required to
 actually type them in, why not let the editor render them. So instead =




of
 symbol 'x' in the source code, say:

 m3 =3D m1 cross_product m2

 as an infix notatation in a similar way to the (uniary) sizeof operato=




r.
 While cross_product is a bit long and unwieldy any editor capable can
 replace the rendition of that keyword with a symbol for it. But in
 editors
 that don't it means that it still can be typed in and/or displayed
 easily.

 Another option includes providing cross_product as an 'alias' and 'X'
 aswell.

 Which then leads on to the introduction of a facility to add arbitary
 operators, which could be interesting becuase you can supply any
 operator
 you see fit for the domains that you use that require it. -- This
 provide
 exactly the right solution though as all the additions would be 'non
 standard' and I can see books in the future recommending people not us=




e
 unicode operators, becuase editors don't have support for them.

 This made me think. What if we /could/ define arbitrary infix operators
 in
 D? I'm thinking something along the lines of:


 operator cross_product(T, U)
 {
  static if (T.opCross)
  {
   T.opCross(T)
  }
  else static if (U.opCross)
  {
   U.opCross_r(T);
  }
  else
  {
   static assert(false, "Operator not applicable to operands.");
  }
 }

 alias cross_product =D7;


 I'm not sure if this is possible, but it sure would please downs. :P

 What's the precedence of your user-defined in-fix operator?

 --bb

 Yup, I realized this myself as well. Seemed like such a great idea when I
 only thought of it for three seconds. :p

Same thing goes for downs' in-fix operators.  I think his syntax is
/infix/ which means that his ops always have the same precedence as
division.
I'm guessing this Python Cookbook recipe is very similar to Downs'
technique.  It discusses pros and cons and such.
http://code.activestate.com/recipes/384122/

--bb

Oct 26 2008

"Simen Kjaeraas" <simen.kjaras gmail.com> writes:

On Mon, 27 Oct 2008 00:41:26 +0100, Bill Baxter <wbaxter gmail.com> wrote:
 Same thing goes for downs' in-fix operators.  I think his syntax is
 /infix/ which means that his ops always have the same precedence as
 division.
 I'm guessing this Python Cookbook recipe is very similar to Downs'
 technique.  It discusses pros and cons and such.
 http://code.activestate.com/recipes/384122/

 --bb

An interesting read, though I have looked at downs' code before. It  
occured to
me now that this could sorta have been fixed with a preprocessor, just  
define
an operator to have the same precedence as an already existing operator,  
define
an alias that gets replaced with /foo/, +foo+, or whatever operator you  
chose.
I guess we're stuck waiting for macros in the meantime.

-- 
Simen

Oct 26 2008

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Simen Kjaeraas wrote:
 On Sun, 26 Oct 2008 22:28:16 +0100, Bill Baxter <wbaxter gmail.com> wrote:
 
 On Sun, Oct 26, 2008 at 11:02 PM, Simen Kjaeraas 
 <simen.kjaras gmail.com> wrote:
 On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset 
 <spacenjasset yahoo.co.uk>
 wrote:

 Why unicode anyway? In the same way that editor support is required to
 actually type them in, why not let the editor render them. So 
 instead of
 symbol 'x' in the source code, say:

 m3 = m1 cross_product m2

 as an infix notatation in a similar way to the (uniary) sizeof 
 operator.


 While cross_product is a bit long and unwieldy any editor capable can
 replace the rendition of that keyword with a symbol for it. But in 
 editors
 that don't it means that it still can be typed in and/or displayed 
 easily.

 Another option includes providing cross_product as an 'alias' and 'X'
 aswell.

 Which then leads on to the introduction of a facility to add arbitary
 operators, which could be interesting becuase you can supply any 
 operator
 you see fit for the domains that you use that require it. -- This 
 provide
 exactly the right solution though as all the additions would be 'non
 standard' and I can see books in the future recommending people not use
 unicode operators, becuase editors don't have support for them.

 This made me think. What if we /could/ define arbitrary infix 
 operators in
 D? I'm thinking something along the lines of:


 operator cross_product(T, U)
 {
  static if (T.opCross)
  {
    T.opCross(T)
  }
  else static if (U.opCross)
  {
    U.opCross_r(T);
  }
  else
  {
    static assert(false, "Operator not applicable to operands.");
  }
 }

 alias cross_product ×;


 I'm not sure if this is possible, but it sure would please downs. :P

 What's the precedence of your user-defined in-fix operator?

 --bb

 
 Yup, I realized this myself as well. Seemed like such a great idea when 
 I only thought of it for three seconds. :p

An operator could always be defined to have the same precedent as an 
existing operator, which it has to specify.

Andrei

Oct 26 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Mon, Oct 27, 2008 at 9:04 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 What's the precedence of your user-defined in-fix operator?

 --bb

 Yup, I realized this myself as well. Seemed like such a great idea when I
 only thought of it for three seconds. :p

 An operator could always be defined to have the same precedent as an
 existing operator, which it has to specify.

Walter said in a previous post a few days ago when I suggested it that
that would kill D's easy parsability.
You say no?  I'm no parser expert, so hard for me to say.

--bb

Oct 26 2008

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bill Baxter wrote:
 On Mon, Oct 27, 2008 at 9:04 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 
 What's the precedence of your user-defined in-fix operator?

 --bb

 Yup, I realized this myself as well. Seemed like such a great idea when I
 only thought of it for three seconds. :p

 An operator could always be defined to have the same precedent as an
 existing operator, which it has to specify.

 
 Walter said in a previous post a few days ago when I suggested it that
 that would kill D's easy parsability.
 You say no?  I'm no parser expert, so hard for me to say.

It can be done, but it's kinda involved. You define a grammar in which 
all operators have the same precedence. Consequently you compile any 
expression into a list of operands and operators. That makes the 
language parsable without semanting info. Then the semantic stage 
transforms the list into a tree. Cecil does that.

Andrei

Oct 26 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Mon, Oct 27, 2008 at 11:43 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 On Mon, Oct 27, 2008 at 9:04 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 What's the precedence of your user-defined in-fix operator?

 --bb

 Yup, I realized this myself as well. Seemed like such a great idea when
 I
 only thought of it for three seconds. :p

 An operator could always be defined to have the same precedent as an
 existing operator, which it has to specify.

 Walter said in a previous post a few days ago when I suggested it that
 that would kill D's easy parsability.
 You say no?  I'm no parser expert, so hard for me to say.

 It can be done, but it's kinda involved. You define a grammar in which all
 operators have the same precedence. Consequently you compile any expression
 into a list of operands and operators. That makes the language parsable
 without semanting info. Then the semantic stage transforms the list into a
 tree. Cecil does that.

I see.  So the price you pay is that you defer more decisions till
semantic stage.

I.e. "a b c d e" is allowed to parse into an amorphous list, then in
the semantic pass you decide if 'b' and 'd' are actually legal
operators or not.

--bb

Oct 26 2008

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bill Baxter wrote:
 On Mon, Oct 27, 2008 at 11:43 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 On Mon, Oct 27, 2008 at 9:04 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 What's the precedence of your user-defined in-fix operator?

 --bb

 Yup, I realized this myself as well. Seemed like such a great idea when
 I
 only thought of it for three seconds. :p

 An operator could always be defined to have the same precedent as an
 existing operator, which it has to specify.

 Walter said in a previous post a few days ago when I suggested it that
 that would kill D's easy parsability.
 You say no?  I'm no parser expert, so hard for me to say.

 It can be done, but it's kinda involved. You define a grammar in which all
 operators have the same precedence. Consequently you compile any expression
 into a list of operands and operators. That makes the language parsable
 without semanting info. Then the semantic stage transforms the list into a
 tree. Cecil does that.

 
 I see.  So the price you pay is that you defer more decisions till
 semantic stage.
 
 I.e. "a b c d e" is allowed to parse into an amorphous list, then in
 the semantic pass you decide if 'b' and 'd' are actually legal
 operators or not.

Yah. Something tells me Walter won't embark on that soon.

Andrei

Oct 26 2008

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 Yah. Something tells me Walter won't embark on that soon.

Not a chance <g>. Producing an amorphous list of tokens isn't what I'd 
call "parsing".

Oct 26 2008

Max Samukha <samukha voliacable.com.removethis> writes:

On Wed, 22 Oct 2008 17:27:58 -0500, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

Please vote up before the haters take it down, and discuss:

http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/


Andrei

I'm already having problems with unicode: the news reader I'm using
doesn't display the characters correctly (maybe it's time to update).
If unicode can be avoided, please avoid it.

Oct 22 2008

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

Few random thoughts on the subject:
- Someday probably programming languages will use some Unicode symbols. I don't
know if Fortress will succeed, but I think someday some language will do.
Probably Unicode symbols will be used as in Fotress, for improve the
readability of the code, and not as in APL to transform the code into
hieroglyphics.
- Another good thing that Fortress does is that there are always *nice* looking
ways to write the same code in pure ASCII. So there are usually intuitive 2 or
3 char long translations of all the accepted Unicode symbols. This is very
positive, so you can write/read Fortress with a normal ASCII editor too.
- My editor, programming font, newsreader, IDEs, and probably more things,
currently have problems with Unicode texts.
- Novels in English and other languages show that you can express very complex
and refined thoughts with just very few characters. But you need some space to
write a novel/short story. Mathematics shows that a judicious usage of standard
and widely used symbols helps a lot in decreasing the space used to represent
formulas, etc.
- Fortress and the Mathematica language are designed for physics and
mathematics. D language can be used for that, but it's mostly a system
language. So symbols are more used and more important in Fortress than D. So
their purposes and targets are different.
- I like the idea of using *few* Unicode symbols in my programs, they can
reduce code size and they may even improve readability.
- Python3 allows Unicode identifiers, mostly to allow people in all part of the
world to write variable names in their languages.
- But seeing the disadvantages in the end I think that in practice adopting
Unicode for D programs is currently bad.

Bye,
bearophile

Oct 23 2008

Robert Fraser <fraserofthenight gmail.com> writes:

bearophile wrote:
 - Python3 allows Unicode identifiers, mostly to allow people in all part of
the world to write variable names in their languages.

So does D.

Oct 23 2008

Max Samukha <samukha voliacable.com.removethis> writes:

On Thu, 23 Oct 2008 04:23:29 -0700, Robert Fraser
<fraserofthenight gmail.com> wrote:

bearophile wrote:
 - Python3 allows Unicode identifiers, mostly to allow people in all part of
the world to write variable names in their languages.

So does D.

I'd like to note that identifiers in a non-English language are
considered bad style by many programmers. Besides, big part of
software projects nowadays are international. Imagine participants of
linux project writing identifiers in his language.

Oct 23 2008

Yigal Chripun <yigal100 gmail.com> writes:

Max Samukha wrote:
 On Thu, 23 Oct 2008 04:23:29 -0700, Robert Fraser 
 <fraserofthenight gmail.com> wrote:
 
 bearophile wrote:
 - Python3 allows Unicode identifiers, mostly to allow people in
 all part of the world to write variable names in their languages.
 

 So does D.

 I'd like to note that identifiers in a non-English language are 
 considered bad style by many programmers. Besides, big part of 
 software projects nowadays are international. Imagine participants of
  linux project writing identifiers in his language.

isn't that something that should be decided upon on a per-project basis?
I agree that it'll be bad for Linux, but each project has its own
objectives. for example, what if you're teaching a programming course
for kids? it'll be easier for them writing in their own native language.
I could easily imagine a small start-up writing in their own native
language (let's say Hebrew) as one way for obfuscating the source code,
so as to protect their IP.
there are, I'm sure, more use-cases.

Oct 23 2008

bearophile <bearophileHUGS lycos.com> writes:

I always use English for variable names, instead of my language, because I've
had my share of debugging code with variables in other languages and it's not a
nice thing to do.

Regarding Python code, its std libs keeps identifiers in English only, but when
they have invented the OneLaptopForChild that uses Python a lot, they have
decided that 'kids' may enjoy using variable names in their language...

Bye,
bearophile

Oct 23 2008

Max Samukha <samukha voliacable.com.removethis> writes:

On Thu, 23 Oct 2008 08:33:16 -0400, bearophile
<bearophileHUGS lycos.com> wrote:

I always use English for variable names, instead of my language, because I've
had my share of debugging code with variables in other languages and it's not a
nice thing to do.

Regarding Python code, its std libs keeps identifiers in English only, but when
they have invented the OneLaptopForChild that uses Python a lot, they have
decided that 'kids' may enjoy using variable names in their language...

Bye,
bearophile

Keep children away from Python. Let them have happy lives :)

Oct 23 2008

Walter Bright <newshound1 digitalmars.com> writes:

Robert Fraser wrote:
 bearophile wrote:
 - Python3 allows Unicode identifiers, mostly to allow people in all 
 part of the world to write variable names in their languages.

 
 So does D.

D currently allows Unicode in identifiers, comments, and strings. In 
fact, D source text is defined to be Unicode.

Oct 23 2008

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

bearophile wrote:
 Andrei Alexandrescu:

[snip]

(No need to single me out. It's Walter's post, and besides I don't have 
a formed opinion on Unicode symbols.)

Andrei

Oct 23 2008

Yigal Chripun <yigal100 gmail.com> writes:

Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/
 
 
 
 Andrei

A few thoughts on the subject:

- others already mentioned, i think, smalltalk as an example. smalltalk
bundles as part of the language also the complete environment and IDE so
they can add Unicode chars without worrying much about editor support.
in D this is an issue as D doesn't provide an "official" D editor. The
support largely exists for Unicode - even plain notepad supports Unicode
fully but that doesn't mean people are using any of the many editors
that has this feature.

- smalltalk uses left-arrow as assignment op. the way you enter it is by
typing "<_" so this is similar to Bill's suggestion, i.e. define a short
sequence of chars to be replaced by a Unicode char in the file source.

- why not generalize the concept? a few ideas: syntax is not important
here, just the idea itself..
1) bool compare as == (A a, A b) {}
you can add an op alias to your function, maybe define anonymous
function with alias to be used only as op.
2) provide a way to specify which functions can be used as infix
functions (Scala does that IIRC) and maybe even specify precedence
somehow, so that downs' map function could be written as :
infix void map(...) {}
and used as: dg map array;

Oct 23 2008

KennyTM~ <kennytm gmail.com> writes:

Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
 
 
 
 Andrei

I suggest not. There are problems if you adopt Unicode as operators:

======

1) My editor supports Unicode, but my keyboard don't. So how do I type ∩ 
and ∪ for a set«T»?

1.1) What if the library writer forget to provide an alternative, 
ASCII-only name? [This is also a problem of using Unicode as identifier 
as general.]

1.2) Some suggested auto-correction in the IDE. Again what if I used 
notepad/nano/TextEdit to code?



I had suggested once before, but let me put it formally here. If you 
really want to support Unicode operators in source code,

  - Firstly, ditch the ability to replace \xxx with '\xxx' when it 
appears without the quotes (so “char x = \n;” won't compile).
  - Then, replace \xxx with the character represented in source level, so

      Vector3D«real» τ = r × F;

    can be written as

      Vector3D!(real) \&tau; = r \&times; F;

  - You don't need to introduce a separate trigraph.
  - But suggestion do trigger some people's trigraph-phobia. [Yell no! 
Now! :) ]
  - It may make the source code difficult to parse grammatically.
  - It will make the source code difficult to read, just look at the 
number of semicolons in the ASCII encoded version.
  - But at least you can compile your code.

======

2) This is regarding the rejection of « & » to be supported even if the 
emacs module goes official. Of course it turns out it is not, but let's 
think of these scenarios:

2.1) OK it turns out ∩ and ∪ and «T» where just .opUnion(x) and 
.opIntersect(x) and !(T) pretty-printed in emacs; the compiler won't 
accept these characters anyway. But sometimes I forgot and just copied a 
portion of these code to nano/geany/whatever and then it stops compiling!

2.2) Well this copy&paste problem has been solved in the IDE level by 
inverting the pretty printing while copying. But now I publish my 
fantastic, pretty-printed D program in a web page/PDF/whatever, and 
people just complain the compiler won't accept it!



I still believe if you're going to transform D code to Unicode visually, 
the compiler must accept these visual replacement as well.

May I also take Mathematica as an example. The programming language 
itself uses a heavy load of non-ASCII characters, and the IDE also 
pretty-printed them as nice mathematical formulas, but in the “source 
code” level they are just escape sequences. So on screen you see

    E^(I π) + 1

but in the source code you'll see

    E^(I \[Pi]) + 1

However, if you type in “E^(I π) + 1” in a plain .nb file and open with 
the Get[] function (think of it as “import xx.d”) it can still correctly 
display the result “0”.

======

3) There are over 800 unary or binary operators in Unicode[1]. How are 
you going to opXXX all them? Assume your blog entry doesn't mean the 
simple “!=” ↦ “≠” transformation.




======

4) These are regarding if you are going to support overloading for all 
these 800 operators, how to define:

4.1) [Big problem] Operator precedence? (One person may want ∧ to mean 
the wedge product (so they have higher precedence than + and -) but 
another want it to mean logical AND (so lower than + and -).)

4.2) Associativity? How to determine if an operator is left-associative, 
right-associative or both? (∧ as wedge product is both, while ∧ as a 
power function pow(a,b) is right-assoc.)

4.3) [Minor problem] Commutativity? Or we'll need to write opXXX and 
opXXX_r all the time?



introduce some attributes like

   [Associative, Commutative]
   FuzzyBool operator∧ (FuzzyBool x, FuzzyBool y) { return min(x,y); }



but it's not D. :)

Or predefine the meaning, precedence and associativity for the each 
operator, so e.g. ∧ always means the wedge product and not logical AND, 
just like now ^ always means XOR and not power function.

Or just require the programmer to always put the parenthesis.




Ref: [1] A rough word count in 
http://www.unicode.org/Public/math/revision-11/MathClass-11.txt. The 
actual number is higher than this.

Oct 23 2008

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

KennyTM~ wrote:
 
 1.2) Some suggested auto-correction in the IDE. Again what if I used 
 notepad/nano/TextEdit to code?
 

Then I suggest a change in career... ^^'


-- 
Bruno Medeiros - Software Developer, MSc. in CS/E graduate
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Oct 24 2008

"Simen Kjaeraas" <simen.kjaras gmail.com> writes:

On Thu, 23 Oct 2008 00:27:58 +0200, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Please vote up before the haters take it down, and discuss:

 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/


 Andrei

I really like the idea of having more unicode in the language, but I feel  
these should be fairly limited.

There are times I feel that more operators (especially, as has been  
mentioned, opCross and opDotProduct) would be nice to have, but it's just  
sugar, really.

As an example, while I'd enjoy seeing code like this, I'm not sure I'd  
enjoy writing it (Note that I am prone to exaggerations):

int a = ∅; //empty set, same as "= void"
int[] b = [1,2,3,4,5,6];
a = readInt();

if (a ∈ b) // Element of - "in"
{
   float c = 2.00001;
   float d =  readInt();
   writefln(c ≈ ⌈d⌉ ); // Approximately equal, ceil

   myClass c = getInstance();
   if (∃c) // c exists, i.e. "!is null"
   {
     writefln(√(c.foo)); // I thought this should work in D today, using  
"alias sqrt √;", but it seems the compiler chokes on it. :(
   }

   ∀element∈b // New foreach syntax!
   {
     element *= ¼;
   }
}

-- 
Simen

Oct 23 2008

"Bill Baxter" <wbaxter gmail.com> writes:

T24gRnJpLCBPY3QgMjQsIDIwMDggYXQgNTo0OCBBTSwgU2ltZW4gS2phZXJhYXMgPHNpbWVuLmtq
YXJhc0BnbWFpbC5jb20+IHdyb3RlOgoKPiAgICB3cml0ZWZsbiiWKGMuZm9vKSk7IC8vIEkgdGhv
dWdodCB0aGlzIHNob3VsZCB3b3JrIGluIEQgdG9kYXksIHVzaW5nCj4gImFsaWFzIHNxcnQgljsi
LCBidXQgaXQgc2VlbXMgdGhlIGNvbXBpbGVyIGNob2tlcyBvbiBpdC4gOigKCkFjY29yZGluZyB0
byB0aGUgc3BlYywgeW91IGNhbiBjYW4gb25seSB1c2UgIlVuaXZlcnNhbEFscGhhIiBVbmljb2Rl
CmNoYXJhY3RlcnMgaW4geW91ciBpZGVudGlmaWVycy4gIFN1cHBvc2VkbHkgdGhvc2UgYXJlIGRl
ZmluZWQgaW4KSVNPL0lFQyA5ODk5OjE5OTkoRSkgQXBwZW5kaXggRC4gIEJ1dCBJJ20gZ3Vlc3Np
bmcgdGhlIElTTyBkaWQgbm90CmRlZmluZSBzcXVhcmUtcm9vdC1zeW1ib2wgYXMgYW4gYWxwaGEg
Y2hhcmFjdGVyLgoKLS1iYgo=

Oct 23 2008

"Simen Kjaeraas" <simen.kjaras gmail.com> writes:

On Thu, 23 Oct 2008 23:47:59 +0200, Bill Baxter <wbaxter gmail.com> wrote:

 On Fri, Oct 24, 2008 at 5:48 AM, Simen Kjaeraas <simen.kjaras gmail.com>  
 wrote:

    writefln(√(c.foo)); // I thought this should work in D today, using
 "alias sqrt √;", but it seems the compiler chokes on it. :(

 According to the spec, you can can only use "UniversalAlpha" Unicode
 characters in your identifiers.  Supposedly those are defined in
 ISO/IEC 9899:1999(E) Appendix D.  But I'm guessing the ISO did not
 define square-root-symbol as an alpha character.

 --bb

That seems to make sense indeed.

-- 
Simen

Oct 23 2008

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

Simen Kjaeraas wrote:
 
 As an example, while I'd enjoy seeing code like this, I'm not sure I'd 
 enjoy writing it (Note that I am prone to exaggerations):
 
 int a = ∅; //empty set, same as "= void"
 int[] b = [1,2,3,4,5,6];
 a = readInt();


Hum, interesting example, it actually made me realize that 'null' would 
be an ideal candidate for having a Unicode symbol of it's own. Does 
anyone have suggestions for a possible one? Preferably somewhat 
circle-shaped.


-- 
Bruno Medeiros - Software Developer, MSc. in CS/E graduate
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Oct 24 2008

"Simen Kjaeraas" <simen.kjaras gmail.com> writes:

On Fri, 24 Oct 2008 18:52:03 +0200, Bruno Medeiros  
<brunodomedeiros+spam com.gmail> wrote:

 Simen Kjaeraas wrote:
  As an example, while I'd enjoy seeing code like this, I'm not sure I'd  
 enjoy writing it (Note that I am prone to exaggerations):
  int a = ∅; //empty set, same as "= void"
 int[] b = [1,2,3,4,5,6];
 a = readInt();


 Hum, interesting example, it actually made me realize that 'null' would  
 be an ideal candidate for having a Unicode symbol of it's own. Does  
 anyone have suggestions for a possible one? Preferably somewhat  
 circle-shaped.

Well, we norwegians got the Ø (html entity &Oslash;, Latin-1 character  
216) - looks a lot like the empty set symbol.

-- 
Simen

Oct 24 2008

KennyTM~ <kennytm gmail.com> writes:

Bruno Medeiros wrote:
 Simen Kjaeraas wrote:
 As an example, while I'd enjoy seeing code like this, I'm not sure I'd 
 enjoy writing it (Note that I am prone to exaggerations):

 int a = ∅; //empty set, same as "= void"
 int[] b = [1,2,3,4,5,6];
 a = readInt();

 
 
 Hum, interesting example, it actually made me realize that 'null' would 
 be an ideal candidate for having a Unicode symbol of it's own. Does 
 anyone have suggestions for a possible one? Preferably somewhat 
 circle-shaped.
 
 

   auto Ø = null; // \&Oslash;

I assume you're not serious...

Oct 24 2008

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

KennyTM~ wrote:
 Bruno Medeiros wrote:
 Simen Kjaeraas wrote:
 As an example, while I'd enjoy seeing code like this, I'm not sure 
 I'd enjoy writing it (Note that I am prone to exaggerations):

 int a = ∅; //empty set, same as "= void"
 int[] b = [1,2,3,4,5,6];
 a = readInt();


 Hum, interesting example, it actually made me realize that 'null' 
 would be an ideal candidate for having a Unicode symbol of it's own. 
 Does anyone have suggestions for a possible one? Preferably somewhat 
 circle-shaped.

 
   auto Ø = null; // \&Oslash;
 
 I assume you're not serious...

It's an interesting and effective way to save some typing, and it might 
be even more readable (but with a symbol other than Ø).
But I probably would not use it anyway, since I like to write very 
standardized code, that other people can easily recognize and read.

-- 
Bruno Medeiros - Software Developer, MSc. in CS/E graduate
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Oct 26 2008

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
 
 
 
 Andrei

I'm unsure about this idea.

I don't know if it would be worthwhile, but I would say there are two 
aspects that likely would need to be observed for this to work out 
favorably:

* Having non-unicode versions of the symbols/keywords available in 
Unicode, such that non-Uunicode editing and viewing is always possible 
as a fallback. This has some important consequences though, such as 
making Unicode-symbol-usage unable to solve the shortage of brackets 
for, for example, the template instantiation syntax (because an 
alternative ASCII notation would still be necessary).

* Having a way to directly input the Unicode symbols in the keyboard. 
One reason is because of typing succinctness, and another, is because I 
find the alternative (have the editor/IDE automatically change an ASCII 
character sequence into a Unicode symbol) to have several disadvantages: 
First is that it doesn't work outside the editors/IDEs configured to do 
so, (which is a bummer, there is actually plenty of code written outside 
that: newsgroups, articles, forums, bug reports, IRC, etc.). Second, I 
personally like that the editor always require exactly N backspaces to 
erase N typed characters[*].

So, anyone knows if it is possible on Windows (I believe in Unix it is) 
to configure your keyboard mapping with custom settings? For example, if 
I press AltGr-O, it inputs some Unicode character of my choosing?



[*] As a sidenote, this is also why I don't like having my editor 
configured to insert 4 spaces on TAB-press. Unless, the editor is also 
smart enough to delete the 4 spaces on one backspace/delete and move 4 
spaces on one move cursor operation (arrow key press).

-- 
Bruno Medeiros - Software Developer, MSc. in CS/E graduate
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Oct 24 2008

"Simen Kjaeraas" <simen.kjaras gmail.com> writes:

On Fri, 24 Oct 2008 18:28:51 +0200, Bruno Medeiros  
<brunodomedeiros+spam com.gmail> wrote:

 Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
   
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/  
    Andrei

 I'm unsure about this idea.

 I don't know if it would be worthwhile, but I would say there are two  
 aspects that likely would need to be observed for this to work out  
 favorably:

 * Having non-unicode versions of the symbols/keywords available in  
 Unicode, such that non-Uunicode editing and viewing is always possible  
 as a fallback. This has some important consequences though, such as  
 making Unicode-symbol-usage unable to solve the shortage of brackets  
 for, for example, the template instantiation syntax (because an  
 alternative ASCII notation would still be necessary).

 * Having a way to directly input the Unicode symbols in the keyboard.  
 One reason is because of typing succinctness, and another, is because I  
 find the alternative (have the editor/IDE automatically change an ASCII  
 character sequence into a Unicode symbol) to have several disadvantages:  
 First is that it doesn't work outside the editors/IDEs configured to do  
 so, (which is a bummer, there is actually plenty of code written outside  
 that: newsgroups, articles, forums, bug reports, IRC, etc.). Second, I  
 personally like that the editor always require exactly N backspaces to  
 erase N typed characters[*].

 So, anyone knows if it is possible on Windows (I believe in Unix it is)  
 to configure your keyboard mapping with custom settings? For example, if  
 I press AltGr-O, it inputs some Unicode character of my choosing?

I'd guess this oughtta do it:
http://www.microsoft.com/globaldev/tools/msklc.mspx


-- 
Simen

Oct 24 2008

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

Simen Kjaeraas wrote:
 On Fri, 24 Oct 2008 18:28:51 +0200, Bruno Medeiros 
 <brunodomedeiros+spam com.gmail> wrote:
 
 Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
  http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
    Andrei

 I'm unsure about this idea.

 I don't know if it would be worthwhile, but I would say there are two 
 aspects that likely would need to be observed for this to work out 
 favorably:

 * Having non-unicode versions of the symbols/keywords available in 
 Unicode, such that non-Uunicode editing and viewing is always possible 
 as a fallback. This has some important consequences though, such as 
 making Unicode-symbol-usage unable to solve the shortage of brackets 
 for, for example, the template instantiation syntax (because an 
 alternative ASCII notation would still be necessary).

 * Having a way to directly input the Unicode symbols in the keyboard. 
 One reason is because of typing succinctness, and another, is because 
 I find the alternative (have the editor/IDE automatically change an 
 ASCII character sequence into a Unicode symbol) to have several 
 disadvantages: First is that it doesn't work outside the editors/IDEs 
 configured to do so, (which is a bummer, there is actually plenty of 
 code written outside that: newsgroups, articles, forums, bug reports, 
 IRC, etc.). Second, I personally like that the editor always require 
 exactly N backspaces to erase N typed characters[*].

 So, anyone knows if it is possible on Windows (I believe in Unix it 
 is) to configure your keyboard mapping with custom settings? For 
 example, if I press AltGr-O, it inputs some Unicode character of my 
 choosing?

 
 I'd guess this oughtta do it:
 http://www.microsoft.com/globaldev/tools/msklc.mspx
 
 

Yes, exactly that! I had the impression there was such a program for 
Windows, but couldn't remember the name.

-- 
Bruno Medeiros - Software Developer, MSc. in CS/E graduate
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Oct 26 2008

Robert Fraser <fraserofthenight gmail.com> writes:

Simen Kjaeraas wrote:
 So, anyone knows if it is possible on Windows (I believe in Unix it 
 is) to configure your keyboard mapping with custom settings? For 
 example, if I press AltGr-O, it inputs some Unicode character of my 
 choosing?

 
 I'd guess this oughtta do it:
 http://www.microsoft.com/globaldev/tools/msklc.mspx

I remember this same question being asked on a Microsoft DL when I was 
working there, and all the answers given were for third-party tools like 
KeyTweak ( http://webpages.charter.net/krumsick/ ) ;-P . Good to know 
there's an MS one.

Oct 26 2008

bearophile <bearophileHUGS lycos.com> writes:

Bruno Medeiros:
* Having non-unicode versions of the symbols/keywords available in Unicode,
such that non-Uunicode editing and viewing is always possible as a fallback.
This has some important consequences though, such as making
Unicode-symbol-usage unable to solve the shortage of brackets for, for example,
the template instantiation syntax (because an alternative ASCII notation would
still be necessary).<

Fortress uses pairs of symbols to denote various sequence literarls. Some of

http://a6systems.com/fsharpsheet.pdf

Creates the list:
let lsgen2 = [0 .. 2 .. 8]
Gives:
[0;2;4;6;8]
Note:  0 .. 2 .. 8  equals to the Python slice with stride syntax 0:8:2

Create the array:
let argen2 = [|0 .. 2 .. 8|]
Gives:
[|0;2;4;6;8|]

Creating a seq (that is lazy):
let s = seq { for i in 0 .. 10 do yield i }  



more functional (as them are useful in Scala too, that is partially functional.

functional-procedural-OOP hybrids almost like D2 will want to become, D2 is so

languages like Haskell are functional all the way), this is an Augmented
Discriminated Union:

type BinTree<'a> =
    | Node of
        BinTree<'a> * 'a *
        BinTree<'a>
    | Leaf
  with member self.Depth() =
        match self with
        | Leaf -> 0
        | Node(l, _, r) -> 1 +
            l.Depth() + r.Depth()


lazy/nonlazy collection generators too, this is the third iteration of my ideas
on this topic (if you think succintness in (partially) functional languages is
useless, think again. It allows to use certain things instead of falling back
to more procedural idioms):

auto flat = (abs(el) for(row: mat) for(el: row) if (el % 2)); // lazy
auto multi = [c:mulIter(c, i) for(i,c: "abcdef")]; // AA
auto squares = void[x*x for(x: 0..100)]; // set
void[int] squares = [x*x for(x: 0..100)];// set, alternative syntax
auto squares = {x*x for x in xrange(100)}; // set, alternative syntax
auto squares = {| x*x for(x: 0..100) |}; // list?
auto squares = [| x*x for(x: 0..100) |]; // multiset? something else?

Bye,
bearophile

Oct 24 2008

ore-sama <spam here.lot> writes:

Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

Console is a legacy technology (you even still call it "DOS"), why expect
features from it?

Oct 24 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why expect
features from it?

So tell me what the alternative is?  I had trouble with running D
tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

Anyone using a shell for Windows that works and supports UTF-8 properly?

--bb

Oct 24 2008

Sergey Gromov <snake.scaly gmail.com> writes:

Sat, 25 Oct 2008 06:43:19 +0900,
Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why expect
features from it?

 
 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.
 
 Anyone using a shell for Windows that works and supports UTF-8 properly?

A regular Windows console supports UTF-8 to some extent:

* Change console font to Lucida Console
* issue "chcp 65001"

You can even get more fonts into there with a bit of hackery.

Oct 24 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com> wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why expect
features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 Anyone using a shell for Windows that works and supports UTF-8 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

I did that but "type <filewith-utf8.txt>"  still prints garbage.

--bb

Oct 24 2008

Yigal Chripun <yigal100 gmail.com> writes:

Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com> wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why expect
features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 Anyone using a shell for Windows that works and supports UTF-8 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 
 I did that but "type <filewith-utf8.txt>"  still prints garbage.
 
 --bb

so don't use type. use notepad instead...
notepad <filewith-utf8.txt>
also, MSYS gives you all the linux tools if you really need to be shell
only.
last resort: nothing stops you from implementing your own "cat"
application in D with full Unicode support.

most if not all linux shell tools are separate executables anyway and if
any still do not support unicode it'll be trivial to roll your own
replacements for the bad ones.

Oct 24 2008

Benji Smith <dlanguage benjismith.net> writes:

Yigal Chripun wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com> wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why expect
features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 Anyone using a shell for Windows that works and supports UTF-8 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 --bb

 
 so don't use type. use notepad instead...
 notepad <filewith-utf8.txt>
 also, MSYS gives you all the linux tools if you really need to be shell
 only.
 last resort: nothing stops you from implementing your own "cat"
 application in D with full Unicode support.
 
 most if not all linux shell tools are separate executables anyway and if
 any still do not support unicode it'll be trivial to roll your own
 replacements for the bad ones.

Oh, and one of my favorite tricks in Windows is to install cygwin 
(usually at "C:\cygwin" or whatever their boneheaded installer insists 
on using) and then add the bin path ("C:\cygwin\bin") to the windows PATH.

That way, I can continue using the ordinary windows shell (which I 
prefer, since it doesn't force me to use the nutty directory names that 
the cygwin shell uses), but I can still access all the linux commands.

Calling grep from a windows shell is the bestest!

--benji

Oct 24 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sat, Oct 25, 2008 at 10:31 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Yigal Chripun wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com>
 wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why
 expect features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 Anyone using a shell for Windows that works and supports UTF-8
 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 --bb

 so don't use type. use notepad instead...
 notepad <filewith-utf8.txt>
 also, MSYS gives you all the linux tools if you really need to be shell
 only.
 last resort: nothing stops you from implementing your own "cat"
 application in D with full Unicode support.

 most if not all linux shell tools are separate executables anyway and if
 any still do not support unicode it'll be trivial to roll your own
 replacements for the bad ones.

 Oh, and one of my favorite tricks in Windows is to install cygwin (usually
 at "C:\cygwin" or whatever their boneheaded installer insists on using) and
 then add the bin path ("C:\cygwin\bin") to the windows PATH.

 That way, I can continue using the ordinary windows shell (which I prefer,
 since it doesn't force me to use the nutty directory names that the cygwin
 shell uses), but I can still access all the linux commands.

 Calling grep from a windows shell is the bestest!

But that has the same problem.  Cygtools don't understand windows
paths so barf when you say "grep c:\foo.txt"  But the Windows shell
only will only autocomplete Windows-style paths.

I've found the gnuwin32 tools to work a little better on that front.

--bb

Oct 24 2008

Benji Smith <dlanguage benjismith.net> writes:

Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:31 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Yigal Chripun wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com>
 wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why
 expect features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 Anyone using a shell for Windows that works and supports UTF-8
 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 --bb

 so don't use type. use notepad instead...
 notepad <filewith-utf8.txt>
 also, MSYS gives you all the linux tools if you really need to be shell
 only.
 last resort: nothing stops you from implementing your own "cat"
 application in D with full Unicode support.

 most if not all linux shell tools are separate executables anyway and if
 any still do not support unicode it'll be trivial to roll your own
 replacements for the bad ones.

 Oh, and one of my favorite tricks in Windows is to install cygwin (usually
 at "C:\cygwin" or whatever their boneheaded installer insists on using) and
 then add the bin path ("C:\cygwin\bin") to the windows PATH.

 That way, I can continue using the ordinary windows shell (which I prefer,
 since it doesn't force me to use the nutty directory names that the cygwin
 shell uses), but I can still access all the linux commands.

 Calling grep from a windows shell is the bestest!

 
 But that has the same problem.  Cygtools don't understand windows
 paths so barf when you say "grep c:\foo.txt"  But the Windows shell
 only will only autocomplete Windows-style paths.
 
 I've found the gnuwin32 tools to work a little better on that front.
 
 --bb

Wha???

The "grep" tool doesn't read the path. The *shell* interprets the path 
and passes the text to the program. That's how all the gnu tools are 
able to pipe their results from one tool to the other.

Or at least, that's how I assume it works.

Cuz I use grep like every single day. On the "cmd.exe" shell. With 
windows paths.

In fact, just for you, I tested this:

    grep -i "SHAZZAM" "C:\Documents and Settings\benji\Desktop\my 
filename with spaces.txt"

Worked like a charm.

If the path doesn't have spaces, I have no problem with this:

    grep -i "SHAZZAM" C:\file.txt

I tried it in both "command.com" and in "cmd.exe" and didn't experience 
any problem in either environment.

The key is to never never never use the cygwin shell. It's a piece of 
garbage. But using the executables from the "cygwin\bin" directory 
within the windows shell... Priceless!

--benji

Oct 24 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sat, Oct 25, 2008 at 11:39 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:31 AM, Benji Smith <dlanguage benjismith.net>
 wrote:
 Yigal Chripun wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com>
 wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why
 expect features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 Anyone using a shell for Windows that works and supports UTF-8
 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 --bb

 so don't use type. use notepad instead...
 notepad <filewith-utf8.txt>
 also, MSYS gives you all the linux tools if you really need to be shell
 only.
 last resort: nothing stops you from implementing your own "cat"
 application in D with full Unicode support.

 most if not all linux shell tools are separate executables anyway and if
 any still do not support unicode it'll be trivial to roll your own
 replacements for the bad ones.

 Oh, and one of my favorite tricks in Windows is to install cygwin
 (usually
 at "C:\cygwin" or whatever their boneheaded installer insists on using)
 and
 then add the bin path ("C:\cygwin\bin") to the windows PATH.

 That way, I can continue using the ordinary windows shell (which I
 prefer,
 since it doesn't force me to use the nutty directory names that the
 cygwin
 shell uses), but I can still access all the linux commands.

 Calling grep from a windows shell is the bestest!

 But that has the same problem.  Cygtools don't understand windows
 paths so barf when you say "grep c:\foo.txt"  But the Windows shell
 only will only autocomplete Windows-style paths.

 I've found the gnuwin32 tools to work a little better on that front.

 --bb

 Wha???

 The "grep" tool doesn't read the path. The *shell* interprets the path and
 passes the text to the program. That's how all the gnu tools are able to
 pipe their results from one tool to the other.

 Or at least, that's how I assume it works.

 Cuz I use grep like every single day. On the "cmd.exe" shell. With windows
 paths.

 In fact, just for you, I tested this:

   grep -i "SHAZZAM" "C:\Documents and Settings\benji\Desktop\my filename
 with spaces.txt"

 Worked like a charm.

 If the path doesn't have spaces, I have no problem with this:

   grep -i "SHAZZAM" C:\file.txt

 I tried it in both "command.com" and in "cmd.exe" and didn't experience any
 problem in either environment.

 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory within
 the windows shell... Priceless!

Oh, I didn't realize that.  There is one thing that doesn't work,
which is probably what gave me the impression it was broken -- Windows
paths with wildcards don't work.   Like "grep c:\Windows\*.txt".   But
you're right that it does seem to work for both windows paths, and
local wildcards, just not Windows paths with wildcards.

But that's great.  Thanks for the info.  Actually I used to put
cygwin\bin on my path years ago, but stopped doing it at some point
and switched to gnuwin32.  I was under the impression that it worked
better then, but actually I've had some trouble with gnuwin32
recently.

--bb

Oct 24 2008

Benji Smith <dlanguage benjismith.net> writes:

Bill Baxter wrote:
 Benji Smith wrote:
 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory within
 the windows shell... Priceless!

 
 Oh, I didn't realize that.  There is one thing that doesn't work,
 which is probably what gave me the impression it was broken -- Windows
 paths with wildcards don't work.   Like "grep c:\Windows\*.txt".   But
 you're right that it does seem to work for both windows paths, and
 local wildcards, just not Windows paths with wildcards.
 
 But that's great.  Thanks for the info.  Actually I used to put
 cygwin\bin on my path years ago, but stopped doing it at some point
 and switched to gnuwin32.  I was under the impression that it worked
 better then, but actually I've had some trouble with gnuwin32
 recently.

Glad I could be of service!

--benji

Oct 24 2008

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"Benji Smith" wrote
 Bill Baxter wrote:
 Benji Smith wrote:
 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory 
 within
 the windows shell... Priceless!

 Oh, I didn't realize that.  There is one thing that doesn't work,
 which is probably what gave me the impression it was broken -- Windows
 paths with wildcards don't work.   Like "grep c:\Windows\*.txt".   But
 you're right that it does seem to work for both windows paths, and
 local wildcards, just not Windows paths with wildcards.


It's not the paths with wildcards that is the problem.  In this case, it is 
the shell.  Grep is expecting the shell to expand the wildcards, as it does 
on unix.

For example, you can use this old trick if ls suddenly becomes unavailable 
to list all files in the current directory:

echo *

Which is all shell builtin no executables are run.

If you ran this from a windows shell you get the same error:

grep text /cygdrive/c/Windows/*.txt

The windows shell expects the application to handle wildcard expansion, 
which is why windows command line programs don't always work the same way. 
Every program has to build in wildcard expansion to support it.

-Steve

Oct 24 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sat, Oct 25, 2008 at 1:40 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 "Benji Smith" wrote
 Bill Baxter wrote:
 Benji Smith wrote:
 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory
 within
 the windows shell... Priceless!

 Oh, I didn't realize that.  There is one thing that doesn't work,
 which is probably what gave me the impression it was broken -- Windows
 paths with wildcards don't work.   Like "grep c:\Windows\*.txt".   But
 you're right that it does seem to work for both windows paths, and
 local wildcards, just not Windows paths with wildcards.


 It's not the paths with wildcards that is the problem.  In this case, it is
 the shell.  Grep is expecting the shell to expand the wildcards, as it does
 on unix.

Read again.  Particularly this part:

"it does seem to work for both windows paths, **and local wildcards**,
just not Windows paths with wildcards".
(emphasis added)

"grep Foo *.txt"  works just fine.  "grep Foo c:\*.txt"  does not.

--bb

Oct 24 2008

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 1:40 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 "Benji Smith" wrote
 Bill Baxter wrote:
 Benji Smith wrote:
 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory
 within
 the windows shell... Priceless!

 Oh, I didn't realize that.  There is one thing that doesn't work,
 which is probably what gave me the impression it was broken -- Windows
 paths with wildcards don't work.   Like "grep c:\Windows\*.txt".   But
 you're right that it does seem to work for both windows paths, and
 local wildcards, just not Windows paths with wildcards.


 It's not the paths with wildcards that is the problem.  In this case, it 
 is
 the shell.  Grep is expecting the shell to expand the wildcards, as it 
 does
 on unix.

 Read again.  Particularly this part:

 "it does seem to work for both windows paths, **and local wildcards**,
 just not Windows paths with wildcards".
 (emphasis added)

 "grep Foo *.txt"  works just fine.  "grep Foo c:\*.txt"  does not.

Then that must be something grep is doing extra.  Or perhaps the Windows 
console selectively expands wildcards?  I have no idea.  It seems weird that 
grep would expand only current-directory wildcards (try grep Foo *, and see 
if it works.  Windows normally only expands *.* to mean 'all files').  But 
in the case of using a cygwin shell, the shell expands all wildcards before 
passing arguments to grep.  That much I do know.  I haven't really had a 
need to use the windows shell in a long time ;)

-Steve

Oct 24 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sat, Oct 25, 2008 at 2:09 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 "Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 1:40 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 "Benji Smith" wrote
 Bill Baxter wrote:
 Benji Smith wrote:
 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory
 within
 the windows shell... Priceless!

 Oh, I didn't realize that.  There is one thing that doesn't work,
 which is probably what gave me the impression it was broken -- Windows
 paths with wildcards don't work.   Like "grep c:\Windows\*.txt".   But
 you're right that it does seem to work for both windows paths, and
 local wildcards, just not Windows paths with wildcards.


 It's not the paths with wildcards that is the problem.  In this case, it
 is
 the shell.  Grep is expecting the shell to expand the wildcards, as it
 does
 on unix.

 Read again.  Particularly this part:

 "it does seem to work for both windows paths, **and local wildcards**,
 just not Windows paths with wildcards".
 (emphasis added)

 "grep Foo *.txt"  works just fine.  "grep Foo c:\*.txt"  does not.

 Then that must be something grep is doing extra.

Yep, that was what I said.

 Or perhaps the Windows
 console selectively expands wildcards?  I have no idea.

Don't think so.   "echo *" still dutifully prints a "*" to the
console.  Cygwin grep is doing it, probably in an attempt to be more
useful when used from the DOS prompt.

 It seems weird that
 grep would expand only current-directory wildcards (try grep Foo *, and see
 if it works.

Yep that works.

 Windows normally only expands *.* to mean 'all files').

If by that you mean Windows command line programs usually expand *.*, then yeh.

 But in the case of using a cygwin shell, the shell expands all wildcards before
 passing arguments to grep.  That much I do know.  I haven't really had a
 need to use the windows shell in a long time ;)

Yep that's true for Bash.

An easy way to tell the Windows shell does nothing is by compiling and running:

import std.stdio;
void main(string[] args) {  writefln("Args: %s", args); }

And passing it some wildcards.  It never expands anything.  Only thing
it does do is mess with quotes some.  Here's an example:

C:\> args.exe * "C:\Program Files" *.* c:\*
Args: [args,*,C:\Program Files,*.*,c:\*]

--bb

Oct 24 2008

Benji Smith <dlanguage benjismith.net> writes:

Bill Baxter wrote:
 "it does seem to work for both windows paths, **and local wildcards**,
 just not Windows paths with wildcards".
 (emphasis added)

 "grep Foo *.txt"  works just fine.  "grep Foo c:\*.txt"  does not.

 Then that must be something grep is doing extra.

 
 Yep, that was what I said.
 
 Or perhaps the Windows
 console selectively expands wildcards?  I have no idea.

 
 Don't think so.   "echo *" still dutifully prints a "*" to the
 console.  Cygwin grep is doing it, probably in an attempt to be more
 useful when used from the DOS prompt.
 
 It seems weird that
 grep would expand only current-directory wildcards (try grep Foo *, and see
 if it works.


Interesting.

About 90% of the time, I run grep with the "recursion" flag, so I 
haven't thought about wildcard expansion in ages.

   grep -R "some text" .

I do know that "wc" does wildcard expansion, even with paths, but you 
have to use forward slashes. So, to count lines in D programs from the 
windows shell:

   wc -l /dev/*.d

Unfortunately, there's no "recursion" flag for wc, so I end up doing 
something dumb like this:

   wc -l /dev/*.d
   wc -l /dev/*/*.d
   wc -l /dev/*/*/*.d

Etc.

Hmmmmmm. I really should just compile my own wc. After all, Walter's 
already written the sample code.

--benji

Oct 25 2008

"Bill Baxter" <wbaxter gmail.com> writes:

 But that has the same problem.  Cygtools don't understand windows
 paths so barf when you say "grep c:\foo.txt"  But the Windows shell
 only will only autocomplete Windows-style paths.

 I've found the gnuwin32 tools to work a little better on that front.

 Wha???

 The "grep" tool doesn't read the path. The *shell* interprets the path and
 passes the text to the program. That's how all the gnu tools are able to
 pipe their results from one tool to the other.

 Or at least, that's how I assume it works.

No, that's how it works with the Bash shell and most Unix shells, but
the Windows console doesn't do that stuff.  It's up to each app to
interpret and expand wildcards like *.txt.  So the cygwin progs must
be explicitly checking to see if they got a * from a stupid DOS
console and doing the glob themselves.  But the implementation is
apparently imperfect since it doesn't work on full DOS paths with
wildcards.

--bb

Oct 24 2008

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"Benji Smith" wrote
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:31 AM, Benji Smith <dlanguage benjismith.net> 
 wrote:
 Yigal Chripun wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com>
 wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why
 expect features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 Anyone using a shell for Windows that works and supports UTF-8
 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 --bb

 so don't use type. use notepad instead...
 notepad <filewith-utf8.txt>
 also, MSYS gives you all the linux tools if you really need to be shell
 only.
 last resort: nothing stops you from implementing your own "cat"
 application in D with full Unicode support.

 most if not all linux shell tools are separate executables anyway and 
 if
 any still do not support unicode it'll be trivial to roll your own
 replacements for the bad ones.

 Oh, and one of my favorite tricks in Windows is to install cygwin 
 (usually
 at "C:\cygwin" or whatever their boneheaded installer insists on using) 
 and
 then add the bin path ("C:\cygwin\bin") to the windows PATH.

 That way, I can continue using the ordinary windows shell (which I 
 prefer,
 since it doesn't force me to use the nutty directory names that the 
 cygwin
 shell uses), but I can still access all the linux commands.

 Calling grep from a windows shell is the bestest!

 But that has the same problem.  Cygtools don't understand windows
 paths so barf when you say "grep c:\foo.txt"  But the Windows shell
 only will only autocomplete Windows-style paths.

 I've found the gnuwin32 tools to work a little better on that front.

 --bb

 Wha???

 The "grep" tool doesn't read the path. The *shell* interprets the path and 
 passes the text to the program. That's how all the gnu tools are able to 
 pipe their results from one tool to the other.

 Or at least, that's how I assume it works.

No, grep accepts either input.  The shell does not change paths to windows 
style, that is what cygpath is for.  But it does interpret backslashes, so 
you have to double all those.

So for instance, in a cygwin shell, this works also:

grep -i "SHAZZAM" C:\\Documents\ and\ Settings\\benji\\Desktop\\my\ 
filename\ with\ spaces.txt

The arguments are passed as they are, grep just is smart enough to use 
either one.  Probably many tools are that way, I wouldn't know because I 
usually do the /cygdrive/c/... form.

 The key is to never never never use the cygwin shell. It's a piece of 
 garbage. But using the executables from the "cygwin\bin" directory within 
 the windows shell... Priceless!

Without the cygwin shell, you lose all bash features, like for, or backticks 
to execute a command and use it's output.  The paths are a minor annoyance 
IMO.  Using the cmd.exe shell is ok for simple tasks, but it pales severely 
in comparison to the power of bash.

So piece of garbage it is not.  Something you don't understand how to use 
properly? definitely ;)

-Steve

Oct 24 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sat, Oct 25, 2008 at 1:33 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 "Benji Smith" wrote
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:31 AM, Benji Smith <dlanguage benjismith.net>
 wrote:
 Yigal Chripun wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com>
 wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why
 expect features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 Anyone using a shell for Windows that works and supports UTF-8
 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 --bb

 so don't use type. use notepad instead...
 notepad <filewith-utf8.txt>
 also, MSYS gives you all the linux tools if you really need to be shell
 only.
 last resort: nothing stops you from implementing your own "cat"
 application in D with full Unicode support.

 most if not all linux shell tools are separate executables anyway and
 if
 any still do not support unicode it'll be trivial to roll your own
 replacements for the bad ones.

 Oh, and one of my favorite tricks in Windows is to install cygwin
 (usually
 at "C:\cygwin" or whatever their boneheaded installer insists on using)
 and
 then add the bin path ("C:\cygwin\bin") to the windows PATH.

 That way, I can continue using the ordinary windows shell (which I
 prefer,
 since it doesn't force me to use the nutty directory names that the
 cygwin
 shell uses), but I can still access all the linux commands.

 Calling grep from a windows shell is the bestest!

 But that has the same problem.  Cygtools don't understand windows
 paths so barf when you say "grep c:\foo.txt"  But the Windows shell
 only will only autocomplete Windows-style paths.

 I've found the gnuwin32 tools to work a little better on that front.

 --bb

 Wha???

 The "grep" tool doesn't read the path. The *shell* interprets the path and
 passes the text to the program. That's how all the gnu tools are able to
 pipe their results from one tool to the other.

 Or at least, that's how I assume it works.

 No, grep accepts either input.  The shell does not change paths to windows
 style, that is what cygpath is for.  But it does interpret backslashes, so
 you have to double all those.

 So for instance, in a cygwin shell, this works also:

 grep -i "SHAZZAM" C:\\Documents\ and\ Settings\\benji\\Desktop\\my\
 filename\ with\ spaces.txt

 The arguments are passed as they are, grep just is smart enough to use
 either one.  Probably many tools are that way, I wouldn't know because I
 usually do the /cygdrive/c/... form.

 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory within
 the windows shell... Priceless!

 Without the cygwin shell, you lose all bash features, like for, or backticks
 to execute a command and use it's output.  The paths are a minor annoyance
 IMO.  Using the cmd.exe shell is ok for simple tasks, but it pales severely
 in comparison to the power of bash.

 So piece of garbage it is not.  Something you don't understand how to use
 properly? definitely ;)

Yeh, I love the bash shell.  Really the only thing keeping me from
using it for D work is the fact that it won't auto-complete Windows
filenames.

--bb

Oct 24 2008

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 1:33 PM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 "Benji Smith" wrote
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:31 AM, Benji Smith 
 <dlanguage benjismith.net>
 wrote:
 Yigal Chripun wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov 
 <snake.scaly gmail.com>
 wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), 
 why
 expect features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 Anyone using a shell for Windows that works and supports UTF-8
 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 --bb

 so don't use type. use notepad instead...
 notepad <filewith-utf8.txt>
 also, MSYS gives you all the linux tools if you really need to be 
 shell
 only.
 last resort: nothing stops you from implementing your own "cat"
 application in D with full Unicode support.

 most if not all linux shell tools are separate executables anyway and
 if
 any still do not support unicode it'll be trivial to roll your own
 replacements for the bad ones.

 Oh, and one of my favorite tricks in Windows is to install cygwin
 (usually
 at "C:\cygwin" or whatever their boneheaded installer insists on 
 using)
 and
 then add the bin path ("C:\cygwin\bin") to the windows PATH.

 That way, I can continue using the ordinary windows shell (which I
 prefer,
 since it doesn't force me to use the nutty directory names that the
 cygwin
 shell uses), but I can still access all the linux commands.

 Calling grep from a windows shell is the bestest!

 But that has the same problem.  Cygtools don't understand windows
 paths so barf when you say "grep c:\foo.txt"  But the Windows shell
 only will only autocomplete Windows-style paths.

 I've found the gnuwin32 tools to work a little better on that front.

 --bb

 Wha???

 The "grep" tool doesn't read the path. The *shell* interprets the path 
 and
 passes the text to the program. That's how all the gnu tools are able to
 pipe their results from one tool to the other.

 Or at least, that's how I assume it works.

 No, grep accepts either input.  The shell does not change paths to 
 windows
 style, that is what cygpath is for.  But it does interpret backslashes, 
 so
 you have to double all those.

 So for instance, in a cygwin shell, this works also:

 grep -i "SHAZZAM" C:\\Documents\ and\ Settings\\benji\\Desktop\\my\
 filename\ with\ spaces.txt

 The arguments are passed as they are, grep just is smart enough to use
 either one.  Probably many tools are that way, I wouldn't know because I
 usually do the /cygdrive/c/... form.

 The key is to never never never use the cygwin shell. It's a piece of
 garbage. But using the executables from the "cygwin\bin" directory 
 within
 the windows shell... Priceless!

 Without the cygwin shell, you lose all bash features, like for, or 
 backticks
 to execute a command and use it's output.  The paths are a minor 
 annoyance
 IMO.  Using the cmd.exe shell is ok for simple tasks, but it pales 
 severely
 in comparison to the power of bash.

 So piece of garbage it is not.  Something you don't understand how to use
 properly? definitely ;)

 Yeh, I love the bash shell.  Really the only thing keeping me from
 using it for D work is the fact that it won't auto-complete Windows
 filenames.

It's ugly, but can be aliased or scripted, look into cygpath:

cygpath -w /cygdrive/c/filename.txt
outputs:

C:\filename.txt

so you can use dmd combined with cygpath:

dmd `cygpath -w /cygdrive/c/path/to/d/files/*.d`

It wouldn't take much to write a bash script to do this for you...

-Steve

Oct 24 2008

Benji Smith <dlanguage benjismith.net> writes:

Steven Schveighoffer wrote:
 So piece of garbage it is not.  Something you don't understand how to use 
 properly? definitely ;)

Definitely!

I hope you'll agree that hyperbole is the best thing in the world :)

--benji

Oct 25 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sat, Oct 25, 2008 at 10:23 AM, Yigal Chripun <yigal100 gmail.com> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 9:15 AM, Sergey Gromov <snake.scaly gmail.com> wrote:
 Sat, 25 Oct 2008 06:43:19 +0900,
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why expect
features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 Anyone using a shell for Windows that works and supports UTF-8 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 --bb

 so don't use type. use notepad instead...
 notepad <filewith-utf8.txt>

Ok what about grep and sort and uniq then?  Can notepad do that?
I have all these tools that work fine in my DOS shell.  I never use
"type".  It was simply meant as the most basic possible tool -- as in
if "type" doesn't work nothing will.

 also, MSYS gives you all the linux tools if you really need to be shell
 only.

I think part of the problem I had with Cygwin shell was that it can't
auto-complete dos filenames, but D programs on Windows can't accept
Cygwin paths.  So it was a pain to work with command-line tools (like
DMD itself) that take filenames.   So I don't think MSYS helps there
either.

 last resort: nothing stops you from implementing your own "cat"
 application in D with full Unicode support.

 most if not all linux shell tools are separate executables anyway and if
 any still do not support unicode it'll be trivial to roll your own
 replacements for the bad ones.

Oct 24 2008

Benji Smith <dlanguage benjismith.net> writes:

Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 
 I did that but "type <filewith-utf8.txt>"  still prints garbage.

That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to 
the console. The only special thing I did was changed the font to Lucide 
Console.

--benji

Oct 24 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sat, Oct 25, 2008 at 10:23 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the
 console. The only special thing I did was changed the font to Lucide
 Console.

Ok.  Thanks for the info.  Knowing that it has actually worked for at
least one person gives me motivation to try again.

--bb

Oct 24 2008

Benji Smith <dlanguage benjismith.net> writes:

Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:23 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the
 console. The only special thing I did was changed the font to Lucide
 Console.

 
 Ok.  Thanks for the info.  Knowing that it has actually worked for at
 least one person gives me motivation to try again.
 
 --bb

Write a tiny little D program and see what you get on the console:

    import tango.io.Stdout;
    void main() {
       Stdout("spade, club, heart, diamond: \u2660\u2663\u2665\u2666");
    }

I don't know anything about the "type" command, and whether it supports 
UTF-8. But the console itself ought to be able to handle it. Try 
compiling the above code and see what happens.

--benji

Oct 24 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sat, Oct 25, 2008 at 10:37 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:23 AM, Benji Smith <dlanguage benjismith.net>
 wrote:
 Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8
 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the
 console. The only special thing I did was changed the font to Lucide
 Console.

 Ok.  Thanks for the info.  Knowing that it has actually worked for at
 least one person gives me motivation to try again.

 --bb

 Write a tiny little D program and see what you get on the console:

   import tango.io.Stdout;
   void main() {
      Stdout("spade, club, heart, diamond: \u2660\u2663\u2665\u2666");
   }

 I don't know anything about the "type" command, and whether it supports
 UTF-8. But the console itself ought to be able to handle it. Try compiling
 the above code and see what happens.

 --benji

Ah, I see.  I guess more what I want to know is if I had utf-8 source
code and the D compiler spit out a message about one of the lines,
would that error message come out as garbage?  Same for ddbg -- if I'm
debugging and say "ps" for "print source" will the result be garbage.
  I was thinking that "type" would be a simple test if that sort of
thing would work.

But maybe type is just borked.  I did try "cat" and "more" too I
think, with same result, though.

--bb

Oct 24 2008

Yigal Chripun <yigal100 gmail.com> writes:

Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:37 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:23 AM, Benji Smith <dlanguage benjismith.net>
 wrote:
 Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8
 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the
 console. The only special thing I did was changed the font to Lucide
 Console.

 Ok.  Thanks for the info.  Knowing that it has actually worked for at
 least one person gives me motivation to try again.

 --bb

 Write a tiny little D program and see what you get on the console:

   import tango.io.Stdout;
   void main() {
      Stdout("spade, club, heart, diamond: \u2660\u2663\u2665\u2666");
   }

 I don't know anything about the "type" command, and whether it supports
 UTF-8. But the console itself ought to be able to handle it. Try compiling
 the above code and see what happens.

 --benji

 
 Ah, I see.  I guess more what I want to know is if I had utf-8 source
 code and the D compiler spit out a message about one of the lines,
 would that error message come out as garbage?  Same for ddbg -- if I'm
 debugging and say "ps" for "print source" will the result be garbage.
   I was thinking that "type" would be a simple test if that sort of
 thing would work.
 
 But maybe type is just borked.  I did try "cat" and "more" too I
 think, with same result, though.
 
 --bb

Msys does autocomplete. it's not perfect but it works. the path will
look unix like though.. i.e.
/c/program files/...

from what I know (winXP sp 2) - console works for unicode Except for RTL
languages like Hebrew. as someone else already noted, this is legacy
tech which you shouldn't be using anyway. I don't know if it's fixed in

there are also other 3rd party stuff as well..

Oct 24 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sat, Oct 25, 2008 at 11:53 AM, Yigal Chripun <yigal100 gmail.com> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:37 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:23 AM, Benji Smith <dlanguage benjismith.net>
 wrote:
 Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8
 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the
 console. The only special thing I did was changed the font to Lucide
 Console.

 Ok.  Thanks for the info.  Knowing that it has actually worked for at
 least one person gives me motivation to try again.

 --bb

 Write a tiny little D program and see what you get on the console:

   import tango.io.Stdout;
   void main() {
      Stdout("spade, club, heart, diamond: \u2660\u2663\u2665\u2666");
   }

 I don't know anything about the "type" command, and whether it supports
 UTF-8. But the console itself ought to be able to handle it. Try compiling
 the above code and see what happens.

 --benji

 Ah, I see.  I guess more what I want to know is if I had utf-8 source
 code and the D compiler spit out a message about one of the lines,
 would that error message come out as garbage?  Same for ddbg -- if I'm
 debugging and say "ps" for "print source" will the result be garbage.
   I was thinking that "type" would be a simple test if that sort of
 thing would work.

 But maybe type is just borked.  I did try "cat" and "more" too I
 think, with same result, though.

 --bb

 Msys does autocomplete. it's not perfect but it works. the path will
 look unix like though.. i.e.
 /c/program files/...

Right that's what Cygwin does too, and it's useless if I want to call
the DMD compiler.

     dmd foo.d /c/libs/mydlib.lib

"Error:  what do you think this is, Linux?"


 from what I know (winXP sp 2) - console works for unicode Except for RTL
 languages like Hebrew. as someone else already noted, this is legacy
 tech which you shouldn't be using anyway. I don't know if it's fixed in

 there are also other 3rd party stuff as well..

Yeh, i've heard of that.  Do you (or anyone) have any actual
experience with PowerShell?  It doesn't seem to be standard equipment
on my new Vista box even.  Does it require a separate download?
Strange if it really is supposed to be "the new way".

--bb

Oct 24 2008

Robert Fraser <fraserofthenight gmail.com> writes:

Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 11:53 AM, Yigal Chripun <yigal100 gmail.com> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:37 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:23 AM, Benji Smith <dlanguage benjismith.net>
 wrote:
 Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8
 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the
 console. The only special thing I did was changed the font to Lucide
 Console.

 Ok.  Thanks for the info.  Knowing that it has actually worked for at
 least one person gives me motivation to try again.

 --bb

 Write a tiny little D program and see what you get on the console:

   import tango.io.Stdout;
   void main() {
      Stdout("spade, club, heart, diamond: \u2660\u2663\u2665\u2666");
   }

 I don't know anything about the "type" command, and whether it supports
 UTF-8. But the console itself ought to be able to handle it. Try compiling
 the above code and see what happens.

 --benji

 Ah, I see.  I guess more what I want to know is if I had utf-8 source
 code and the D compiler spit out a message about one of the lines,
 would that error message come out as garbage?  Same for ddbg -- if I'm
 debugging and say "ps" for "print source" will the result be garbage.
   I was thinking that "type" would be a simple test if that sort of
 thing would work.

 But maybe type is just borked.  I did try "cat" and "more" too I
 think, with same result, though.

 --bb

 Msys does autocomplete. it's not perfect but it works. the path will
 look unix like though.. i.e.
 /c/program files/...

 
 Right that's what Cygwin does too, and it's useless if I want to call
 the DMD compiler.
 
      dmd foo.d /c/libs/mydlib.lib
 
 "Error:  what do you think this is, Linux?"
 
 
 from what I know (winXP sp 2) - console works for unicode Except for RTL
 languages like Hebrew. as someone else already noted, this is legacy
 tech which you shouldn't be using anyway. I don't know if it's fixed in

 there are also other 3rd party stuff as well..

 
 Yeh, i've heard of that.  Do you (or anyone) have any actual
 experience with PowerShell?  It doesn't seem to be standard equipment
 on my new Vista box even.  Does it require a separate download?
 Strange if it really is supposed to be "the new way".
 
 --bb

PowerShell is MS's concession that there are things better done in a 
console environment, especially for developers & powerusers. And, yes, 
it works very well (I'm a fan...). It also contains aliases for all the 
GNU tools (i.e. ls => dir, etc.).

It doesn't come as the default on most OSes simply because Microsoft 
doesn't expect the average home user to need it. It does come default on 
Windows Server 2008, because Microsoft expects it to be a useful utility 
to server admins.

Oct 25 2008

Sergey Gromov <snake.scaly gmail.com> writes:

Bill Baxter пишет:
 On Sat, Oct 25, 2008 at 10:37 AM, Benji Smith <dlanguage benjismith.net> wrote:
 Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 10:23 AM, Benji Smith <dlanguage benjismith.net>
 wrote:
 Bill Baxter wrote:
 Anyone using a shell for Windows that works and supports UTF-8
 properly?

 A regular Windows console supports UTF-8 to some extent:

 * Change console font to Lucida Console
 * issue "chcp 65001"

 You can even get more fonts into there with a bit of hackery.

 I did that but "type <filewith-utf8.txt>"  still prints garbage.

 That's weird. My machine (WinXp Sp3) has no problem printing UTF-8 to the
 console. The only special thing I did was changed the font to Lucide
 Console.

 Ok.  Thanks for the info.  Knowing that it has actually worked for at
 least one person gives me motivation to try again.

 --bb

 Write a tiny little D program and see what you get on the console:

   import tango.io.Stdout;
   void main() {
      Stdout("spade, club, heart, diamond: \u2660\u2663\u2665\u2666");
   }

 I don't know anything about the "type" command, and whether it supports
 UTF-8. But the console itself ought to be able to handle it. Try compiling
 the above code and see what happens.

 --benji

 
 Ah, I see.  I guess more what I want to know is if I had utf-8 source
 code and the D compiler spit out a message about one of the lines,
 would that error message come out as garbage?  Same for ddbg -- if I'm
 debugging and say "ps" for "print source" will the result be garbage.
   I was thinking that "type" would be a simple test if that sort of
 thing would work.
 
 But maybe type is just borked.  I did try "cat" and "more" too I
 think, with same result, though.

They all work for me: type, cat, less.  The file is UTF-8 with BOM. 
Error messages are printed correctly displaying all the characters in a 
buggy symbol.

But now I remember.  It fails to execute any  batch files when it's in 
65001 codepage.  More precisely, it executes exactly one line from a 
batch file like if there were no more lines.  So this pseudo-uniclde 
mode is useless.

Oct 27 2008

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why expect 
 features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

Any text-based program uses the same Windows console (unless it's a GUI 
application, and it uses controls to create a text box, etc).  Including 
cygwin shell.

To say it's a legacy technology is like saying Linux is a legacy technology 
because it's command line based.  It's a false experience promoted by 
Microsoft to try and spread FUD about OSes that mainly support command line 
tools, like Linux.  But command line tools are extremely useful and 
powerful, much easier to develop, and IMO easier to use.  For instance, if 
you want to find all files that contain a certain text, grep -R text / and 
you're done.  On windows it's 'click the start menu, select search, wait for 
the search window to pop up, click on the dog, etc'.  Freaking annoying if 
you ask me ;)


 Anyone using a shell for Windows that works and supports UTF-8 properly?

I would guess it should work properly, most everything in windows supports 
unicode.  Perhaps you have some configuration setting not set properly?  I'd 
suggest searching msdn.

-Steve

Oct 24 2008

Yigal Chripun <yigal100 gmail.com> writes:

Steven Schveighoffer wrote:
 "Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why expect 
 features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 
 Any text-based program uses the same Windows console (unless it's a GUI 
 application, and it uses controls to create a text box, etc).  Including 
 cygwin shell.
 
 To say it's a legacy technology is like saying Linux is a legacy technology 
 because it's command line based.  It's a false experience promoted by 
 Microsoft to try and spread FUD about OSes that mainly support command line 
 tools, like Linux.  But command line tools are extremely useful and 
 powerful, much easier to develop, and IMO easier to use.  For instance, if 
 you want to find all files that contain a certain text, grep -R text / and 
 you're done.  On windows it's 'click the start menu, select search, wait for 
 the search window to pop up, click on the dog, etc'.  Freaking annoying if 
 you ask me ;)
 
 
 Anyone using a shell for Windows that works and supports UTF-8 properly?

 
 I would guess it should work properly, most everything in windows supports 
 unicode.  Perhaps you have some configuration setting not set properly?  I'd 
 suggest searching msdn.
 
 -Steve 
 
 

windows console AKA DOS Box *is* in fact legacy technology. It is

ideas from Linux and incorporated in it.

Also, it doesn't have to be either/or situation regarding CLI vs GUI.
There's Apple's quicksilver (IIRC the name) which is a gui app with CLI
like interface. it has the best from both worlds. PowerShell is GUI
based as well. IMO, CLI should be provided as just a widget in the GUI
world and not a separate entity.

Oct 25 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sat, Oct 25, 2008 at 8:57 PM, Yigal Chripun <yigal100 gmail.com> wrote:
 Steven Schveighoffer wrote:
 "Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why expect
 features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 Any text-based program uses the same Windows console (unless it's a GUI
 application, and it uses controls to create a text box, etc).  Including
 cygwin shell.

 To say it's a legacy technology is like saying Linux is a legacy technology
 because it's command line based.  It's a false experience promoted by
 Microsoft to try and spread FUD about OSes that mainly support command line
 tools, like Linux.  But command line tools are extremely useful and
 powerful, much easier to develop, and IMO easier to use.  For instance, if
 you want to find all files that contain a certain text, grep -R text / and
 you're done.  On windows it's 'click the start menu, select search, wait for
 the search window to pop up, click on the dog, etc'.  Freaking annoying if
 you ask me ;)


 Anyone using a shell for Windows that works and supports UTF-8 properly?

 I would guess it should work properly, most everything in windows supports
 unicode.  Perhaps you have some configuration setting not set properly?  I'd
 suggest searching msdn.

 -Steve


 PowerShell is GUI based as well.

After downloading it and giving it a try, I find this claim somewhat
suspect.  What makes you say it's GUI based?  It has the exact same
decorations and goofy menu options as a regular non-GUI Windows
console.  If it were really a GUI, I doubt they would go through the
extra programming effort required to make it look *exactly* like a
console app.

--bb

Oct 25 2008

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 8:57 PM, Yigal Chripun <yigal100 gmail.com> wrote:
 Steven Schveighoffer wrote:
 "Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why 
 expect
 features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 Any text-based program uses the same Windows console (unless it's a GUI
 application, and it uses controls to create a text box, etc).  Including
 cygwin shell.

 To say it's a legacy technology is like saying Linux is a legacy 
 technology
 because it's command line based.  It's a false experience promoted by
 Microsoft to try and spread FUD about OSes that mainly support command 
 line
 tools, like Linux.  But command line tools are extremely useful and
 powerful, much easier to develop, and IMO easier to use.  For instance, 
 if
 you want to find all files that contain a certain text, grep -R text / 
 and
 you're done.  On windows it's 'click the start menu, select search, wait 
 for
 the search window to pop up, click on the dog, etc'.  Freaking annoying 
 if
 you ask me ;)


 Anyone using a shell for Windows that works and supports UTF-8 
 properly?

 I would guess it should work properly, most everything in windows 
 supports
 unicode.  Perhaps you have some configuration setting not set properly? 
 I'd
 suggest searching msdn.

 -Steve


 PowerShell is GUI based as well.

 After downloading it and giving it a try, I find this claim somewhat
 suspect.  What makes you say it's GUI based?  It has the exact same
 decorations and goofy menu options as a regular non-GUI Windows
 console.  If it were really a GUI, I doubt they would go through the
 extra programming effort required to make it look *exactly* like a
 console app.

I've never used powershell, but most likely you are correct.  I think there 
is a confusion of terms here.

Windows Console is the GUI that comes up with the black window, and displays 
text.  It serves as a terminal, not a shell.  This is not 'old' technology, 
it's just an integral piece of the OS.

cmd.exe is the command interpreter, which is definitely crappy technology 
(and somewhat old).

The responsible party for displaying UTF properly is the console, not the 
shell.

-Steve

Oct 25 2008

ore-sama <spam here.lot> writes:

Steven Schveighoffer Wrote:

 The responsible party for displaying UTF properly is the console, not the 
 shell.
 

One important feature of legacy technology is it must not change for
compatibility with legacy code, stdout is just an oblique pipe and one has no
means to specify text encoding and legacy applications write OCP-encoded text
to stdout, that's why console expects OCP output and breaking this convention
will break legacy applications, piping etc, etc. BTW, cmd.exe can in fact
produce utf-16 output.

Oct 26 2008

Robert Fraser <fraserofthenight gmail.com> writes:

Bill Baxter wrote:
 Yigal Chripun wrote:
 PowerShell is GUI based as well.

 
 After downloading it and giving it a try, I find this claim somewhat
 suspect.  What makes you say it's GUI based?  It has the exact same
 decorations and goofy menu options as a regular non-GUI Windows
 console.  If it were really a GUI, I doubt they would go through the
 extra programming effort required to make it look *exactly* like a
 console app.
 
 --bb

It uses the same console application to do the displaying/execution. 
And, yes, this application sucks (ever done any serious copy/paste in it?)

There's PoshConsole ( http://www.codeplex.com/PoshConsole ), but that 
TODO list is a bit extensive ;-P. Hopefully by Win7 time, the Windows 
group gets around to fixing the console, but that's like hoping they'll 
fix Paint or Notepad ;-P.

Oct 25 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Sun, Oct 26, 2008 at 9:18 AM, Robert Fraser
<fraserofthenight gmail.com> wrote:
 Bill Baxter wrote:
 Yigal Chripun wrote:
 PowerShell is GUI based as well.

 After downloading it and giving it a try, I find this claim somewhat
 suspect.  What makes you say it's GUI based?  It has the exact same
 decorations and goofy menu options as a regular non-GUI Windows
 console.  If it were really a GUI, I doubt they would go through the
 extra programming effort required to make it look *exactly* like a
 console app.

 --bb

 It uses the same console application to do the displaying/execution. And,
 yes, this application sucks (ever done any serious copy/paste in it?)

 There's PoshConsole ( http://www.codeplex.com/PoshConsole ), but that TODO
 list is a bit extensive ;-P. Hopefully by Win7 time, the Windows group gets
 around to fixing the console, but that's like hoping they'll fix Paint or
 Notepad ;-P.

I'm using "Console2" as my facade on the console window.
Works pretty nicely.
http://sourceforge.net/projects/console/

--bb

Oct 25 2008

KennyTM~ <kennytm gmail.com> writes:

Robert Fraser wrote:
 Bill Baxter wrote:
 Yigal Chripun wrote:
 PowerShell is GUI based as well.

 After downloading it and giving it a try, I find this claim somewhat
 suspect.  What makes you say it's GUI based?  It has the exact same
 decorations and goofy menu options as a regular non-GUI Windows
 console.  If it were really a GUI, I doubt they would go through the
 extra programming effort required to make it look *exactly* like a
 console app.

 --bb

 
 It uses the same console application to do the displaying/execution. 
 And, yes, this application sucks (ever done any serious copy/paste in it?)
 
 There's PoshConsole ( http://www.codeplex.com/PoshConsole ), but that 
 TODO list is a bit extensive ;-P. Hopefully by Win7 time, the Windows 
 group gets around to fixing the console, but that's like hoping they'll 
 fix Paint or Notepad ;-P.

Hey, they do have fixed MSPaint and WordPad! :)

Oct 25 2008

torhu <no spam.invalid> writes:

Robert Fraser wrote:
 It uses the same console application to do the displaying/execution. 
 And, yes, this application sucks (ever done any serious copy/paste in it?)

That works fine for me if I enable Quick edit mode in the options.  Then 
the right mouse button will do both copy and paste.

Oct 26 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Mon, Oct 27, 2008 at 1:51 AM, torhu <no spam.invalid> wrote:
 Robert Fraser wrote:
 It uses the same console application to do the displaying/execution. And,
 yes, this application sucks (ever done any serious copy/paste in it?)

 That works fine for me if I enable Quick edit mode in the options.  Then the
 right mouse button will do both copy and paste.

Except it only does block-oriented rectangular selection, which is odd
for something that is primarily line-oriented.

--bb

Oct 26 2008

torhu <no spam.invalid> writes:

Bill Baxter wrote:
 On Mon, Oct 27, 2008 at 1:51 AM, torhu <no spam.invalid> wrote:
 Robert Fraser wrote:
 It uses the same console application to do the displaying/execution. And,
 yes, this application sucks (ever done any serious copy/paste in it?)

 That works fine for me if I enable Quick edit mode in the options.  Then the
 right mouse button will do both copy and paste.

 
 Except it only does block-oriented rectangular selection, which is odd
 for something that is primarily line-oriented.

Yeah, that's true.  Pretty stupid.

Oct 26 2008

Robert Fraser <fraserofthenight gmail.com> writes:

torhu wrote:
 Bill Baxter wrote:
 On Mon, Oct 27, 2008 at 1:51 AM, torhu <no spam.invalid> wrote:
 Robert Fraser wrote:
 It uses the same console application to do the displaying/execution. 
 And,
 yes, this application sucks (ever done any serious copy/paste in it?)

 That works fine for me if I enable Quick edit mode in the options.  
 Then the
 right mouse button will do both copy and paste.

 Except it only does block-oriented rectangular selection, which is odd
 for something that is primarily line-oriented.

 
 Yeah, that's true.  Pretty stupid.

My main problem is that you can't do it just with the keyboard, which is 
my standard method. I also take issue with the fact you can't copy more 
than is visible on a single screen, which goes along with the block 
selection mode.

Oct 26 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Mon, Oct 27, 2008 at 1:52 PM, Robert Fraser
<fraserofthenight gmail.com> wrote:
 torhu wrote:
 Bill Baxter wrote:
 On Mon, Oct 27, 2008 at 1:51 AM, torhu <no spam.invalid> wrote:
 Robert Fraser wrote:
 It uses the same console application to do the displaying/execution.
 And,
 yes, this application sucks (ever done any serious copy/paste in it?)

 That works fine for me if I enable Quick edit mode in the options.  Then
 the
 right mouse button will do both copy and paste.

 Except it only does block-oriented rectangular selection, which is odd
 for something that is primarily line-oriented.

 Yeah, that's true.  Pretty stupid.

 My main problem is that you can't do it just with the keyboard, which is my
 standard method. I also take issue with the fact you can't copy more than is
 visible on a single screen, which goes along with the block selection mode.

By the way I tried running powershell as a tab inside the Console2
prog I mentioned before and it does work fine.

--bb

Oct 26 2008

Yigal Chripun <yigal100 gmail.com> writes:

Bill Baxter wrote:
 On Sat, Oct 25, 2008 at 8:57 PM, Yigal Chripun <yigal100 gmail.com> wrote:
 Steven Schveighoffer wrote:
 "Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why expect
 features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 Any text-based program uses the same Windows console (unless it's a GUI
 application, and it uses controls to create a text box, etc).  Including
 cygwin shell.

 To say it's a legacy technology is like saying Linux is a legacy technology
 because it's command line based.  It's a false experience promoted by
 Microsoft to try and spread FUD about OSes that mainly support command line
 tools, like Linux.  But command line tools are extremely useful and
 powerful, much easier to develop, and IMO easier to use.  For instance, if
 you want to find all files that contain a certain text, grep -R text / and
 you're done.  On windows it's 'click the start menu, select search, wait for
 the search window to pop up, click on the dog, etc'.  Freaking annoying if
 you ask me ;)


 Anyone using a shell for Windows that works and supports UTF-8 properly?

 I would guess it should work properly, most everything in windows supports
 unicode.  Perhaps you have some configuration setting not set properly?  I'd
 suggest searching msdn.

 -Steve


 
 PowerShell is GUI based as well.

 
 After downloading it and giving it a try, I find this claim somewhat
 suspect.  What makes you say it's GUI based?  It has the exact same
 decorations and goofy menu options as a regular non-GUI Windows
 console.  If it were really a GUI, I doubt they would go through the
 extra programming effort required to make it look *exactly* like a
 console app.
 
 --bb

I've just checked (it's been a long time since I used it) and you're
correct. I don't know Why I remembered it as being GUI based, maybe the
blue color threw me off..sorry for the confusion. but I'm sure that
there are 3rd party GUI based shells for Windows.

Oct 27 2008

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Yigal Chripun wrote:
 Steven Schveighoffer wrote:
 "Bill Baxter" wrote
 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why expect 
 features from it?

 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

 Any text-based program uses the same Windows console (unless it's a GUI 
 application, and it uses controls to create a text box, etc).  Including 
 cygwin shell.

 To say it's a legacy technology is like saying Linux is a legacy technology 
 because it's command line based.  It's a false experience promoted by 
 Microsoft to try and spread FUD about OSes that mainly support command line 
 tools, like Linux.  But command line tools are extremely useful and 
 powerful, much easier to develop, and IMO easier to use.  For instance, if 
 you want to find all files that contain a certain text, grep -R text / and 
 you're done.  On windows it's 'click the start menu, select search, wait for 
 the search window to pop up, click on the dog, etc'.  Freaking annoying if 
 you ask me ;)


 Anyone using a shell for Windows that works and supports UTF-8 properly?

 I would guess it should work properly, most everything in windows supports 
 unicode.  Perhaps you have some configuration setting not set properly?  I'd 
 suggest searching msdn.

 -Steve 

 
 windows console AKA DOS Box *is* in fact legacy technology. It is

 ideas from Linux and incorporated in it.

Windows has gotten a lot better in the recent times - ever since it 
finally started to imitate Unix :o).

 Also, it doesn't have to be either/or situation regarding CLI vs GUI.
 There's Apple's quicksilver (IIRC the name) which is a gui app with CLI
 like interface. it has the best from both worlds. PowerShell is GUI
 based as well. IMO, CLI should be provided as just a widget in the GUI
 world and not a separate entity.

I'm not sure I understand. Widget in the GUI = a window with text in it 
living side by side, or embedded with, graphical windows? That's been 
the case for a long time.


Andrei

Oct 25 2008

ore-sama <spam here.lot> writes:

Bill Baxter Wrote:

 On Sat, Oct 25, 2008 at 6:37 AM, ore-sama <spam here.lot> wrote:
 Bill Baxter Wrote:

 (like I haven't been able to figure out how to get the
 DOS console in Windows to display UTF-8)

 Console is a legacy technology (you even still call it "DOS"), why expect
features from it?

 
 So tell me what the alternative is?  I had trouble with running D
 tools from a Cygwin shell.  Can't remember if I tried MSYS or not.

gui of course. MSYS's console is gui in fact.

Oct 25 2008

ore-sama <spam here.lot> writes:

Bill Baxter Wrote:

 import std.stdio;
 void main(string[] args) {  writefln("Args: %s", args); }
 
 And passing it some wildcards.  It never expands anything.  Only thing
 it does do is mess with quotes some.  Here's an example:
 
 C:\> args.exe * "C:\Program Files" *.* c:\*
 Args: [args,*,C:\Program Files,*.*,c:\*]

It's not windows, it's program's standard startup module gets command line with
GetCommandLine() and parses it into string[] args.

Oct 25 2008

ore-sama <spam here.lot> writes:

Bill Baxter Wrote:

 I did that but "type <filewith-utf8.txt>"  still prints garbage.
 
 --bb

if application prints garbage, this indicates that it's implemented incorrectly
or it's not encodings-aware. Correctly implemented application should transcode
text to OCP before printing to console. This is what std.stdio.writef is
supposed to do.

Oct 25 2008

Kevin Bealer <kevinbealer gmail.com> writes:

Andrei Alexandrescu Wrote:

Please vote up before the haters take it down, and discuss:

http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/

Andrei

I think this is a bad idea -- there are a lot of places that don't use Unicode
or don't support 8 bit clean
translation, and the operators in question would be a pain to use every time
they were needed, since
there is no obvious way to type them. And I don't just mean organizations that
drag their feet, but also special cases within every new technology that have
these blind spots. Does your cell phone web browser correctly display these
symbols? Does the program "less" display these correctly? If you
think it's just a matter of time, maybe, but consider that IBM still uses
EBCDIC internally in mainframes.

A lot of languages using only punctuation based syntax are already hard to read
because of it, e.g. Perl can be very hard to read in some cases. Using the
word "and" would make a lot of languages easier to read than using "&&". The
standardized meanings should be kept, but I would favor something like $( stuff
)$, $[ more stuff ]$ and so on rather than using special unicode tokens.

modify bracket usage and
"#text" to indicate special symbols as an extension of the #line and #function
directives. If ".operation" is good enough for every method call, then why

rather than importing thousands of individual extension operators that are only
readable in the unicode-speaking contexts.

Kevin

Oct 25 2008

Alix Pexton <alixD.TpextonNO SPAMgmailD.Tcom> writes:

Andrei Alexandrescu wrote:
 Please vote up before the haters take it down, and discuss:
 
 http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operator
_in_d_similarly_to/ 
 
 
 
 Andrei

I've been following this thread without really having an opinion to 
offer, but I just had a thought...

We already know that D's CTFE and templates can be used together to 
parse DSLs (matrix ops, regular expressions and IIRC Scheme too) and 
turn them into optimal native code. That suggests to me that it is 
already possible to write D code that can turn an expression written in 
established mathematic/scientific notation (complete with unicode 
symbols) into either conventional D code, or machine code.

What I am not sure of is whether is would be possible to make it general 
enough to work with all mathmatical dialects (I seem to remember some 
overlapping in ways that might be problematic). A complete solution 
would have to be able to define new operatos (including thier 
associativity and precidence) in such a way that they can be looked up 
by the templates that evaluate the expresion.

Another related thought I had: Would it be possible to write a 
compile-time parser that turned MathML into code? I'm not even sure if 
MathML is structured enough to represent the undelying meaning of an 
expression rather than just its graphical form. Perhaps it would be more 
interesting to write the code that did the tranformation in the opposite 
direction, turning expressions written in D into MathML ^^

A...

Oct 26 2008

D Programming

C/C++ Programming

Other

digitalmars.D.announce - Adding Unicode operators to D