## digitalmars.D.announce - Adding Unicode operators to D

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Please vote up before the haters take it down, and discuss:

Andrei

Oct 22 2008
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Correx:

Andrei

Andrei Alexandrescu wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

Andrei


Oct 22 2008
"Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Andrei Alexandrescu"  wrote
Correx:

Andrei

No thanks.  Please let's only use operators that are on the keys of my
keyboard. I don't fancy having to type key digraphs or trigraphs to try and
write code.

I understand that others already have this problem, but I don't.  This would
be a huge detractor from D for me.  I'd definitely support a language fork
at that point, or at least refuse to deal with any code that has unicode
operators.  I think you'd find others feel the same way.

Why can't the emacs module solution work that was used for the cheverons?
That is, when emacs sees:

x opCross(y);

display it as

x x y

(of course, assume the middle x is the cross symbol, I have no idea how to
type it).

And upon save, regenerate the correct code.

I see no issue with something like that.  This is all the compiler is doing
anyways...

Note that any operators for unicode would be user-defined anyways, the
standard operator symbols already cover what actually gets generated to
machine code.  That is, unicode operator X is invariably going to map to
opX, so there is no benefit to the compiler performing this step instead of
an editor.

-Steve

Oct 22 2008
"Steven Schveighoffer" <schveiguy yahoo.com> writes:
"davidl" wrote
? Thu, 23 Oct 2008 09:36:29 +0800,Steven Schveighoffer
<schveiguy yahoo.com> ??:

"Andrei Alexandrescu"  wrote
Correx:

Andrei

No thanks.  Please let's only use operators that are on the keys of my
keyboard. I don't fancy having to type key digraphs or trigraphs to try
and
write code.

I understand that others already have this problem, but I don't.  This
would
be a huge detractor from D for me.  I'd definitely support a language
fork
at that point, or at least refuse to deal with any code that has unicode
operators.  I think you'd find others feel the same way.

Why can't the emacs module solution work that was used for the cheverons?
That is, when emacs sees:

x opCross(y);

display it as

x x y

(of course, assume the middle x is the cross symbol, I have no idea how
to
type it).

And upon save, regenerate the correct code.

I see no issue with something like that.  This is all the compiler is
doing
anyways...

Everything you worry about is just poor editor. Why do you think an editor
can affect the language?

All that is being proposed right now is syntax sugar.  Cross product, dot
product, union, etc.  All of these will map to a function, so there is no
reason to require compiler support  (that is, they don't translate directly
to assembly/machine code).  I'm proposing the editor be used to do the sugar
instead of the compiler.

Right now Unicode is not universally accepted by all editors, ASCII is.
Right now, I don't have cross product symbol on my keyboard, all currently
supported symbols I do have.  Why should my experience with D be severely
affected by your desire for syntax sugar?

And It complexes the language, if it's not priorly converted by the
programmer. Also it possibly sets up
future restrictions of extending the language in the correct direction!

Today, I can call opX functions instead of using the appropriate operator.
This is no different.

In your case: x opCross(y) , why identifier opCross(identifier) is
considered as identifier x identifier?
So would the typical operator overload function declaration should be
considered that way?

x opCross(y)
{
}

x x y
{
}

or even

x opCross(y, m){}

--->

x x y, m  {}

also consider a template declaration

Matrix opCross(T)(T a)
{
}

should it be considered as Matrix x T (T a)?

If not , how do you distinguish in all those circumstances(and not all
possible "shouldn't be" situations are listed here)

The editor module would have to be (and can be) smarter than that.

-Steve

Oct 23 2008
Sergey Gromov <snake.scaly gmail.com> writes:
Thu, 23 Oct 2008 18:21:18 +0800,
davidl wrote:
Everything you worry about is just poor editor. Why do you think an
editor can affect the language?

I think an editor is not the only thing that displays your program's
source.  I think that compiler's error message should be readable over a
TTY terminal.  Otherwise you're limited to working with fancy graphical
shells.

Oct 23 2008
KennyTM~ <kennytm gmail.com> writes:
Sergey Gromov wrote:
Thu, 23 Oct 2008 18:21:18 +0800,
davidl wrote:
Everything you worry about is just poor editor. Why do you think an
editor can affect the language?

I think an editor is not the only thing that displays your program's
source.  I think that compiler's error message should be readable over a
TTY terminal.  Otherwise you're limited to working with fancy graphical
shells.

I agree.

My real world experience: Sometimes I need to code over ssh. The server
admin only installed vim (which I don't use) and nano, no emacs.

Probably there could be a vim module also (is it possible?), but that's
just palliatives.

Oct 23 2008
"Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Wed, Oct 22, 2008 at 9:36 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
Why can't the emacs module solution work that was used for the cheverons?

Beeeecause not everyone uses emacs?

Oct 22 2008
"Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Jarrett Billingsley" wrote
On Wed, Oct 22, 2008 at 9:36 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
Why can't the emacs module solution work that was used for the cheverons?

Beeeecause not everyone uses emacs?

Including myself ;)

But I really meant the same *type* of solution.  If you use another editor,
especially if it is used for coding, it probably has a macro feature that
you can use for doing this.

-Steve

Oct 22 2008
"Bill Baxter" <wbaxter gmail.com> writes:
On Thu, Oct 23, 2008 at 10:36 AM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
No thanks.  Please let's only use operators that are on the keys of my
keyboard. I don't fancy having to type key digraphs or trigraphs to try and
write code.
[...]
Why can't the emacs module solution work that was used for the cheverons?

Actually, the solutions aren't that far apart.  Andrei's solution
displays XXX as YYY, the actual Unicode version you'd still type XXX
just it would actually be replaced by YYY instead of just being
displayed as YYY.

The nice thing about getting such AutoCorrect replacements working
well across a wide range of editors is that it has benefits beyond
just typing unicode characters.  You can have it insert code snippets
when you type [[main]] for example, or some people have said that some
of the existing characters are hard to type on their non-US keyboards.
You could define replacements for those.

I'm certainly not saying going Unicode is the right thing to do right
now.  More like trying to explore what has to change (if anything)
before it really becomes viable to introduce Unicode.  The topic seems
to keep coming up in a lot of places, so I think eventually it is
inevitable that we will see more and more languages start using it.

---bb

Oct 22 2008
"Bill Baxter" <wbaxter gmail.com> writes:
On Thu, Oct 23, 2008 at 10:45 AM, Jarrett Billingsley
<jarrett.billingsley gmail.com> wrote:
On Wed, Oct 22, 2008 at 9:36 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
Why can't the emacs module solution work that was used for the cheverons?

Beeeecause not everyone uses emacs?

In fact, I think there are only like three of us using emacs.  :-)  So
it's not a very general solution.

But I think the point is that you should be able to implement
something similar in many editors.
Although I think the trick of showing one thing but saving another is
more tricky for most editors than just replacing the strings outright
a la AutoCorrect.

--bb

Oct 22 2008
Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:
Andrei Alexandrescu Wrote:

Correx:

Andrei

Andrei Alexandrescu wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

Andrei

Java allows unicode variable names. The Greek letter 'pi' is a valid variable
name in Java (see www.jscience.org for an example). Having said that, I've had
Java IDEs choke on these.

An opportunity may exist here for someone to create/modify a D language IDE
that supports same. [Although Descent (being Eclipse-based and therefore
Java-based) should have a leg up already.]

I know projects exist that intend to be 'the' D IDE (written in D, for D,
etc.). Maybe this could be a discriminator that makes one stand out.

Paul

Oct 22 2008
Spacen Jasset <spacenjasset yahoo.co.uk> writes:
Andrei Alexandrescu wrote:
Correx:

_in_d_similarly_to/

Andrei

Andrei Alexandrescu wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

Andrei

editor support and I think that it could hinder readability as one would
have to know that symbol 'x' is say, crossproduct. -- It isn't always,
it depends on the mathematical domain.

There are, I belive, far more pressing matters, and this feature would
make editor support a bit more difficult, and we are currently in the
days where there isn't enough editor and/or ide support for D. I would
personally prefer it not be added to the language in the near future,
this is of course only my perferance, which in honesty may be biased but
isn't entirely for self reasons.

Oct 23 2008
Walter Bright <newshound1 digitalmars.com> writes:
Bill Baxter wrote:
I think that's the conclusion I'm coming too as well.  While the use
of Unicode would have some advantages, there are various technical
issues with it (like I haven't been able to figure out how to get the
DOS console in Windows to display UTF-8).  I think those issues can
all be solved, but it would be a large distraction for the D
community.  Better to let some big, well-funded, massively popular
language pioneer in this area.  If some language with a billion
programmers decided to use Unicode, then you can bet that most of
these infrastructure problems would start to disappear quickly as
annoyed programmers start scratching their own itches and as they
start complaining to the people who write the tools they use.

Realistically, if I complain to any software vendor now that their
editor doesn't work well with D because they don't have funky Unicode
functionality, the response is likely to be "Sounds like a problem
with D, whatever that is".  If the language were Java or C++, though,
they would have little choice but to take the complaint seriously,
regardless of the effort required.

Unfortunately, you might be right in that D is not currently in a
position to force the issue.

Oct 23 2008
"Nick Sabalausky" <a a.a> writes:
"Walter Bright" <newshound1 digitalmars.com> wrote in message
news:gdr4pe$2uje$1 digitalmars.com...
Bill Baxter wrote:
I think that's the conclusion I'm coming too as well.  While the use
of Unicode would have some advantages, there are various technical
issues with it (like I haven't been able to figure out how to get the
DOS console in Windows to display UTF-8).  I think those issues can
all be solved, but it would be a large distraction for the D
community.  Better to let some big, well-funded, massively popular
language pioneer in this area.  If some language with a billion
programmers decided to use Unicode, then you can bet that most of
these infrastructure problems would start to disappear quickly as
annoyed programmers start scratching their own itches and as they
start complaining to the people who write the tools they use.

Realistically, if I complain to any software vendor now that their
editor doesn't work well with D because they don't have funky Unicode
functionality, the response is likely to be "Sounds like a problem
with D, whatever that is".  If the language were Java or C++, though,
they would have little choice but to take the complaint seriously,
regardless of the effort required.

Unfortunately, you might be right in that D is not currently in a position
to force the issue.

My various thoughts:

Whatever language does end up forcing the issue is going to come up against
(inertial) resistance, either successfully or unsuccessfully. If D, right
now, were to be the language to attempt to force the issue, then like you
two have said, it would probably be unsuccesful. So, in order for the
unicode transition to ever be successful, it would have to be some other
language (or a version of D later down the road) that forces the issue.

However, if D and/or other similarly less-than-mainstream (I hate referring
to D that way, BTW) languages already had useful unicode support in a way
that *wasn't* trying to force the issue (ie, purely optional, with perfectly
acceptable ASCII fallbacks) when that "force the issue" language does come
along, then that can help cut down on the resistance that the "force the
issue" language encounters. We might not be able to crack the
chicken-and-the-egg, but we could help weaken it by providing a little extra
incentive of out own (again, as long as it was in a way that wasn't
forceful).

I do agree, though, with the people who have said that D has more important
things to focus on right now than unicode. And I would add that I see most
of D's biggest strengths as things where it cleans up and fixes the mistakes
made by the more pioneering languages like C++ or Java. So I think it would
be in true D style (in a good way) to wait for something else, like maybe
Fortress, to go muck around in unicode, and then we can design our unicode
to clean up the mistakes those languages will inevitably end up making
(instead leading our own language into a corner by making those "pioneer"
mistakes ourselves). Plus, hopefully by that time we'll have finally taken
care of the more pressing issues that we're currently facing. (Like
eliminating foreward reference issues!! Please!!)

I hope that all made sense. I guess my summary is: Hold off on official
unicode stuff for now and learn from other's unicode mistakes. But, if we do
put official unicode stuff in right now, keep it in a way that doesn't force
the issue. And as for unofficial unicode stuff, I say go ahead, play around
with it, post it, do whatever.

Oct 23 2008
"Bill Baxter" <wbaxter gmail.com> writes:
On Fri, Oct 24, 2008 at 3:42 AM, Spacen Jasset <spacenjasset yahoo.co.uk> wrote:
I haven't really ever felt the need for such things. It would require editor
support and I think that it could hinder readability as one would have to
know that symbol 'x' is say, crossproduct. -- It isn't always, it depends on
the mathematical domain.

There are, I belive, far more pressing matters, and this feature would make
editor support a bit more difficult, and we are currently in the days where
there isn't enough editor and/or ide support for D. I would personally
prefer it not be added to the language in the near future, this is of course
only my perferance, which in honesty may be biased but isn't entirely for
self reasons.

I think that's the conclusion I'm coming too as well.  While the use
of Unicode would have some advantages, there are various technical
issues with it (like I haven't been able to figure out how to get the
DOS console in Windows to display UTF-8).  I think those issues can
all be solved, but it would be a large distraction for the D
community.  Better to let some big, well-funded, massively popular
language pioneer in this area.  If some language with a billion
programmers decided to use Unicode, then you can bet that most of
these infrastructure problems would start to disappear quickly as
annoyed programmers start scratching their own itches and as they
start complaining to the people who write the tools they use.

Realistically, if I complain to any software vendor now that their
editor doesn't work well with D because they don't have funky Unicode
functionality, the response is likely to be "Sounds like a problem
with D, whatever that is".  If the language were Java or C++, though,
they would have little choice but to take the complaint seriously,
regardless of the effort required.

--bb

Oct 23 2008
Don <nospam nospam.com.au> writes:
Andrei Alexandrescu wrote:
Correx:

_in_d_similarly_to/

Andrei

Andrei Alexandrescu wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

Andrei

Entering this debate late:

I think that operator overloading itself is syntactic sugar, and
primarily exists for numerical programmers. So it's not so unreasonable
to support for operator overloading which is not hugely intelligible to
non-mathematicians.
"Funny" operators should never be seen by anyone without a mathematical
background. However, I'm not so sure how common they'd actually be.

The strongest use case seems to me to be the situation where multiple
related operations exist, but only one operator is available.
The classic example is vector products, where we have:
- vector dot vector
- vector cross vector
- Elementwise product of two vectors.
But we only have one opMul. So it would be useful to have alternate
multiplication signs available.
Adding  (opCross) as a multiplication which is non-associative would, I
think, be quite generally useful.

But, I think there aren't actually very many other operators which are
easy to justify on mathematical grounds. Largely because most unary
operations look quite OK when implemented as functions, and
mathematicians don't have a huge number of binary operators.
Other than dot product, cross product, and convolution, there's the
exclusive or symbol (+ with a circle around it), and everything else is
pretty obscure.

Apart from the dot and cross product, the inability to have superscripts
and subscripts in variable names (and comments!) is a much bigger issue,
in my experience.
Oh. And the lack of an exponentiation operator. I miss the old Commodore
64 up-arrow for power <g>

If you could completely ignore keyboard and display issues, and use any
unicode character as an operator, which ones would you actually use?

Oct 28 2008
Sergey Gromov <snake.scaly gmail.com> writes:
Don wrote:
If you could completely ignore keyboard and display issues, and use any
unicode character as an operator, which ones would you actually use?

I'd use dot "⋅" and cross "×" products for 3D, union "∪" and
intersection "∩", subset "⊂" and superset "⊃" and their negative forms.
I don't think I'd use anything else.

Well, comparisons look better when converted into appropriate unicode.

Oct 28 2008
bearophile <bearophileHUGS lycos.com> writes:
Sergey Gromov:
I'd use dot "⋅" and cross "×" products for 3D, union "∪" and
intersection "∩", subset "⊂" and superset "⊃" and their negative forms.
I don't think I'd use anything else.

I just want to note that the whole thread is almost unreadable on the
digitalmars.com/webnews/, because it doesn't digest unicode chars at all. So
adding unicode to D will give problems to show code.

Unrelated to the unicode, but related on those opSubset, opSuperset, etc:
while implementing a set() class with the same API of the Python sets, I have
seen there are the following operators/methods too:

issubset(other)
set <= other
Test whether every element in the set is in other.

set < other
Test whether the set is a true subset of other, that is, set <= other and set
!= other.

issuperset(other)
set >= other
Test whether every element in other is in the set.

set > other
Test whether the set is a true superset of other, that is, set >= other and set
!= other.

A full opCmp can't be defined on sets, so I think in D1 we can't overload <= >=
among sets... I think this is a problem has to be solved in D2, because sets
are important enough.

Bye,
bearophile

Oct 28 2008
KennyTM~ <kennytm gmail.com> writes:
bearophile wrote:
Sergey Gromov:
I'd use dot "â‹…" and cross "Ã—" products for 3D, union "âˆª" and
intersection "âˆ©", subset "âŠ‚" and superset "âŠƒ" and their
negative forms.
I don't think I'd use anything else.

I just want to note that the whole thread is almost unreadable on the
digitalmars.com/webnews/, because it doesn't digest unicode chars at all. So
adding unicode to D will give problems to show code.

Unrelated to the unicode, but related on those opSubset, opSuperset, etc:
while implementing a set() class with the same API of the Python sets, I have
seen there are the following operators/methods too:

issubset(other)
set <= other
Test whether every element in the set is in other.

set < other
Test whether the set is a true subset of other, that is, set <= other and set
!= other.

issuperset(other)
set >= other
Test whether every element in other is in the set.

set > other
Test whether the set is a true superset of other, that is, set >= other and
set != other.

A full opCmp can't be defined on sets, so I think in D1 we can't overload <=
>= among sets... I think this is a problem has to be solved in D2, because sets
are important enough.

Bye,
bearophile

If the two sets are incomparable, just return NaN... We need an opCmp
that returns a float :)

Oct 28 2008
KennyTM~ <kennytm gmail.com> writes:
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit

KennyTM~ wrote:
bearophile wrote:
Sergey Gromov:
I'd use dot "â‹…" and cross "Ã—" products for 3D, union "âˆª" and
intersection "âˆ©", subset "âŠ‚" and superset "âŠƒ" and their
negative forms.
I don't think I'd use anything else.

I just want to note that the whole thread is almost unreadable on the
digitalmars.com/webnews/, because it doesn't digest unicode chars at
all. So adding unicode to D will give problems to show code.

Unrelated to the unicode, but related on those opSubset, opSuperset, etc:
while implementing a set() class with the same API of the Python sets,
I have seen there are the following operators/methods too:

issubset(other) set <= other Test whether every element in the set is
in other.

set < other Test whether the set is a true subset of other, that is,
set <= other and set != other.

issuperset(other) set >= other Test whether every element in other is
in the set.

set > other Test whether the set is a true superset of other, that is,
set >= other and set != other.

A full opCmp can't be defined on sets, so I think in D1 we can't
overload <= >= among sets... I think this is a problem has to be
solved in D2, because sets are important enough.

Bye,
bearophile

If the two sets are incomparable, just return NaN... We need an opCmp
that returns a float :)

Actually I've made a working solution. Even the exotic operators like
!<= (not a subset of, ⊈) works too. It's designed for demonstration, not
performance, though.

Oct 28 2008
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sergey Gromov wrote:
Don wrote:
If you could completely ignore keyboard and display issues, and use any
unicode character as an operator, which ones would you actually use?

I'd use dot "⋅" and cross "×" products for 3D, union "∪" and
intersection "∩", subset "⊂" and superset "⊃" and their negative forms.
I don't think I'd use anything else.

Well, comparisons look better when converted into appropriate unicode.

In my opinion, a workable feature is this:

* Functions can be defined with a leading backspace. They will be usable
with the infix notation.

* There is a way of specifying that precedence of a function defined as
above is the same as precedence of a built-in operator.

* Functions of which name is the same as an HTML entity name for a
symbol can be replaced with the actual symbol.

Andrei

Oct 28 2008
Don <nospam nospam.com.au> writes:
Andrei Alexandrescu wrote:
Sergey Gromov wrote:
Don wrote:
If you could completely ignore keyboard and display issues, and use any
unicode character as an operator, which ones would you actually use?

I'd use dot "⋅" and cross "×" products for 3D, union "∪" and
intersection "∩", subset "⊂" and superset "⊃" and their negative forms.
I don't think I'd use anything else.

Well, comparisons look better when converted into appropriate unicode.

In my opinion, a workable feature is this:

* Functions can be defined with a leading backspace. They will be usable
with the infix notation.

* There is a way of specifying that precedence of a function defined as
above is the same as precedence of a built-in operator.

Do we really need to do that? How many Unicode binary operators are there?

This list of symbols which work in web browsers is very short.
http://en.wikipedia.org/wiki/Wikipedia:Mathematical_symbols

The interesting thing about this second list is just how short it is,
and how many of the items in it are comparison operators.
Any of the unicode comparison operators could be given the same
precedence as <,> and 'in'.
Cross should be given the same precedence as opMul and opDiv.
That just leaves oplus, otimes, which probably the same precedence as
plus and mul.

You can do the same thing with this list:
http://en.wikipedia.org/wiki/Unicode_Mathematical_Operators
And you find that the precedence of almost everything is easy to
determine. Seems like 90% of them are relational operators.

Specifying the precedence of each unicode operator (eg by a lookup
table) would be adequate for any use case I can imagine, and it wouldn't
make syntactic analysis any more ambiguous.

* Functions of which name is the same as an HTML entity name for a
symbol can be replaced with the actual symbol.


Oct 29 2008
Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
* There is a way of specifying that precedence of a function defined as
above is the same as precedence of a built-in operator.

That throws out the ability to parse without semantic analysis. It's not
worth it.

Oct 29 2008
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Walter Bright wrote:
Andrei Alexandrescu wrote:
* There is a way of specifying that precedence of a function defined
as above is the same as precedence of a built-in operator.

That throws out the ability to parse without semantic analysis. It's not
worth it.

It doesn't per a previous post of mine, but I agree it's still not worth it.

Andrei

Oct 29 2008
Benji Smith <dlanguage benjismith.net> writes:
Sergey Gromov wrote:
Don wrote:
If you could completely ignore keyboard and display issues, and use any
unicode character as an operator, which ones would you actually use?

I'd use dot "⋅" and cross "×" products for 3D, union "∪" and
intersection "∩", subset "⊂" and superset "⊃" and their negative forms.
I don't think I'd use anything else.

Well, comparisons look better when converted into appropriate unicode.

I have pretty much the same list.

For me the really compelling case for unicode characters isn't in
finding more operators. It's the brackets!!

--benji

Oct 28 2008
"Bill Baxter" <wbaxter gmail.com> writes:
T24gV2VkLCBPY3QgMjksIDIwMDggYXQgNDoxMiBBTSwgQW5kcmVpIEFsZXhhbmRyZXNjdQo8U2Vl
V2Vic2l0ZUZvckVtYWlsQGVyZGFuaS5vcmc+IHdyb3RlOgo+IFNlcmdleSBHcm9tb3Ygd3JvdGU6
Cj4+Cj4+IERvbiB3cm90ZToKPj4+Cj4+PiBJZiB5b3UgY291bGQgY29tcGxldGVseSBpZ25vcmUg
a2V5Ym9hcmQgYW5kIGRpc3BsYXkgaXNzdWVzLCBhbmQgdXNlIGFueQo+Pj4gdW5pY29kZSBjaGFy
YWN0ZXIgYXMgYW4gb3BlcmF0b3IsIHdoaWNoIG9uZXMgd291bGQgeW91IGFjdHVhbGx5IHVzZT8K
Pj4KPj4gSSdkIHVzZSBkb3QgIuKLhSIgYW5kIGNyb3NzICLDlyIgcHJvZHVjdHMgZm9yIDNELCB1
bmlvbiAi4oiqIiBhbmQKPj4gaW50ZXJzZWN0aW9uICLiiKkiLCBzdWJzZXQgIuKKgiIgYW5kIHN1
cGVyc2V0ICLiioMiIGFuZCB0aGVpciBuZWdhdGl2ZSBmb3Jtcy4KPj4gIEkgZG9uJ3QgdGhpbmsg
SSdkIHVzZSBhbnl0aGluZyBlbHNlLgo+Pgo+PiBXZWxsLCBjb21wYXJpc29ucyBsb29rIGJldHRl
ciB3aGVuIGNvbnZlcnRlZCBpbnRvIGFwcHJvcHJpYXRlIHVuaWNvZGUuCj4KPiBJbiBteSBvcGlu
aW9uLCBhIHdvcmthYmxlIGZlYXR1cmUgaXMgdGhpczoKPgo+ICogRnVuY3Rpb25zIGNhbiBiZSBk
ZWZpbmVkIHdpdGggYSBsZWFkaW5nIGJhY2tzcGFjZS4gVGhleSB3aWxsIGJlIHVzYWJsZQo+IHdp
b3UncmUgbm90IHN1Z2dlc3Rpbmcgd2Ugd3JpdGUKXkhpbmZpeE9wZXJhdG9yLiA6LSkKCj4gKiBU
aGVyZSBpcyBhIHdheSBvZiBzcGVjaWZ5aW5nIHRoYXQgcHJlY2VkZW5jZSBvZiBhIGZ1bmN0aW9u
IGRlZmluZWQgYXMKPiBhYm92ZSBpcyB0aGUgc2FtZSBhcyBwcmVjZWRlbmNlIG9mIGEgYnVpbHQt
aW4gb3BlcmF0b3IuCgpXb3JrYWJsZSwgYnV0IGl0IGFpbid0IHdoYXQgV2FsdGVyIGNhbGxzIHBh
cnNpbmcuCgo+ICogRnVuY3Rpb25zIG9mIHdoaWNoIG5hbWUgaXMgdGhlIHNhbWUgYXMgYW4gSFRN
TCBlbnRpdHkgbmFtZSBmb3IgYSBzeW1ib2wKPiBjYW4gYmUgcmVwbGFjZWQgd2l0aCB0aGUgYWN0
dWFsIHN5bWJvbC4KCi0tYmIK

Oct 28 2008
Moritz Warning <moritzwarning web.de> writes:
On Wed, 22 Oct 2008 17:27:58 -0500, Andrei Alexandrescu wrote:

Please vote up before the haters take it down, and discuss:

Andrei

It would be very nice to have unicode operators.
But what opFooBar functions do users need (most)?

opDotProduct and opCrossProduct would be definitely cool.

Oct 22 2008
Moritz Warning <moritzwarning web.de> writes:
On Wed, 22 Oct 2008 23:37:43 +0000, Moritz Warning wrote:

On Wed, 22 Oct 2008 17:27:58 -0500, Andrei Alexandrescu wrote:

Please vote up before the haters take it down, and discuss:

Andrei

It would be very nice to have unicode operators. But what opFooBar
functions do users need (most)?

opDotProduct and opCrossProduct would be definitely cool.

sorry posted in d.announce by .. accident. :/

Oct 22 2008
"Nick Sabalausky" <a a.a> writes:
"Moritz Warning" <moritzwarning web.de> wrote in message
news:gdodg7$1f5o$1 digitalmars.com...
On Wed, 22 Oct 2008 17:27:58 -0500, Andrei Alexandrescu wrote:

Please vote up before the haters take it down, and discuss:

Andrei

It would be very nice to have unicode operators.
But what opFooBar functions do users need (most)?

opDotProduct and opCrossProduct would be definitely cool.

I'd certainly like opIntersection and maybe opUnion.

Oct 22 2008
"Bill Baxter" <wbaxter gmail.com> writes:
On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

(My comment cross posted here from reddit)

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

The truth is keyboards aren't very good for inputting Unicode. That
isn't likely to change. Yes they've dealt with the problem in Asian
languages by using IMEs but in my opinion IMEs are horrible to use.

Some people seem to argue it's a waste to go to Unicode only for a few
symbols. If you're going to go Unicode, you should go whole hog. I'd
argue the exact opposite. If you're going to go Unicode, it should be
done in moderation. Use as little Unicode as necessary and no more.

As for how to input unicode -- Microsoft Word solved that problem ages
ago, assuming we're talking about small numbers of special characters.
It's called AutoCorrect. You just register your unicode symbol as a
misspelling for "(X)" or something unique like that and then every
time you type "(X)" a funky unicode character instantly replaces those
chars.

Yeh, not many editors support such a feature. But it's very easy to
implement. And with that one generic mechanism, your editor is ready
to support input of Unicode chars in any language just by adding the
right definitions.

--bb

Oct 22 2008
Don <nospam nospam.com.au> writes:
Bill Baxter wrote:
On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

(My comment cross posted here from reddit)

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

The truth is keyboards aren't very good for inputting Unicode. That
isn't likely to change. Yes they've dealt with the problem in Asian
languages by using IMEs but in my opinion IMEs are horrible to use.

Some people seem to argue it's a waste to go to Unicode only for a few
symbols. If you're going to go Unicode, you should go whole hog. I'd
argue the exact opposite. If you're going to go Unicode, it should be
done in moderation. Use as little Unicode as necessary and no more.

I agree.
There is in fact a fairly defensible subset of Unicode: those characters
which are easy to type on some keyboard.  This would includes chevrons,
currency symbols (especially pound, euro, yen); european accented
characters (not terribly useful) and a couple of other punctuation
marks. After all, if it's painful to type a Euro symbol on your
keyboard, you're heading for oblivion.

The list is pretty much equivalent to the US-International keyboard
layout in  Windows. There aren't many useful characters in there, but it
might be enough.

The chevrons and the inverted ? and ! are perhaps the most interesting,
since they are paired. The multiply sign isn't bad, though.
With the German keyboards I have to use, some of these are less painful
to type than {}.

Oct 23 2008
Sergey Gromov <snake.scaly gmail.com> writes:
Thu, 23 Oct 2008 09:36:39 +0200,
Don wrote:
=AB =BB ? ? =B6 =A7 =AC ? ? ? ? ? =A4 ? =A9 =AE

Lots of question marks here.  This sucks.

Oct 23 2008
Spacen Jasset <spacenjasset yahoo.co.uk> writes:
Bill Baxter wrote:
On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

(My comment cross posted here from reddit)

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

The truth is keyboards aren't very good for inputting Unicode. That
isn't likely to change. Yes they've dealt with the problem in Asian
languages by using IMEs but in my opinion IMEs are horrible to use.

Some people seem to argue it's a waste to go to Unicode only for a few
symbols. If you're going to go Unicode, you should go whole hog. I'd
argue the exact opposite. If you're going to go Unicode, it should be
done in moderation. Use as little Unicode as necessary and no more.

As for how to input unicode -- Microsoft Word solved that problem ages
ago, assuming we're talking about small numbers of special characters.
It's called AutoCorrect. You just register your unicode symbol as a
misspelling for "(X)" or something unique like that and then every
time you type "(X)" a funky unicode character instantly replaces those
chars.

Yeh, not many editors support such a feature. But it's very easy to
implement. And with that one generic mechanism, your editor is ready
to support input of Unicode chars in any language just by adding the
right definitions.

--bb

good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs
m3 = m1 X m2 ? and how often will that happen? It's also going to make
the language more difficult to learn and understand.

If set memebrship test operator and a few others are introduced, then
really to be "complete" all the set operators must be added, and
implemented.

Futhermore, the introduction of set operators should really mean that
you can use them on something by default, that means implementing sets
that presumably are usable, quick, and are worth using, otherwise peope
will roll thier own (all the time) in many different ways.

Unicode symbol 'x' may look better, but is it really more readable? I
think it is -- a bit, and it may be cool, but I don't think it's one of
the things that is going to make developing software siginficantly easier.

Why unicode anyway? In the same way that editor support is required to
actually type them in, why not let the editor render them. So instead of
symbol 'x' in the source code, say:

m3 = m1 cross_product m2

as an infix notatation in a similar way to the (uniary) sizeof operator.

While cross_product is a bit long and unwieldy any editor capable can
replace the rendition of that keyword with a symbol for it. But in
editors that don't it means that it still can be typed in and/or
displayed easily.

Another option includes providing cross_product as an 'alias' and 'X'
aswell.

Which then leads on to the introduction of a facility to add arbitary
operators, which could be interesting becuase you can supply any
operator you see fit for the domains that you use that require it. --
This provide exactly the right solution though as all the additions
would be 'non standard' and I can see books in the future recommending
people not use unicode operators, becuase editors don't have support for
them.

If D is to be used on a wide variety of platforms, which would be
desirable if it is to gain traction, then editor support barriers like
this could impeede it's progress.

Oct 25 2008
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Spacen Jasset wrote:
Bill Baxter wrote:
On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

(My comment cross posted here from reddit)

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

The truth is keyboards aren't very good for inputting Unicode. That
isn't likely to change. Yes they've dealt with the problem in Asian
languages by using IMEs but in my opinion IMEs are horrible to use.

Some people seem to argue it's a waste to go to Unicode only for a few
symbols. If you're going to go Unicode, you should go whole hog. I'd
argue the exact opposite. If you're going to go Unicode, it should be
done in moderation. Use as little Unicode as necessary and no more.

As for how to input unicode -- Microsoft Word solved that problem ages
ago, assuming we're talking about small numbers of special characters.
It's called AutoCorrect. You just register your unicode symbol as a
misspelling for "(X)" or something unique like that and then every
time you type "(X)" a funky unicode character instantly replaces those
chars.

Yeh, not many editors support such a feature. But it's very easy to
implement. And with that one generic mechanism, your editor is ready
to support input of Unicode chars in any language just by adding the
right definitions.

--bb

good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs
m3 = m1 X m2 ? and how often will that happen? It's also going to make
the language more difficult to learn and understand.

I have noticed that in pretty much all scientific code, the f(a, b) and
a.f(b) notations fall off a readability cliff when the number of
operators grows only to a handful. Lured by simple examples like yours,
people don't see that as a problem until they actually have to read or
write such code. Adding temporaries and such is not that great because
it further takes the algorithm away from its mathematical form just for
serving a notation that was the problem in the first place.

If set memebrship test operator and a few others are introduced, then
really to be "complete" all the set operators must be added, and
implemented.

Futhermore, the introduction of set operators should really mean that
you can use them on something by default, that means implementing sets
that presumably are usable, quick, and are worth using, otherwise peope
will roll thier own (all the time) in many different ways.

Unicode symbol 'x' may look better, but is it really more readable? I
think it is -- a bit, and it may be cool, but I don't think it's one of
the things that is going to make developing software siginficantly easier.

I think "cool" has not a lot to do with it. For scientific code, it's
closer to a necessity.

Andrei

Oct 25 2008
Spacen Jasset <spacenjasset yahoo.co.uk> writes:
Andrei Alexandrescu wrote:
Spacen Jasset wrote:
Bill Baxter wrote:
On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

(My comment cross posted here from reddit)

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

The truth is keyboards aren't very good for inputting Unicode. That
isn't likely to change. Yes they've dealt with the problem in Asian
languages by using IMEs but in my opinion IMEs are horrible to use.

Some people seem to argue it's a waste to go to Unicode only for a few
symbols. If you're going to go Unicode, you should go whole hog. I'd
argue the exact opposite. If you're going to go Unicode, it should be
done in moderation. Use as little Unicode as necessary and no more.

As for how to input unicode -- Microsoft Word solved that problem ages
ago, assuming we're talking about small numbers of special characters.
It's called AutoCorrect. You just register your unicode symbol as a
misspelling for "(X)" or something unique like that and then every
time you type "(X)" a funky unicode character instantly replaces those
chars.

Yeh, not many editors support such a feature. But it's very easy to
implement. And with that one generic mechanism, your editor is ready
to support input of Unicode chars in any language just by adding the
right definitions.

--bb

a good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ?
vs m3 = m1 X m2 ? and how often will that happen? It's also going to
make the language more difficult to learn and understand.

I have noticed that in pretty much all scientific code, the f(a, b) and
a.f(b) notations fall off a readability cliff when the number of
operators grows only to a handful. Lured by simple examples like yours,
people don't see that as a problem until they actually have to read or
write such code. Adding temporaries and such is not that great because
it further takes the algorithm away from its mathematical form just for
serving a notation that was the problem in the first place.

programming language." [sic] though; and so what will people use it for
in the main? I suggest that communities that require scientific code
have options now, and that they can and do choose languages for the
purpose which have better support for thier needs than D might achieve.

If set memebrship test operator and a few others are introduced, then
really to be "complete" all the set operators must be added, and
implemented.

Futhermore, the introduction of set operators should really mean that
you can use them on something by default, that means implementing sets
that presumably are usable, quick, and are worth using, otherwise
peope will roll thier own (all the time) in many different ways.

Unicode symbol 'x' may look better, but is it really more readable? I
think it is -- a bit, and it may be cool, but I don't think it's one
of the things that is going to make developing software siginficantly
easier.

I think "cool" has not a lot to do with it. For scientific code, it's
closer to a necessity.

mentions of the word and it's a bit nebulous. I, personally, am more
concerened with practicality than "cool".

Andrei

What I think of unicode symbols therefore depends on whether D should be
more scientific oriented or not. If it should be, then unicode symbols
would undoubtedly be a benefit. My responses were guided by the
assumption that D was more generic in nature, though.

Oct 25 2008
bearophile <bearophileHUGS lycos.com> writes:
Bill Baxter:
On the other end there are the Matlab and NumPy-type solutions.  They
are convenient for tinkering around and displaying some results, but
these are not good for performance.

I have seen many scientific programs that use numpy, so sometimes it's fast
enough. But it forces you to write everything in a vector programming style,
that a procedural programmer needs time to learn. Normal C/D/C++ code is more
flexible, you can work on single items too in a fast way, while in numpy you
can go fast only when you work in bulk, on vectors.

On the other hand numpy offers you some higher level operations on arrays that
are currently missing in D, like certain complex slicing operations, that may
reduce your code length significantly, increasing code readability (because it
looks more like formulas); I can show you some examples if you want. Note that
in D there's no built-in rectangular dynamic arrays, that are basic stuff in
numpy/matlab.

Bye,
bearophile

Oct 25 2008
bearophile <bearophileHUGS lycos.com> writes:
Bill Baxter:

was actually more work than it would be to just use D for everything.<

Mixing languages isn't nice, I agree. That's why I too use D for several
purposes.

But if you have to change your code very often (and if your problems are of a
certain kind that allow a natural vectorization), then having vectorial (short)
code may have some advantages), think about how much C++ code you need to write
to implement the programs of this book:
http://wiki.deductivethinking.com/wiki/Python_Programs_for_Modelling_Infectious_Diseases_book
So it allows a more explorative way of coding.

Sure Python does have some nice features as a language that D lacks, but from
10,000 ft  D is a lot closer to Python than C++ in terms of ease of use.<

My experience with the ShedSkin compiler shows me that most of those features
that D lacks (complex slices, list comps, generators, short syntax, some
near-zero-cost safeties, etc) are absent because of cultural or inertial
reasons present in the brain of people used to C/C++, and not because they
can't be present/added in a language like D.
ShedSkin translates Python code to clean C++ code, showing that it can be done,
it gives advantages, and it's not too much difficult to do. It shows once and
forever, that you can have a C++-class language with a short and nice syntax,
etc.
Hopefully the Delight language has less of the cultural inertia coming from
C/C++, so it may become a better compromise than D itself.

I've got my dflat and gobo (http://www.dsource.org/projects/multiarray) that
are working for me pretty well.  They could use some full-time loving to make
more operations work intuitively, but the basics work ok.<

Nice stuff, lot of stuff. More comments require more study of that code. D
(Tango) may gain from having more batteries.

Bye,
bearophile

Oct 25 2008
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Spacen Jasset wrote:
Andrei Alexandrescu wrote:
Spacen Jasset wrote:
Bill Baxter wrote:
On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

(My comment cross posted here from reddit)

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

The truth is keyboards aren't very good for inputting Unicode. That
isn't likely to change. Yes they've dealt with the problem in Asian
languages by using IMEs but in my opinion IMEs are horrible to use.

Some people seem to argue it's a waste to go to Unicode only for a few
symbols. If you're going to go Unicode, you should go whole hog. I'd
argue the exact opposite. If you're going to go Unicode, it should be
done in moderation. Use as little Unicode as necessary and no more.

As for how to input unicode -- Microsoft Word solved that problem ages
ago, assuming we're talking about small numbers of special characters.
It's called AutoCorrect. You just register your unicode symbol as a
misspelling for "(X)" or something unique like that and then every
time you type "(X)" a funky unicode character instantly replaces those
chars.

Yeh, not many editors support such a feature. But it's very easy to
implement. And with that one generic mechanism, your editor is ready
to support input of Unicode chars in any language just by adding the
right definitions.

--bb

be a good thing anyway. How hard is it to say m3 =
m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that
happen? It's also going to make the language more difficult to learn
and understand.

I have noticed that in pretty much all scientific code, the f(a, b)
and a.f(b) notations fall off a readability cliff when the number of
operators grows only to a handful. Lured by simple examples like
yours, people don't see that as a problem until they actually have to
read or write such code. Adding temporaries and such is not that great
because it further takes the algorithm away from its mathematical form
just for serving a notation that was the problem in the first place.

programming language." [sic] though; and so what will people use it for
in the main? I suggest that communities that require scientific code
have options now, and that they can and do choose languages for the
purpose which have better support for thier needs than D might achieve.

Surprisingly there's not a lot of choice, witnessed by the prevalence of
Fortran for scientific code. One interesting thing is that quite a few
scientific coders mess with D and hang out around here, such as Don
Clugston, Bill Baxter, bearophile, Benji Smith (he's doing machine
learning if I remember correctly) and, if I may aspire to the status,
yours truly.

(I remain with an unformed opinion regarding Unicode operators.)

Andrei

Oct 25 2008
Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Andrei Alexandrescu wrote:
Spacen Jasset wrote:
Bill Baxter wrote:
On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

(My comment cross posted here from reddit)

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

The truth is keyboards aren't very good for inputting Unicode. That
isn't likely to change. Yes they've dealt with the problem in Asian
languages by using IMEs but in my opinion IMEs are horrible to use.

Some people seem to argue it's a waste to go to Unicode only for a few
symbols. If you're going to go Unicode, you should go whole hog. I'd
argue the exact opposite. If you're going to go Unicode, it should be
done in moderation. Use as little Unicode as necessary and no more.

As for how to input unicode -- Microsoft Word solved that problem ages
ago, assuming we're talking about small numbers of special characters.
It's called AutoCorrect. You just register your unicode symbol as a
misspelling for "(X)" or something unique like that and then every
time you type "(X)" a funky unicode character instantly replaces those
chars.

Yeh, not many editors support such a feature. But it's very easy to
implement. And with that one generic mechanism, your editor is ready
to support input of Unicode chars in any language just by adding the
right definitions.

--bb

a good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ?
vs m3 = m1 X m2 ? and how often will that happen? It's also going to
make the language more difficult to learn and understand.

I have noticed that in pretty much all scientific code, the f(a, b) and
a.f(b) notations fall off a readability cliff when the number of
operators grows only to a handful. Lured by simple examples like yours,
people don't see that as a problem until they actually have to read or
write such code. Adding temporaries and such is not that great because
it further takes the algorithm away from its mathematical form just for
serving a notation that was the problem in the first place.

But what operators would be added? Some mathematician programmers might
want vector and matrix operators, others set operators, others still
derivation/integration operators, and so on. Where would we stop?
I don't deny it might be useful for them, but it does seem like too
specific a need to integrate in the language.

--
Bruno Medeiros - Software Developer, MSc. in CS/E graduate
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Oct 26 2008
KennyTM~ <kennytm gmail.com> writes:
Bruno Medeiros wrote:
Andrei Alexandrescu wrote:
Spacen Jasset wrote:
Bill Baxter wrote:
On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

(My comment cross posted here from reddit)

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

The truth is keyboards aren't very good for inputting Unicode. That
isn't likely to change. Yes they've dealt with the problem in Asian
languages by using IMEs but in my opinion IMEs are horrible to use.

Some people seem to argue it's a waste to go to Unicode only for a few
symbols. If you're going to go Unicode, you should go whole hog. I'd
argue the exact opposite. If you're going to go Unicode, it should be
done in moderation. Use as little Unicode as necessary and no more.

As for how to input unicode -- Microsoft Word solved that problem ages
ago, assuming we're talking about small numbers of special characters.
It's called AutoCorrect. You just register your unicode symbol as a
misspelling for "(X)" or something unique like that and then every
time you type "(X)" a funky unicode character instantly replaces those
chars.

Yeh, not many editors support such a feature. But it's very easy to
implement. And with that one generic mechanism, your editor is ready
to support input of Unicode chars in any language just by adding the
right definitions.

--bb

be a good thing anyway. How hard is it to say m3 =
m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that
happen? It's also going to make the language more difficult to learn
and understand.

I have noticed that in pretty much all scientific code, the f(a, b)
and a.f(b) notations fall off a readability cliff when the number of
operators grows only to a handful. Lured by simple examples like
yours, people don't see that as a problem until they actually have to
read or write such code. Adding temporaries and such is not that great
because it further takes the algorithm away from its mathematical form
just for serving a notation that was the problem in the first place.

But what operators would be added? Some mathematician programmers might
want vector and matrix operators, others set operators, others still
derivation/integration operators, and so on. Where would we stop?
I don't deny it might be useful for them, but it does seem like too
specific a need to integrate in the language.

Composition may be useful for functional programming (I've never used
any functional programming paradigm except "reduce".)

Matrix operations: + - * .tr() .inv() .det() etc are already sufficient
for most jobs.

Vector operations: Maybe an operator for cross product.

Set operators: Just use + - * (| ~ &) instead like Pascal.

So only 2 Unicode operators I see are really useful and the replacements
are ugly: Composition (o) and cross product (×).

Oct 26 2008
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bruno Medeiros wrote:
Andrei Alexandrescu wrote:
Spacen Jasset wrote:
Bill Baxter wrote:
On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

(My comment cross posted here from reddit)

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

The truth is keyboards aren't very good for inputting Unicode. That
isn't likely to change. Yes they've dealt with the problem in Asian
languages by using IMEs but in my opinion IMEs are horrible to use.

Some people seem to argue it's a waste to go to Unicode only for a few
symbols. If you're going to go Unicode, you should go whole hog. I'd
argue the exact opposite. If you're going to go Unicode, it should be
done in moderation. Use as little Unicode as necessary and no more.

As for how to input unicode -- Microsoft Word solved that problem ages
ago, assuming we're talking about small numbers of special characters.
It's called AutoCorrect. You just register your unicode symbol as a
misspelling for "(X)" or something unique like that and then every
time you type "(X)" a funky unicode character instantly replaces those
chars.

Yeh, not many editors support such a feature. But it's very easy to
implement. And with that one generic mechanism, your editor is ready
to support input of Unicode chars in any language just by adding the
right definitions.

--bb

be a good thing anyway. How hard is it to say m3 =
m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that
happen? It's also going to make the language more difficult to learn
and understand.

I have noticed that in pretty much all scientific code, the f(a, b)
and a.f(b) notations fall off a readability cliff when the number of
operators grows only to a handful. Lured by simple examples like
yours, people don't see that as a problem until they actually have to
read or write such code. Adding temporaries and such is not that great
because it further takes the algorithm away from its mathematical form
just for serving a notation that was the problem in the first place.

But what operators would be added? Some mathematician programmers might
want vector and matrix operators, others set operators, others still
derivation/integration operators, and so on. Where would we stop?
I don't deny it might be useful for them, but it does seem like too
specific a need to integrate in the language.

I was thinking of allowing a general way of defining one Unicode
character to stand in as one operator, and then have libraries implement
the actual operators.

There's the remaining problem of different libraries defining the same
character to mean different operators. This may not be huge as math
subdomains tend to be rather consistent in their use of operators.
Across math subdomains, types and overloading can take care of things.

Also, ascii representation should be allowed for operators, and one nice
thing about Unicode characters is that many have HTML ascii and
http://www.fileformat.info/format/w3c/htmlentity.htm. So
\unicodecharname may be a good alternate way to enter these operators.
For example, the empty set could be \empty, and the cross-product could
be written as \times. So

c = a \times b;

doesn't quite look bad to me.

One nice thing about this is that we don't need to pore over naming and
such, we just use stuff that others (creators and users alike) have
already pored over. Saves on documentation writing too :o).

Andrei

Oct 26 2008
KennyTM~ <kennytm gmail.com> writes:
Andrei Alexandrescu wrote:
Bruno Medeiros wrote:
Andrei Alexandrescu wrote:
Spacen Jasset wrote:
Bill Baxter wrote:
On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

(My comment cross posted here from reddit)

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

The truth is keyboards aren't very good for inputting Unicode. That
isn't likely to change. Yes they've dealt with the problem in Asian
languages by using IMEs but in my opinion IMEs are horrible to use.

Some people seem to argue it's a waste to go to Unicode only for a few
symbols. If you're going to go Unicode, you should go whole hog. I'd
argue the exact opposite. If you're going to go Unicode, it should be
done in moderation. Use as little Unicode as necessary and no more.

As for how to input unicode -- Microsoft Word solved that problem ages
ago, assuming we're talking about small numbers of special characters.
It's called AutoCorrect. You just register your unicode symbol as a
misspelling for "(X)" or something unique like that and then every
time you type "(X)" a funky unicode character instantly replaces those
chars.

Yeh, not many editors support such a feature. But it's very easy to
implement. And with that one generic mechanism, your editor is ready
to support input of Unicode chars in any language just by adding the
right definitions.

--bb

be a good thing anyway. How hard is it to say m3 =
m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that
happen? It's also going to make the language more difficult to learn
and understand.

I have noticed that in pretty much all scientific code, the f(a, b)
and a.f(b) notations fall off a readability cliff when the number of
operators grows only to a handful. Lured by simple examples like
yours, people don't see that as a problem until they actually have to
read or write such code. Adding temporaries and such is not that
great because it further takes the algorithm away from its
mathematical form just for serving a notation that was the problem in
the first place.

But what operators would be added? Some mathematician programmers
might want vector and matrix operators, others set operators, others
still derivation/integration operators, and so on. Where would we stop?
I don't deny it might be useful for them, but it does seem like too
specific a need to integrate in the language.

I was thinking of allowing a general way of defining one Unicode
character to stand in as one operator, and then have libraries implement
the actual operators.

There's the remaining problem of different libraries defining the same
character to mean different operators. This may not be huge as math
subdomains tend to be rather consistent in their use of operators.
Across math subdomains, types and overloading can take care of things.

Also, ascii representation should be allowed for operators, and one nice
thing about Unicode characters is that many have HTML ascii and
http://www.fileformat.info/format/w3c/htmlentity.htm. So
\unicodecharname may be a good alternate way to enter these operators.
For example, the empty set could be \empty, and the cross-product could
be written as \times. So

c = a \times b;

doesn't quite look bad to me.

One nice thing about this is that we don't need to pore over naming and
such, we just use stuff that others (creators and users alike) have
already pored over. Saves on documentation writing too :o).

Andrei

LaTeX in D? :p

Anyway we already have \&times; and \&empty; so we could reuse them in
source code level as I've described somewhere in this thread.

auto torque = position \&times; force;

This is uglier than

auto torque = position \times force;

but it gives a uniform syntax between escape sequences inside and
outside strings.

The problem is you may have to invent some names, i.e. the composition
operator ∘ (U+2218 ring operator) has no name in SGML entities. In LaTeX
it is represented as \circ but \&circ; is already taken by ˆ (U+02C6
modifier letter circumflex accent).

And you'll need to predefine the associativity and operation precedence
too. ;) See my other entry in this thread.

Oct 26 2008
Charles Hixson <charleshixsn earthlink.net> writes:
Bruno Medeiros wrote:
Andrei Alexandrescu wrote:
Spacen Jasset wrote:
Bill Baxter wrote:
On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

(My comment cross posted here from reddit)

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

The truth is keyboards aren't very good for inputting Unicode. That
isn't likely to change. Yes they've dealt with the problem in Asian
languages by using IMEs but in my opinion IMEs are horrible to use.

Some people seem to argue it's a waste to go to Unicode only for a few
symbols. If you're going to go Unicode, you should go whole hog. I'd
argue the exact opposite. If you're going to go Unicode, it should be
done in moderation. Use as little Unicode as necessary and no more.

As for how to input unicode -- Microsoft Word solved that problem ages
ago, assuming we're talking about small numbers of special characters.
It's called AutoCorrect. You just register your unicode symbol as a
misspelling for "(X)" or something unique like that and then every
time you type "(X)" a funky unicode character instantly replaces those
chars.

Yeh, not many editors support such a feature. But it's very easy to
implement. And with that one generic mechanism, your editor is ready
to support input of Unicode chars in any language just by adding the
right definitions.

--bb

be a good thing anyway. How hard is it to say m3 =
m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that
happen? It's also going to make the language more difficult to learn
and understand.

I have noticed that in pretty much all scientific code, the f(a, b)
and a.f(b) notations fall off a readability cliff when the number of
operators grows only to a handful. Lured by simple examples like
yours, people don't see that as a problem until they actually have to
read or write such code. Adding temporaries and such is not that great
because it further takes the algorithm away from its mathematical form
just for serving a notation that was the problem in the first place.

But what operators would be added? Some mathematician programmers might
want vector and matrix operators, others set operators, others still
derivation/integration operators, and so on. Where would we stop?
I don't deny it might be useful for them, but it does seem like too
specific a need to integrate in the language.

Perhaps what needs to be added is a syntax for defining character to
function correspondence?  That way people could define the binary
functions that they need, and then define a corresponding character
string that represented it.  I once recommended that Eiffel include a
means of defining user operators (i.e., binary functions that sit
between the terms on which the operate) using the name syntax thusly:

Starts and ends with '|' and doesn't contain any whitespace.  Must be
surrounded by whitespace when used.  I.e. 1 |X|-3 would be forbidden, as
there is no whitespace following the |X| operator.

That still seems like a good rule to me.  If you want to include
unicode, that's no problem.  And the function could also be used as:
X(1, -3)
with identical meaning.  I.e., marking a function as an operator by
surrounding it with pipes would be purely syntax sugar.  Note that such
operators would have a precedence higher than assignment, but lower than
everything else, so in practice the choice would be between writing:
X (1, -3)
and writing:
(1 |X| -3)
unless all one were doing is making an assignment.  This is analogous to
the class member variable in object methods, or the class name in class
methods, except that that is often understood.

OTOH, I'm not certain how much such syntax buys you.

P.S.:  another possibility, which is more in line with current D syntax
requires an assignment of the operator character to a function that
starts with op.  As in '+' is associated with opAdd.  However even
though this is more in line with current D syntax, it seems to buy you a
lot less.  And it seems to require that the operator be a single
character.  This appears to me to be more work than it's worth for the
return.  Even the approach that I suggested is probably marginal.

P.P.S:  Any system that requires that a specific IDE or editor be used
is no going to work.  Not unless the IDE were provided with the
language, and even then the most successful examples I can thing of are
EMACS and Smalltalk.  (I'm excluding programs that don't run on Linux,
as I have no familiarity with either how they function or how popular
they are.  Probably, though, one could include Visual Basic and maybe
some others.  But one certainly couldn't include Basic, merely one
dialect of it.)

Oct 26 2008
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Simen Kjaeraas wrote:
On Sun, 26 Oct 2008 22:28:16 +0100, Bill Baxter <wbaxter gmail.com> wrote:

On Sun, Oct 26, 2008 at 11:02 PM, Simen Kjaeraas
<simen.kjaras gmail.com> wrote:
On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset
<spacenjasset yahoo.co.uk>
wrote:

Why unicode anyway? In the same way that editor support is required to
actually type them in, why not let the editor render them. So
symbol 'x' in the source code, say:

m3 = m1 cross_product m2

as an infix notatation in a similar way to the (uniary) sizeof
operator.

While cross_product is a bit long and unwieldy any editor capable can
replace the rendition of that keyword with a symbol for it. But in
editors
that don't it means that it still can be typed in and/or displayed
easily.

Another option includes providing cross_product as an 'alias' and 'X'
aswell.

Which then leads on to the introduction of a facility to add arbitary
operators, which could be interesting becuase you can supply any
operator
you see fit for the domains that you use that require it. -- This
provide
exactly the right solution though as all the additions would be 'non
standard' and I can see books in the future recommending people not use
unicode operators, becuase editors don't have support for them.

This made me think. What if we /could/ define arbitrary infix
operators in
D? I'm thinking something along the lines of:

operator cross_product(T, U)
{
static if (T.opCross)
{
T.opCross(T)
}
else static if (U.opCross)
{
U.opCross_r(T);
}
else
{
static assert(false, "Operator not applicable to operands.");
}
}

alias cross_product ×;

I'm not sure if this is possible, but it sure would please downs. :P

What's the precedence of your user-defined in-fix operator?

--bb

Yup, I realized this myself as well. Seemed like such a great idea when
I only thought of it for three seconds. :p

An operator could always be defined to have the same precedent as an
existing operator, which it has to specify.

Andrei

Oct 26 2008
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bill Baxter wrote:
On Mon, Oct 27, 2008 at 9:04 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

What's the precedence of your user-defined in-fix operator?

--bb

only thought of it for three seconds. :p

existing operator, which it has to specify.

Walter said in a previous post a few days ago when I suggested it that
that would kill D's easy parsability.
You say no?  I'm no parser expert, so hard for me to say.

It can be done, but it's kinda involved. You define a grammar in which
all operators have the same precedence. Consequently you compile any
expression into a list of operands and operators. That makes the
language parsable without semanting info. Then the semantic stage
transforms the list into a tree. Cecil does that.

Andrei

Oct 26 2008
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bill Baxter wrote:
On Mon, Oct 27, 2008 at 11:43 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Bill Baxter wrote:
On Mon, Oct 27, 2008 at 9:04 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

What's the precedence of your user-defined in-fix operator?

--bb

I
only thought of it for three seconds. :p

existing operator, which it has to specify.

that would kill D's easy parsability.
You say no?  I'm no parser expert, so hard for me to say.

operators have the same precedence. Consequently you compile any expression
into a list of operands and operators. That makes the language parsable
without semanting info. Then the semantic stage transforms the list into a
tree. Cecil does that.

I see.  So the price you pay is that you defer more decisions till
semantic stage.

I.e. "a b c d e" is allowed to parse into an amorphous list, then in
the semantic pass you decide if 'b' and 'd' are actually legal
operators or not.

Yah. Something tells me Walter won't embark on that soon.

Andrei

Oct 26 2008
Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
Yah. Something tells me Walter won't embark on that soon.

Not a chance <g>. Producing an amorphous list of tokens isn't what I'd
call "parsing".

Oct 26 2008
Jesse Phillips <jessekphillips gmail.com> writes:
On Thu, 23 Oct 2008 09:52:34 +0900, Bill Baxter wrote:

On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Please vote up before the haters take it down, and discuss:

I think the right way to do it is not to make everything Unicode. All
the pressure on the existing symbols would be dramatically relieved by
the addition of just a handful of new symbols.

The truth is keyboards aren't very good for inputting Unicode. That
isn't likely to change. Yes they've dealt with the problem in Asian
languages by using IMEs but in my opinion IMEs are horrible to use.

Some people seem to argue it's a waste to go to Unicode only for a few
symbols. If you're going to go Unicode, you should go whole hog. I'd
argue the exact opposite. If you're going to go Unicode, it should be
done in moderation. Use as little Unicode as necessary and no more.

As for how to input unicode -- Microsoft Word solved that problem ages
ago, assuming we're talking about small numbers of special characters.
It's called AutoCorrect. You just register your unicode symbol as a
misspelling for "(X)" or something unique like that and then every time
you type "(X)" a funky unicode character instantly replaces those chars.

Yeh, not many editors support such a feature. But it's very easy to
implement. And with that one generic mechanism, your editor is ready to
support input of Unicode chars in any language just by adding the right
definitions.

--bb

I don't find this terribly appealing. Walter mentions having thrown out
support for 16bit processors and such. Why not through out 32bit too?
Those are going out of style.

The point is, it's not the languages job to force change of hardware. And
support via a text editor is also not acceptable. Going the software
support route relies on the OS to support a universal easy method to
enter unicode.

As for D's case, I say support unicode for these new operators, but
provide the same function with keyboard provided symbols.

Oct 22 2008
Max Samukha <samukha voliacable.com.removethis> writes:
On Wed, 22 Oct 2008 17:27:58 -0500, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

Please vote up before the haters take it down, and discuss:

Andrei

I'm already having problems with unicode: the news reader I'm using
doesn't display the characters correctly (maybe it's time to update).
If unicode can be avoided, please avoid it.

Oct 22 2008
bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

Few random thoughts on the subject:
- Someday probably programming languages will use some Unicode symbols. I don't
know if Fortress will succeed, but I think someday some language will do.
Probably Unicode symbols will be used as in Fotress, for improve the
readability of the code, and not as in APL to transform the code into
hieroglyphics.
- Another good thing that Fortress does is that there are always *nice* looking
ways to write the same code in pure ASCII. So there are usually intuitive 2 or
3 char long translations of all the accepted Unicode symbols. This is very
positive, so you can write/read Fortress with a normal ASCII editor too.
- My editor, programming font, newsreader, IDEs, and probably more things,
currently have problems with Unicode texts.
- Novels in English and other languages show that you can express very complex
and refined thoughts with just very few characters. But you need some space to
write a novel/short story. Mathematics shows that a judicious usage of standard
and widely used symbols helps a lot in decreasing the space used to represent
formulas, etc.
- Fortress and the Mathematica language are designed for physics and
mathematics. D language can be used for that, but it's mostly a system
language. So symbols are more used and more important in Fortress than D. So
their purposes and targets are different.
- I like the idea of using *few* Unicode symbols in my programs, they can
reduce code size and they may even improve readability.
- Python3 allows Unicode identifiers, mostly to allow people in all part of the
world to write variable names in their languages.
- But seeing the disadvantages in the end I think that in practice adopting
Unicode for D programs is currently bad.

Bye,
bearophile

Oct 23 2008
Robert Fraser <fraserofthenight gmail.com> writes:
bearophile wrote:
- Python3 allows Unicode identifiers, mostly to allow people in all part of
the world to write variable names in their languages.

So does D.

Oct 23 2008
Max Samukha <samukha voliacable.com.removethis> writes:
On Thu, 23 Oct 2008 04:23:29 -0700, Robert Fraser
<fraserofthenight gmail.com> wrote:

bearophile wrote:
- Python3 allows Unicode identifiers, mostly to allow people in all part of
the world to write variable names in their languages.

So does D.

considered bad style by many programmers. Besides, big part of
software projects nowadays are international. Imagine participants of
linux project writing identifiers in his language.

Oct 23 2008
Yigal Chripun <yigal100 gmail.com> writes:
Max Samukha wrote:
On Thu, 23 Oct 2008 04:23:29 -0700, Robert Fraser
<fraserofthenight gmail.com> wrote:

bearophile wrote:
- Python3 allows Unicode identifiers, mostly to allow people in
all part of the world to write variable names in their languages.

considered bad style by many programmers. Besides, big part of
software projects nowadays are international. Imagine participants of
linux project writing identifiers in his language.

isn't that something that should be decided upon on a per-project basis?
I agree that it'll be bad for Linux, but each project has its own
objectives. for example, what if you're teaching a programming course
for kids? it'll be easier for them writing in their own native language.
I could easily imagine a small start-up writing in their own native
language (let's say Hebrew) as one way for obfuscating the source code,
so as to protect their IP.
there are, I'm sure, more use-cases.

Oct 23 2008
bearophile <bearophileHUGS lycos.com> writes:
I always use English for variable names, instead of my language, because I've
had my share of debugging code with variables in other languages and it's not a
nice thing to do.

Regarding Python code, its std libs keeps identifiers in English only, but when
they have invented the OneLaptopForChild that uses Python a lot, they have
decided that 'kids' may enjoy using variable names in their language...

Bye,
bearophile

Oct 23 2008
Max Samukha <samukha voliacable.com.removethis> writes:
On Thu, 23 Oct 2008 08:33:16 -0400, bearophile
<bearophileHUGS lycos.com> wrote:

I always use English for variable names, instead of my language, because I've
had my share of debugging code with variables in other languages and it's not a
nice thing to do.

Regarding Python code, its std libs keeps identifiers in English only, but when
they have invented the OneLaptopForChild that uses Python a lot, they have
decided that 'kids' may enjoy using variable names in their language...

Bye,
bearophile

Keep children away from Python. Let them have happy lives :)

Oct 23 2008
Walter Bright <newshound1 digitalmars.com> writes:
Robert Fraser wrote:
bearophile wrote:
- Python3 allows Unicode identifiers, mostly to allow people in all
part of the world to write variable names in their languages.

So does D.

D currently allows Unicode in identifiers, comments, and strings. In
fact, D source text is defined to be Unicode.

Oct 23 2008
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
Andrei Alexandrescu:

(No need to single me out. It's Walter's post, and besides I don't have
a formed opinion on Unicode symbols.)

Andrei

Oct 23 2008
Yigal Chripun <yigal100 gmail.com> writes:
Andrei Alexandrescu wrote:
Please vote up before the haters take it down, and discuss:

Andrei

A few thoughts on the subject:

- others already mentioned, i think, smalltalk as an example. smalltalk
bundles as part of the language also the complete environment and IDE so
they can add Unicode chars without worrying much about editor support.
in D this is an issue as D doesn't provide an "official" D editor. The
support largely exists for Unicode - even plain notepad supports Unicode
fully but that doesn't mean people are using any of the many editors
that has this feature.

- smalltalk uses left-arrow as assignment op. the way you enter it is by
typing "<_" so this is similar to Bill's suggestion, i.e. define a short
sequence of chars to be replaced by a Unicode char in the file source.

- why not generalize the concept? a few ideas: syntax is not important
here, just the idea itself..
1) bool compare as == (A a, A b) {}
you can add an op alias to your function, maybe define anonymous
function with alias to be used only as op.
2) provide a way to specify which functions can be used as infix
functions (Scala does that IIRC) and maybe even specify precedence
somehow, so that downs' map function could be written as :
infix void map(...) {}
and used as: dg map array;

Oct 23 2008
KennyTM~ <kennytm gmail.com> writes:
Andrei Alexandrescu wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

Andrei

I suggest not. There are problems if you adopt Unicode as operators:

======

1) My editor supports Unicode, but my keyboard don't. So how do I type ∩
and ∪ for a set«T»?

1.1) What if the library writer forget to provide an alternative,
ASCII-only name? [This is also a problem of using Unicode as identifier
as general.]

1.2) Some suggested auto-correction in the IDE. Again what if I used

I had suggested once before, but let me put it formally here. If you
really want to support Unicode operators in source code,

- Firstly, ditch the ability to replace \xxx with '\xxx' when it
appears without the quotes (so “char x = \n;” won't compile).
- Then, replace \xxx with the character represented in source level, so

Vector3D«real» τ = r × F;

can be written as

Vector3D!(real) \&tau; = r \&times; F;

- You don't need to introduce a separate trigraph.
- But suggestion do trigger some people's trigraph-phobia. [Yell no!
Now! :) ]
- It may make the source code difficult to parse grammatically.
- It will make the source code difficult to read, just look at the
number of semicolons in the ASCII encoded version.
- But at least you can compile your code.

======

2) This is regarding the rejection of « & » to be supported even if the
emacs module goes official. Of course it turns out it is not, but let's
think of these scenarios:

2.1) OK it turns out ∩ and ∪ and «T» where just .opUnion(x) and
.opIntersect(x) and !(T) pretty-printed in emacs; the compiler won't
accept these characters anyway. But sometimes I forgot and just copied a
portion of these code to nano/geany/whatever and then it stops compiling!

2.2) Well this copy&paste problem has been solved in the IDE level by
inverting the pretty printing while copying. But now I publish my
fantastic, pretty-printed D program in a web page/PDF/whatever, and
people just complain the compiler won't accept it!

I still believe if you're going to transform D code to Unicode visually,
the compiler must accept these visual replacement as well.

May I also take Mathematica as an example. The programming language
itself uses a heavy load of non-ASCII characters, and the IDE also
pretty-printed them as nice mathematical formulas, but in the “source
code” level they are just escape sequences. So on screen you see

E^(I π) + 1

but in the source code you'll see

E^(I \[Pi]) + 1

However, if you type in “E^(I π) + 1” in a plain .nb file and open with
the Get[] function (think of it as “import xx.d”) it can still correctly
display the result “0”.

======

3) There are over 800 unary or binary operators in Unicode[1]. How are
you going to opXXX all them? Assume your blog entry doesn't mean the
simple “!=” ↦ “≠” transformation.

Use to the C++/C# approach? But I heard that's no good.

======

4) These are regarding if you are going to support overloading for all
these 800 operators, how to define:

4.1) [Big problem] Operator precedence? (One person may want ∧ to mean
the wedge product (so they have higher precedence than + and -) but
another want it to mean logical AND (so lower than + and -).)

4.2) Associativity? How to determine if an operator is left-associative,
right-associative or both? (∧ as wedge product is both, while ∧ as a
power function pow(a,b) is right-assoc.)

4.3) [Minor problem] Commutativity? Or we'll need to write opXXX and
opXXX_r all the time?

I don't have solutions for D on these. For 4.2 & 4.3 in C# we can
introduce some attributes like

[Associative, Commutative]
FuzzyBool operator∧ (FuzzyBool x, FuzzyBool y) { return min(x,y); }

(Not actual C# code.)

but it's not D. :)

Or predefine the meaning, precedence and associativity for the each
operator, so e.g. ∧ always means the wedge product and not logical AND,
just like now ^ always means XOR and not power function.

Or just require the programmer to always put the parenthesis.

Ref: [1] A rough word count in
http://www.unicode.org/Public/math/revision-11/MathClass-11.txt. The
actual number is higher than this.

Oct 23 2008
Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
KennyTM~ wrote:

1.2) Some suggested auto-correction in the IDE. Again what if I used

Then I suggest a change in career... ^^'

--
Bruno Medeiros - Software Developer, MSc. in CS/E graduate
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Oct 24 2008
"Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Thu, 23 Oct 2008 00:27:58 +0200, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

Please vote up before the haters take it down, and discuss:

Andrei

I really like the idea of having more unicode in the language, but I feel
these should be fairly limited.

There are times I feel that more operators (especially, as has been
mentioned, opCross and opDotProduct) would be nice to have, but it's just
sugar, really.

As an example, while I'd enjoy seeing code like this, I'm not sure I'd
enjoy writing it (Note that I am prone to exaggerations):

int a = ∅; //empty set, same as "= void"
int[] b = [1,2,3,4,5,6];

if (a ∈ b) // Element of - "in"
{
float c = 2.00001;
float d =  readInt();
writefln(c ≈ ⌈d⌉ ); // Approximately equal, ceil

myClass c = getInstance();
if (∃c) // c exists, i.e. "!is null"
{
writefln(√(c.foo)); // I thought this should work in D today, using
"alias sqrt √;", but it seems the compiler chokes on it. :(
}

∀element∈b // New foreach syntax!
{
element *= ¼;
}
}

--
Simen

Oct 23 2008
Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Simen Kjaeraas wrote:

As an example, while I'd enjoy seeing code like this, I'm not sure I'd
enjoy writing it (Note that I am prone to exaggerations):

int a = ∅; //empty set, same as "= void"
int[] b = [1,2,3,4,5,6];

Hum, interesting example, it actually made me realize that 'null' would
be an ideal candidate for having a Unicode symbol of it's own. Does
anyone have suggestions for a possible one? Preferably somewhat
circle-shaped.

--
Bruno Medeiros - Software Developer, MSc. in CS/E graduate
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Oct 24 2008
KennyTM~ <kennytm gmail.com> writes:
Bruno Medeiros wrote:
Simen Kjaeraas wrote:
As an example, while I'd enjoy seeing code like this, I'm not sure I'd
enjoy writing it (Note that I am prone to exaggerations):

int a = ∅; //empty set, same as "= void"
int[] b = [1,2,3,4,5,6];

Hum, interesting example, it actually made me realize that 'null' would
be an ideal candidate for having a Unicode symbol of it's own. Does
anyone have suggestions for a possible one? Preferably somewhat
circle-shaped.

auto Ø = null; // \&Oslash;

I assume you're not serious...

Oct 24 2008
Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
KennyTM~ wrote:
Bruno Medeiros wrote:
Simen Kjaeraas wrote:
As an example, while I'd enjoy seeing code like this, I'm not sure
I'd enjoy writing it (Note that I am prone to exaggerations):

int a = ∅; //empty set, same as "= void"
int[] b = [1,2,3,4,5,6];

Hum, interesting example, it actually made me realize that 'null'
would be an ideal candidate for having a Unicode symbol of it's own.
Does anyone have suggestions for a possible one? Preferably somewhat
circle-shaped.

auto Ø = null; // \&Oslash;

I assume you're not serious...

It's an interesting and effective way to save some typing, and it might
be even more readable (but with a symbol other than Ø).
But I probably would not use it anyway, since I like to write very
standardized code, that other people can easily recognize and read.

--
Bruno Medeiros - Software Developer, MSc. in CS/E graduate
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Oct 26 2008
"Bill Baxter" <wbaxter gmail.com> writes:
T24gRnJpLCBPY3QgMjQsIDIwMDggYXQgNTo0OCBBTSwgU2ltZW4gS2phZXJhYXMgPHNpbWVuLmtq
YXJhc0BnbWFpbC5jb20+IHdyb3RlOgoKPiAgICB3cml0ZWZsbiiWKGMuZm9vKSk7IC8vIEkgdGhv
dWdodCB0aGlzIHNob3VsZCB3b3JrIGluIEQgdG9kYXksIHVzaW5nCj4gImFsaWFzIHNxcnQgljsi
LCBidXQgaXQgc2VlbXMgdGhlIGNvbXBpbGVyIGNob2tlcyBvbiBpdC4gOigKCkFjY29yZGluZyB0
byB0aGUgc3BlYywgeW91IGNhbiBjYW4gb25seSB1c2UgIlVuaXZlcnNhbEFscGhhIiBVbmljb2Rl
CmNoYXJhY3RlcnMgaW4geW91ciBpZGVudGlmaWVycy4gIFN1cHBvc2VkbHkgdGhvc2UgYXJlIGRl
ZmluZWQgaW4KSVNPL0lFQyA5ODk5OjE5OTkoRSkgQXBwZW5kaXggRC4gIEJ1dCBJJ20gZ3Vlc3Np
bmcgdGhlIElTTyBkaWQgbm90CmRlZmluZSBzcXVhcmUtcm9vdC1zeW1ib2wgYXMgYW4gYWxwaGEg
Y2hhcmFjdGVyLgoKLS1iYgo=

Oct 23 2008
"Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Thu, 23 Oct 2008 23:47:59 +0200, Bill Baxter <wbaxter gmail.com> wrote:

On Fri, Oct 24, 2008 at 5:48 AM, Simen Kjaeraas <simen.kjaras gmail.com>
wrote:

writefln(√(c.foo)); // I thought this should work in D today, using
"alias sqrt √;", but it seems the compiler chokes on it. :(

According to the spec, you can can only use "UniversalAlpha" Unicode
characters in your identifiers.  Supposedly those are defined in
ISO/IEC 9899:1999(E) Appendix D.  But I'm guessing the ISO did not
define square-root-symbol as an alpha character.

--bb

That seems to make sense indeed.

--
Simen

Oct 23 2008
Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Andrei Alexandrescu wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

Andrei

I don't know if it would be worthwhile, but I would say there are two
aspects that likely would need to be observed for this to work out
favorably:

* Having non-unicode versions of the symbols/keywords available in
Unicode, such that non-Uunicode editing and viewing is always possible
as a fallback. This has some important consequences though, such as
making Unicode-symbol-usage unable to solve the shortage of brackets
for, for example, the template instantiation syntax (because an
alternative ASCII notation would still be necessary).

* Having a way to directly input the Unicode symbols in the keyboard.
One reason is because of typing succinctness, and another, is because I
find the alternative (have the editor/IDE automatically change an ASCII
character sequence into a Unicode symbol) to have several disadvantages:
First is that it doesn't work outside the editors/IDEs configured to do
so, (which is a bummer, there is actually plenty of code written outside
that: newsgroups, articles, forums, bug reports, IRC, etc.). Second, I
personally like that the editor always require exactly N backspaces to
erase N typed characters[*].

So, anyone knows if it is possible on Windows (I believe in Unix it is)
to configure your keyboard mapping with custom settings? For example, if
I press AltGr-O, it inputs some Unicode character of my choosing?

[*] As a sidenote, this is also why I don't like having my editor
configured to insert 4 spaces on TAB-press. Unless, the editor is also
smart enough to delete the 4 spaces on one backspace/delete and move 4
spaces on one move cursor operation (arrow key press).

--
Bruno Medeiros - Software Developer, MSc. in CS/E graduate
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Oct 24 2008
"Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Fri, 24 Oct 2008 18:28:51 +0200, Bruno Medeiros
<brunodomedeiros+spam com.gmail> wrote:

Andrei Alexandrescu wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/
Andrei

I don't know if it would be worthwhile, but I would say there are two
aspects that likely would need to be observed for this to work out
favorably:

* Having non-unicode versions of the symbols/keywords available in
Unicode, such that non-Uunicode editing and viewing is always possible
as a fallback. This has some important consequences though, such as
making Unicode-symbol-usage unable to solve the shortage of brackets
for, for example, the template instantiation syntax (because an
alternative ASCII notation would still be necessary).

* Having a way to directly input the Unicode symbols in the keyboard.
One reason is because of typing succinctness, and another, is because I
find the alternative (have the editor/IDE automatically change an ASCII
character sequence into a Unicode symbol) to have several disadvantages:
First is that it doesn't work outside the editors/IDEs configured to do
so, (which is a bummer, there is actually plenty of code written outside
that: newsgroups, articles, forums, bug reports, IRC, etc.). Second, I
personally like that the editor always require exactly N backspaces to
erase N typed characters[*].

So, anyone knows if it is possible on Windows (I believe in Unix it is)
to configure your keyboard mapping with custom settings? For example, if
I press AltGr-O, it inputs some Unicode character of my choosing?

I'd guess this oughtta do it:
http://www.microsoft.com/globaldev/tools/msklc.mspx

--
Simen

Oct 24 2008
Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Simen Kjaeraas wrote:
On Fri, 24 Oct 2008 18:28:51 +0200, Bruno Medeiros
<brunodomedeiros+spam com.gmail> wrote:

Andrei Alexandrescu wrote:
Please vote up before the haters take it down, and discuss:
_in_d_similarly_to/
Andrei

I don't know if it would be worthwhile, but I would say there are two
aspects that likely would need to be observed for this to work out
favorably:

* Having non-unicode versions of the symbols/keywords available in
Unicode, such that non-Uunicode editing and viewing is always possible
as a fallback. This has some important consequences though, such as
making Unicode-symbol-usage unable to solve the shortage of brackets
for, for example, the template instantiation syntax (because an
alternative ASCII notation would still be necessary).

* Having a way to directly input the Unicode symbols in the keyboard.
One reason is because of typing succinctness, and another, is because
I find the alternative (have the editor/IDE automatically change an
ASCII character sequence into a Unicode symbol) to have several
disadvantages: First is that it doesn't work outside the editors/IDEs
configured to do so, (which is a bummer, there is actually plenty of
code written outside that: newsgroups, articles, forums, bug reports,
IRC, etc.). Second, I personally like that the editor always require
exactly N backspaces to erase N typed characters[*].

So, anyone knows if it is possible on Windows (I believe in Unix it
is) to configure your keyboard mapping with custom settings? For
example, if I press AltGr-O, it inputs some Unicode character of my
choosing?

I'd guess this oughtta do it:
http://www.microsoft.com/globaldev/tools/msklc.mspx

Yes, exactly that! I had the impression there was such a program for
Windows, but couldn't remember the name.

--
Bruno Medeiros - Software Developer, MSc. in CS/E graduate
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Oct 26 2008
Robert Fraser <fraserofthenight gmail.com> writes:
Simen Kjaeraas wrote:
So, anyone knows if it is possible on Windows (I believe in Unix it
is) to configure your keyboard mapping with custom settings? For
example, if I press AltGr-O, it inputs some Unicode character of my
choosing?

I'd guess this oughtta do it:
http://www.microsoft.com/globaldev/tools/msklc.mspx

I remember this same question being asked on a Microsoft DL when I was
working there, and all the answers given were for third-party tools like
KeyTweak ( http://webpages.charter.net/krumsick/ ) ;-P . Good to know
there's an MS one.

Oct 26 2008
bearophile <bearophileHUGS lycos.com> writes:
Bruno Medeiros:
* Having non-unicode versions of the symbols/keywords available in Unicode,
such that non-Uunicode editing and viewing is always possible as a fallback.
This has some important consequences though, such as making
Unicode-symbol-usage unable to solve the shortage of brackets for, for example,
the template instantiation syntax (because an alternative ASCII notation would
still be necessary).<

Fortress uses pairs of symbols to denote various sequence literarls. Some of
them can be seen in F# too, you can see some here:
http://a6systems.com/fsharpsheet.pdf

Creates the list:
let lsgen2 = [0 .. 2 .. 8]
Gives:
[0;2;4;6;8]
Note:  0 .. 2 .. 8  equals to the Python slice with stride syntax 0:8:2

Create the array:
let argen2 = [|0 .. 2 .. 8|]
Gives:
[|0;2;4;6;8|]

Creating a seq (that is lazy):
let s = seq { for i in 0 .. 10 do yield i }

F# has also algebraic types that will become very useful in D2, as it becomes
more functional (as them are useful in Scala too, that is partially functional.
F# and Scala are languages to copy from because they are
functional-procedural-OOP hybrids almost like D2 will want to become, D2 is so
far just a bit functional, Scala is more functional, F# even more, and
languages like Haskell are functional all the way), this is an Augmented
Discriminated Union:

type BinTree<'a> =
| Node of
BinTree<'a> * 'a *
BinTree<'a>
| Leaf
with member self.Depth() =
match self with
| Leaf -> 0
| Node(l, _, r) -> 1 +
l.Depth() + r.Depth()

So D2 can use collection literals similar to those ones in F# to implement
lazy/nonlazy collection generators too, this is the third iteration of my ideas
on this topic (if you think succintness in (partially) functional languages is
useless, think again. It allows to use certain things instead of falling back
to more procedural idioms):

auto flat = (abs(el) for(row: mat) for(el: row) if (el % 2)); // lazy
auto multi = [c:mulIter(c, i) for(i,c: "abcdef")]; // AA
auto squares = void[x*x for(x: 0..100)]; // set
void[int] squares = [x*x for(x: 0..100)];// set, alternative syntax
auto squares = {x*x for x in xrange(100)}; // set, alternative syntax
auto squares = {| x*x for(x: 0..100) |}; // list?
auto squares = [| x*x for(x: 0..100) |]; // multiset? something else?

Bye,
bearophile

Oct 24 2008
"Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Fri, 24 Oct 2008 18:52:03 +0200, Bruno Medeiros
<brunodomedeiros+spam com.gmail> wrote:

Simen Kjaeraas wrote:
As an example, while I'd enjoy seeing code like this, I'm not sure I'd
enjoy writing it (Note that I am prone to exaggerations):
int a = ∅; //empty set, same as "= void"
int[] b = [1,2,3,4,5,6];

Hum, interesting example, it actually made me realize that 'null' would
be an ideal candidate for having a Unicode symbol of it's own. Does
anyone have suggestions for a possible one? Preferably somewhat
circle-shaped.

Well, we norwegians got the Ø (html entity &Oslash;, Latin-1 character
216) - looks a lot like the empty set symbol.

--
Simen

Oct 24 2008
"Bill Baxter" <wbaxter gmail.com> writes:
On Sun, Oct 26, 2008 at 3:46 AM, Spacen Jasset <spacenjasset yahoo.co.uk> wrote:
I am not entirely sure that 30 or (x amount) of new operators would be a
good thing anyway. How hard is it to say m3 = m1.crossProduct(m2) ? vs m3 =
m1 X m2 ? and how often will that happen? It's also going to make the
language more difficult to learn and understand.

I have noticed that in pretty much all scientific code, the f(a, b) and
a.f(b) notations fall off a readability cliff when the number of operators
grows only to a handful. Lured by simple examples like yours, people don't
see that as a problem until they actually have to read or write such code.
Adding temporaries and such is not that great because it further takes the
algorithm away from its mathematical form just for serving a notation that
was the problem in the first place.

Yes, heavy math code is hard to read in the current situation.
I almost always prefix any significant math with a comment giving the
equations being implemented in a more compact notation.
Having to write the same thing in two different ways like that is a
waste of effort.
It would be very cool if I could just write it once and have it look
like it does in my notebook.

Yes, that is indeed a fair point and I agree. D is a "systems programming
language." [sic] though; and so what will people use it for in the main?

D is a compile-to-the-metal language that is of interest to anyone who
ranks performance high on their list of priorities.  Mathemeticians
and scientists are among the few remaining groups where maximum speed
is still needed.  Games are another area, and games are becoming more
and more sophisticated mathematically under the hood.

I suggest that communities that require scientific code have options now, and
that they can and do choose languages for the purpose which have better
support for thier needs than D might achieve.

The traditional math languages suck at doing anything besides math.
Want to do a bit of math then display the results interactively in an
OpenGL window?  With Fortran?!  Ha!

On the other end there are the Matlab and NumPy-type solutions.  They
are convenient for tinkering around and displaying some results, but
these are not good for performance.

D has both.  So I think D has potential to gain traction in the world
of math-heavy computing.

But anyway, I'm got convinced several posts back that the time is not
yet ripe for Unicode in D.  So I'm not gonna argue that D go Unicode
now.   I'm just saying that math code is hard to read, and that heavy
math users are a good target audience for D because they need
performance, but don't necessarily want to give up
general-purposeness.

--bb

Oct 25 2008
"Bill Baxter" <wbaxter gmail.com> writes:
On Sun, Oct 26, 2008 at 5:10 AM, bearophile <bearophileHUGS lycos.com> wrote:
Bill Baxter:
On the other end there are the Matlab and NumPy-type solutions.  They
are convenient for tinkering around and displaying some results, but
these are not good for performance.

I have seen many scientific programs that use numpy, so sometimes it's fast
enough. But it forces you to write everything in a vector programming style,
that a procedural programmer needs time to learn. Normal C/D/C++ code is more
flexible, you can work on single items too in a fast way, while in numpy you
can go fast only when you work in bulk, on vectors.

Yep  C/D/C++ is easier.  The SciPy.org site has a growing section of
their wiki devoted to how to make your code fast using various levels
of python/native hybrids.  I was using python heavily for numerical
stuff for a while and it got to the point where I realized that the
time I spent trying to figure out how to vectorize things and use
other tricks to make things fast, and to make python modules out of
external code I wanted to call,  etc.  was actually more work than it
would be to just use D for everything.   Sure Python does have some
nice features as a language that D lacks, but from 10,000 ft  D is a
lot closer to Python than C++ in terms of ease of use.  Also, while
Python is nice for arrays and number crunching, I found the lack of
typing to be a liability when it comes to complicated graph
structures.  Instead of nicely typed pointers that the compiler can
tell apart, you end up with 23 different integer index variables that
you have to keep straight.  And finally, also type related, there's
the annoyance that you have to actually run your app to detect typos.

I'm sure there's way's to work around all those issues, but to me D's
a lot easier.  I simply don't need the workarounds.

I still fire up NumPy and Matplotlib for analyzing the from results
from my D programs.  And SymPy is great too.  I just don't use it as
my main development langauge any more.

On the other hand numpy offers you some higher level operations on arrays that
are currently missing in D, like certain complex slicing operations, that may
reduce your code length significantly, increasing code readability (because it
looks more like formulas); I can show you some examples if you want.

No thanks!  Been there, done that!

Note that in D there's no built-in rectangular dynamic arrays, that are basic
stuff in numpy/matlab.

I've got my dflat and gobo
(http://www.dsource.org/projects/multiarray) that are working for me
pretty well.  They could use some full-time loving to make more
operations work intuitively, but the basics work ok.

--bb

Oct 25 2008
Alix Pexton <alixD.TpextonNO SPAMgmailD.Tcom> writes:
Andrei Alexandrescu wrote:
Please vote up before the haters take it down, and discuss:

_in_d_similarly_to/

Andrei

I've been following this thread without really having an opinion to
offer, but I just had a thought...

We already know that D's CTFE and templates can be used together to
parse DSLs (matrix ops, regular expressions and IIRC Scheme too) and
turn them into optimal native code. That suggests to me that it is
already possible to write D code that can turn an expression written in
established mathematic/scientific notation (complete with unicode
symbols) into either conventional D code, or machine code.

What I am not sure of is whether is would be possible to make it general
enough to work with all mathmatical dialects (I seem to remember some
overlapping in ways that might be problematic). A complete solution
would have to be able to define new operatos (including thier
associativity and precidence) in such a way that they can be looked up
by the templates that evaluate the expresion.

Another related thought I had: Would it be possible to write a
compile-time parser that turned MathML into code? I'm not even sure if
MathML is structured enough to represent the undelying meaning of an
expression rather than just its graphical form. Perhaps it would be more
interesting to write the code that did the tranformation in the opposite
direction, turning expressions written in D into MathML ^^

A...

Oct 26 2008
"Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset
<spacenjasset yahoo.co.uk> wrote:

Why unicode anyway? In the same way that editor support is required to
actually type them in, why not let the editor render them. So instead of
symbol 'x' in the source code, say:

m3 = m1 cross_product m2

as an infix notatation in a similar way to the (uniary) sizeof operator.

While cross_product is a bit long and unwieldy any editor capable can
replace the rendition of that keyword with a symbol for it. But in
editors that don't it means that it still can be typed in and/or
displayed easily.

Another option includes providing cross_product as an 'alias' and 'X'
aswell.

Which then leads on to the introduction of a facility to add arbitary
operators, which could be interesting becuase you can supply any
operator you see fit for the domains that you use that require it. --
This provide exactly the right solution though as all the additions
would be 'non standard' and I can see books in the future recommending
people not use unicode operators, becuase editors don't have support for
them.

This made me think. What if we /could/ define arbitrary infix operators in
D? I'm thinking something along the lines of:

operator cross_product(T, U)
{
static if (T.opCross)
{
T.opCross(T)
}
else static if (U.opCross)
{
U.opCross_r(T);
}
else
{
static assert(false, "Operator not applicable to operands.");
}
}

alias cross_product ×;

I'm not sure if this is possible, but it sure would please downs. :P

--
Simen

Oct 26 2008
"Bill Baxter" <wbaxter gmail.com> writes:
On Sun, Oct 26, 2008 at 11:02 PM, Simen Kjaeraas <simen.kjaras gmail.com> w=
rote:
On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset <spacenjasset yahoo.co.=

wrote:

Why unicode anyway? In the same way that editor support is required to
actually type them in, why not let the editor render them. So instead of
symbol 'x' in the source code, say:

m3 =3D m1 cross_product m2

as an infix notatation in a similar way to the (uniary) sizeof operator.

While cross_product is a bit long and unwieldy any editor capable can
replace the rendition of that keyword with a symbol for it. But in edito=

that don't it means that it still can be typed in and/or displayed easil=

Another option includes providing cross_product as an 'alias' and 'X'
aswell.

Which then leads on to the introduction of a facility to add arbitary
operators, which could be interesting becuase you can supply any operato=

you see fit for the domains that you use that require it. -- This provid=

exactly the right solution though as all the additions would be 'non
standard' and I can see books in the future recommending people not use
unicode operators, becuase editors don't have support for them.

This made me think. What if we /could/ define arbitrary infix operators i=

D? I'm thinking something along the lines of:

operator cross_product(T, U)
{
static if (T.opCross)
{
T.opCross(T)
}
else static if (U.opCross)
{
U.opCross_r(T);
}
else
{
static assert(false, "Operator not applicable to operands.");
}
}

alias cross_product =D7;

I'm not sure if this is possible, but it sure would please downs. :P

What's the precedence of your user-defined in-fix operator?

--bb

Oct 26 2008
"Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Sun, 26 Oct 2008 22:28:16 +0100, Bill Baxter <wbaxter gmail.com> wrote:

On Sun, Oct 26, 2008 at 11:02 PM, Simen Kjaeraas
<simen.kjaras gmail.com> wrote:
On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset
<spacenjasset yahoo.co.uk>
wrote:

Why unicode anyway? In the same way that editor support is required to
actually type them in, why not let the editor render them. So instead
of
symbol 'x' in the source code, say:

m3 = m1 cross_product m2

as an infix notatation in a similar way to the (uniary) sizeof
operator.

While cross_product is a bit long and unwieldy any editor capable can
replace the rendition of that keyword with a symbol for it. But in
editors
that don't it means that it still can be typed in and/or displayed
easily.

Another option includes providing cross_product as an 'alias' and 'X'
aswell.

Which then leads on to the introduction of a facility to add arbitary
operators, which could be interesting becuase you can supply any
operator
you see fit for the domains that you use that require it. -- This
provide
exactly the right solution though as all the additions would be 'non
standard' and I can see books in the future recommending people not use
unicode operators, becuase editors don't have support for them.

This made me think. What if we /could/ define arbitrary infix operators
in
D? I'm thinking something along the lines of:

operator cross_product(T, U)
{
static if (T.opCross)
{
T.opCross(T)
}
else static if (U.opCross)
{
U.opCross_r(T);
}
else
{
static assert(false, "Operator not applicable to operands.");
}
}

alias cross_product ×;

I'm not sure if this is possible, but it sure would please downs. :P

What's the precedence of your user-defined in-fix operator?

--bb

Yup, I realized this myself as well. Seemed like such a great idea when I
only thought of it for three seconds. :p

--
Simen

Oct 26 2008
"Bill Baxter" <wbaxter gmail.com> writes:
On Mon, Oct 27, 2008 at 8:23 AM, Simen Kjaeraas <simen.kjaras gmail.com> wr=
ote:
On Sun, 26 Oct 2008 22:28:16 +0100, Bill Baxter <wbaxter gmail.com> wrote=

On Sun, Oct 26, 2008 at 11:02 PM, Simen Kjaeraas <simen.kjaras gmail.com=

wrote:
On Sat, 25 Oct 2008 12:14:47 +0200, Spacen Jasset
<spacenjasset yahoo.co.uk>
wrote:

Why unicode anyway? In the same way that editor support is required to
actually type them in, why not let the editor render them. So instead =

symbol 'x' in the source code, say:

m3 =3D m1 cross_product m2

as an infix notatation in a similar way to the (uniary) sizeof operato=

While cross_product is a bit long and unwieldy any editor capable can
replace the rendition of that keyword with a symbol for it. But in
editors
that don't it means that it still can be typed in and/or displayed
easily.

Another option includes providing cross_product as an 'alias' and 'X'
aswell.

Which then leads on to the introduction of a facility to add arbitary
operators, which could be interesting becuase you can supply any
operator
you see fit for the domains that you use that require it. -- This
provide
exactly the right solution though as all the additions would be 'non
standard' and I can see books in the future recommending people not us=

unicode operators, becuase editors don't have support for them.

This made me think. What if we /could/ define arbitrary infix operators
in
D? I'm thinking something along the lines of:

operator cross_product(T, U)
{
static if (T.opCross)
{
T.opCross(T)
}
else static if (U.opCross)
{
U.opCross_r(T);
}
else
{
static assert(false, "Operator not applicable to operands.");
}
}

alias cross_product =D7;

I'm not sure if this is possible, but it sure would please downs. :P

What's the precedence of your user-defined in-fix operator?

--bb

Yup, I realized this myself as well. Seemed like such a great idea when I
only thought of it for three seconds. :p

Same thing goes for downs' in-fix operators.  I think his syntax is
/infix/ which means that his ops always have the same precedence as
division.
I'm guessing this Python Cookbook recipe is very similar to Downs'
technique.  It discusses pros and cons and such.
http://code.activestate.com/recipes/384122/

--bb

Oct 26 2008
"Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Mon, 27 Oct 2008 00:41:26 +0100, Bill Baxter <wbaxter gmail.com> wrote:
Same thing goes for downs' in-fix operators.  I think his syntax is
/infix/ which means that his ops always have the same precedence as
division.
I'm guessing this Python Cookbook recipe is very similar to Downs'
technique.  It discusses pros and cons and such.
http://code.activestate.com/recipes/384122/

--bb

An interesting read, though I have looked at downs' code before. It
occured to
me now that this could sorta have been fixed with a preprocessor, just
define
an operator to have the same precedence as an already existing operator,
define
an alias that gets replaced with /foo/, +foo+, or whatever operator you
chose.
I guess we're stuck waiting for macros in the meantime.

--
Simen

Oct 26 2008
"Bill Baxter" <wbaxter gmail.com> writes:
On Mon, Oct 27, 2008 at 9:04 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

What's the precedence of your user-defined in-fix operator?

--bb

Yup, I realized this myself as well. Seemed like such a great idea when I
only thought of it for three seconds. :p

An operator could always be defined to have the same precedent as an
existing operator, which it has to specify.

Walter said in a previous post a few days ago when I suggested it that
that would kill D's easy parsability.
You say no?  I'm no parser expert, so hard for me to say.

--bb

Oct 26 2008
"Bill Baxter" <wbaxter gmail.com> writes:
On Mon, Oct 27, 2008 at 11:43 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Bill Baxter wrote:
On Mon, Oct 27, 2008 at 9:04 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

What's the precedence of your user-defined in-fix operator?

--bb

Yup, I realized this myself as well. Seemed like such a great idea when
I
only thought of it for three seconds. :p

An operator could always be defined to have the same precedent as an
existing operator, which it has to specify.

Walter said in a previous post a few days ago when I suggested it that
that would kill D's easy parsability.
You say no?  I'm no parser expert, so hard for me to say.

It can be done, but it's kinda involved. You define a grammar in which all
operators have the same precedence. Consequently you compile any expression
into a list of operands and operators. That makes the language parsable
without semanting info. Then the semantic stage transforms the list into a
tree. Cecil does that.

I see.  So the price you pay is that you defer more decisions till
semantic stage.

I.e. "a b c d e" is allowed to parse into an amorphous list, then in
the semantic pass you decide if 'b' and 'd' are actually legal
operators or not.

--bb

Oct 26 2008