www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Array literals' default type

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Consider:

auto c = [ 2.71, 3.14, 6.023e22 ];
c ~= 2.21953167;

Should this work? Currently it doesn't because c's type is deduced as 
double[3].

The literal can initialize either double[3] or double[], so the question 
is only what the default should be when "auto" is used.


Thoughts?

Andrei
Oct 08 2009
next sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 08 Oct 2009 22:07:32 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Consider:

 auto c = [ 2.71, 3.14, 6.023e22 ];
 c ~= 2.21953167;

 Should this work? Currently it doesn't because c's type is deduced as  
 double[3].

 The literal can initialize either double[3] or double[], so the question  
 is only what the default should be when "auto" is used.


 Thoughts?

 Andrei

I was just about to bump a similar topic. I strongly believe typeof(c) must be immutable(double)[3]. There are 2 problems in D with current design: 1) The following code: auto c = [ 2.71, 3.14, 6.023e22 ]; always allocates memory from a heap. In many cases, a read-only view of that array would suffice. In case a mutation is needed, no one stops you from dup'ing that array: auto c = [ 2.71, 3.14, 6.023e22 ].dup; 2) There is an inconsistency with strings: auto c1 = "Hello"; // immutable auto c2 = ['H', 'e', 'l', 'l', 'o']; // mutable I don't like hidden allocations like this to present in D. I believe this is not what most users expect to happen. Back to your question, I believe it should be fixed-size and immutable. Once again, no one stops you from using "[]": auto c = [ 2.71, 3.14, 6.023e22 ][]; // a slice is returned The opposite is impossible - you can't get a fixed-sized array from a dynamic one. Another question is, what type should .dup return? T[new] or T[]? (The former one, probably, since the latter one is accessible via .dup[] syntax...) Anyway, what is the state of T[new] in future of D?
Oct 08 2009
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 08 Oct 2009 15:10:46 -0400, Denis Koroskin <2korden gmail.com>  
wrote:

 On Thu, 08 Oct 2009 22:07:32 +0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 Consider:

 auto c = [ 2.71, 3.14, 6.023e22 ];
 c ~= 2.21953167;

 Should this work? Currently it doesn't because c's type is deduced as  
 double[3].

 The literal can initialize either double[3] or double[], so the  
 question is only what the default should be when "auto" is used.


 Thoughts?

 Andrei

I was just about to bump a similar topic. I strongly believe typeof(c) must be immutable(double)[3].

You're half right. It should be immutable(double)[]. Remember, double[3] is allocated *on the stack*, so there is no point in making it immutable. So the only two logical choices are: double[3] - The compiler stores a copy of this array somewhere, and initializes the stack variable with the contents each time the literal is used (or generates code to make the array in the function itself). immutable(double)[] - The compiler stores a copy of this array somewhere in ROM and initializes the stack variable with the immutable pointer to the data. The second choice is almost certainly more useful than the first, and if you truly don't need it mutable, much better performing. This then leaves the quandry -- what if you *do* want a static array, but you don't want to have to match the size? I think bearophile's suggestion is a good one: double[$] c = ... Or another option: auto[$] c = ... if you want true type inference from the literal.
 2) There is an inconsistency with strings:

      auto c1 = "Hello"; // immutable
      auto c2 = ['H', 'e', 'l', 'l', 'o']; // mutable

Note that the type of c1 is not immutable(char)[5], it's immutable(char)[] (this is different from D1, where the "Hello" literal would be typed char[5u]). -Steve
Oct 08 2009
parent reply grauzone <none example.net> writes:
Steven Schveighoffer wrote:
 immutable(double)[] - The compiler stores a copy of this array somewhere 
 in ROM and initializes the stack variable with the immutable pointer to 
 the data.

And what about void foo(int x) { auto a = [1, x, 2]; ? Should it create an immutable array on the heap? And if the user happens to need a mutable one, he has to dup it? (Causing one unnecessary memory allocation.) (Wait, is .dup even enough to get a mutable array, or will it return just another immutable one?) Anyway, returning an array allocated on the stack is unsafe. If someone wants it, he should write: int[3] a = [1, x, 2]; Now the only problem is, that right now, array initializers only work for static variables. Which is very stupid. Currently, this code creates an array literal, and copies it into the static array a (which is very very stupid), and if you add a "static" in front of the variable a, the thing beyond "=" is interpreted as array initializer (and x must be const). Ideally, the line of code above would not cause a heap allocation.
Oct 09 2009
parent reply Don <nospam nospam.com> writes:
Steven Schveighoffer wrote:
 On Fri, 09 Oct 2009 06:11:50 -0400, grauzone <none example.net> wrote:
 
 Steven Schveighoffer wrote:
 immutable(double)[] - The compiler stores a copy of this array 
 somewhere in ROM and initializes the stack variable with the 
 immutable pointer to the data.

And what about void foo(int x) { auto a = [1, x, 2]; ? Should it create an immutable array on the heap? And if the user happens to need a mutable one, he has to dup it? (Causing one unnecessary memory allocation.)

This is an interesting question. This literal obviously cannot be ROM-allocated, so it must be heap allocated. But do we want heap allocated literals forced into being immutable? I think array literals need to go through a major overhaul. The type of the literal is highly dependent on both how you want to use it, and the values given to it. Similar to how a 1 can be interpreted as a byte, maybe an array literal needs to generate different code depending on how you assign it. Here's a stab at some rules I'd like to see implemented (when I say assigned to variable, I mean assigned, or casted, or passed as argument, etc): 1. If an array literal is assigned to a variable of type immutable(T)[] or const(T)[]: a. If any of the elements of the array are runtime-decided, then the array is allocated on the heap. b. Otherwise, the array is set in ROM, and an alias to the array is returned. 2. If an array literal is assigned to a variable of type T[] where T is not immutable or const, it is *always* allocated on the heap. 3. If an array literal is assigned to a variable of type T[], and all of the elements can be either interpreted as type T or implicitly casted to type T, the literal shall be interpreted as if it were written [cast(T)e1, cast(T)e2, ...] 4. If an array literal is assigned to a variable with the auto specifier, the type is immutable(T) where T is the most basic type that all the elements can be interpreted as. Then rule 1 is followed. 5. If an array literal is assigned to a variable that is a static array, no heap allocation shall occur. 6. If an array literal is .dup'd, it will be treated as if it were assigned to a T[] where T is mutable. If it's assigned to an auto, then T is the most basic type that all elements can be interpreted as. 7. If an array literal is .idup'd, it will be treated as if it were assigned to an immutable(T)[]. If it's assigned to an auto, then T is the most basic type that all elements can be interpreted as. I suck at writing rules :) Here are some examples of what I think should happen, and the types interpreted: int[] x = [1,2,3]; // type: int[], on heap. auto x = [1,2,3]; // type: immutable(int)[], ROM. int y = 2; auto x = [1,y,3]; // type: immutable(int)[], heap. int[] x = [1,y,3]; // type: int[], heap. auto x = [1,y,3].dup; // type int[], heap, only one allocation. auto x = [1,2,3].dup; // type: int[], heap, only one allocation. auto x = [1,y,3].idup; // type: immutable(int)[], heap, only one allocation. auto x = [1,2,3].idup; // type: immutable(int)[], ROM. auto x = [1,2.2,3]; // type: immutable(double)[], ROM. immutable(double) x = [1,2,3]; // type: immutable(double)[], ROM. int[3] x = [1,2,3]; // type int[3u], on stack, no heap allocation. auto x = "hello"; // type immutable(char)[], ROM. char[] x = "hello"; // type char[], on heap. This is all principal of least surprise. Do people think this is a good idea?
 (Wait, is .dup even enough to get a mutable array, or will it return 
 just another immutable one?)

.dup always returns a mutable array, regardless of the immutability of the source. .idup returns an array of immutable data.
 int[3] a = [1, x, 2];

 Ideally, the line of code above would not cause a heap allocation.

I agree. Automatic heap allocations for array literals that don't need to be heap allocated is crappy. -Steve

I don't understand why runtime-determined array literals even exist. They're not literals!!! They cause no end of trouble. IMHO we'd be *much* better off without them.
Oct 09 2009
next sibling parent Max Samukha <spambox d-coding.com> writes:
On Fri, 09 Oct 2009 14:34:31 +0200, Don <nospam nospam.com> wrote:

I don't understand why runtime-determined array literals even exist.
They're not literals!!!
They cause no end of trouble. IMHO we'd be *much* better off without them.

I agree. They are an endless source of confusion. The same applies to struct literals.
Oct 09 2009
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
Steven Schveighoffer wrote:
 On Fri, 09 Oct 2009 08:34:31 -0400, Don <nospam nospam.com> wrote:
 
 I don't understand why runtime-determined array literals even exist.
 They're not literals!!!
 They cause no end of trouble. IMHO we'd be *much* better off without 
 them.

I don't agree. Here is a runtime decided array literal: void foo(int a, int b, int c) { auto x = [a, b, c]; } The alternatives are:

 // template function
 
 auto x = createArray(a, b, c);
 
 // mixin?
 
 Although the template function looks nice, it adds bloat.

There's no bloat. You just need a type-safe variadic. T[] createArray(T)(T[] args...); One function per type. That's the best you're ever going to do with run-time construction anyway. Actually, there's horrific bloat present right now. Look at the code generated when you use an array literal.
 Why shouldn't the compiler just do what you want it to do?  I 
 don't see a  lot of ambiguity in the statement above, "I want an array 
 of a, b, and c together please".  It's obvious what to do, and the code 
 looks clean.

There's ambiguity once you leave the simplest cases. See below.
 
 On top of that, what if a, b, and c are runtime decide, then during 
 development, or with a new compiler, they can now be CTFE decided?  Now 
 you are calling some function when they *could* be in a literal.

This is exactly the problem. They should ALWAYS require CTFE evaluation. EG: immutable(double)[] tableOfSines = [ sin(0.0), sin(PI/4), sin(PI/2), sin(3*PI/4), sin(1)]; Obviously, these values should be be compile-time evaluated. But how does the compiler know that? It can't. Right now, this is done at run-time. Runtime array creation is a prime candidate for moving from language to libraries.
Oct 09 2009
parent Don <nospam nospam.com> writes:
Steven Schveighoffer wrote:
 On Fri, 09 Oct 2009 09:27:01 -0400, Don <nospam nospam.com> wrote:
 
 Steven Schveighoffer wrote:
 On Fri, 09 Oct 2009 08:34:31 -0400, Don <nospam nospam.com> wrote:

 I don't understand why runtime-determined array literals even exist.
 They're not literals!!!
 They cause no end of trouble. IMHO we'd be *much* better off without 
 them.

void foo(int a, int b, int c) { auto x = [a, b, c]; } The alternatives are:

 // template function
  auto x = createArray(a, b, c);
  // mixin?
  Although the template function looks nice, it adds bloat.

There's no bloat. You just need a type-safe variadic. T[] createArray(T)(T[] args...); One function per type. That's the best you're ever going to do with run-time construction anyway. Actually, there's horrific bloat present right now. Look at the code generated when you use an array literal.

If you have a function that takes a typesafe variadic array, what is the compiler going to do to pass that data into the function? Push it on the stack, call a function, and then the function is going to do the same thing a literal would do, reading the data off the stack? How is that not worse than an array literal generating code to build an array?

That's exactly what the compiler does right now. It pushes all the values onto the stack, then calls a function to create a literal <g>.
 Not to mention the added symbol bloat.

That's the only kind of bloat the template solution could give you.
 Generated code isn't bloat if it's the minimal work that has to be done 
 to get what you want.

Yes, but at present it always generating code for the worst case.
 
  On top of that, what if a, b, and c are runtime decide, then during 
 development, or with a new compiler, they can now be CTFE decided?  
 Now you are calling some function when they *could* be in a literal.

This is exactly the problem. They should ALWAYS require CTFE evaluation. EG: immutable(double)[] tableOfSines = [ sin(0.0), sin(PI/4), sin(PI/2), sin(3*PI/4), sin(1)]; Obviously, these values should be be compile-time evaluated. But how does the compiler know that? It can't. Right now, this is done at run-time.

I'm not extremely well-versed in what triggers CTFE, but it seems logical to me that the compiler can determine that it can be evaluated at compile-time, assuming sin is marked as pure (or maybe even if it isn't). What am I missing?

A function can be pure even if it does a huge calculation that takes days. CTFE is only triggered if used in a situation where a compile-time constant is _mandatory_. You have to explicitly ask for CTFE somehow.
 Runtime array creation is a prime candidate for moving from language 
 to libraries.

It is a solution, but I think the better solution is you just write what you want and the compiler figures out the best move. Whether it's heap allocated or not, created at runtime or not, is an implementation detail I don't think the user needs to worry about.

I think it's really misleading to have an expensive operation masqueriding as a free one. Suppose you have a 20000 element array literal, all constants, and then you change one element to 'x+2' where x is a local variable. Suddenly, instead of just getting a pointer to statically-loaded data, you're pushing 20000 things onto the stack! Creating an array at run-time seems to be a kind of constructor call to me. Using array literal syntax for runtime initialization gives the same problems Andrei discussed in the 'new' thread. Eg, it's not polymorphic. I suspect that uses of run-time array literals are really rare. My code is full of compile-time array literals, but I've never seen a run-time usage.
Oct 09 2009
prev sibling parent reply Christopher Wright <dhasenan gmail.com> writes:
Don wrote:
 I don't understand why runtime-determined array literals even exist.
 They're not literals!!!
 They cause no end of trouble. IMHO we'd be *much* better off without them.

You don't see the use. I do. I would go on a murderous rampage if that feature were removed from the language. For example, one thing I recently wrote involved creating a process with a large number of arguments. The invocation looked like: exec("description", [procName, arg1, arg2] ~ generatedArgs ~ [arg3, arg4] ~ moreGeneratedArgs); There were about ten or fifteen lines like that. You'd suggest I rewrite that how? char[][] args; args ~= procName; args ~= arg1; args ~= arg2; args ~= generatedArgs; args ~= arg3; Just fucking shoot me. Or better yet, whoever removed array literals with non-constant elements from the language.
Oct 09 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Christopher Wright wrote:
 Don wrote:
 I don't understand why runtime-determined array literals even exist.
 They're not literals!!!
 They cause no end of trouble. IMHO we'd be *much* better off without 
 them.

You don't see the use. I do. I would go on a murderous rampage if that feature were removed from the language. For example, one thing I recently wrote involved creating a process with a large number of arguments. The invocation looked like: exec("description", [procName, arg1, arg2] ~ generatedArgs ~ [arg3, arg4] ~ moreGeneratedArgs); There were about ten or fifteen lines like that. You'd suggest I rewrite that how? char[][] args; args ~= procName; args ~= arg1; args ~= arg2; args ~= generatedArgs; args ~= arg3; Just fucking shoot me. Or better yet, whoever removed array literals with non-constant elements from the language.

Relax. It's a condition known as literalitis. :o) Literals only have you write [ a, b, c ] instead of toArray(a, b, c). I wouldn't see it a big deal one way or another, but the issue is that the former is a one-time decision that pretty much can't be changed, whereas toArray can benefit of the hindsight of experience. Andrei
Oct 09 2009
parent bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 Relax. It's a condition known as literalitis. :o)

While I guess the opposite illness is literalphobia :-)
 Literals only have you write [ a, b, c ] instead of toArray(a, b, c). I 
 wouldn't see it a big deal one way or another, but the issue is that the 
 former is a one-time decision that pretty much can't be changed, whereas 
 toArray can benefit of the hindsight of experience.

Literals for basic things are a good thing, handy, short, easy to think about, like icons; for things like strings, arrays, associative arrays, sets, bignums, complex numbers, and few more. Such things are common or very common in programs, and simple enough that a good design can be found and used. When you design a language you have to balance the generality with the specificity. Both extrema have disadvantages. Your designs are usually good, but often they risk a little overgeneralisation. Bye, bearophile
Oct 09 2009
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
Christopher Wright wrote:
 Don wrote:
 I don't understand why runtime-determined array literals even exist.
 They're not literals!!!
 They cause no end of trouble. IMHO we'd be *much* better off without 
 them.

You don't see the use. I do. I would go on a murderous rampage if that feature were removed from the language. For example, one thing I recently wrote involved creating a process with a large number of arguments. The invocation looked like: exec("description", [procName, arg1, arg2] ~ generatedArgs ~ [arg3, arg4] ~ moreGeneratedArgs); There were about ten or fifteen lines like that. You'd suggest I rewrite that how? char[][] args; args ~= procName; args ~= arg1; args ~= arg2; args ~= generatedArgs; args ~= arg3;

Of course not. These runtime 'array literals' are just syntax sugar for a constructor call. Really, they are nothing more. At worst, it would be something like: exec("description", createArray(procName, arg1, arg2) ~ generatedArgs ~ createArray(arg3, arg4) ~ moreGeneratedArgs); Depending on what the 'exec' signature is, it could be simpler than that. But that's the absolute worst case. The language pays a heavy price for that little bit of syntax sugar.
Oct 10 2009
next sibling parent reply Yigal Chripun <yigal100 gmail.com> writes:
On 10/10/2009 10:11, Don wrote:
 Christopher Wright wrote:
 Don wrote:
 I don't understand why runtime-determined array literals even exist.
 They're not literals!!!
 They cause no end of trouble. IMHO we'd be *much* better off without
 them.

You don't see the use. I do. I would go on a murderous rampage if that feature were removed from the language. For example, one thing I recently wrote involved creating a process with a large number of arguments. The invocation looked like: exec("description", [procName, arg1, arg2] ~ generatedArgs ~ [arg3, arg4] ~ moreGeneratedArgs); There were about ten or fifteen lines like that. You'd suggest I rewrite that how? char[][] args; args ~= procName; args ~= arg1; args ~= arg2; args ~= generatedArgs; args ~= arg3;

Of course not. These runtime 'array literals' are just syntax sugar for a constructor call. Really, they are nothing more. At worst, it would be something like: exec("description", createArray(procName, arg1, arg2) ~ generatedArgs ~ createArray(arg3, arg4) ~ moreGeneratedArgs); Depending on what the 'exec' signature is, it could be simpler than that. But that's the absolute worst case. The language pays a heavy price for that little bit of syntax sugar.

You keep calling these literals "constructor calls' and I agree that that's what they are. My question is then why not make them real constructors? auto a = new int[](x, y, z); auto b = new int[3](x, y, z); auto c = new int[]; // empty array auto d = new int[3]; // use default ctor: d == [0,0,0] for arrays of class instances this could be extended to call a constructor for each index, something like: // tuples would be very handy here.. auto e = new Class[2](Tuple!(args1), Tuple!(args2));
Oct 10 2009
next sibling parent reply Yigal Chripun <yigal100 gmail.com> writes:
On 10/10/2009 16:12, Jarrett Billingsley wrote:
 On Sat, Oct 10, 2009 at 7:33 AM, Yigal Chripun<yigal100 gmail.com>  wrote:
 You keep calling these literals "constructor calls' and I agree that that's
 what they are. My question is then why not make them real constructors?

 auto a = new int[](x, y, z);

Teehee, that syntax already has meaning in D. Well, that right there would give you a semantic error, but "new int[][][](x, y, z)" creates a 3-D array with dimensions x, y, and z. That brings up another point. If you *did* use a class-style ctor syntax, how would you list the arguments for a multidimensional array? That is, what would be the equivalent of [[1, 2], [3, 4]]?

I know about the current meaning, I was suggesting to change it. to answer your question - a) compile-time literals should remain as is so your example of: [[1, 2], [3, 4]] is still valid. b) for run-time arrays: auto a = new int[][]( (new int[](x, y), new int[](z, w) ); auto a = new int[][]( Tuple!(x, y), Tuple!(z, w) ); auto a = new int[][]( [1, 2], [3, 4] ); you can construct a regular array with a literal or a tuple: int[] a = [1, 2]; int[] b = new int[](x, y); int[][] is an array of arrays so the rules apply recursively: both forms can initialize each array in the array of arrays. tuples of tuples are a shortcut for the second option. Now, wouldn't it be wonderful if D had provided real tuple support without all the Tuple!() nonsense?
Oct 10 2009
parent reply Don <nospam nospam.com> writes:
language_fan wrote:
 Sat, 10 Oct 2009 17:15:55 +0200, Yigal Chripun thusly wrote:
 
 Now, wouldn't it be wonderful if D had provided real tuple support
 without all the Tuple!() nonsense?

'D has full built-in tuple support' has been the answer each time I've asked. It seems not to be advisable to ask more about this specific feature since the language creators easily get annoyed when asked about this. They see more value in reserving the syntax for the C style sequencing operator which is rarely used. Also they have apparently scientifically proven that the auto-flattening semantics of tuples somehow works better than real product types, and have no intention to make it an explicit controllable operation, which is also easily implementable.

Not so, Andrei has said that he thinks auto-flattening was a bad idea. And AFAIK, Walter doesn't disagree. Andrei and I, and almost everyone else, have tried to persuade Walter to remove the comma operator, but without success. But I doubt you'd be able to use it for tuples, because x, y = foo(); already has meaning in C and tuples would give it a different meaning. I'd LOVE to be proved wrong. It is very difficult to change Walter's mind about many things, but despite what people say, it is not impossible.
Oct 12 2009
next sibling parent reply grauzone <none example.net> writes:
Don wrote:
 Not so, Andrei has said that he thinks auto-flattening was a bad idea. 
 And AFAIK, Walter doesn't disagree.

You should try harder, because if you don't change it soon, it will be there forever due to compatibility requirements.
 Andrei and I, and almost everyone else, have tried to persuade Walter to 
 remove the comma operator, but without success. But I doubt you'd be 
 able to use it for tuples, because   x, y = foo(); already has meaning 
 in C and tuples would give it a different meaning. I'd LOVE to be proved 
 wrong.

Wasn't the comma operator to be supposed to be important for automatic code generation? Even if it is, you could just pick a different token to implement this operator. It doesn't have to be a comma, does it?
 It is very difficult to change Walter's mind about many things, but 
 despite what people say, it is not impossible.

Oct 12 2009
parent reply Don <nospam nospam.com> writes:
grauzone wrote:
 Don wrote:
 Not so, Andrei has said that he thinks auto-flattening was a bad idea. 
 And AFAIK, Walter doesn't disagree.

You should try harder, because if you don't change it soon, it will be there forever due to compatibility requirements.

The TODO list is very long.
 Andrei and I, and almost everyone else, have tried to persuade Walter 
 to remove the comma operator, but without success. But I doubt you'd 
 be able to use it for tuples, because   x, y = foo(); already has 
 meaning in C and tuples would give it a different meaning. I'd LOVE to 
 be proved wrong.

Wasn't the comma operator to be supposed to be important for automatic code generation?

It's used frequently in in the compiler internals. EG, given int foo(X x = default_value) { return 0; } then foo(); becomes: (X tmp = default_value, foo(tmp)); I don't think it sees much use outside of compilers. There are other ways to get the same effect in user code.
 Even if it is, you could just pick a different token to implement this 
 operator. It doesn't have to be a comma, does it?

It doesn't need any syntax at all, when it's inside the compiler. But the problem is, that comma has that meaning in C. (OTOH I wonder how much extant C++ code uses the comma operator. I bet there's not much of it. (But more than code than uses octal!)). BTW as an asm programmer, I'm completely baffled as to why the concept of 'single return value from a function' became dominant in programming languages. It's a very unnatural restriction from a machine point of view.
Oct 12 2009
next sibling parent Ary Borenszweig <ary esperanto.org.ar> writes:
language_fan wrote:
 Mon, 12 Oct 2009 13:04:03 -0400, Jarrett Billingsley thusly wrote:
 
 On Mon, Oct 12, 2009 at 10:47 AM, Don <nospam nospam.com> wrote:

 Wasn't the comma operator to be supposed to be important for automatic
 code generation?

int foo(X x = default_value) { return 0; } then foo(); becomes: (X tmp = default_value, foo(tmp));

that's used internally by the compiler. I mean, we don't have to explicitly mark which brace blocks introduce scopes, but ScopeStatements are alive and well inside the compiler. CommaExp could just become "SequenceExp" or something and it would have the exact same effect. I really don't think there will be a lot of moaning if comma expressions disappeared. And yes, for loop increments can be special-cased, geez..

But it breaks the holy C compatibility. When a C veteran with 40+ years of C development experience under their belt studies D by porting a 1 MLOC library to D 2.0, his code will fail as the precious old comma does not compute sequencing, but instead will produce a nasty compile error. Porting the code in a single go will not be possible anymore and reddit commentators will literally crush D.

What is it with C compatibility? Can't you link C functions and you are done? What's the compulsive need to port everything written in C or C++ to D?
Oct 12 2009
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Don wrote:
 BTW as an asm programmer, I'm completely baffled as to why the concept 
 of 'single return value from a function' became dominant in programming 
 languages. It's a very unnatural restriction from a machine point of view.

I'd guess it is because of the close relationship between programming and mathematics. -Lars
Oct 12 2009
prev sibling parent reply Yigal Chripun <yigal100 gmail.com> writes:
On 12/10/2009 15:43, Don wrote:
 language_fan wrote:
 Sat, 10 Oct 2009 17:15:55 +0200, Yigal Chripun thusly wrote:

 Now, wouldn't it be wonderful if D had provided real tuple support
 without all the Tuple!() nonsense?

'D has full built-in tuple support' has been the answer each time I've asked. It seems not to be advisable to ask more about this specific feature since the language creators easily get annoyed when asked about this. They see more value in reserving the syntax for the C style sequencing operator which is rarely used. Also they have apparently scientifically proven that the auto-flattening semantics of tuples somehow works better than real product types, and have no intention to make it an explicit controllable operation, which is also easily implementable.

Not so, Andrei has said that he thinks auto-flattening was a bad idea. And AFAIK, Walter doesn't disagree. Andrei and I, and almost everyone else, have tried to persuade Walter to remove the comma operator, but without success. But I doubt you'd be able to use it for tuples, because x, y = foo(); already has meaning in C and tuples would give it a different meaning. I'd LOVE to be proved wrong. It is very difficult to change Walter's mind about many things, but despite what people say, it is not impossible.

what's wrong with enclosing tuples in parenthesis? (x, y) = foo(); int foo(); int bar(); int a = foo(), bar(); // sequence int b, c; (b, c) = (foo(), bar()); // tuples b, c = foo(), bar(); // sequence (b, c) = foo(), bar(); // error assigning int to (int, int) b, c = (foo(), bar()); // error assigning (int, int) to int
Oct 12 2009
parent reply Yigal Chripun <yigal100 gmail.com> writes:
On 12/10/2009 23:00, Yigal Chripun wrote:
 On 12/10/2009 15:43, Don wrote:
 language_fan wrote:
 Sat, 10 Oct 2009 17:15:55 +0200, Yigal Chripun thusly wrote:

 Now, wouldn't it be wonderful if D had provided real tuple support
 without all the Tuple!() nonsense?

'D has full built-in tuple support' has been the answer each time I've asked. It seems not to be advisable to ask more about this specific feature since the language creators easily get annoyed when asked about this. They see more value in reserving the syntax for the C style sequencing operator which is rarely used. Also they have apparently scientifically proven that the auto-flattening semantics of tuples somehow works better than real product types, and have no intention to make it an explicit controllable operation, which is also easily implementable.

Not so, Andrei has said that he thinks auto-flattening was a bad idea. And AFAIK, Walter doesn't disagree. Andrei and I, and almost everyone else, have tried to persuade Walter to remove the comma operator, but without success. But I doubt you'd be able to use it for tuples, because x, y = foo(); already has meaning in C and tuples would give it a different meaning. I'd LOVE to be proved wrong. It is very difficult to change Walter's mind about many things, but despite what people say, it is not impossible.

what's wrong with enclosing tuples in parenthesis? (x, y) = foo(); int foo(); int bar(); int a = foo(), bar(); // sequence int b, c; (b, c) = (foo(), bar()); // tuples b, c = foo(), bar(); // sequence (b, c) = foo(), bar(); // error assigning int to (int, int) b, c = (foo(), bar()); // error assigning (int, int) to int

one more note: there's no need to special case the for loop. for (int i = 0, j = 0; someCondition; i++, j--) {...} ~~~^~~~~ the above will continue to work whether it's a tuple or a C sequence.
Oct 12 2009
parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
Yigal Chripun wrote:
 what's wrong with enclosing tuples in parenthesis?
 (x, y) = foo();


I'm guessing you'd end up with some weird rules z = foo(); // what is z, a tuple, or y? auto (a,b) = ((1,2),(3,4)); // eh, what is this? foo((1,2)); // um, what? When offered semantic ambiguity, just say no.
 int foo();
 int bar();

 int a = foo(), bar(); // sequence
 int b, c;
 (b, c) = (foo(), bar()); // tuples
 b, c = foo(), bar(); // sequence
 (b, c) = foo(), bar(); // error assigning int to (int, int)
 b, c = (foo(), bar()); // error assigning (int, int) to int

one more note: there's no need to special case the for loop. for (int i = 0, j = 0; someCondition; i++, j--) {...} ~~~^~~~~ the above will continue to work whether it's a tuple or a C sequence.

How about this? OtherTypeIwant foo(out bool flag){...} ... while(guardflag && (x=foo(f), f)){ ... } Just checked, still have something like this in my code.
Oct 12 2009
parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
Bill Baxter wrote:
 foo((1,2)); // um, what?

You see that kind of thing in Python all the time, with NumPy at least. Array dimensions for example are set with a tuple. So x = array((1,2), dtype=int) And very common to see things like numpy.zeros((10,20)).

This is one place [of many] where you can't say, "Oh, leave off the parens and it's a comma exp". That leaves me in the lurch.
 
 When offered semantic ambiguity, just say no.

You didn't really show any examples of ambiguity that I could see. Some of those examples may have a legal meaning in C, so that's an issue if so.

If you try to add tuple expressions AND keep comma expressions, you either can't recurse from paren exp to comma exp (or any exp, really) or you have semantic ambiguity. If you can't recurse comma exps, they're nothing more than semicolons. Worthless, so you might as well remove them. Or you have to introduce a new syntax. <aside> Actually, making comma exp a little uglier might be an appealing solution. something like CommaExp -> , CommaExp CommaExp -> AsgExp gives you auto a = (,1,2); // a = 2 auto a = (1,2); // a = (1,2) You still have the problem of ( nonCommaExp ), though. And user-friendliness issues. I'm sure others could think of better ideas.
Oct 13 2009
parent Don <nospam nospam.com> writes:
Ellery Newcomer wrote:
 Bill Baxter wrote:
 foo((1,2)); // um, what?

least. Array dimensions for example are set with a tuple. So x = array((1,2), dtype=int) And very common to see things like numpy.zeros((10,20)).

This is one place [of many] where you can't say, "Oh, leave off the parens and it's a comma exp". That leaves me in the lurch.
 When offered semantic ambiguity, just say no.

Some of those examples may have a legal meaning in C, so that's an issue if so.

If you try to add tuple expressions AND keep comma expressions, you either can't recurse from paren exp to comma exp (or any exp, really) or you have semantic ambiguity. If you can't recurse comma exps, they're nothing more than semicolons. Worthless, so you might as well remove them. Or you have to introduce a new syntax. <aside> Actually, making comma exp a little uglier might be an appealing solution. something like CommaExp -> , CommaExp CommaExp -> AsgExp gives you auto a = (,1,2); // a = 2 auto a = (1,2); // a = (1,2) You still have the problem of ( nonCommaExp ), though. And user-friendliness issues. I'm sure others could think of better ideas.

You do NOT need to keep the comma operator. What you do need to do, though, is make sure that any use of , doesn't silently produce different behaviour to what it would do in C.
Oct 14 2009
prev sibling next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Mon, Oct 12, 2009 at 1:21 PM, language_fan <foo bar.com.invalid> wrote:
 Mon, 12 Oct 2009 13:04:03 -0400, Jarrett Billingsley thusly wrote:

 On Mon, Oct 12, 2009 at 10:47 AM, Don <nospam nospam.com> wrote:

 Wasn't the comma operator to be supposed to be important for automatic
 code generation?

It's used frequently in in the compiler internals. EG, given int foo(X x =3D default_value) { return 0; } then foo(); becomes: =A0 (=



 tmp =3D default_value, foo(tmp));

There doesn't need to be any *syntactic* reservation for something that's used internally by the compiler. I mean, we don't have to explicitly mark which brace blocks introduce scopes, but ScopeStatements are alive and well inside the compiler. CommaExp could just become "SequenceExp" or something and it would have the exact same effect. I really don't think there will be a lot of moaning if comma expressions disappeared. And yes, for loop increments can be special-cased, geez..

But it breaks the holy C compatibility. When a C veteran with 40+ years of C development experience under their belt studies D by porting a 1 MLOC library to D 2.0, his code will fail as the precious old comma does not compute sequencing, but instead will produce a nasty compile error. Porting the code in a single go will not be possible anymore and reddit commentators will literally crush D.

Fuck C.
Oct 12 2009
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 12 Oct 2009 14:27:21 -0400, Jarrett Billingsley  
<jarrett.billingsley gmail.com> wrote:

 On Mon, Oct 12, 2009 at 1:21 PM, language_fan <foo bar.com.invalid>  
 wrote:
 Mon, 12 Oct 2009 13:04:03 -0400, Jarrett Billingsley thusly wrote:

 On Mon, Oct 12, 2009 at 10:47 AM, Don <nospam nospam.com> wrote:

 Wasn't the comma operator to be supposed to be important for  
 automatic
 code generation?

It's used frequently in in the compiler internals. EG, given int foo(X x = default_value) { return 0; } then foo(); becomes: (X tmp = default_value, foo(tmp));

There doesn't need to be any *syntactic* reservation for something that's used internally by the compiler. I mean, we don't have to explicitly mark which brace blocks introduce scopes, but ScopeStatements are alive and well inside the compiler. CommaExp could just become "SequenceExp" or something and it would have the exact same effect. I really don't think there will be a lot of moaning if comma expressions disappeared. And yes, for loop increments can be special-cased, geez..

But it breaks the holy C compatibility. When a C veteran with 40+ years of C development experience under their belt studies D by porting a 1 MLOC library to D 2.0, his code will fail as the precious old comma does not compute sequencing, but instead will produce a nasty compile error. Porting the code in a single go will not be possible anymore and reddit commentators will literally crush D.

Fuck C.

Jarrett, I think you lost the sarcasm in his post :) -Steve
Oct 12 2009
prev sibling next sibling parent reply Christopher Wright <dhasenan gmail.com> writes:
Don wrote:
 Christopher Wright wrote:
 Don wrote:
 I don't understand why runtime-determined array literals even exist.
 They're not literals!!!
 They cause no end of trouble. IMHO we'd be *much* better off without 
 them.

You don't see the use. I do. I would go on a murderous rampage if that feature were removed from the language. For example, one thing I recently wrote involved creating a process with a large number of arguments. The invocation looked like: exec("description", [procName, arg1, arg2] ~ generatedArgs ~ [arg3, arg4] ~ moreGeneratedArgs); There were about ten or fifteen lines like that. You'd suggest I rewrite that how? char[][] args; args ~= procName; args ~= arg1; args ~= arg2; args ~= generatedArgs; args ~= arg3;

Of course not. These runtime 'array literals' are just syntax sugar for a constructor call. Really, they are nothing more.

I'm quite surprised that there is a runtime function for this. I would expect codegen to emit something like: array = __d_newarray(nBytes) array[0] = exp0 array[1] = exp1 ...
 At worst, it would be something like:
 
 exec("description", createArray(procName, arg1, arg2) ~ generatedArgs ~ 
 createArray(arg3, arg4) ~ moreGeneratedArgs);

PHP does this. I haven't used PHP enough to hate it.
 Depending on what the 'exec' signature is, it could be simpler than 
 that. But that's the absolute worst case.
 
 The language pays a heavy price for that little bit of syntax sugar.

The price being occasional heap allocation where it's unnecessary? The compiler should be able to detect this in many cases and allocate on the stack instead. Your createArray() suggestion doesn't have that advantage. Or parsing difficulties? It's not an insanely difficult thing to parse, and people writing parsers for D comprise an extremely small segment of your audience. Or just having another construct to know? Except in PHP, you can't use arrays without knowing about the array() function, and in D, you can't easily use arrays without knowing about array literals. So it's the same mental load. You could say array() is more self-documenting, but that's only when you want someone who has no clue what D is to read your code. I think it's reasonable to require people to know what an array literal is. What is the price?
Oct 10 2009
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Christopher Wright wrote:
 Don wrote:
 Christopher Wright wrote:
 Don wrote:
 I don't understand why runtime-determined array literals even exist.
 They're not literals!!!
 They cause no end of trouble. IMHO we'd be *much* better off without 
 them.

You don't see the use. I do. I would go on a murderous rampage if that feature were removed from the language. For example, one thing I recently wrote involved creating a process with a large number of arguments. The invocation looked like: exec("description", [procName, arg1, arg2] ~ generatedArgs ~ [arg3, arg4] ~ moreGeneratedArgs); There were about ten or fifteen lines like that. You'd suggest I rewrite that how? char[][] args; args ~= procName; args ~= arg1; args ~= arg2; args ~= generatedArgs; args ~= arg3;

Of course not. These runtime 'array literals' are just syntax sugar for a constructor call. Really, they are nothing more.

I'm quite surprised that there is a runtime function for this. I would expect codegen to emit something like: array = __d_newarray(nBytes) array[0] = exp0 array[1] = exp1 ...
 At worst, it would be something like:

 exec("description", createArray(procName, arg1, arg2) ~ generatedArgs 
 ~ createArray(arg3, arg4) ~ moreGeneratedArgs);

PHP does this. I haven't used PHP enough to hate it.

I've used PHP a fair bit, and I don't hate its array syntax at all. (There are plenty of other things in PHP to hate, though.) It's easily readable, and not much of a hassle to write. But array() in PHP isn't a function, it's a language construct with special syntax. To create an AA, for instance, you'd write $colours = array("apple" => "red", "pear" => "green"); I'm not sure what the D equivalent of that one should be. -Lars
Oct 10 2009
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2009-10-10 12:12:27 -0400, "Lars T. Kyllingstad" 
<public kyllingen.NOSPAMnet> said:

 Christopher Wright wrote:
 Don wrote:
 At worst, it would be something like:
 
 exec("description", createArray(procName, arg1, arg2) ~ generatedArgs ~ 
 createArray(arg3, arg4) ~ moreGeneratedArgs);

PHP does this. I haven't used PHP enough to hate it.

I've used PHP a fair bit, and I don't hate its array syntax at all. (There are plenty of other things in PHP to hate, though.) It's easily readable, and not much of a hassle to write. But array() in PHP isn't a function, it's a language construct with special syntax. To create an AA, for instance, you'd write $colours = array("apple" => "red", "pear" => "green"); I'm not sure what the D equivalent of that one should be.

Associative array literals: string[string] s = ["hello": "world", "foo": "bar"]; Note that an "array" in PHP is always a double-linked list indexed by a hash-table. Writing `array(1, 2, 3)` is the same as writing `array(0 => 1, 1 => 2, 2 => 3)`: what gets constructed is identical. That's quite nice as a generic container. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 10 2009
parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Michel Fortin wrote:
 On 2009-10-10 12:12:27 -0400, "Lars T. Kyllingstad" 
 <public kyllingen.NOSPAMnet> said:
 
 Christopher Wright wrote:
 Don wrote:
 At worst, it would be something like:

 exec("description", createArray(procName, arg1, arg2) ~ 
 generatedArgs ~ createArray(arg3, arg4) ~ moreGeneratedArgs);

PHP does this. I haven't used PHP enough to hate it.

I've used PHP a fair bit, and I don't hate its array syntax at all. (There are plenty of other things in PHP to hate, though.) It's easily readable, and not much of a hassle to write. But array() in PHP isn't a function, it's a language construct with special syntax. To create an AA, for instance, you'd write $colours = array("apple" => "red", "pear" => "green"); I'm not sure what the D equivalent of that one should be.

Associative array literals: string[string] s = ["hello": "world", "foo": "bar"];

I know that. :) I was just wondering what the equivalent function call should look like if we replaced array literals with functions, cf. the createArray() function above. -Lars
Oct 10 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Oct 12, 2009 at 9:42 AM, Phil Deets <pjdeets2 gmail.com> wrote:
 On Mon, 12 Oct 2009 09:47:34 -0500, Don <nospam nospam.com> wrote:

 (OTOH I wonder how much extant C++ code uses the comma operator. I bet
 there's not much of it. (But more than code than uses octal!)).


There are quite a few uses out there if you count for-loop clauses and stuff hidden in macros. I think it probably isn't possible to have a, b = foo(); be valid D syntax with the "Don't re-interpret valid C" constraint. But perhaps a slight variation like this could work: (a, b) = foo(); --bb
Oct 12 2009
prev sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Oct 12, 2009 at 5:20 PM, Ellery Newcomer
<ellery-newcomer utulsa.edu> wrote:
 Yigal Chripun wrote:
 what's wrong with enclosing tuples in parenthesis?
 (x, y) =3D foo();


I'm guessing you'd end up with some weird rules z =3D foo(); // what is z, a tuple, or y?

What's weird there? z is whatever type it was declared to be.
 auto (a,b) =3D ((1,2),(3,4)); // eh, what is this?

That's pretty easy to figure out. Why would it be anything other than a=3D(1,2) b=3D(3,4) ?
 foo((1,2)); // um, what?

You see that kind of thing in Python all the time, with NumPy at least. Array dimensions for example are set with a tuple. So x =3D array((1,2), dtype=3Dint) And very common to see things like numpy.zeros((10,20)).
 When offered semantic ambiguity, just say no.

You didn't really show any examples of ambiguity that I could see. Some of those examples may have a legal meaning in C, so that's an issue if so.
 int foo();
 int bar();

 int a =3D foo(), bar(); // sequence
 int b, c;
 (b, c) =3D (foo(), bar()); // tuples
 b, c =3D foo(), bar(); // sequence
 (b, c) =3D foo(), bar(); // error assigning int to (int, int)
 b, c =3D (foo(), bar()); // error assigning (int, int) to int

one more note: there's no need to special case the for loop. for (int i =3D 0, j =3D 0; someCondition; i++, j--) {...} =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =


 the above will continue to work whether it's a tuple or a C sequence.

How about this? OtherTypeIwant foo(out bool flag){...} ... while(guardflag && (x=3Dfoo(f), f)){ ... } Just checked, still have something like this in my code.

That actual code would still be ok because I don't think it wouldn't be legal to do && on a tuple. Or if so then the result would be a tuple, and then the while would barf. So it wouldn't compile and you'd find the porting error. --bb
Oct 12 2009
prev sibling next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Mon, Oct 12, 2009 at 10:47 AM, Don <nospam nospam.com> wrote:

 Wasn't the comma operator to be supposed to be important for automatic
 code generation?

It's used frequently in in the compiler internals. EG, given int foo(X x =3D default_value) { return 0; } then foo(); becomes: =A0 (X tmp =3D default_value, foo(tmp));

There doesn't need to be any *syntactic* reservation for something that's used internally by the compiler. I mean, we don't have to explicitly mark which brace blocks introduce scopes, but ScopeStatements are alive and well inside the compiler. CommaExp could just become "SequenceExp" or something and it would have the exact same effect. I really don't think there will be a lot of moaning if comma expressions disappeared. And yes, for loop increments can be special-cased, geez..
Oct 12 2009
prev sibling parent language_fan <foo bar.com.invalid> writes:
Mon, 12 Oct 2009 13:04:03 -0400, Jarrett Billingsley thusly wrote:

 On Mon, Oct 12, 2009 at 10:47 AM, Don <nospam nospam.com> wrote:
 
 Wasn't the comma operator to be supposed to be important for automatic
 code generation?

It's used frequently in in the compiler internals. EG, given int foo(X x = default_value) { return 0; } then foo(); becomes:   (X tmp = default_value, foo(tmp));

There doesn't need to be any *syntactic* reservation for something that's used internally by the compiler. I mean, we don't have to explicitly mark which brace blocks introduce scopes, but ScopeStatements are alive and well inside the compiler. CommaExp could just become "SequenceExp" or something and it would have the exact same effect. I really don't think there will be a lot of moaning if comma expressions disappeared. And yes, for loop increments can be special-cased, geez..

But it breaks the holy C compatibility. When a C veteran with 40+ years of C development experience under their belt studies D by porting a 1 MLOC library to D 2.0, his code will fail as the precious old comma does not compute sequencing, but instead will produce a nasty compile error. Porting the code in a single go will not be possible anymore and reddit commentators will literally crush D.
Oct 12 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 08 Oct 2009 23:50:31 +0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Thu, 08 Oct 2009 15:10:46 -0400, Denis Koroskin <2korden gmail.com>  
 wrote:

 On Thu, 08 Oct 2009 22:07:32 +0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 Consider:

 auto c = [ 2.71, 3.14, 6.023e22 ];
 c ~= 2.21953167;

 Should this work? Currently it doesn't because c's type is deduced as  
 double[3].

 The literal can initialize either double[3] or double[], so the  
 question is only what the default should be when "auto" is used.


 Thoughts?

 Andrei

I was just about to bump a similar topic. I strongly believe typeof(c) must be immutable(double)[3].

You're half right. It should be immutable(double)[]. Remember, double[3] is allocated *on the stack*, so there is no point in making it immutable. So the only two logical choices are: double[3] - The compiler stores a copy of this array somewhere, and initializes the stack variable with the contents each time the literal is used (or generates code to make the array in the function itself). immutable(double)[] - The compiler stores a copy of this array somewhere in ROM and initializes the stack variable with the immutable pointer to the data. The second choice is almost certainly more useful than the first, and if you truly don't need it mutable, much better performing.

Right, that was an oversight. I'm now in favor of immutable(double)[].
 2) There is an inconsistency with strings:

      auto c1 = "Hello"; // immutable
      auto c2 = ['H', 'e', 'l', 'l', 'o']; // mutable

Note that the type of c1 is not immutable(char)[5], it's immutable(char)[] (this is different from D1, where the "Hello" literal would be typed char[5u]). -Steve

I believe the two should be the same (with the tiny difference that c1 is null-terminated under the hood), immutable(char)[] that is.
Oct 08 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 09 Oct 2009 06:11:50 -0400, grauzone <none example.net> wrote:

 Steven Schveighoffer wrote:
 immutable(double)[] - The compiler stores a copy of this array  
 somewhere in ROM and initializes the stack variable with the immutable  
 pointer to the data.

And what about void foo(int x) { auto a = [1, x, 2]; ? Should it create an immutable array on the heap? And if the user happens to need a mutable one, he has to dup it? (Causing one unnecessary memory allocation.)

This is an interesting question. This literal obviously cannot be ROM-allocated, so it must be heap allocated. But do we want heap allocated literals forced into being immutable? I think array literals need to go through a major overhaul. The type of the literal is highly dependent on both how you want to use it, and the values given to it. Similar to how a 1 can be interpreted as a byte, maybe an array literal needs to generate different code depending on how you assign it. Here's a stab at some rules I'd like to see implemented (when I say assigned to variable, I mean assigned, or casted, or passed as argument, etc): 1. If an array literal is assigned to a variable of type immutable(T)[] or const(T)[]: a. If any of the elements of the array are runtime-decided, then the array is allocated on the heap. b. Otherwise, the array is set in ROM, and an alias to the array is returned. 2. If an array literal is assigned to a variable of type T[] where T is not immutable or const, it is *always* allocated on the heap. 3. If an array literal is assigned to a variable of type T[], and all of the elements can be either interpreted as type T or implicitly casted to type T, the literal shall be interpreted as if it were written [cast(T)e1, cast(T)e2, ...] 4. If an array literal is assigned to a variable with the auto specifier, the type is immutable(T) where T is the most basic type that all the elements can be interpreted as. Then rule 1 is followed. 5. If an array literal is assigned to a variable that is a static array, no heap allocation shall occur. 6. If an array literal is .dup'd, it will be treated as if it were assigned to a T[] where T is mutable. If it's assigned to an auto, then T is the most basic type that all elements can be interpreted as. 7. If an array literal is .idup'd, it will be treated as if it were assigned to an immutable(T)[]. If it's assigned to an auto, then T is the most basic type that all elements can be interpreted as. I suck at writing rules :) Here are some examples of what I think should happen, and the types interpreted: int[] x = [1,2,3]; // type: int[], on heap. auto x = [1,2,3]; // type: immutable(int)[], ROM. int y = 2; auto x = [1,y,3]; // type: immutable(int)[], heap. int[] x = [1,y,3]; // type: int[], heap. auto x = [1,y,3].dup; // type int[], heap, only one allocation. auto x = [1,2,3].dup; // type: int[], heap, only one allocation. auto x = [1,y,3].idup; // type: immutable(int)[], heap, only one allocation. auto x = [1,2,3].idup; // type: immutable(int)[], ROM. auto x = [1,2.2,3]; // type: immutable(double)[], ROM. immutable(double) x = [1,2,3]; // type: immutable(double)[], ROM. int[3] x = [1,2,3]; // type int[3u], on stack, no heap allocation. auto x = "hello"; // type immutable(char)[], ROM. char[] x = "hello"; // type char[], on heap. This is all principal of least surprise. Do people think this is a good idea?
 (Wait, is .dup even enough to get a mutable array, or will it return  
 just another immutable one?)

.dup always returns a mutable array, regardless of the immutability of the source. .idup returns an array of immutable data.
 int[3] a = [1, x, 2];

 Ideally, the line of code above would not cause a heap allocation.

I agree. Automatic heap allocations for array literals that don't need to be heap allocated is crappy. -Steve
Oct 09 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 09 Oct 2009 08:34:31 -0400, Don <nospam nospam.com> wrote:

 I don't understand why runtime-determined array literals even exist.
 They're not literals!!!
 They cause no end of trouble. IMHO we'd be *much* better off without  
 them.

I don't agree. Here is a runtime decided array literal: void foo(int a, int b, int c) { auto x = [a, b, c]; } The alternatives are: // manual construction and assignment auto x = new int[3]; // unnecessary initialization to 0! x[0] = a; x[1] = b; x[2] = c; // template function auto x = createArray(a, b, c); // mixin? Although the template function looks nice, it adds bloat. The mixin is probably ugly (I don't write them much, so I'm not sure how it would be done). Why shouldn't the compiler just do what you want it to do? I don't see a lot of ambiguity in the statement above, "I want an array of a, b, and c together please". It's obvious what to do, and the code looks clean. On top of that, what if a, b, and c are runtime decide, then during development, or with a new compiler, they can now be CTFE decided? Now you are calling some function when they *could* be in a literal. It's the same thing in my opinion with range detection. It used to be that: byte b = 1 + 1; generated an error, now, it works because the compiler realizes that 1+1 doesn't overflow a byte. Why can't we add that same type of smarts into array literals? -Steve
Oct 09 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 09 Oct 2009 09:27:01 -0400, Don <nospam nospam.com> wrote:

 Steven Schveighoffer wrote:
 On Fri, 09 Oct 2009 08:34:31 -0400, Don <nospam nospam.com> wrote:

 I don't understand why runtime-determined array literals even exist.
 They're not literals!!!
 They cause no end of trouble. IMHO we'd be *much* better off without  
 them.

void foo(int a, int b, int c) { auto x = [a, b, c]; } The alternatives are:

 // template function
  auto x = createArray(a, b, c);
  // mixin?
  Although the template function looks nice, it adds bloat.

There's no bloat. You just need a type-safe variadic. T[] createArray(T)(T[] args...); One function per type. That's the best you're ever going to do with run-time construction anyway. Actually, there's horrific bloat present right now. Look at the code generated when you use an array literal.

If you have a function that takes a typesafe variadic array, what is the compiler going to do to pass that data into the function? Push it on the stack, call a function, and then the function is going to do the same thing a literal would do, reading the data off the stack? How is that not worse than an array literal generating code to build an array? Not to mention the added symbol bloat. Generated code isn't bloat if it's the minimal work that has to be done to get what you want.
  On top of that, what if a, b, and c are runtime decide, then during  
 development, or with a new compiler, they can now be CTFE decided?  Now  
 you are calling some function when they *could* be in a literal.

This is exactly the problem. They should ALWAYS require CTFE evaluation. EG: immutable(double)[] tableOfSines = [ sin(0.0), sin(PI/4), sin(PI/2), sin(3*PI/4), sin(1)]; Obviously, these values should be be compile-time evaluated. But how does the compiler know that? It can't. Right now, this is done at run-time.

I'm not extremely well-versed in what triggers CTFE, but it seems logical to me that the compiler can determine that it can be evaluated at compile-time, assuming sin is marked as pure (or maybe even if it isn't). What am I missing?
 Runtime array creation is a prime candidate for moving from language to  
 libraries.

It is a solution, but I think the better solution is you just write what you want and the compiler figures out the best move. Whether it's heap allocated or not, created at runtime or not, is an implementation detail I don't think the user needs to worry about. Come to think of it, the same thing goes for static initializers. What a pain it is to do: int x; static this() { x = foo(); } instead of just int x = foo(); -Steve
Oct 09 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Fri, Oct 9, 2009 at 8:06 AM, Don <nospam nospam.com> wrote:
 Steven Schveighoffer wrote:
 On Fri, 09 Oct 2009 09:27:01 -0400, Don <nospam nospam.com> wrote:

 Steven Schveighoffer wrote:
 On Fri, 09 Oct 2009 08:34:31 -0400, Don <nospam nospam.com> wrote:

 I don't understand why runtime-determined array literals even exist.
 They're not literals!!!
 They cause no end of trouble. IMHO we'd be *much* better off without
 them.

=A0I don't agree. =A0Here is a runtime decided array literal: =A0void foo(int a, int b, int c) { auto x =3D [a, b, c]; } =A0The alternatives are:

 // template function
 =A0auto x =3D createArray(a, b, c);
 =A0// mixin?
 =A0Although the template function looks nice, it adds bloat.

There's no bloat. You just need a type-safe variadic. T[] createArray(T)(T[] args...); One function per type. That's the best you're ever going to do with run-time construction anyway. Actually, there's horrific bloat present right now. Look at the code generated when you use an array literal.

If you have a function that takes a typesafe variadic array, what is the compiler going to do to pass that data into the function? =A0Push it on =


 stack, call a function, and then the function is going to do the same th=


 a literal would do, reading the data off the stack? =A0How is that not w=


 than an array literal generating code to build an array?

That's exactly what the compiler does right now. It pushes all the values onto the stack, then calls a function to create a literal <g>.
 Not to mention the added symbol bloat.

That's the only kind of bloat the template solution could give you.
 Generated code isn't bloat if it's the minimal work that has to be done =


 get what you want.

Yes, but at present it always generating code for the worst case.
 =A0On top of that, what if a, b, and c are runtime decide, then during
 development, or with a new compiler, they can now be CTFE decided? =A0=




 are calling some function when they *could* be in a literal.

This is exactly the problem. They should ALWAYS require CTFE evaluation. EG: immutable(double)[] tableOfSines =3D [ sin(0.0), sin(PI/4), sin(PI/2), sin(3*PI/4), sin(1)]; Obviously, these values should be be compile-time evaluated. But how do=



 the compiler know that? It can't.
 Right now, this is done at run-time.

I'm not extremely well-versed in what triggers CTFE, but it seems logica=


 to me that the compiler can determine that it can be evaluated at
 compile-time, assuming sin is marked as pure (or maybe even if it isn't)=


 =A0What am I missing?

A function can be pure even if it does a huge calculation that takes days=

 CTFE is only triggered if used in a situation where a compile-time consta=

 is _mandatory_. You have to explicitly ask for CTFE somehow.

 Runtime array creation is a prime candidate for moving from language to
 libraries.

It is a solution, but I think the better solution is you just write what you want and the compiler figures out the best move. =A0Whether it's hea=


 allocated or not, created at runtime or not, is an implementation detail=


 don't think the user needs to worry about.

I think it's really misleading to have an expensive operation masqueridin=

 as a free one. Suppose you have a 20000 element array literal, all
 constants, and then you change one element to 'x+2' where x is a local
 variable. Suddenly, instead of just getting a pointer to statically-loade=

 data, you're pushing 20000 things onto the stack!

 Creating an array at run-time seems to be a kind of constructor call to m=

 Using array literal syntax for runtime initialization gives the same
 problems Andrei discussed in the 'new' thread. Eg, it's not polymorphic.

 I suspect that uses of run-time array literals are really rare. My code i=

 full of compile-time array literals, but I've never seen a run-time usage=

The only place I can think of that I use them is when I'm trying to make some set of actions on several different things loop-able. Like float a =3D [x-2, x*2, y]; foreach(i,v; a) { // do something } Not a very compelling example, but sometimes you have a small set of computed values and you need to do the same thing to all of them, or access them by number. (So you can use a[i] and a[(i+1)%a.length] for example). I wouldn't mind having to call some sort of constructor in those cases. --bb
Oct 09 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 09 Oct 2009 11:06:18 -0400, Don <nospam nospam.com> wrote:

 Steven Schveighoffer wrote:
 On Fri, 09 Oct 2009 09:27:01 -0400, Don <nospam nospam.com> wrote:

 Steven Schveighoffer wrote:
 On Fri, 09 Oct 2009 08:34:31 -0400, Don <nospam nospam.com> wrote:

 I don't understand why runtime-determined array literals even exist.
 They're not literals!!!
 They cause no end of trouble. IMHO we'd be *much* better off without  
 them.

void foo(int a, int b, int c) { auto x = [a, b, c]; } The alternatives are:

 // template function
  auto x = createArray(a, b, c);
  // mixin?
  Although the template function looks nice, it adds bloat.

There's no bloat. You just need a type-safe variadic. T[] createArray(T)(T[] args...); One function per type. That's the best you're ever going to do with run-time construction anyway. Actually, there's horrific bloat present right now. Look at the code generated when you use an array literal.

the compiler going to do to pass that data into the function? Push it on the stack, call a function, and then the function is going to do the same thing a literal would do, reading the data off the stack? How is that not worse than an array literal generating code to build an array?

That's exactly what the compiler does right now. It pushes all the values onto the stack, then calls a function to create a literal <g>.

Actually, that makes sense :) I stand corrected. The only case it doesn't make sense is for static arrays.
 Generated code isn't bloat if it's the minimal work that has to be done  
 to get what you want.

Yes, but at present it always generating code for the worst case.

I'm not arguing for the present situation (always heap allocate mutable data).
  On top of that, what if a, b, and c are runtime decide, then during  
 development, or with a new compiler, they can now be CTFE decided?   
 Now you are calling some function when they *could* be in a literal.

This is exactly the problem. They should ALWAYS require CTFE evaluation. EG: immutable(double)[] tableOfSines = [ sin(0.0), sin(PI/4), sin(PI/2), sin(3*PI/4), sin(1)]; Obviously, these values should be be compile-time evaluated. But how does the compiler know that? It can't. Right now, this is done at run-time.

logical to me that the compiler can determine that it can be evaluated at compile-time, assuming sin is marked as pure (or maybe even if it isn't). What am I missing?

A function can be pure even if it does a huge calculation that takes days. CTFE is only triggered if used in a situation where a compile-time constant is _mandatory_. You have to explicitly ask for CTFE somehow.

Yes, you are right. I didn't think about that. It seems like whether a function call should be evaluated via CTFE is really an orthogonal problem (one which probably needs solving also).
 Runtime array creation is a prime candidate for moving from language  
 to libraries.

what you want and the compiler figures out the best move. Whether it's heap allocated or not, created at runtime or not, is an implementation detail I don't think the user needs to worry about.

I think it's really misleading to have an expensive operation masqueriding as a free one. Suppose you have a 20000 element array literal, all constants, and then you change one element to 'x+2' where x is a local variable. Suddenly, instead of just getting a pointer to statically-loaded data, you're pushing 20000 things onto the stack!

I'm not sure we should cater to preventing cases like this. I've never seen anything like this in real code.
 Creating an array at run-time seems to be a kind of constructor call to  
 me. Using array literal syntax for runtime initialization gives the same  
 problems Andrei discussed in the 'new' thread. Eg, it's not polymorphic.

Arrays aren't polymorphic. I don't think that applies. However, the allocation argument is valid. For example: int[] x = new int[3]; x[] = [a,b,c]; Should this construct a heap allocated array, then copy it to x? What a waste. I guess that's another special case that would need to be handled by the compiler, but the rules are getting quite complicated.
 I suspect that uses of run-time array literals are really rare. My code  
 is full of compile-time array literals, but I've never seen a run-time  
 usage.

I think you are right that they are not as common, but I wouldn't call them really rare. Building an array via the same syntax whether you are using literals or variables seems like the most natural thing to me. If we can't do that, a library call is probably the next best solution. But I think a library solution will be very difficult to implement, especially working around possible type issues (a unique type modifier would be helpful here). -Steve
Oct 09 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Sat, 10 Oct 2009 02:28:42 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Christopher Wright wrote:
 Don wrote:
 I don't understand why runtime-determined array literals even exist.
 They're not literals!!!
 They cause no end of trouble. IMHO we'd be *much* better off without  
 them.

feature were removed from the language. For example, one thing I recently wrote involved creating a process with a large number of arguments. The invocation looked like: exec("description", [procName, arg1, arg2] ~ generatedArgs ~ [arg3, arg4] ~ moreGeneratedArgs); There were about ten or fifteen lines like that. You'd suggest I rewrite that how? char[][] args; args ~= procName; args ~= arg1; args ~= arg2; args ~= generatedArgs; args ~= arg3; Just fucking shoot me. Or better yet, whoever removed array literals with non-constant elements from the language.

Relax. It's a condition known as literalitis. :o) Literals only have you write [ a, b, c ] instead of toArray(a, b, c). I wouldn't see it a big deal one way or another, but the issue is that the former is a one-time decision that pretty much can't be changed, whereas toArray can benefit of the hindsight of experience. Andrei

I believe it's okay, but compiler should be able to return static arrays from a function (I'll call it "array", but feel free to substitute any other function name): foreach (i; [a, b, c]) { ---> foreach (i, array(a, b,c )) { // ... // ... } } int[3] x = [a, b, c]; ---> int[3] x = array(a, b, c); The first example could work with any range returned, but can you initialize a static array with a range? Well, some language feature could allow that and translate the latter case into something like this: int[3] x = void; foreach (index, value; array(a, b, c)) { x[index] = value; } But then, what would you do with an array overflow? Disallow it? I.e., make it impossible to create a static array and initialize it at the same time? If range design is essential in D, perhaps the following could be allowed: int[] someArray = ...; someArray[] = someRange(); // initializes an array with elements of a range. Throws an exception if boundaries don't match And then, int[3] x = array(a, b, c); could be translated into int[3] x = void; x[] = array(a, b, c); (unless array doesn't meet range criteria, of course). You loose compile-time boundaries checking, but that's the best you can do if you drop [a, b, c] feature, I'm affraid.
Oct 09 2009
prev sibling next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Sat, Oct 10, 2009 at 7:33 AM, Yigal Chripun <yigal100 gmail.com> wrote:
 You keep calling these literals "constructor calls' and I agree that that's
 what they are. My question is then why not make them real constructors?

 auto a = new int[](x, y, z);

Teehee, that syntax already has meaning in D. Well, that right there would give you a semantic error, but "new int[][][](x, y, z)" creates a 3-D array with dimensions x, y, and z. That brings up another point. If you *did* use a class-style ctor syntax, how would you list the arguments for a multidimensional array? That is, what would be the equivalent of [[1, 2], [3, 4]]?
Oct 10 2009
prev sibling next sibling parent language_fan <foo bar.com.invalid> writes:
Sat, 10 Oct 2009 17:15:55 +0200, Yigal Chripun thusly wrote:

 Now, wouldn't it be wonderful if D had provided real tuple support
 without all the Tuple!() nonsense?

'D has full built-in tuple support' has been the answer each time I've asked. It seems not to be advisable to ask more about this specific feature since the language creators easily get annoyed when asked about this. They see more value in reserving the syntax for the C style sequencing operator which is rarely used. Also they have apparently scientifically proven that the auto-flattening semantics of tuples somehow works better than real product types, and have no intention to make it an explicit controllable operation, which is also easily implementable.
Oct 12 2009
prev sibling parent "Phil Deets" <pjdeets2 gmail.com> writes:
On Mon, 12 Oct 2009 09:47:34 -0500, Don <nospam nospam.com> wrote:

 (OTOH I wonder how much extant C++ code uses the comma operator. I bet  
 there's not much of it. (But more than code than uses octal!)).

Boost Assign uses the comma operator. http://www.boost.org/doc/libs/1_40_0/libs/assign/doc/index.html#operator+= However, the C++ code vector<int> values; values += 1,2,3,4,5,6,7,8,9; could be translated to int[] values; values ~= [1,2,3,4,5,6,7,8,9]; in D so no comma operator would be needed there. -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Oct 12 2009
prev sibling parent Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 Consider:
 
 auto c = [ 2.71, 3.14, 6.023e22 ];
 c ~= 2.21953167;
 
 Should this work? Currently it doesn't because c's type is deduced as 
 double[3].
 
 The literal can initialize either double[3] or double[], so the question 
 is only what the default should be when "auto" is used.
 
 
 Thoughts?
 
 Andrei

I agree with the others who say immutable(double)[]. Also, I don't understand why type inference is determined only by the first member. When patching your ICE bug 3374 just now I was struck by how simple it would be to determine the tightest type in the array. auto c = [0, 2.5, 5, 7.5, 10]; --> should be immutable(double)[]. I guess Walter was just worried about how complicated the rules would become? There's clearly no implementation difficulty.
Oct 09 2009