www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - [:] as empty associative array literal, plus warning for null

reply "bearophile" <bearophileHUGS lycos.com> writes:
Time ago I have opened an enhancement request for empty 
associative array literals:

http://d.puremagic.com/issues/show_bug.cgi?id=7227

This is currently correct code:

void foo(int[int] bb) {}
void main() {
     int[int] aa;
     aa = null;
     foo(null);
     int[int][] aas = [null, null];
}


With the proposal it becomes:

void foo(int[int] bb) {}
void main() {
     int[int] aa;
     aa = [:];
     foo([:]);
     int[int][] aas = [[:], [:]];
}


I am writing about it here because Henning Pohl has written a 
first version of a patch:

https://github.com/D-Programming-Language/dmd/pull/2284

- - - - - - - - - - - - -

The patch by Henning Pohl uses the [] syntax as literal for empty 
associative array, but I prefer the [:] syntax, because it's more 
precise.

Kenji Hara shows the ambiguity of the [] syntax (that was already 
present with using "null" as literals), that [:] lacks:

void foo(int[]);
void foo(int[int]);
foo([]);   // prefer int[] overload, or ambiguous?

- - - - - - - - - - - - -

On GitHub yebblies commented:

 Ideally this would not be the same as K[V] aa = null;,
 it would behave like K[V] aa = new K[V]; - an AA would be 
 allocated.

I think this is a bad idea, because then the semantics of D code changes if you use [:] instead of null. D associative arrays have problems: void test(int[int] arraya, int x) { arraya[x] = x; } void main() { int[int] d; test(d, 0); int[int] d0; assert(d == d0); // d is empty, 0:0 is lost d[1] = 1; test(d, 2); assert(d == [1: 1, 2: 2]); // now 2:2 is not lost } Compared to the output of this Python code: def test(arraya, x): arraya[x] = x def main(): d = {} test(d, 0) assert d == {0: 0} d[1] = 1 test(d, 2) assert d == {0: 0, 1: 1, 2: 2} main() Such problems should be faced in other ways. Making the associative array literal semantics even more complex is not helping. This problem is present for dynamic arrays too, and I asked to fix it: http://d.puremagic.com/issues/show_bug.cgi?id=5788 - - - - - - - - - - - - - With the introduction of the new empty associative array literal it's a good idea to warn against the usage of the older "null" literal: void foo(int[int]) {} void main() { foo(null); int[int][] aas = [null]; aas[0] = [1: 2, 2: 3]; } Should give the warnings: test.d(3): Warning: explicit [:] empty associative array literal is better than null, that will be deprecated test.d(4): Warning: explicit [:] empty associative array literal is better than null, that will be deprecated The wording of such warning message is modelled on another warning message: test.d(3): Warning: explicit element-wise assignment (a)[] = 2 is better than a = 2 For Jonathan M Davis: this is a new warning, but later it's supposed to become a deprecation message, and then an error. So this is not meant meant to be a permanent warning. Lot of time ago I have also proposed to deprecate "null" as literal for dynamic arrays, this goes well with the idea that dynamic arrays are not pointers (the * syntax was disallowed for dynamic arrays, etc), this is meant to go with the optimization requested in issue 5788: http://d.puremagic.com/issues/show_bug.cgi?id=3889 Bye, bearophile
Jul 03 2013
next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 3 July 2013 at 14:33:02 UTC, bearophile wrote:
 Time ago I have opened an enhancement request for empty 
 associative array literals:

Why not simply [] ? After all, an array is just a very simple type of hashmap, with the identity as hash function.
Jul 03 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/3/13 10:31 AM, Dicebot wrote:
 On Wednesday, 3 July 2013 at 16:55:59 UTC, bearophile wrote:
 ...

Has sounded convincing enough for me. Anything that enforces stronger typing is big win in my opinion.

typeof(null) has quite a few interesting properties. It's the closest type to the bottom of all types (we don't have an actual bottom type), and it subtypes many other types as mentioned. Introducing yet another type works against that nice uniformity and is yet another arbitrary little thing that people who learn the language would need to know about. Andrei
Jul 03 2013
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 07/04/2013 02:28 AM, Andrei Alexandrescu wrote:
 On 7/3/13 10:31 AM, Dicebot wrote:
 On Wednesday, 3 July 2013 at 16:55:59 UTC, bearophile wrote:
 ...

Has sounded convincing enough for me. Anything that enforces stronger typing is big win in my opinion.

typeof(null) has quite a few interesting properties. It's the closest type to the bottom of all types (we don't have an actual bottom type), and it subtypes many other types as mentioned. Introducing yet another type works against that nice uniformity

What is that nice uniformity?
 and is yet another arbitrary little thing that people who learn the language
 would need to know about.
...

A lot less arbitrary than having to initialize AAs into an empty state by adding and removing a mapping. Also, I think [] should have a singleton type as well. Currently it is a void[] with special implicit conversion rules.
Jul 03 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/3/13 5:52 PM, Timon Gehr wrote:
 On 07/04/2013 02:28 AM, Andrei Alexandrescu wrote:
 On 7/3/13 10:31 AM, Dicebot wrote:
 On Wednesday, 3 July 2013 at 16:55:59 UTC, bearophile wrote:
 ...

Has sounded convincing enough for me. Anything that enforces stronger typing is big win in my opinion.

typeof(null) has quite a few interesting properties. It's the closest type to the bottom of all types (we don't have an actual bottom type), and it subtypes many other types as mentioned. Introducing yet another type works against that nice uniformity

What is that nice uniformity?

That it "is the closest type to the bottom of all types" and "subtypes many other types as mentioned" :o).
 and is yet another arbitrary little thing that people who learn the
 language
 would need to know about.
 ...

A lot less arbitrary than having to initialize AAs into an empty state by adding and removing a mapping.

Could be a function call. I find it unnecessary to just add new notation for every single little thing.
 Also, I think [] should have a singleton type as well. Currently it is a
 void[] with special implicit conversion rules.

Agreed. Andrei
Jul 03 2013
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/4/13 3:00 AM, Dicebot wrote:
 I am afraid I don't really follow here. How introducing new array
 literal (and making old one stronger typed) compromises typeof(null)
 features?

If it has a different type, there's unneeded secession. If it has the same type, it's arguably useless as one could always write (T[K]).init in the rare cases that need disambiguation. Andrei
Jul 04 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
deadalnix:

 Why not simply [] ? After all, an array is just a very simple 
 type of hashmap, with the identity as hash function.

In a language it's very often handy for different data types to have different textual representations and different literals. If in D code you see: foo0(null); That foo0 can be a function that takes a null pointer, and empty dynamic array, an empty string, an empty associative array (or even a Nullable). Using [] as empty associative array literal is only a bit better than using null: foo1([]); Now foo4 can take an empty dynamic array, an empty string, or an empty associative array. Instead if you see: foo2(""); foo3([]); foo4([:]); Now you know what you are giving to that function, it's more readable because more explicit. Currently if you have code like this: void foo(int[]) {} void foo(int[int]) {} void main() { foo(null); } The compiler gives an error because of the ambiguity: test.d(4): Error: called with argument types: (typeof(null)) matches both: temp.d(1): test.foo(int[] _param_0) and: temp.d(2): test.foo(int[int] _param_0) If you introduce [] as empty associative array literal, the code will give similar errors, the situation is not improved much compared to null, this is what Kenji was saying. But if you introduce a literal for associative arrays it's better to design it right. [:] avoids the ambiguity, and this code gives no errors: void foo(int[]) {} void foo(int[int]) {} void main() { foo([]); foo([:]); } [:] takes just one more char compared to [] so it's doesn't cause too much typing (and it's shorter than null). And I think D programmers are able to infer what [:] means, I think it's not hard to learn. Bye, bearophile
Jul 03 2013
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Wednesday, 3 July 2013 at 16:55:59 UTC, bearophile wrote:
 ...

Has sounded convincing enough for me. Anything that enforces stronger typing is big win in my opinion.
Jul 03 2013
prev sibling next sibling parent reply "TommiT" <tommitissari hotmail.com> writes:
On Wednesday, 3 July 2013 at 14:33:02 UTC, bearophile wrote:
 [..]

I think it's a good idea to have [:] literal for associative arrays since there's [] literal for dynamic arrays. But I would expect [] to mean "empty dynamic array", not null. And the same for [:]. This is how I'd expect things to work: int[] nullArray; assert(nullArray is null); int[] emptyArray = []; assert(emptyArray !is null); int[string] nullAA; assert(nullAA is null); int[string] emptyAA = [:]; assert(emptyAA !is null); Reasoning by extrapolation: int[] arr = [1, 2, 3]; // Array of 3 ints int[] arr = [1, 2]; // Array of 2 ints int[] arr = [1]; // Array of 1 ints int[] arr = []; // Array of 0 ints (not null)
Jul 03 2013
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 07/03/2013 07:53 PM, TommiT wrote:
 On Wednesday, 3 July 2013 at 14:33:02 UTC, bearophile wrote:
 [..]

I think it's a good idea to have [:] literal for associative arrays since there's [] literal for dynamic arrays. But I would expect [] to mean "empty dynamic array", not null. And the same for [:]. This is how I'd expect things to work: int[] nullArray; assert(nullArray is null); int[] emptyArray = []; assert(emptyArray !is null); int[string] nullAA; assert(nullAA is null); int[string] emptyAA = [:]; assert(emptyAA !is null); Reasoning by extrapolation: int[] arr = [1, 2, 3]; // Array of 3 ints int[] arr = [1, 2]; // Array of 2 ints int[] arr = [1]; // Array of 1 ints int[] arr = []; // Array of 0 ints (not null)

+1.
Jul 03 2013
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 07/03/2013 08:43 PM, TommiT wrote:
 On Wednesday, 3 July 2013 at 18:10:42 UTC, bearophile wrote:
 TommiT:
 But I would expect [] to mean "empty dynamic array", not null. [..]

This means you have to keep "null" in the language to represent an empty associative array, because someone somewhere will surely want a literal that avoids memory allocations. So lot of people will keep using null, and the coding situation is improved about zero.

Okay, I didn't realize an empty array would need to allocate.

Not necessarily if you allow the runtime to share the pointer for all empty arrays.
Jul 03 2013
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 07/03/2013 08:10 PM, bearophile wrote:
 ...

 This means you have to keep "null" in the language to represent an empty
 associative array, because someone somewhere will surely want a literal
 that avoids memory allocations. So lot of people will keep using null,
 and the coding situation is improved about zero.
 ...

It would be improved. Excerpt from my code: private static Variant[VarDecl] parseContext(void[] mem,FunctionDef fd){ // ... Variant[VarDecl] r; // (stupid built-in AAs) assert(fd !is null); r[cast(VarDecl)cast(void*)fd]=Variant(null); r.remove(cast(VarDecl)cast(void*)fd); // ... } That code using an allocating [:] literal: private static Variant[VarDecl] parseContext(void[] mem, FunctionDef fd){ // ... Variant[VarDecl] r = [:]; // ... } Is there currently a better way to initialize AAs?
Jul 03 2013
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 07/04/2013 11:25 AM, Regan Heath wrote:
 ...

 Tommi's suggestion that all empty arrays share a common pointer is a
 good one and more or less solves the "it has to allocate memory" complaint.
 ...

(That was actually my suggestion. :o) ) Sharing a common pointer has some drawbacks as well, because it still makes [] somewhat special. It is probably worth it though.
Jul 04 2013
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 07/04/2013 01:50 PM, Steven Schveighoffer wrote:
 On Thu, 04 Jul 2013 05:25:30 -0400, Regan Heath <regan netmail.co.nz>
 wrote:

 On Wed, 03 Jul 2013 19:10:40 +0100, bearophile
 <bearophileHUGS lycos.com> wrote:
 Telling apart the literal for an empty array from the literal of a
 empty but not null array is a bad idea that muds the language. And
 thankfully this currently fails:

 void main() {
      int[] emptyArray = [];
      assert(emptyArray !is null);
 }

As this comes up often you're probably aware that there are people (like myself) who find the distinction between a null (non-existant) array and an empty array useful.

Nobody questions that. The biggest problem is making if(arr) mean if(arr.ptr) instead of if(arr.length)

if(arr.ptr) is what it means now.
 What [] returns should not be an allocation.  And returning null is a
 reasonable implementation of that.

 -Steve

static __gshared void[1] x; return x[0..0];
Jul 04 2013
prev sibling next sibling parent Artur Skawina <art.08.09 gmail.com> writes:
On 07/04/13 14:41, Timon Gehr wrote:
 On 07/04/2013 11:25 AM, Regan Heath wrote:
 ...

 Tommi's suggestion that all empty arrays share a common pointer is a
 good one and more or less solves the "it has to allocate memory" complaint.
 ...

(That was actually my suggestion. :o) )

What's the point? [Mutation will cause reallocation; (void*)0 isn't special] artur
Jul 04 2013
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/4/13 6:32 AM, Steven Schveighoffer wrote:
 I would not be opposed to a pull request that made [] be non-null, as
 long as it doesn't allocate.

What would be the benefits? Andrei
Jul 04 2013
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 07/04/2013 04:35 PM, Andrei Alexandrescu wrote:
 On 7/4/13 6:32 AM, Steven Schveighoffer wrote:
 I would not be opposed to a pull request that made [] be non-null, as
 long as it doesn't allocate.

What would be the benefits? Andrei

- Additional sentinel values at basically no cost. - No accidental flawed relying on empty array is null or empty array !is null. (i.e. less nondeterminism.) - One thing less to discuss (this has come up before.)
Jul 04 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/4/13 8:27 AM, Timon Gehr wrote:
 On 07/04/2013 04:35 PM, Andrei Alexandrescu wrote:
 On 7/4/13 6:32 AM, Steven Schveighoffer wrote:
 I would not be opposed to a pull request that made [] be non-null, as
 long as it doesn't allocate.

What would be the benefits? Andrei

- Additional sentinel values at basically no cost.

OK, an "extra null". These can occasionally useful.
 - No accidental flawed relying on empty array is null or empty array !is
 null.
 (i.e. less nondeterminism.)

But null arrays stay, so this doesn't help with that. It may actually add confusion.
 - One thing less to discuss (this has come up before.)

That would be true if e.g. the null array disappeared. As such, this adds yet another type to the discussion. Andrei
Jul 04 2013
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 07/04/2013 05:51 PM, Andrei Alexandrescu wrote:
 On 7/4/13 8:27 AM, Timon Gehr wrote:
 On 07/04/2013 04:35 PM, Andrei Alexandrescu wrote:
 On 7/4/13 6:32 AM, Steven Schveighoffer wrote:
 I would not be opposed to a pull request that made [] be non-null, as
 long as it doesn't allocate.

What would be the benefits? Andrei

- Additional sentinel values at basically no cost.

OK, an "extra null". These can occasionally useful.
 - No accidental flawed relying on empty array is null or empty array !is
 null.
 (i.e. less nondeterminism.)

But null arrays stay, so this doesn't help with that. It may actually add confusion.

It is more likely that 'if(array !is null) { }' is valid under the stronger semantics than under the old ones. This removes a potentially bug-prone construct, since it is easy to fall into the trap of expecting that different syntax denotes a different construct.
 - One thing less to discuss (this has come up before.)

That would be true if e.g. the null array disappeared. As such, this adds yet another type to the discussion. ...

What I mean is that there will be no reason to discuss the following atrocity any further: void main(){ // (Compiles and runs with DMD) assert([1].ptr[0..0] !is null); assert([1][0..0] is null); int i=0; assert([1][i..i] !is null); assert([] is null); auto x=[1]; assert(x[0..0] !is null); auto y=[1][i..i]; } It's quite obvious to me what the rules are that govern whether or not an empty array is null, but why bother? It's simply not useful. Doing it always the same way makes a lot more sense than what's demonstrated above. And then, going for non-null empty arrays makes most sense to support an efficient implementation of slicing. Furthermore let's look at it from a syntactic viewpoint. Currently we have: - empty arrays of the form cast(T[])null - other empty arrays which are null, eg. [] - empty arrays which are not null, eg. [1][i..i] Afterwards we have: - empty arrays of the form cast(T[])null - empty arrays which are not null, eg. [] or [1][i..i] IMO that reduces cognitive load, even if not as much as simply getting rid of cast(T[])null.
Jul 04 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/4/13 9:51 AM, Timon Gehr wrote:
 On 07/04/2013 05:51 PM, Andrei Alexandrescu wrote:
 On 7/4/13 8:27 AM, Timon Gehr wrote:
 On 07/04/2013 04:35 PM, Andrei Alexandrescu wrote:
 On 7/4/13 6:32 AM, Steven Schveighoffer wrote:
 I would not be opposed to a pull request that made [] be non-null, as
 long as it doesn't allocate.

What would be the benefits? Andrei

- Additional sentinel values at basically no cost.

OK, an "extra null". These can occasionally useful.
 - No accidental flawed relying on empty array is null or empty array !is
 null.
 (i.e. less nondeterminism.)

But null arrays stay, so this doesn't help with that. It may actually add confusion.

It is more likely that 'if(array !is null) { }' is valid under the stronger semantics than under the old ones. This removes a potentially bug-prone construct, since it is easy to fall into the trap of expecting that different syntax denotes a different construct.

Why? Maybe someone returned [] thinking it will be a null array. I don't see how adding an obscure "whoa, this is an empty array, but surprisingly, it's not null!" marks an increase in clarity.
 - One thing less to discuss (this has come up before.)

That would be true if e.g. the null array disappeared. As such, this adds yet another type to the discussion. ...

What I mean is that there will be no reason to discuss the following atrocity any further: void main(){ // (Compiles and runs with DMD) assert([1].ptr[0..0] !is null); assert([1][0..0] is null); int i=0; assert([1][i..i] !is null); assert([] is null); auto x=[1]; assert(x[0..0] !is null); auto y=[1][i..i]; }

Making [] changes exactly one line there if I understand things correctly, and does not improve anything much.
 Furthermore let's look at it from a syntactic viewpoint.

 Currently we have:

 - empty arrays of the form cast(T[])null
 - other empty arrays which are null, eg. []
 - empty arrays which are not null, eg. [1][i..i]

 Afterwards we have:
 - empty arrays of the form cast(T[])null
 - empty arrays which are not null, eg. [] or [1][i..i]

 IMO that reduces cognitive load, even if not as much as simply getting
 rid of cast(T[])null.

The way I see it is: - we now have two literals for null arrays - we also have the ability to create non-null but empty arrays through natural reduction and slicing of arrays The new setup is: - we'd have one literal for null arrays - we'd still have the ability to create non-null but empty arrays through natural reduction and slicing of arrays - we'd have Nosferatu in the form of [] Andrei
Jul 04 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/4/13 10:58 AM, TommiT wrote:
 On Thursday, 4 July 2013 at 17:32:29 UTC, Andrei Alexandrescu wrote:
 [..] Maybe someone returned [] thinking it will be a null array. [..]

I wouldn't think that [] is null, and I suspect neither would very many other newcomers to the language. To me, the only problem with [] being null is that it doesn't look like null. It looks like an empty array.

A null array _is_ an empty array.
 So, the problem is that [] is not what you'd intuitively expect it to be.

Is the intuition that it's an empty array that's not part of anything, presumably obtained by forging 1 into a pointer? Didn't think so.
 By the way, this must be a bug, right?

 template arr(X_...)
 {
 int[] arr = [X_]; // [1]
 }

 void main()
 {
 auto a2 = arr!(1, 2);
 auto a1 = arr!(1);
 auto a0 = arr!(); // [2]
 }

 [1] Error: initializer must be an expression, not '()'
 [2] Error: template instance main.arr!() error instantiating

 ...because if that's not supposed to work, then I don't see much point
 in having the [] literal in the language.

I also happen to think we don't quite need []. Andrei
Jul 04 2013
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 07/04/2013 08:03 PM, Andrei Alexandrescu wrote:
 On 7/4/13 10:58 AM, TommiT wrote:
 On Thursday, 4 July 2013 at 17:32:29 UTC, Andrei Alexandrescu wrote:
 [..] Maybe someone returned [] thinking it will be a null array. [..]

I wouldn't think that [] is null, and I suspect neither would very many other newcomers to the language. To me, the only problem with [] being null is that it doesn't look like null. It looks like an empty array.

A null array _is_ an empty array.

Whether null is [] is the subject of this discussion. There are multiple possibilities, such as: 1. null !is [], but [] is [] 2. null !is [], and not even [] is [] 3. !is(typeof(null is [])) // also, !is(typeof(null == [])) 4. null is [] 1 is the most useful, 2 is the most wasteful, 3 is the sanest, and 4 is what DMD/druntime implement. The spec is loose enough to allow an implementation to behave like 1, 2 or 4 or in even other ways.
 So, the problem is that [] is not what you'd intuitively expect it to be.

Is the intuition that it's an empty array that's not part of anything, presumably obtained by forging 1 into a pointer? Didn't think so.

Why do multiple persons come up with this suggestion independently?
 By the way, this must be a bug, right?

 template arr(X_...)
 {
 int[] arr = [X_]; // [1]
 }

 void main()
 {
     auto a2 = arr!(1, 2);
     auto a1 = arr!(1);
     auto a0 = arr!(); // [2]
 }

 [1] Error: initializer must be an expression, not '()'
 [2] Error: template instance main.arr!() error instantiating

 ...because if that's not supposed to work, then I don't see much point
 in having the [] literal in the language.

I also happen to think we don't quite need [].

[1,2,3] [1,2] null [1] [] Find Waldo.
Jul 04 2013
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/4/13 11:38 AM, bearophile wrote:
 Andrei Alexandrescu:

 I also happen to think we don't quite need [].

It's the opposite, we don't need null to represent empty dynamic arrays. Hopefully Rust designers will avoid such point of views, they generally prefer a less sloppy design and a stronger typing.

Where does the whole "stronger typing" comes in? This is poppycock. We need real arguments here. Andrei
Jul 04 2013
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 07/04/2013 07:32 PM, Andrei Alexandrescu wrote:
 On 7/4/13 9:51 AM, Timon Gehr wrote:
 On 07/04/2013 05:51 PM, Andrei Alexandrescu wrote:
 On 7/4/13 8:27 AM, Timon Gehr wrote:
 On 07/04/2013 04:35 PM, Andrei Alexandrescu wrote:
 On 7/4/13 6:32 AM, Steven Schveighoffer wrote:
 I would not be opposed to a pull request that made [] be non-null, as
 long as it doesn't allocate.

What would be the benefits? Andrei

- Additional sentinel values at basically no cost.

OK, an "extra null". These can occasionally useful.
 - No accidental flawed relying on empty array is null or empty array
 !is
 null.
 (i.e. less nondeterminism.)

But null arrays stay, so this doesn't help with that. It may actually add confusion.

It is more likely that 'if(array !is null) { }' is valid under the stronger semantics than under the old ones. This removes a potentially bug-prone construct, since it is easy to fall into the trap of expecting that different syntax denotes a different construct.

Why? Maybe someone returned [] thinking it will be a null array.

Maybe someone returned [] thinking it won't be null. int[] foo(int[] x){ if(a) return [1,2,3]; // not null if(b) return [1,2]; // not null if(c) return [1]; // not null if(e) return []; // _ if(x !is null) return x[0..0]; // not null return null; // null } void bar(){ // ... auto y=foo(x); // ... }
 I don't see how adding an obscure "whoa, this is an empty array, but
 surprisingly, it's not null!" marks an increase in clarity.

Please justify "obscure".
 - One  thing less to discuss (this has come up before.)

That would be true if e.g. the null array disappeared. As such, this adds yet another type to the discussion. ...

What I mean is that there will be no reason to discuss the following atrocity any further: void main(){ // (Compiles and runs with DMD) assert([1].ptr[0..0] !is null); assert([1][0..0] is null); int i=0; assert([1][i..i] !is null); assert([] is null); auto x=[1]; assert(x[0..0] !is null); auto y=[1][i..i]; }

Making [] changes exactly one line there if I understand things correctly, and does not improve anything much.

You don't understand things correctly. This would be the new behaviour: void main(){ assert([1].ptr[0..0] !is null); assert([1][0..0] !is null); int i=0; assert([1][i..i] !is null); assert([] !is null); auto x=[1]; assert(x[0..0] !is null); auto y=[1][i..i]; } Notice the 'nice uniformity' there.
 Furthermore let's look at it from a syntactic viewpoint.

 Currently we have:

 - empty arrays of the form cast(T[])null
 - other empty arrays which are null, eg. []
 - empty arrays which are not null, eg. [1][i..i]

 Afterwards we have:
 - empty arrays of the form cast(T[])null
 - empty arrays which are not null, eg. [] or [1][i..i]

 IMO that reduces cognitive load, even if not as much as simply getting
 rid of cast(T[])null.

The way I see it is: - we now have two literals for null arrays

Which two? I only see one bulb.
 - we also have the ability to create non-null but empty arrays through
 natural reduction and slicing of arrays

 The new setup is:

 - we'd have one literal for null arrays
 - we'd still have the ability to create non-null but empty arrays
 through natural reduction and slicing of arrays
 - we'd have Nosferatu in the form of []
 ...

Yah, those are IMO fairly unfair classifications, but I really don't want to get into this kind of discussion.
Jul 04 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/4/13 8:37 AM, monarch_dodra wrote:
 On Thursday, 4 July 2013 at 15:27:17 UTC, Timon Gehr wrote:
 On 07/04/2013 04:35 PM, Andrei Alexandrescu wrote:
 On 7/4/13 6:32 AM, Steven Schveighoffer wrote:
 I would not be opposed to a pull request that made [] be non-null, as
 long as it doesn't allocate.

What would be the benefits? Andrei

- Additional sentinel values at basically no cost. - No accidental flawed relying on empty array is null or empty array !is null. (i.e. less nondeterminism.) - One thing less to discuss (this has come up before.)

There are no benefits to making "[]" return null either. Implementation wise, instead of returning a void[] with "ptr == 0x0" and "length == 0", it could just as well return a void[] with "ptr == 0x1" and "length == 0". You'd get better behavior at no extra cost.

I'm clear on the no extra cost part, but confused about the benefits. Andrei
Jul 04 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/4/13 8:53 AM, Andrei Alexandrescu wrote:
 On 7/4/13 8:37 AM, monarch_dodra wrote:
 On Thursday, 4 July 2013 at 15:27:17 UTC, Timon Gehr wrote:
 On 07/04/2013 04:35 PM, Andrei Alexandrescu wrote:
 On 7/4/13 6:32 AM, Steven Schveighoffer wrote:
 I would not be opposed to a pull request that made [] be non-null, as
 long as it doesn't allocate.

What would be the benefits? Andrei

- Additional sentinel values at basically no cost. - No accidental flawed relying on empty array is null or empty array !is null. (i.e. less nondeterminism.) - One thing less to discuss (this has come up before.)

There are no benefits to making "[]" return null either. Implementation wise, instead of returning a void[] with "ptr == 0x0" and "length == 0", it could just as well return a void[] with "ptr == 0x1" and "length == 0". You'd get better behavior at no extra cost.

I'm clear on the no extra cost part, but confused about the benefits.

BTW you get the no extra cost part with a sheer function: auto emptyArray(T)() trusted { return (cast(T*) 1)[0 .. 0]; } Andrei
Jul 04 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/4/13 8:02 AM, Regan Heath wrote:
 On Thu, 04 Jul 2013 15:35:30 +0100, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 7/4/13 6:32 AM, Steven Schveighoffer wrote:
 I would not be opposed to a pull request that made [] be non-null, as
 long as it doesn't allocate.

What would be the benefits?

Being able to naturally specify a non-null empty array (literal) such that... char[] n = null; char[] e = []; assert(n is null) assert(e !is null);

And what would be the benefit of that? Andrei
Jul 04 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/4/13 9:51 AM, Jonathan M Davis wrote:
 On Thursday, July 04, 2013 08:41:47 Andrei Alexandrescu wrote:
 On 7/4/13 8:02 AM, Regan Heath wrote:
 On Thu, 04 Jul 2013 15:35:30 +0100, Andrei Alexandrescu

 <SeeWebsiteForEmail erdani.org>  wrote:
 On 7/4/13 6:32 AM, Steven Schveighoffer wrote:
 I would not be opposed to a pull request that made [] be non-null, as
 long as it doesn't allocate.

What would be the benefits?

Being able to naturally specify a non-null empty array (literal) such that... char[] n = null; char[] e = []; assert(n is null) assert(e !is null);

And what would be the benefit of that?

Making the distinction between null and empty cleaner.

I don't see where that derives from at all.
 There are plenty of
 cases in CS in general where distinguishing between null and empty is useful,
 but unfortunately, the way we've gone about implementing null and empty with
 arrays in D tends to blur to the point that it's kind of iffy to use null as
 something distinct from empty.

The way code should do is use null on the creation side and .empty on the testing side. Making [] non-null does not change that.
 You can do it for simple stuff if you're careful
 (like explicitly returning null from a function), but it's very easy to end up
 with an array that's empty when you want null or vice versa. One prime case of
 this is [] vs null. [] is supposed to indicate an empty array, but it results
 in a null one, which not only helps blur the line between null and empty, but
 it makes it so that (similar to AAs), there's no clean way to simply declare
 an empty array.

Why do you want so much an empty array that's not null? I can't make sense of this entire argument. Andrei
Jul 04 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/5/13 2:05 AM, Regan Heath wrote:
 On Thu, 04 Jul 2013 18:26:09 +0100, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Why do you want so much an empty array that's not null? I can't make
 sense of this entire argument.

Suppose you have a web page, suppose it has a text field on it called "comment". Suppose you load a pre-existing record from your database and populate the page, suppose it had a value for comment, suppose you want to set that comment to be blank. you edit and click save. The code backing this page is going to get a string for "comment", that string should be empty but not null. Why? Because if it were null it would have a different meaning. It would mean that the comment field was not present on the page at all, and should not be altered.

I find the example tenuous. Even assuming it has merit, it does not explain the need for a syntactic _literal_ to fulfill that need.
 There are many such examples.

I am not convinced there are many such examples. I'd call it poor design to make it a cornerstone to distinguish between empty arrays that are null and empty arrays that are not only non-null, but aren't part of any other array! Let me emphasize the last part. Ironically, there is a potential interesting use of empty non-null arrays as anchors: an empty slice referring to the interior of an array may be combined with another slice, pointer, or length, to create a new, meaningful slice. However, the discussed literal does nothing of that kind - it just fabricates a slice of an array that does not exist.
 All of them can be worked around by
 various means but these are all more complex and require additional
 containers or variables to represent state.

I would say those are better designs.
 null - does not exist, was not specified.
 empty - exists and was intentionally set to be empty.

Such empty arrays occur as natural reductions from other arrays. Why yet another literal for such?
 I think arrays will be most useful if we can treat them like safe
 reference types - this wrappers around the unsafe ptr reference type. To
 do that, we need null/empty to be stable/reliable states.

 If not, then array becomes like 'int' and we have to invent a special
 value to represent the null case (or use other containers/variables to
 represent null) like we do for int.

I am not convinced. Andrei
Jul 05 2013
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
04-Jul-2013 19:00, Regan Heath пишет:
 In fact, you can generalise further.

 The meaning of if(x) is "compare the value of x with 0" (in C, C++, .. ).

 The value of x for a pointer is the address to which it points.
 The value of x for a class reference is the address of the class to
 which it refers.

 If D's arrays are reference types,

They are not. It's a half-reference no wonder it has a bit of schizophrenia now and then. then IMO they should exhibit the same
 behaviour.

The behavior should be the most useful and since arr.length != 0 is what 99% of time a programmer wants to check. -- Dmitry Olshansky
Jul 04 2013
next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
05-Jul-2013 13:01, TommiT пишет:
 On Thursday, 4 July 2013 at 19:15:09 UTC, Dmitry Olshansky wrote:
 04-Jul-2013 19:00, Regan Heath пишет:
 In fact, you can generalise further.

 The meaning of if(x) is "compare the value of x with 0" (in C, C++,
 .. ).

 The value of x for a pointer is the address to which it points.
 The value of x for a class reference is the address of the class to
 which it refers.

 If D's arrays are reference types,

They are not. It's a half-reference no wonder it has a bit of schizophrenia now and then.

What do you mean by D's dynamic arrays being half-reference types? And what kind of "schizophrenia" do exhibit?

For instance passing a slice by value means passing length by value and data pointed to by pointer (hence half-ref): void messWith(int[] slice) { slice[0] = 45; //changes data pointed to, reference semantics slice.length = 0; //doesn't, length is not part of 'reference semantic' } void main(){ auto test = [1, 2, 3]; messWith(test); assert(test.length == 3); assert(test[0] == 45); } -- Dmitry Olshansky
Jul 05 2013
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
05-Jul-2013 12:55, Regan Heath пишет:
 On Thu, 04 Jul 2013 20:15:08 +0100, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 04-Jul-2013 19:00, Regan Heath пишет:
 In fact, you can generalise further.

 The meaning of if(x) is "compare the value of x with 0" (in C, C++,
 .. ).

 The value of x for a pointer is the address to which it points.
 The value of x for a class reference is the address of the class to
 which it refers.

 If D's arrays are reference types,

They are not. It's a half-reference no wonder it has a bit of schizophrenia now and then.

True. The struct which contains the ptr and length is actually a value type. I think conceptually however we should be thinking of them as reference types, because.. the array struct is effectively a lightweight wrapper (adding length) around a reference type (ptr).
   then IMO they should exhibit the same
 behaviour.

The behavior should be the most useful and since arr.length != 0 is what 99% of time a programmer wants to check.

IMO, the behaviour should be consistent. If you code if (x) then the compiler will compare 'x' (not a property of x) to 0. Doing anything else would be inconsistent and unexpected to anyone from a C background.

Then since slices compared to null by your logic means both ptr and length equal 0. Completely broken idea hence I'd simply propose to disable it. -- Dmitry Olshansky
Jul 05 2013
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
05-Jul-2013 13:24, Regan Heath пишет:
 On Fri, 05 Jul 2013 10:13:11 +0100, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 05-Jul-2013 12:55, Regan Heath пишет:
 On Thu, 04 Jul 2013 20:15:08 +0100, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 04-Jul-2013 19:00, Regan Heath пишет:
 In fact, you can generalise further.

 The meaning of if(x) is "compare the value of x with 0" (in C, C++,
 .. ).

 The value of x for a pointer is the address to which it points.
 The value of x for a class reference is the address of the class to
 which it refers.

 If D's arrays are reference types,

They are not. It's a half-reference no wonder it has a bit of schizophrenia now and then.

True. The struct which contains the ptr and length is actually a value type. I think conceptually however we should be thinking of them as reference types, because.. the array struct is effectively a lightweight wrapper (adding length) around a reference type (ptr).
   then IMO they should exhibit the same
 behaviour.

The behavior should be the most useful and since arr.length != 0 is what 99% of time a programmer wants to check.

IMO, the behaviour should be consistent. If you code if (x) then the compiler will compare 'x' (not a property of x) to 0. Doing anything else would be inconsistent and unexpected to anyone from a C background.

Then since slices compared to null by your logic means both ptr and length equal 0. Completely broken idea hence I'd simply propose to disable it.

I think I need to clarify. I am making 3 statements here: 1. Arrays are a thin wrapper around a reference type (ptr) which add safety.

Rather it packs 2 pointers (pair: ptr, ptr+len), modeling the region in between. No matter how we look at it, it doesn't overlap with most of pointer operations except for indexing. In my opinion it can't be framed as a thin wrapper around _one_ pointer.
 2. When you have a thin wrapper you should treat operations on the
 wrapper as the wrapped object in the general case.

Continuation of the above stretch. To be a true wrapper it has to support only the same or subset of operations. For instance arrays have slicing operation hence it's more then that.
 3. if (x) should compare x to 0.

This one is consistent.
 Given those statements I have come to the conclusion that if (x) on an
 array should compare x.ptr to 0.

I'd agree if arrays did decay to pointers or integers on demand (implicit conversion).
 Which of my statements do you disagree with, and why?

 R

-- Dmitry Olshansky
Jul 05 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
05-Jul-2013 16:26, Artur Skawina пишет:
 On 07/05/13 12:39, Dmitry Olshansky wrote:
 05-Jul-2013 13:24, Regan Heath пишет:
 1. Arrays are a thin wrapper around a reference type (ptr) which add
 safety.

Rather it packs 2 pointers (pair: ptr, ptr+len), modeling the region in between. No matter how we look at it, it doesn't overlap with most of pointer operations except for indexing. In my opinion it can't be framed as a thin wrapper around _one_ pointer.

The 'array' term is overloaded and confusing, let's call them "slices". Slices are nothing but fat pointers; they just carry more information - - the length.

The interesting fact is that slice introduces a new concept - empty slice. It bears no relation to a view of "wrapped" pointer at all. (0-pointer is not "empty" it can't be empty it's just invalid)
 2. When you have a thin wrapper you should treat operations on the
 wrapper as the wrapped object in the general case.

Continuation of the above stretch. To be a true wrapper it has to support only the same or subset of operations. For instance arrays have slicing operation hence it's more then that.

ubyte* pointer = null; auto slice = pointer[0..42];

I'd argue that this creates a slice, but point taken. Slicing aside there this empty notion. My point was the space between 2 pointers != pointer semantically hence doesn't need to behave like one no matter the statement. It's like saying that an integer is the same as an interval of integers because it only adds an extra length field to a starting point. Concept is different.
 3. if (x) should compare x to 0.

This one is consistent.

Actually, slices should implicitly convert to bool with !!length. "Normal" pointers couldn't do that - because they had no length.

Aside of meta-arguments for me it ends like this: 1) disallow if(arr) 2) or make it the same as if(arr.length) as more useful/frequent/meaningful The first step is required anyway if we are to change things. -- Dmitry Olshansky
Jul 05 2013
prev sibling next sibling parent Artur Skawina <art.08.09 gmail.com> writes:
On 07/05/13 12:39, Dmitry Olshansky wrote:
 05-Jul-2013 13:24, Regan Heath пишет:
 1. Arrays are a thin wrapper around a reference type (ptr) which add
 safety.

Rather it packs 2 pointers (pair: ptr, ptr+len), modeling the region in between. No matter how we look at it, it doesn't overlap with most of pointer operations except for indexing. In my opinion it can't be framed as a thin wrapper around _one_ pointer.

The 'array' term is overloaded and confusing, let's call them "slices". Slices are nothing but fat pointers; they just carry more information - - the length.
 2. When you have a thin wrapper you should treat operations on the
 wrapper as the wrapped object in the general case.

Continuation of the above stretch. To be a true wrapper it has to support only the same or subset of operations. For instance arrays have slicing operation hence it's more then that.

ubyte* pointer = null; auto slice = pointer[0..42];
 3. if (x) should compare x to 0.

This one is consistent.

Actually, slices should implicitly convert to bool with !!length. "Normal" pointers couldn't do that - because they had no length. artur
Jul 05 2013
prev sibling parent Artur Skawina <art.08.09 gmail.com> writes:
On 07/05/13 14:26, Artur Skawina wrote:
 
 Actually, slices should implicitly convert to bool with !!length.
 "Normal" pointers couldn't do that - because they had no length.

s/implicitly// artur
Jul 05 2013
prev sibling parent "Daniel Murphy" <yebblies nospamgmail.com> writes:
"TommiT" <tommitissari hotmail.com> wrote in message 
news:cajkpllpdchpphqxhyer forum.dlang.org...
 On Thursday, 4 July 2013 at 13:32:25 UTC, Steven Schveighoffer wrote:
 On Thu, 04 Jul 2013 08:52:12 -0400, Regan Heath wrote:
 Indeed.  IMO if(arr) should mean if(arr.ptr) .. and I thought it did.. 
 or did this change at some point?

No, it should mean if(arr.length). It means if(arr.ptr) now, and this is incorrect. [..]

The meaning of if(x) for all x of nullable types has always been if(x != null) probably in all languages.

I completely agree, these three functions should return the same no matter the input: bool f(T)(T[] x) { if (x) return true; return false; } bool g(T)(T[] x) { if (x != null) return true; return false; } bool h(T)(T[] x) { if (x.length) return true; return false; } The last two already do.
Jul 05 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
TommiT:

 But I would expect [] to mean "empty dynamic array", not null. 
 And the same for [:]. This is how I'd expect things to work:

 int[] nullArray;
 assert(nullArray is null);

 int[] emptyArray = [];
 assert(emptyArray !is null);

 int[string] nullAA;
 assert(nullAA is null);

 int[string] emptyAA = [:];
 assert(emptyAA !is null);

 Reasoning by extrapolation:
 int[] arr = [1, 2, 3]; // Array of 3 ints
 int[] arr = [1, 2]; // Array of 2 ints
 int[] arr = [1]; // Array of 1 ints
 int[] arr = []; // Array of 0 ints (not null)

I have discussed the topic here regarding dynamic arrays: http://d.puremagic.com/issues/show_bug.cgi?id=3889 http://d.puremagic.com/issues/show_bug.cgi?id=5788 This means you have to keep "null" in the language to represent an empty associative array, because someone somewhere will surely want a literal that avoids memory allocations. So lot of people will keep using null, and the coding situation is improved about zero. Telling apart the literal for an empty array from the literal of a empty but not null array is a bad idea that muds the language. And thankfully this currently fails: void main() { int[] emptyArray = []; assert(emptyArray !is null); } Bye, bearophile
Jul 03 2013
prev sibling next sibling parent "Diggory" <diggsey googlemail.com> writes:
I agree - the current state with both null and non-null empty 
arrays is confusing, and the separate syntax for empty arrays and 
associative arrays is nice and consistent.

I would go as far as to say it should be impossible, or at least 
fairly difficult in safe code to distinguish (both intentionally 
and unintentionally) between an array of size zero with a 
non-null pointer and an array of size zero with a null pointer, 
and similar for associative arrays too.
Jul 03 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
 Telling apart the literal for an empty array from the literal 
 of a empty but not null array is a bad idea that muds the 
 language. And thankfully this currently fails:

 void main() {
     int[] emptyArray = [];
     assert(emptyArray !is null);
 }

But currently this code: void main() { int[] emptyArray = []; } produces a call to __d_arrayliteralTX, for reasons unknown to me: _D4temp10emptyArrayFZAi comdat L0: push EAX mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ push 0 push EAX call near ptr __d_arrayliteralTX mov EDX,EAX add ESP,8 pop ECX xor EAX,EAX ret Bye, bearophile
Jul 03 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
 produces a call to __d_arrayliteralTX, for reasons unknown to 
 me:


 _D4temp10emptyArrayFZAi comdat
 L0:     push    EAX
         mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ
         push    0
         push    EAX
         call    near ptr __d_arrayliteralTX
         mov EDX,EAX
         add ESP,8
         pop ECX
         xor EAX,EAX
         ret

Sorry, my mistake, I meant: __Dmain comdat L0: push EAX mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ push 0 push EAX call near ptr __d_arrayliteralTX add ESP,8 xor EAX,EAX pop ECX ret Bye, bearophile
Jul 03 2013
prev sibling next sibling parent "TommiT" <tommitissari hotmail.com> writes:
On Wednesday, 3 July 2013 at 18:10:42 UTC, bearophile wrote:
 TommiT:
 But I would expect [] to mean "empty dynamic array", not null. 
 [..]

This means you have to keep "null" in the language to represent an empty associative array, because someone somewhere will surely want a literal that avoids memory allocations. So lot of people will keep using null, and the coding situation is improved about zero.

Okay, I didn't realize an empty array would need to allocate.
Jul 03 2013
prev sibling next sibling parent "Henning Pohl" <henning still-hidden.de> writes:
On Thursday, 4 July 2013 at 00:52:13 UTC, Timon Gehr wrote:
 Also, I think [] should have a singleton type as well. 
 Currently it is a void[] with special implicit conversion rules.

+1
Jul 03 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, July 03, 2013 22:00:39 Andrei Alexandrescu wrote:
 A lot less arbitrary than having to initialize AAs into an empty state
 by adding and removing a mapping.

Could be a function call. I find it unnecessary to just add new notation for every single little thing.

True, but we should probably at least add something to the AA implementation to do this, since a function at that level can create an AA that's empty without having to add and remove an element. And actually, I might as well open an enhancement request for that: http://d.puremagic.com/issues/show_bug.cgi?id=10535 - Jonathan M Davis
Jul 03 2013
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 3 July 2013 at 16:55:59 UTC, bearophile wrote:
 deadalnix:

 Why not simply [] ? After all, an array is just a very simple 
 type of hashmap, with the identity as hash function.

In a language it's very often handy for different data types to have different textual representations and different literals.

[] can be int[] or foo!bar[], so you'll have to justify why it is specifically good to make the difference in this case, while it is generally not needed.
Jul 03 2013
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Wed, 03 Jul 2013 19:10:40 +0100, bearophile <bearophileHUGS lycos.com>  
wrote:
 Telling apart the literal for an empty array from the literal of a empty  
 but not null array is a bad idea that muds the language. And thankfully  
 this currently fails:

 void main() {
      int[] emptyArray = [];
      assert(emptyArray !is null);
 }

As this comes up often you're probably aware that there are people (like myself) who find the distinction between a null (non-existant) array and an empty array useful. Granted, it can complicate code but only if you want the distinction, in all other cases you should be checking array.length == 0 which is true for empty and null arrays. Also, there are ways to get around the issue which all involve having a separate record of whether something exists or not - perhaps by placing items in a associative array and using 'contains' or having a separate existence boolean but they're all band aids over the issue of not having an actual reference type. The humble char* pointer can represent null (non-existent) , "" empty, and "value" and I would find it useful if string could do the same, consistently (there are edge cases where arrays change from empty to null and vice-versa). Tommi's suggestion that all empty arrays share a common pointer is a good one and more or less solves the "it has to allocate memory" complaint. So, that just leaves arguments against complexity or arguments against the whole concept of null/empty being useful. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 04 2013
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Wed, 03 Jul 2013 19:15:19 +0100, Diggory <diggsey googlemail.com> wrote:
 I would go as far as to say it should be impossible, or at least fairly  
 difficult in safe code to distinguish (both intentionally and  
 unintentionally) between an array of size zero with a non-null pointer  
 and an array of size zero with a null pointer, and similar for  
 associative arrays too.

Being able to represent both non-existent (null) and empty ([]) is a useful property, one that pointers and true references have. If D can give us that power with arrays /and/ do it safely, then why wouldn't we want that? If you don't care about the distinction between null and empty, you just check array.length == 0. If you do care about the distinction you should be able to consistently tell them apart, and they should not mutate from one to the other as a side effect of other operations (as they currently do). Statements like if "(array is null)"/"(array !is null)" should be used to check for a null (non-existent) array, and empty arrays should not match. The main argument against having a distinction in the past has been that to have an empty array you need a non-null array pointer, which means you may have to allocate something which is a bit wasteful if you don't really care about the difference. Tommi made a suggestion earlier in the thread that all empty arrays could point to the same global/thread local 1 byte block of pre-allocated memory, and this neatly solves the issue/complaint I believe. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 04 2013
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
I am afraid I don't really follow here. How introducing new array 
literal (and making old one stronger typed) compromises 
typeof(null) features?

On Thursday, 4 July 2013 at 00:28:48 UTC, Andrei Alexandrescu 
wrote:
 typeof(null) has quite a few interesting properties. It's the 
 closest type to the bottom of all types (we don't have an 
 actual bottom type), and it subtypes many other types as 
 mentioned.

 Introducing yet another type works against that nice uniformity 
 and is yet another arbitrary little thing that people who learn 
 the language would need to know about.


 Andrei

Jul 04 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 04 Jul 2013 05:25:30 -0400, Regan Heath <regan netmail.co.nz>  
wrote:

 On Wed, 03 Jul 2013 19:10:40 +0100, bearophile  
 <bearophileHUGS lycos.com> wrote:
 Telling apart the literal for an empty array from the literal of a  
 empty but not null array is a bad idea that muds the language. And  
 thankfully this currently fails:

 void main() {
      int[] emptyArray = [];
      assert(emptyArray !is null);
 }

As this comes up often you're probably aware that there are people (like myself) who find the distinction between a null (non-existant) array and an empty array useful.

Nobody questions that. The biggest problem is making if(arr) mean if(arr.ptr) instead of if(arr.length) What [] returns should not be an allocation. And returning null is a reasonable implementation of that. -Steve
Jul 04 2013
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Thu, 04 Jul 2013 12:50:54 +0100, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Thu, 04 Jul 2013 05:25:30 -0400, Regan Heath <regan netmail.co.nz>  
 wrote:

 On Wed, 03 Jul 2013 19:10:40 +0100, bearophile  
 <bearophileHUGS lycos.com> wrote:
 Telling apart the literal for an empty array from the literal of a  
 empty but not null array is a bad idea that muds the language. And  
 thankfully this currently fails:

 void main() {
      int[] emptyArray = [];
      assert(emptyArray !is null);
 }

As this comes up often you're probably aware that there are people (like myself) who find the distinction between a null (non-existant) array and an empty array useful.

Nobody questions that. The biggest problem is making if(arr) mean if(arr.ptr) instead of if(arr.length)

Indeed. IMO if(arr) should mean if(arr.ptr) .. and I thought it did.. or did this change at some point?
 What [] returns should not be an allocation. And returning null is a  
 reasonable implementation of that.

Whether there is an allocation or not is secondary. The primary goal is for [] to represent empty, not null. We have null, if we want to represent null we pass null. What we lack is a way to represent empty. So, I would say that what [] returns should be empty, and not null. Secondarily we want to avoid allocation, so .. can we not have [] return a slice of length 0 with ptr set to a global pre-allocated single byte of memory? R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 04 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 4 July 2013 at 12:42:51 UTC, Timon Gehr wrote:
 On 07/04/2013 01:50 PM, Steven Schveighoffer wrote:
 On Thu, 04 Jul 2013 05:25:30 -0400, Regan Heath 
 <regan netmail.co.nz>
 wrote:

 On Wed, 03 Jul 2013 19:10:40 +0100, bearophile
 <bearophileHUGS lycos.com> wrote:
 Telling apart the literal for an empty array from the 
 literal of a
 empty but not null array is a bad idea that muds the 
 language. And
 thankfully this currently fails:

 void main() {
     int[] emptyArray = [];
     assert(emptyArray !is null);
 }

As this comes up often you're probably aware that there are people (like myself) who find the distinction between a null (non-existant) array and an empty array useful.

Nobody questions that. The biggest problem is making if(arr) mean if(arr.ptr) instead of if(arr.length)

if(arr.ptr) is what it means now.
 What [] returns should not be an allocation.  And returning 
 null is a
 reasonable implementation of that.

 -Steve

static __gshared void[1] x; return x[0..0];

+1. If you really want null, you can just as well use null for typeless, or "T[].init" for strong typing. As stated before:
 Reasoning by extrapolation:
 int[] arr = [1, 2, 3]; // Array of 3 ints
 int[] arr = [1, 2]; // Array of 2 ints
 int[] arr = [1]; // Array of 1 ints
 int[] arr = []; // Array of 0 ints (not null)

As demonstrated this can be done with no allocations either.
Jul 04 2013
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Thu, 04 Jul 2013 13:41:48 +0100, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 07/04/2013 11:25 AM, Regan Heath wrote:
 ...

 Tommi's suggestion that all empty arrays share a common pointer is a
 good one and more or less solves the "it has to allocate memory"  
 complaint.
 ...

(That was actually my suggestion. :o) )

Whoops :P
 Sharing a common pointer has some drawbacks as well, because it still  
 makes [] somewhat special. It is probably worth it though.

True, it may require some special casing in the array code on mutations of the array etc. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 04 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 04 Jul 2013 08:42:50 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 07/04/2013 01:50 PM, Steven Schveighoffer wrote:
 On Thu, 04 Jul 2013 05:25:30 -0400, Regan Heath <regan netmail.co.nz>
 wrote:

 On Wed, 03 Jul 2013 19:10:40 +0100, bearophile
 <bearophileHUGS lycos.com> wrote:
 Telling apart the literal for an empty array from the literal of a
 empty but not null array is a bad idea that muds the language. And
 thankfully this currently fails:

 void main() {
      int[] emptyArray = [];
      assert(emptyArray !is null);
 }

As this comes up often you're probably aware that there are people (like myself) who find the distinction between a null (non-existant) array and an empty array useful.

Nobody questions that. The biggest problem is making if(arr) mean if(arr.ptr) instead of if(arr.length)

if(arr.ptr) is what it means now.

I know, and that is a problem. -Steve
 What [] returns should not be an allocation.  And returning null is a
 reasonable implementation of that.

static __gshared void[1] x; return x[0..0];

That's also a valid solution, along with: void *x = null; return x[0..0]; -Steve
Jul 04 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 04 Jul 2013 08:52:12 -0400, Regan Heath <regan netmail.co.nz>  
wrote:

 On Thu, 04 Jul 2013 12:50:54 +0100, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 On Thu, 04 Jul 2013 05:25:30 -0400, Regan Heath <regan netmail.co.nz>  
 wrote:

 On Wed, 03 Jul 2013 19:10:40 +0100, bearophile  
 <bearophileHUGS lycos.com> wrote:
 Telling apart the literal for an empty array from the literal of a  
 empty but not null array is a bad idea that muds the language. And  
 thankfully this currently fails:

 void main() {
      int[] emptyArray = [];
      assert(emptyArray !is null);
 }

As this comes up often you're probably aware that there are people (like myself) who find the distinction between a null (non-existant) array and an empty array useful.

Nobody questions that. The biggest problem is making if(arr) mean if(arr.ptr) instead of if(arr.length)

Indeed. IMO if(arr) should mean if(arr.ptr) .. and I thought it did.. or did this change at some point?

No, it should mean if(arr.length). It means if(arr.ptr) now, and this is incorrect. I don't care if an empty array points at null or at some other value, I care if it's empty or not. You can always use if(arr is null) or if(arr.ptr) if you prefer to distinguish null arrays from non-null ones. As of now, it's impossible to not care.
 What [] returns should not be an allocation. And returning null is a  
 reasonable implementation of that.

Whether there is an allocation or not is secondary. The primary goal is for [] to represent empty, not null. We have null, if we want to represent null we pass null. What we lack is a way to represent empty.

What I'm saying is that [] returning null is reasonable. Returning some non-null empty array is also reasonable if it's not an allocation. In the realm of arrays, you are not supposed to care where it's stored, you only care about the elements and the length. I would not be opposed to a pull request that made [] be non-null, as long as it doesn't allocate. -Steve
Jul 04 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 4 July 2013 at 13:32:25 UTC, Steven Schveighoffer 
wrote:
 Indeed.  IMO if(arr) should mean if(arr.ptr) .. and I thought 
 it did.. or did this change at some point?

No, it should mean if(arr.length).

*I* think it should simply mean compilation error. There is, arguably, a "big" difference between null and empty: null means not yet initialized, whereas empty means initialized, but contains no info. This is an important distinction for things such as lazy initialization, or just plain checking data integrity. It can also mean important distinctions for serialization, such as JSON or xml, to differentiate between an null node, or a node that contains nothing. A slice is the abstraction of both pointer and container. Wanting to test null pointer or empty are both legitimate operations. IMO, you should have to state which you want. Making arbitrary choices for the user in the face of ambiguity is evil.
Jul 04 2013
prev sibling next sibling parent "TommiT" <tommitissari hotmail.com> writes:
On Thursday, 4 July 2013 at 13:32:25 UTC, Steven Schveighoffer 
wrote:
 On Thu, 04 Jul 2013 08:52:12 -0400, Regan Heath wrote:
 Indeed.  IMO if(arr) should mean if(arr.ptr) .. and I thought 
 it did.. or did this change at some point?

No, it should mean if(arr.length). It means if(arr.ptr) now, and this is incorrect. [..]

The meaning of if(x) for all x of nullable types has always been if(x != null) probably in all languages.
Jul 04 2013
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Thu, 04 Jul 2013 15:45:54 +0100, TommiT <tommitissari hotmail.com>  
wrote:

 On Thursday, 4 July 2013 at 13:32:25 UTC, Steven Schveighoffer wrote:
 On Thu, 04 Jul 2013 08:52:12 -0400, Regan Heath wrote:
 Indeed.  IMO if(arr) should mean if(arr.ptr) .. and I thought it did..  
 or did this change at some point?

No, it should mean if(arr.length). It means if(arr.ptr) now, and this is incorrect. [..]

The meaning of if(x) for all x of nullable types has always been if(x != null) probably in all languages.

In fact, you can generalise further. The meaning of if(x) is "compare the value of x with 0" (in C, C++, .. ). The value of x for a pointer is the address to which it points. The value of x for a class reference is the address of the class to which it refers. If D's arrays are reference types, then IMO they should exhibit the same behaviour. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 04 2013
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Thu, 04 Jul 2013 15:35:30 +0100, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 7/4/13 6:32 AM, Steven Schveighoffer wrote:
 I would not be opposed to a pull request that made [] be non-null, as
 long as it doesn't allocate.

What would be the benefits?

Being able to naturally specify a non-null empty array (literal) such that... char[] n = null; char[] e = []; assert(n is null) assert(e !is null); R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 04 2013
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Thu, 04 Jul 2013 15:11:49 +0100, monarch_dodra <monarchdodra gmail.com>  
wrote:

 On Thursday, 4 July 2013 at 13:32:25 UTC, Steven Schveighoffer wrote:
 Indeed.  IMO if(arr) should mean if(arr.ptr) .. and I thought it did..  
 or did this change at some point?

No, it should mean if(arr.length).

*I* think it should simply mean compilation error. There is, arguably, a "big" difference between null and empty: ... IMO, you should have to state which you want. Making arbitrary choices for the user in the face of ambiguity is evil.

That's another option, but I for one would be annoyed by this. I would also argue that it's not ambiguous, if(x) has always meant "compare x with 0" (for me / in C style languages), which in this case means comparing the array reference to null, not comparing the array length to 0. That is.. if array references are really reference types. I which case they should obey the same rules/behaviour. Imagine you had a class Array. Array a = null; if (a) // should compare the reference a to 0, not a.length to 0, right? R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 04 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 4 July 2013 at 15:27:17 UTC, Timon Gehr wrote:
 On 07/04/2013 04:35 PM, Andrei Alexandrescu wrote:
 On 7/4/13 6:32 AM, Steven Schveighoffer wrote:
 I would not be opposed to a pull request that made [] be 
 non-null, as
 long as it doesn't allocate.

What would be the benefits? Andrei

- Additional sentinel values at basically no cost. - No accidental flawed relying on empty array is null or empty array !is null. (i.e. less nondeterminism.) - One thing less to discuss (this has come up before.)

There are no benefits to making "[]" return null either. Implementation wise, instead of returning a void[] with "ptr == 0x0" and "length == 0", it could just as well return a void[] with "ptr == 0x1" and "length == 0". You'd get better behavior at no extra cost.
Jul 04 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, July 04, 2013 08:41:47 Andrei Alexandrescu wrote:
 On 7/4/13 8:02 AM, Regan Heath wrote:
 On Thu, 04 Jul 2013 15:35:30 +0100, Andrei Alexandrescu
 
 <SeeWebsiteForEmail erdani.org> wrote:
 On 7/4/13 6:32 AM, Steven Schveighoffer wrote:
 I would not be opposed to a pull request that made [] be non-null, as
 long as it doesn't allocate.

What would be the benefits?

Being able to naturally specify a non-null empty array (literal) such that... char[] n = null; char[] e = []; assert(n is null) assert(e !is null);

And what would be the benefit of that?

Making the distinction between null and empty cleaner. There are plenty of cases in CS in general where distinguishing between null and empty is useful, but unfortunately, the way we've gone about implementing null and empty with arrays in D tends to blur to the point that it's kind of iffy to use null as something distinct from empty. You can do it for simple stuff if you're careful (like explicitly returning null from a function), but it's very easy to end up with an array that's empty when you want null or vice versa. One prime case of this is [] vs null. [] is supposed to indicate an empty array, but it results in a null one, which not only helps blur the line between null and empty, but it makes it so that (similar to AAs), there's no clean way to simply declare an empty array. Now, because of how much empty and null gets blurred with dynamic arrays, you don't generally end up caring much about being able to declare an empty, non-null array as you do with AAs, but a lot of that is simply because it's arguably too risky to try and distinguish between null and empty with arrays even if you wanted to (because of how easy it is to get one when you wanted the other and how they act almost - but not quite - the same). - Jonathan M Davis
Jul 04 2013
prev sibling next sibling parent "Mr. Anonymous" <mailnew4ster gmail.com> writes:
On Thursday, 4 July 2013 at 12:52:14 UTC, Regan Heath wrote:
 Whether there is an allocation or not is secondary.  The 
 primary goal is for [] to represent empty, not null.  We have 
 null, if we want to represent null we pass null.  What we lack 
 is a way to represent empty.

 So, I would say that what [] returns should be empty, and not 
 null.

 Secondarily we want to avoid allocation, so .. can we not have 
 [] return a slice of length 0 with ptr set to a global 
 pre-allocated single byte of memory?

 R

Here's C++'s way of handling it (note the "unique"): --quoting-- There is a special case for a zero-length array (N == 0). In that case, array.begin() == array.end(), which is some unique value. The effect of calling front() or back() on a zero-sized array is undefined. --/quoting-- http://en.cppreference.com/w/cpp/container/array
Jul 04 2013
prev sibling next sibling parent "TommiT" <tommitissari hotmail.com> writes:
On Thursday, 4 July 2013 at 17:32:29 UTC, Andrei Alexandrescu 
wrote:
 [..] Maybe someone returned [] thinking it will be a null 
 array. [..]

I wouldn't think that [] is null, and I suspect neither would very many other newcomers to the language. To me, the only problem with [] being null is that it doesn't look like null. It looks like an empty array. So, the problem is that [] is not what you'd intuitively expect it to be. By the way, this must be a bug, right? template arr(X_...) { int[] arr = [X_]; // [1] } void main() { auto a2 = arr!(1, 2); auto a1 = arr!(1); auto a0 = arr!(); // [2] } [1] Error: initializer must be an expression, not '()' [2] Error: template instance main.arr!() error instantiating ...because if that's not supposed to work, then I don't see much point in having the [] literal in the language.
Jul 04 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, July 04, 2013 10:26:09 Andrei Alexandrescu wrote:
 The way code should do is use null on the creation side and .empty on
 the testing side. Making [] non-null does not change that.

True. But if [] is guaranteed to be non-null, then you have a way to explicitly create empty arrays which are non-null, and then the only ways that you end up with an array that's actually null are via its init value and by actually using null. It makes it easier to actually distinguish between null and empty in your code without accidentally getting the wrong one.
 Why do you want so much an empty array that's not null? I can't make
 sense of this entire argument.

To distinguish between the cases where you have a value, but it's empty, and the cases where you don't actually have a value. It's the same reason why we have std.typecons.Nullable for types that don't have null. Arrays have null, so they shouldn't need Nullable, but if you want to be able to distinguish between null and empty, then you need to be able to be certain that something is null when you mean for it to be null and not null when it's is supposed to have a value, and arrays tend to treat null and empty as almost the same thing, so it becomes easy to screw it up if you're doing anything more complicated than checking whether a function returned null or not. - Jonathan M Davis
Jul 04 2013
prev sibling next sibling parent "NotYetUsingD" <do_not_reply this_message.com> writes:
On Thursday, 4 July 2013 at 18:03:02 UTC, Andrei Alexandrescu 
wrote:
 A null array _is_ an empty array.

Proof (I'm sorry for not using D): #include <cstddef> #include <iostream> int main() { int* a = nullptr; int length = 0; int offset = 7; bool is_empty_0 = &a[0] - &a[length]; bool is_empty_1 = a - a; bool is_empty_2 = (a + offset) - (a + offset); bool is_empty_3 = &(a + offset)[0] - &(a + offset)[length]; std::cout << is_empty_0 << std::endl; std::cout << is_empty_1 << std::endl; std::cout << is_empty_2 << std::endl; std::cout << is_empty_3 << std::endl; }
Jul 04 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 I also happen to think we don't quite need [].

It's the opposite, we don't need null to represent empty dynamic arrays. Hopefully Rust designers will avoid such point of views, they generally prefer a less sloppy design and a stronger typing. Bye, bearophile
Jul 04 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 Where does the whole "stronger typing" comes in? This is 
 poppycock. We need real arguments here.

Maybe it's a matter of definitions, for me having "null" as literal for empty array, null pointer, empty associative array, and more is more weakly typed compared to having a literal like [] usable only for empty dynamic arrays (and strings), a literal as [:] usable only for empty associative arrays, and null for pointers, class references (and little else like a Nullable). Bye, bearophile
Jul 04 2013
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 4 July 2013 at 14:45:57 UTC, TommiT wrote:
 On Thursday, 4 July 2013 at 13:32:25 UTC, Steven Schveighoffer 
 wrote:
 On Thu, 04 Jul 2013 08:52:12 -0400, Regan Heath wrote:
 Indeed.  IMO if(arr) should mean if(arr.ptr) .. and I thought 
 it did.. or did this change at some point?

No, it should mean if(arr.length). It means if(arr.ptr) now, and this is incorrect. [..]

The meaning of if(x) for all x of nullable types has always been if(x != null) probably in all languages.

D has differentiated values and identity.
Jul 04 2013
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 4 July 2013 at 23:52:35 UTC, bearophile wrote:
 Andrei Alexandrescu:

 Where does the whole "stronger typing" comes in? This is 
 poppycock. We need real arguments here.

Maybe it's a matter of definitions, for me having "null" as literal for empty array, null pointer, empty associative array, and more is more weakly typed compared to having a literal like [] usable only for empty dynamic arrays (and strings), a literal as [:] usable only for empty associative arrays, and null for pointers, class references (and little else like a Nullable). Bye, bearophile

[] and [:] aren't even remotely close to be strongly typed. I still don't see any reason to have a distinction.
Jul 04 2013
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Thursday, 4 July 2013 at 23:25:38 UTC, Andrei Alexandrescu 
wrote:
 Where does the whole "stronger typing" comes in? This is 
 poppycock. We need real arguments here.

 Andrei

More values typeof([]) can be implicitly converted to - weaker typing, that simple. Actually I'd love to be able to specify precisely typed empty array literal, like [uint] or [uint:bool] but can't find a good syntax to propose. Currently array literals are very special and is does harm the type system.
Jul 05 2013
prev sibling next sibling parent "Nicolas Sicard" <dransic gmail.com> writes:
On Thursday, 4 July 2013 at 23:52:35 UTC, bearophile wrote:
 Andrei Alexandrescu:

 Where does the whole "stronger typing" comes in? This is 
 poppycock. We need real arguments here.

Maybe it's a matter of definitions, for me having "null" as literal for empty array, null pointer, empty associative array, and more is more weakly typed compared to having a literal like [] usable only for empty dynamic arrays (and strings), a literal as [:] usable only for empty associative arrays, and null for pointers, class references (and little else like a Nullable). Bye, bearophile

While I agree with the need to have a literal for non-initialized arrays and another one for initialized but empty arrays, that is null and [] respectively, I can't see the necessity for [:]. The literal should be used to mark the difference between null and empty, not the difference between plain or associative, shouldn't it? For me, having to type int[string] foo = [:]; instead of int[string] foo = []; // same semantic would just be a source of confusion.
Jul 05 2013
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 5 July 2013 at 08:15:35 UTC, Nicolas Sicard wrote:
 On Thursday, 4 July 2013 at 23:52:35 UTC, bearophile wrote:
 Andrei Alexandrescu:

 Where does the whole "stronger typing" comes in? This is 
 poppycock. We need real arguments here.

Maybe it's a matter of definitions, for me having "null" as literal for empty array, null pointer, empty associative array, and more is more weakly typed compared to having a literal like [] usable only for empty dynamic arrays (and strings), a literal as [:] usable only for empty associative arrays, and null for pointers, class references (and little else like a Nullable). Bye, bearophile

While I agree with the need to have a literal for non-initialized arrays and another one for initialized but empty arrays, that is null and [] respectively, I can't see the necessity for [:]. The literal should be used to mark the difference between null and empty, not the difference between plain or associative, shouldn't it? For me, having to type int[string] foo = [:]; instead of int[string] foo = []; // same semantic would just be a source of confusion.

Keep in mind that arrays are value that *look* like reference types. You can get an array that's non-null with no allocations. Associative arrays, on the other hand, are true reference types. This means initialization *necessarily* implies allocating a payload. Having "[]" "maybe"/"maybe not" allocate is not that good of an idea. Using "[:]", on the other hand, really stresses the fact that "this AA needs to be empty but initialized, and I know I'm going to allocate for it".
Jul 05 2013
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Thu, 04 Jul 2013 20:15:08 +0100, Dmitry Olshansky  =

<dmitry.olsh gmail.com> wrote:

 04-Jul-2013 19:00, Regan Heath =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
 In fact, you can generalise further.

 The meaning of if(x) is "compare the value of x with 0" (in C, C++, .=


 ).

 The value of x for a pointer is the address to which it points.
 The value of x for a class reference is the address of the class to
 which it refers.

 If D's arrays are reference types,

They are not. It's a half-reference no wonder it has a bit of =

 schizophrenia now and then.

True. The struct which contains the ptr and length is actually a value = = type. I think conceptually however we should be thinking of them as = reference types, because.. the array struct is effectively a lightweight= = wrapper (adding length) around a reference type (ptr).
   then IMO they should exhibit the same
 behaviour.

The behavior should be the most useful and since arr.length !=3D 0 is =

 99% of time a programmer wants to check.

IMO, the behaviour should be consistent. If you code if (x) then the = compiler will compare 'x' (not a property of x) to 0. Doing anything el= se = would be inconsistent and unexpected to anyone from a C background. If you mean to check arr.length, then code that explicitly. Coding if = (arr) and having it check arr.length hides details which really should b= e = visible for the programmer to see. R -- = Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 05 2013
prev sibling next sibling parent "TommiT" <tommitissari hotmail.com> writes:
On Thursday, 4 July 2013 at 19:15:09 UTC, Dmitry Olshansky wrote:
 04-Jul-2013 19:00, Regan Heath пишет:
 In fact, you can generalise further.

 The meaning of if(x) is "compare the value of x with 0" (in C, 
 C++, .. ).

 The value of x for a pointer is the address to which it points.
 The value of x for a class reference is the address of the 
 class to
 which it refers.

 If D's arrays are reference types,

They are not. It's a half-reference no wonder it has a bit of schizophrenia now and then.

What do you mean by D's dynamic arrays being half-reference types? And what kind of "schizophrenia" do exhibit?
Jul 05 2013
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Thu, 04 Jul 2013 18:26:09 +0100, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:
 Why do you want so much an empty array that's not null? I can't make  
 sense of this entire argument.

Suppose you have a web page, suppose it has a text field on it called "comment". Suppose you load a pre-existing record from your database and populate the page, suppose it had a value for comment, suppose you want to set that comment to be blank. you edit and click save. The code backing this page is going to get a string for "comment", that string should be empty but not null. Why? Because if it were null it would have a different meaning. It would mean that the comment field was not present on the page at all, and should not be altered. There are many such examples. All of them can be worked around by various means but these are all more complex and require additional containers or variables to represent state. null - does not exist, was not specified. empty - exists and was intentionally set to be empty. I think arrays will be most useful if we can treat them like safe reference types - this wrappers around the unsafe ptr reference type. To do that, we need null/empty to be stable/reliable states. If not, then array becomes like 'int' and we have to invent a special value to represent the null case (or use other containers/variables to represent null) like we do for int. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 05 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, July 05, 2013 11:01:17 TommiT wrote:
 On Thursday, 4 July 2013 at 19:15:09 UTC, Dmitry Olshansky wrote:
 04-Jul-2013 19:00, Regan Heath =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
 In fact, you can generalise further.
=20
 The meaning of if(x) is "compare the value of x with 0" (in C,
 C++, .. ).
=20
 The value of x for a pointer is the address to which it points.
 The value of x for a class reference is the address of the
 class to
 which it refers.
=20
 If D's arrays are reference types,

They are not. It's a half-reference no wonder it has a bit of schizophrenia now and then.

What do you mean by D's dynamic arrays being half-reference types? And what kind of "schizophrenia" do exhibit?

Okay. Take this function. void foo(T)(T bar) { bar.mutateMe(); } If T is a reference type, then the argument passed to foo will be mutat= ed. If=20 T is a value type, it won't be. But with arrays, it depends. If you alt= er=20 bar's length in any way, it won't have any effect on the array that you= passed=20 to foo. However, if you alter any of bar's elements, then it _will_ alt= er the=20 array that was passed in. So, arrays are sort of a half-reference type = rather=20 than a reference type or a value type. - Jonathan M Davis
Jul 05 2013
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 05 Jul 2013 10:13:11 +0100, Dmitry Olshansky  =

<dmitry.olsh gmail.com> wrote:

 05-Jul-2013 12:55, Regan Heath =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
 On Thu, 04 Jul 2013 20:15:08 +0100, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 04-Jul-2013 19:00, Regan Heath =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
 In fact, you can generalise further.

 The meaning of if(x) is "compare the value of x with 0" (in C, C++,=




 .. ).

 The value of x for a pointer is the address to which it points.
 The value of x for a class reference is the address of the class to=




 which it refers.

 If D's arrays are reference types,

They are not. It's a half-reference no wonder it has a bit of schizophrenia now and then.

True. The struct which contains the ptr and length is actually a val=


 type.  I think conceptually however we should be thinking of them as
 reference types, because.. the array struct is effectively a lightwei=


 wrapper (adding length) around a reference type (ptr).

   then IMO they should exhibit the same
 behaviour.

The behavior should be the most useful and since arr.length !=3D 0 i=



 what 99% of time a programmer wants to check.

IMO, the behaviour should be consistent. If you code if (x) then the=


 compiler will compare 'x' (not a property of x) to 0.  Doing anything=


 else would be inconsistent and unexpected to anyone from a C backgrou=


 Then since slices compared to null by your logic means both ptr and  =

 length equal 0. Completely broken idea hence I'd simply propose to  =

 disable it.

I think I need to clarify. I am making 3 statements here: 1. Arrays are a thin wrapper around a reference type (ptr) which add = safety. W 2. When you have a thin wrapper you should treat operations on the wrapp= er = as the wrapped object in the general case. 3. if (x) should compare x to 0. Given those statements I have come to the conclusion that if (x) on an = array should compare x.ptr to 0. Which of my statements do you disagree with, and why? R -- = Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 05 2013
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 05 Jul 2013 10:17:02 +0100, Jonathan M Davis <jmdavisProg gmx.co=
m>  =

wrote:

 On Friday, July 05, 2013 11:01:17 TommiT wrote:
 On Thursday, 4 July 2013 at 19:15:09 UTC, Dmitry Olshansky wrote:
 04-Jul-2013 19:00, Regan Heath =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
 In fact, you can generalise further.

 The meaning of if(x) is "compare the value of x with 0" (in C,
 C++, .. ).

 The value of x for a pointer is the address to which it points.
 The value of x for a class reference is the address of the
 class to
 which it refers.

 If D's arrays are reference types,

They are not. It's a half-reference no wonder it has a bit of schizophrenia now and then.

What do you mean by D's dynamic arrays being half-reference types? And what kind of "schizophrenia" do exhibit?

Okay. Take this function. void foo(T)(T bar) { bar.mutateMe(); } If T is a reference type, then the argument passed to foo will be =

 mutated. If
 T is a value type, it won't be. But with arrays, it depends. If you al=

 bar's length in any way, it won't have any effect on the array that yo=

 passed
 to foo. However, if you alter any of bar's elements, then it _will_  =

 alter the
 array that was passed in. So, arrays are sort of a half-reference type=

 rather
 than a reference type or a value type.

I think it is simpler to think of arrays as a struct with 2 members (ptr= = and length) and this struct is/as a value type. Then it all simply make= s = sense. R -- = Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 05 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, July 05, 2013 10:26:09 Regan Heath wrote:
 I think it is simpler to think of arrays as a struct with 2 members (ptr
 and length) and this struct is/as a value type.  Then it all simply makes
 sense.

I'm not saying that it can't be understood. It's just that the semantics are a bit unique, and while it works just fine, it does tend to cause confusion for many people (at least at first). - Jonathan M Davis
Jul 05 2013
prev sibling next sibling parent "TommiT" <tommitissari hotmail.com> writes:
On Friday, 5 July 2013 at 10:39:38 UTC, Dmitry Olshansky wrote:
 05-Jul-2013 13:24, Regan Heath пишет:
 Given those statements I have come to the conclusion that if 
 (x) on an
 array should compare x.ptr to 0.

I'd agree if arrays did decay to pointers or integers on demand (implicit conversion).

Are you arguing that if(x) shouldn't even compile if x is a slice? If so, I was thinking the same thing. The fact that if(x) compiles makes sense only if you think of slice as a struct which happens to provide a convenience cast(bool) operator (which slice doesn't provide). The meaning of if(x) is obvious if x is a full/real/proper reference type. But since slice is just a half-reference, so the meaning of if(x) for slices is not obvious, and thus, it shouldn't compile.
Jul 05 2013
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 05 Jul 2013 11:39:37 +0100, Dmitry Olshansky  =

<dmitry.olsh gmail.com> wrote:
 05-Jul-2013 13:24, Regan Heath =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
 On Fri, 05 Jul 2013 10:13:11 +0100, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 05-Jul-2013 12:55, Regan Heath =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
 On Thu, 04 Jul 2013 20:15:08 +0100, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 04-Jul-2013 19:00, Regan Heath =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
 In fact, you can generalise further.

 The meaning of if(x) is "compare the value of x with 0" (in C, C+=






 .. ).

 The value of x for a pointer is the address to which it points.
 The value of x for a class reference is the address of the class =






 which it refers.

 If D's arrays are reference types,

They are not. It's a half-reference no wonder it has a bit of schizophrenia now and then.

True. The struct which contains the ptr and length is actually a =




 value
 type.  I think conceptually however we should be thinking of them a=




 reference types, because.. the array struct is effectively a  =




 lightweight
 wrapper (adding length) around a reference type (ptr).

   then IMO they should exhibit the same
 behaviour.

The behavior should be the most useful and since arr.length !=3D 0=





 what 99% of time a programmer wants to check.

IMO, the behaviour should be consistent. If you code if (x) then t=




 compiler will compare 'x' (not a property of x) to 0.  Doing anythi=




 else would be inconsistent and unexpected to anyone from a C  =




 background.

Then since slices compared to null by your logic means both ptr and length equal 0. Completely broken idea hence I'd simply propose to disable it.

I think I need to clarify. I am making 3 statements here: 1. Arrays are a thin wrapper around a reference type (ptr) which add safety.

Rather it packs 2 pointers (pair: ptr, ptr+len), modeling the region i=

 between.

True, but the actual implementation isn't the issue. The concept of D's= = arrays was to wrap pointers in order to make them safer. So, = /conceptually/ an array is a thin wrapper over a pointer. Concept defin= es = semantics, rather than implementation, IMO :)
 2. When you have a thin wrapper you should treat operations on the
 wrapper as the wrapped object in the general case.

Continuation of the above stretch. To be a true wrapper it has to support only the same or subset of =

 operations. For instance arrays have slicing operation hence it's more=

 then that.

To my mind wrappers aren't limited in such a way and can add = functionality, the key to me is that they continue to expose the = underlying object, which d's arrays do (in various ways). R -- = Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 05 2013
prev sibling next sibling parent "Regan Heath" <reganmheath gmail.com> writes:
On Friday, 5 July 2013 at 16:22:12 UTC, Andrei Alexandrescu wrote:
 On 7/5/13 2:05 AM, Regan Heath wrote:
 Why? Because if it were null it would have a different 
 meaning. It would
 mean that the comment field was not present on the page at 
 all, and
 should not be altered.

I find the example tenuous. Even assuming it has merit, it does not explain the need for a syntactic _literal_ to fulfill that need.

TBH my main concern is the stability of the null/empty states of arrays, not the literal. But, I think the current means by which you can create empty arrays, without slicing them from an existing array or variable is hackish at best. Imagine we have a function which accepts an optional array. Passing null means, not specified/do nothing, passing non-null means assign/use the value. If we have an existing array or even a local variable of the right type we can slice off it to pass empty, but otherwise we're screwed .. right? i.e. How can you call 'foo' below from 'bar' passing an empty array for a, b, or c in a generic manner? import std.stdio; void foo(T,U,V)(T[] a, U[] b, V[] c) { writefln("a !is null = %s", (a !is null)); writefln("a.length is = %s", a.length); writefln("b !is null = %s", (b !is null)); writefln("b.length is = %s", b.length); writefln("c !is null = %s", (c !is null)); writefln("c.length is = %s", c.length); } void bar(T,U,V)() { V[] c1 = null; V[] c2 = new V[0]; foo!(T,U,V)([], // null, length = 0 null, // null, length = 0 c1); // null, length = 0 writefln(""); U i; foo!(T,U,V)(cast(T[])"", // !null, length = 0 cast(U[])(&i)[0..0], // !null, length = 0 c2); // *null* length = 5 } void main() { bar!(dchar,int,bool)(); } The dchar[] works because I cheated/knew the type, the int[] requires a local variable to slice off, the bool[] is a local array which in the 2nd case is still null! Is there a better way to do it?
 There are many such examples.

I am not convinced there are many such examples. I'd call it poor design to make it a cornerstone to distinguish between empty arrays that are null and empty arrays that are not only non-null, but aren't part of any other array!

To be clear I want to be able to distinguish between not-specified and specified. In the specified case I can tell empty/not empty by using length. This may seem like splitting hairs but from my point of view there is no such thing as "empty arrays that are null". If the array/slice/parameter "is null" then it is not-specified, or non-existent, not empty. For example if you start with a set of 2 items, take 2 items away and you have an empty set, take the set itself away and you have no set at all (non-existent/not-specified).
 Let me emphasize the last part. Ironically, there is a 
 potential interesting use of empty non-null arrays as anchors: 
 an empty slice referring to the interior of an array may be 
 combined with another slice, pointer, or length, to create a 
 new, meaningful slice. However, the discussed literal does 
 nothing of that kind - it just fabricates a slice of an array 
 that does not exist.

Thus allowing us to pass a set which exists but contains no items.
 All of them can be worked around by
 various means but these are all more complex and require 
 additional
 containers or variables to represent state.

I would say those are better designs.

How exactly? They're all more complex and/or indirect. Having a single variable which can reflect the intent directly and without relying on external state is surely "better".
 null - does not exist, was not specified.
 empty - exists and was intentionally set to be empty.

Such empty arrays occur as natural reductions from other arrays. Why yet another literal for such?

Yet another literal? What other literal are we talking about? What if you don't have another array or variable to slice off?
 I think arrays will be most useful if we can treat them like 
 safe
 reference types - this wrappers around the unsafe ptr 
 reference type. To
 do that, we need null/empty to be stable/reliable states.

 If not, then array becomes like 'int' and we have to invent a 
 special
 value to represent the null case (or use other 
 containers/variables to
 represent null) like we do for int.

I am not convinced.

Fair enough. It just seems .. annoying to me that I cannot do with a char[] array what I can do with a char* pointer (represent non-specified, empty, not empty). That, plus the fact that the D array/slice behaviour is not reliable, or consistent, makes it seem like a rough edge.
Jul 05 2013
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
deadalnix:

 [] and [:] aren't even remotely close to be strongly typed.

It's a matter of degrees of how much "strongly" typed things are. The current situation is significantly more weakly typed than what I have proposed. It doesn't even tell apart a light reference from a fat pointer. Bye, bearophile
Jul 08 2013