www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Empty VS null array?

reply "ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:
How do I find out if null was passed in? As you can guess I 
wasn't happy with the current behavior.

Code:

	import std.stdio;

	void main() {

		fn([1,2]);
		fn(null);
		fn([]);
	}
	void fn(int[] v) {
		writeln("-");
		if(v==null)
			writeln("Use default");
		foreach(e; v)
			writeln(e);
	}

Output

	-
	1
	2
	-
	Use default
	-
	Use default
Oct 17 2013
next sibling parent reply "ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:
Sorry I misspoke. I meant to say empty array or not null passed 
in. The 3rd call to fn is what I didn't like.
Oct 17 2013
parent reply "anonymous" <anonymous example.com> writes:
On Thursday, 17 October 2013 at 22:50:22 UTC, ProgrammingGhost 
wrote:
 How do I find out if null was passed in? As you can guess I 
 wasn't happy with the current behavior.

 Code:

 	import std.stdio;

 	void main() {

 		fn([1,2]);
 		fn(null);
 		fn([]);
 	}
 	void fn(int[] v) {
 		writeln("-");
 		if(v==null)
 			writeln("Use default");
 		foreach(e; v)
 			writeln(e);
 	}

 Output

 	-
 	1
 	2
 	-
 	Use default
 	-
 	Use default
On Thursday, 17 October 2013 at 22:51:24 UTC, ProgrammingGhost wrote:
 Sorry I misspoke. I meant to say empty array or not null passed 
 in. The 3rd call to fn is what I didn't like.
null implicitly converts to []. You can't distinguish them in fn. You could add an overload for typeof(null), but that only catches the literal null, probably not what you'd expect: import std.stdio; void fn(typeof(null) v) { writeln("-"); writeln("Use default"); } void fn(int[] v) { writeln("-"); foreach(e; v) writeln(e); } void main() { fn([1,2]); fn(null); fn([]); int[] x = null; fn(x); } ---- - 1 2 - Use default - -
Oct 17 2013
parent "ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:
On Thursday, 17 October 2013 at 23:14:51 UTC, anonymous wrote:
 On Thursday, 17 October 2013 at 22:50:22 UTC, ProgrammingGhost 
 wrote:
 How do I find out if null was passed in? As you can guess I 
 wasn't happy with the current behavior.

 Code:

 	import std.stdio;

 	void main() {

 		fn([1,2]);
 		fn(null);
 		fn([]);
 	}
 	void fn(int[] v) {
 		writeln("-");
 		if(v==null)
 			writeln("Use default");
 		foreach(e; v)
 			writeln(e);
 	}

 Output

 	-
 	1
 	2
 	-
 	Use default
 	-
 	Use default
On Thursday, 17 October 2013 at 22:51:24 UTC, ProgrammingGhost wrote:
 Sorry I misspoke. I meant to say empty array or not null 
 passed in. The 3rd call to fn is what I didn't like.
null implicitly converts to []. You can't distinguish them in fn. You could add an overload for typeof(null), but that only catches the literal null, probably not what you'd expect: import std.stdio; void fn(typeof(null) v) { writeln("-"); writeln("Use default"); } void fn(int[] v) { writeln("-"); foreach(e; v) writeln(e); } void main() { fn([1,2]); fn(null); fn([]); int[] x = null; fn(x); } ---- - 1 2 - Use default - -
Overloads are acceptable. But that behavior is odd although I do understand its being passed as value. I guess I have to suck it up and hope this behavior doesn't give me problems.
Oct 17 2013
prev sibling next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 17 October 2013 at 22:50:22 UTC, ProgrammingGhost 
wrote:
 How do I find out if null was passed in?
try if(v is null) { use default } if all you care about is if there's contents, I like to use if(v.length) {}
Oct 17 2013
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 17 October 2013 at 23:00:12 UTC, Adam D. Ruppe wrote:
 On Thursday, 17 October 2013 at 22:50:22 UTC, ProgrammingGhost 
 wrote:
 How do I find out if null was passed in?
try if(v is null) { use default } if all you care about is if there's contents, I like to use if(v.length) {}
Which is ultimately wrong as equality shouldn't test for identity.
Oct 17 2013
prev sibling parent reply "ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:
On Thursday, 17 October 2013 at 23:00:12 UTC, Adam D. Ruppe wrote:
 On Thursday, 17 October 2013 at 22:50:22 UTC, ProgrammingGhost 
 wrote:
 How do I find out if null was passed in?
try if(v is null) { use default } if all you care about is if there's contents, I like to use if(v.length) {}
is null still treats [] as null. I tried && !is [] for fun and it didnt worth either (null is [])
Oct 17 2013
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 17 October 2013 at 23:12:03 UTC, ProgrammingGhost 
wrote:
 is null still treats [] as null.
blah, you're right. It will at least distinguish it from an empty slice though (like arr[$..$]). I don't think there's any way to tell [] from null except typeof(null) at all. At runtime they're both the same: no contents, so null pointer and zero length.
Oct 17 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Oct 18, 2013 at 01:27:33AM +0200, Adam D. Ruppe wrote:
 On Thursday, 17 October 2013 at 23:12:03 UTC, ProgrammingGhost
 wrote:
is null still treats [] as null.
blah, you're right. It will at least distinguish it from an empty slice though (like arr[$..$]). I don't think there's any way to tell [] from null except typeof(null) at all. At runtime they're both the same: no contents, so null pointer and zero length.
I think it's a mistake to rely on the distinction between null and non-null but empty arrays in D. They should be regarded as implementation details that user code shouldn't depend on. If you need to distinguish between arrays that are empty and arrays that are null, consider using Nullable!(T[]) instead. T -- Curiosity kills the cat. Moral: don't be the cat.
Oct 17 2013
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 18 Oct 2013 00:32:46 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Fri, Oct 18, 2013 at 01:27:33AM +0200, Adam D. Ruppe wrote:
 On Thursday, 17 October 2013 at 23:12:03 UTC, ProgrammingGhost
 wrote:
is null still treats [] as null.
blah, you're right. It will at least distinguish it from an empty slice though (like arr[$..$]). I don't think there's any way to tell [] from null except typeof(null) at all. At runtime they're both the same: no contents, so null pointer and zero length.
I think it's a mistake to rely on the distinction between null and non-null but empty arrays in D. They should be regarded as implementation details that user code shouldn't depend on. If you need to distinguish between arrays that are empty and arrays that are null, consider using Nullable!(T[]) instead.
This comes up time and again. The use of, and ability to distinguish empty from null is very useful. Yes, you run the risk of things like null pointer exceptions etc, but we have that risk now without the reward of being able to distinguish these cases. Take this simple design: string readline(); This function would like to be able to: - return null for EOF - return [] for a blank line but it cannot, because as soon as you write: foo(readline()) the null/[] case merges. There are plenty of other such design/cases that can be imagined, and while you can work around them all they add complexity for zero gain. A simple pointer can do this.. string cannot, this is sad. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 18 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/18/13 3:44 AM, Regan Heath wrote:
 On Fri, 18 Oct 2013 00:32:46 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>
 wrote:

 On Fri, Oct 18, 2013 at 01:27:33AM +0200, Adam D. Ruppe wrote:
 On Thursday, 17 October 2013 at 23:12:03 UTC, ProgrammingGhost
 wrote:
is null still treats [] as null.
blah, you're right. It will at least distinguish it from an empty slice though (like arr[$..$]). I don't think there's any way to tell [] from null except typeof(null) at all. At runtime they're both the same: no contents, so null pointer and zero length.
I think it's a mistake to rely on the distinction between null and non-null but empty arrays in D. They should be regarded as implementation details that user code shouldn't depend on. If you need to distinguish between arrays that are empty and arrays that are null, consider using Nullable!(T[]) instead.
This comes up time and again. The use of, and ability to distinguish empty from null is very useful.
I disagree.
 Yes, you run the risk of things like
 null pointer exceptions etc, but we have that risk now without the
 reward of being able to distinguish these cases.

 Take this simple design:

    string readline();

 This function would like to be able to:
   - return null for EOF
   - return [] for a blank line
That's bad API design, pure and simple. The function should e.g. return the string including the line terminator, and only return an empty (or null) string upon EOF. Andrei
Oct 18 2013
next sibling parent reply "Max Samukha" <maxsamukha gmail.com> writes:
On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu 
wrote:
 On 10/18/13 3:44 AM, Regan Heath wrote:
 On Fri, 18 Oct 2013 00:32:46 +0100, H. S. Teoh 
 <hsteoh quickfur.ath.cx>
 wrote:

 On Fri, Oct 18, 2013 at 01:27:33AM +0200, Adam D. Ruppe wrote:
 On Thursday, 17 October 2013 at 23:12:03 UTC, 
 ProgrammingGhost
 wrote:
is null still treats [] as null.
blah, you're right. It will at least distinguish it from an empty slice though (like arr[$..$]). I don't think there's any way to tell [] from null except typeof(null) at all. At runtime they're both the same: no contents, so null pointer and zero length.
I think it's a mistake to rely on the distinction between null and non-null but empty arrays in D. They should be regarded as implementation details that user code shouldn't depend on. If you need to distinguish between arrays that are empty and arrays that are null, consider using Nullable!(T[]) instead.
This comes up time and again. The use of, and ability to distinguish empty from null is very useful.
I disagree.
 Yes, you run the risk of things like
 null pointer exceptions etc, but we have that risk now without 
 the
 reward of being able to distinguish these cases.

 Take this simple design:

   string readline();

 This function would like to be able to:
  - return null for EOF
  - return [] for a blank line
That's bad API design, pure and simple. The function should e.g. return the string including the line terminator, and only return an empty (or null) string upon EOF. Andrei
*That's* bad API design. readln should be symmetrical to writeln, not write. And about preserving the exact representation of new lines, readln/writeln shouldn't preserve that, pure and simple.
Oct 18 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/18/13 9:26 AM, Max Samukha wrote:
 *That's* bad API design. readln should be symmetrical to writeln, not
 write. And about preserving the exact representation of new lines,
 readln/writeln shouldn't preserve that, pure and simple.
Fair point. I just gave one possible alternative out of many. Thing is, relying on client code to distinguish subtleties between empty and null strings is fraught with dangers. Andrei
Oct 18 2013
next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, October 18, 2013 09:55:46 Andrei Alexandrescu wrote:
 On 10/18/13 9:26 AM, Max Samukha wrote:
 *That's* bad API design. readln should be symmetrical to writeln, not
 write. And about preserving the exact representation of new lines,
 readln/writeln shouldn't preserve that, pure and simple.
Fair point. I just gave one possible alternative out of many. Thing is, relying on client code to distinguish subtleties between empty and null strings is fraught with dangers.
Yeah, but the primary reason that it's bad design is the fact that D tries to conflate null and empty instead of keeping them distinct (which is essentially the complaint that was made). Whether that's ultimately good or bad is up for debate, but the side effect is that relying on the difference between null and empty ends up being very bug-prone, whereas in other languages which don't conflate the two, it isn't problematic in the same way, and it's much more reasonable to have the API treat them differently. - Jonathan M Davis
Oct 18 2013
parent reply Shammah Chancellor <s s.com> writes:
On 2013-10-18 17:32:58 +0000, Jonathan M Davis said:

 On Friday, October 18, 2013 09:55:46 Andrei Alexandrescu wrote:
 On 10/18/13 9:26 AM, Max Samukha wrote:
 *That's* bad API design. readln should be symmetrical to writeln, not
 write. And about preserving the exact representation of new lines,
 readln/writeln shouldn't preserve that, pure and simple.
Fair point. I just gave one possible alternative out of many. Thing is, relying on client code to distinguish subtleties between empty and null strings is fraught with dangers.
Yeah, but the primary reason that it's bad design is the fact that D tries to conflate null and empty instead of keeping them distinct (which is essentially the complaint that was made). Whether that's ultimately good or bad is up for debate, but the side effect is that relying on the difference between null and empty ends up being very bug-prone, whereas in other languages which don't conflate the two, it isn't problematic in the same way, and it's much more reasonable to have the API treat them differently. - Jonathan M Davis
Null and the Empty Set are different entities. A set containing exactly nothing, vs undefined. However, null is not handled properly in D or any other systems language since it's simply a pointer with value = 0. if (null == 0) is a true statement in C, C++, and D, but is not in fact true. Null is neither equal to zero, nor not equal to zero.
Oct 25 2013
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 10/25/2013 11:02 PM, Shammah Chancellor wrote:
 ... null == 0 ... in C, C++, and D,
Check again.
Oct 26 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Oct 18, 2013 at 01:32:58PM -0400, Jonathan M Davis wrote:
 On Friday, October 18, 2013 09:55:46 Andrei Alexandrescu wrote:
 On 10/18/13 9:26 AM, Max Samukha wrote:
 *That's* bad API design. readln should be symmetrical to writeln,
 not write. And about preserving the exact representation of new
 lines, readln/writeln shouldn't preserve that, pure and simple.
Fair point. I just gave one possible alternative out of many. Thing is, relying on client code to distinguish subtleties between empty and null strings is fraught with dangers.
Yeah, but the primary reason that it's bad design is the fact that D tries to conflate null and empty instead of keeping them distinct (which is essentially the complaint that was made). Whether that's ultimately good or bad is up for debate, but the side effect is that relying on the difference between null and empty ends up being very bug-prone, whereas in other languages which don't conflate the two, it isn't problematic in the same way, and it's much more reasonable to have the API treat them differently.
[...] IMO, distinguishing between null and empty arrays is bad abstraction. I agree with D's "conflation" of null with empty, actually. Conceptually speaking, an array is a sequence of values of non-negative length. An array with non-zero length contains at least one element, and is therefore non-empty, whereas an array with zero length is empty. Same thing goes with a slice. A slice is a view into zero or more array elements. A slice with zero length is empty, and a slice with non-zero length contains at least one element. There's nowhere in this conceptual scheme for such a thing as a "null array" that's distinct from an empty array. This distinction only crops up in implementation, and IMO leads to code smells because code should be operating based on the conceptual behaviour of arrays rather than on the implementation details. T -- The most powerful one-line C program: #include "/dev/tty" -- IOCCC
Oct 18 2013
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 18 Oct 2013 18:38:12 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Fri, Oct 18, 2013 at 01:32:58PM -0400, Jonathan M Davis wrote:
 On Friday, October 18, 2013 09:55:46 Andrei Alexandrescu wrote:
 On 10/18/13 9:26 AM, Max Samukha wrote:
 *That's* bad API design. readln should be symmetrical to writeln,
 not write. And about preserving the exact representation of new
 lines, readln/writeln shouldn't preserve that, pure and simple.
Fair point. I just gave one possible alternative out of many. Thing is, relying on client code to distinguish subtleties between empty and null strings is fraught with dangers.
Yeah, but the primary reason that it's bad design is the fact that D tries to conflate null and empty instead of keeping them distinct (which is essentially the complaint that was made). Whether that's ultimately good or bad is up for debate, but the side effect is that relying on the difference between null and empty ends up being very bug-prone, whereas in other languages which don't conflate the two, it isn't problematic in the same way, and it's much more reasonable to have the API treat them differently.
[...] Conceptually speaking, an array is a sequence of values of non-negative length. An array with non-zero length contains at least one element, and is therefore non-empty, whereas an array with zero length is empty. Same thing goes with a slice. A slice is a view into zero or more array elements. A slice with zero length is empty, and a slice with non-zero length contains at least one element.
This describes the empty/not empty distinction.
 There's nowhere in this conceptual
 scheme for such a thing as a "null array" that's distinct from an empty
 array.
And this is the problem/complaint. You cannot represent specified/not specified, you can only represent empty/not empty. I agree you cannot logically have an existing array that is somehow a "null array" and distinct/different from an empty array, but that's not what I want/am asking for. I want to use an array 'reference' to represent that the array is non existent, has not been set, has not been defined, etc. This is what null is for.
 This distinction only crops up in implementation, and IMO leads
 to code smells because code should be operating based on the conceptual
 behaviour of arrays rather than on the implementation details.
It is not an implementation detail, it's a conceptual difference. A reference type has the power to represent specified/not specified in addition to referring to an array which is empty/not empty. A value type, like int, cannot do the same thing without either boxing (into a reference type, whose reference can be null) or by giving up one of it's values (i.e. 0) and pretending it's something special. This is what D's string has done with empty, it is pretending that it is special and means "not specified", and because it converts null into empty, that means we cannot rely on empty really being empty (as in the user wants the value set to empty), as it might also be a value the user did not specify. It's actually a fairly simple distinction I want to be able to make. If you get input from a user a field called "foo" may be: - not specified - specified and if specified, may be: - empty - not empty null allows us the specified/not specified distinction. Regan -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 21 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Oct 21, 2013 at 11:53:44AM +0100, Regan Heath wrote:
 On Fri, 18 Oct 2013 18:38:12 +0100, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:
[...]
Conceptually speaking, an array is a sequence of values of
non-negative length. An array with non-zero length contains at least
one element, and is therefore non-empty, whereas an array with zero
length is empty. Same thing goes with a slice. A slice is a view into
zero or more array elements. A slice with zero length is empty, and a
slice with non-zero length contains at least one element.
This describes the empty/not empty distinction.
There's nowhere in this conceptual scheme for such a thing as a "null
array" that's distinct from an empty array.
And this is the problem/complaint. You cannot represent specified/not specified, you can only represent empty/not empty. I agree you cannot logically have an existing array that is somehow a "null array" and distinct/different from an empty array, but that's not what I want/am asking for. I want to use an array 'reference' to represent that the array is non existent, has not been set, has not been defined, etc. This is what null is for.
The thing is, D slices are value types even though the elements they point to are pointed to by reference. If you treat slices (slices themselves, that is, not the elements they refer to) as value types, then the problem goes away. If you want to have a *reference* to a slice, then you simply write T[]* and then it becomes nullable as expected. I do agree that the current situation is confusing, though, mainly because you can write `if (arr is null)`, which then makes you think of it as a reference type. I think that should be prohibited, and slices should be treated as pure value types, and all comparisons should be checked with .length (or .empty if you import std.range). T -- Кто везде - тот нигде.
Oct 21 2013
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 21 Oct 2013 15:01:04 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Mon, Oct 21, 2013 at 11:53:44AM +0100, Regan Heath wrote:
 On Fri, 18 Oct 2013 18:38:12 +0100, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:
[...]
Conceptually speaking, an array is a sequence of values of
non-negative length. An array with non-zero length contains at least
one element, and is therefore non-empty, whereas an array with zero
length is empty. Same thing goes with a slice. A slice is a view into
zero or more array elements. A slice with zero length is empty, and a
slice with non-zero length contains at least one element.
This describes the empty/not empty distinction.
There's nowhere in this conceptual scheme for such a thing as a "null
array" that's distinct from an empty array.
And this is the problem/complaint. You cannot represent specified/not specified, you can only represent empty/not empty. I agree you cannot logically have an existing array that is somehow a "null array" and distinct/different from an empty array, but that's not what I want/am asking for. I want to use an array 'reference' to represent that the array is non existent, has not been set, has not been defined, etc. This is what null is for.
The thing is, D slices are value types even though the elements they point to are pointed to by reference. If you treat slices (slices themselves, that is, not the elements they refer to) as value types, then the problem goes away. If you want to have a *reference* to a slice, then you simply write T[]* and then it becomes nullable as expected.
True, and that's a pointer, and I am comfortable using pointers.. however I worry this will limit the compilers ability to optimise somehow.. and doesn't it make the code immediately un"safe"?
 I do agree that the current situation is confusing, though, mainly
 because you can write `if (arr is null)`, which then makes you think of
 it as a reference type. I think that should be prohibited, and slices
 should be treated as pure value types, and all comparisons should be
 checked with .length (or .empty if you import std.range).
IMO, this would be preferable to the current situation even thought I would rather go the other way and have a reference type. I can see the argument that it would be safer and easier for most users, even though I do not believe I am in that category. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 21 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Oct 21, 2013 at 04:41:23PM +0100, Regan Heath wrote:
 On Mon, 21 Oct 2013 15:01:04 +0100, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:
 
On Mon, Oct 21, 2013 at 11:53:44AM +0100, Regan Heath wrote:
[...]
I agree you cannot logically have an existing array that is somehow
a "null array" and distinct/different from an empty array, but
that's not what I want/am asking for.  I want to use an array
'reference' to represent that the array is non existent, has not
been set, has not been defined, etc.  This is what null is for.
The thing is, D slices are value types even though the elements they point to are pointed to by reference. If you treat slices (slices themselves, that is, not the elements they refer to) as value types, then the problem goes away. If you want to have a *reference* to a slice, then you simply write T[]* and then it becomes nullable as expected.
True, and that's a pointer, and I am comfortable using pointers.. however I worry this will limit the compilers ability to optimise somehow.. and doesn't it make the code immediately un"safe"?
No, pointers are allowed in safe. What is not allowed is pointer *arithmetic* and casting pointers into pointers of different types.
I do agree that the current situation is confusing, though, mainly
because you can write `if (arr is null)`, which then makes you think
of it as a reference type. I think that should be prohibited, and
slices should be treated as pure value types, and all comparisons
should be checked with .length (or .empty if you import std.range).
IMO, this would be preferable to the current situation even thought I would rather go the other way and have a reference type. I can see the argument that it would be safer and easier for most users, even though I do not believe I am in that category.
[...] Well, either way would work, though I do prefer treating slices as value types. It's just cleaner conceptually, IMO. But I suppose this is one of those things in which reasonable people may disagree. T -- Sometimes the best solution to morale problems is just to fire all of the unhappy people. -- despair.com
Oct 21 2013
parent "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 21 Oct 2013 17:34:51 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Mon, Oct 21, 2013 at 04:41:23PM +0100, Regan Heath wrote:
 On Mon, 21 Oct 2013 15:01:04 +0100, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:

On Mon, Oct 21, 2013 at 11:53:44AM +0100, Regan Heath wrote:
[...]
I agree you cannot logically have an existing array that is somehow
a "null array" and distinct/different from an empty array, but
that's not what I want/am asking for.  I want to use an array
'reference' to represent that the array is non existent, has not
been set, has not been defined, etc.  This is what null is for.
The thing is, D slices are value types even though the elements they point to are pointed to by reference. If you treat slices (slices themselves, that is, not the elements they refer to) as value types, then the problem goes away. If you want to have a *reference* to a slice, then you simply write T[]* and then it becomes nullable as expected.
True, and that's a pointer, and I am comfortable using pointers.. however I worry this will limit the compilers ability to optimise somehow.. and doesn't it make the code immediately un"safe"?
No, pointers are allowed in safe. What is not allowed is pointer *arithmetic* and casting pointers into pointers of different types.
Ah, thanks.
I do agree that the current situation is confusing, though, mainly
because you can write `if (arr is null)`, which then makes you think
of it as a reference type. I think that should be prohibited, and
slices should be treated as pure value types, and all comparisons
should be checked with .length (or .empty if you import std.range).
IMO, this would be preferable to the current situation even thought I would rather go the other way and have a reference type. I can see the argument that it would be safer and easier for most users, even though I do not believe I am in that category.
[...] Well, either way would work, though I do prefer treating slices as value types. It's just cleaner conceptually, IMO. But I suppose this is one of those things in which reasonable people may disagree.
I agree that conceptually if you slice something, you cannot get a 'null' reference. So, a null state for slices makes no sense. However, most people see arrays as slices, slices as arrays - do you? If so, for arrays the same conceptual argument does not apply. If not, how do we tell we have a slice, or an array? If we can't tell, then we have to check for null with both anyway.. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 22 2013
prev sibling next sibling parent reply "Max Samukha" <maxsamukha gmail.com> writes:
On Friday, 18 October 2013 at 16:55:19 UTC, Andrei Alexandrescu 
wrote:
 Fair point. I just gave one possible alternative out of many. 
 Thing is, relying on client code to distinguish subtleties 
 between empty and null strings is fraught with dangers.

 Andrei
I agree. Thinking about your variant of readln - it's ok to use [] as the value indicating EOF, since it is not included in the value set of type "line" as you define it. But generally, neither cast(T[])[] nor cast(T[])null should be used like that, because both of them are in the set of T[]'s values, i.e. a generic stream returning [] to signify its end would be a bad idea - that should be either a side effect or a value outside T[]'s set. Hm, I've just said nothing with many words. Never mind.
Oct 18 2013
parent reply "Kagamin" <spam here.lot> writes:
On Friday, 18 October 2013 at 17:59:17 UTC, Max Samukha wrote:
 On Friday, 18 October 2013 at 16:55:19 UTC, Andrei Alexandrescu 
 wrote:
 Fair point. I just gave one possible alternative out of many. 
 Thing is, relying on client code to distinguish subtleties 
 between empty and null strings is fraught with dangers.

 Andrei
I agree. Thinking about your variant of readln - it's ok to use [] as the value indicating EOF, since it is not included in the value set of type "line" as you define it.
No, if the last line is empty, it has no new line character(s) at the end, and is as empty, as it can get.
Oct 19 2013
parent "Max Samukha" <maxsamukha gmail.com> writes:
On Saturday, 19 October 2013 at 12:04:43 UTC, Kagamin wrote:
 On Friday, 18 October 2013 at 17:59:17 UTC, Max Samukha wrote:
 On Friday, 18 October 2013 at 16:55:19 UTC, Andrei 
 Alexandrescu wrote:
 Fair point. I just gave one possible alternative out of many. 
 Thing is, relying on client code to distinguish subtleties 
 between empty and null strings is fraught with dangers.

 Andrei
I agree. Thinking about your variant of readln - it's ok to use [] as the value indicating EOF, since it is not included in the value set of type "line" as you define it.
No, if the last line is empty, it has no new line character(s) at the end, and is as empty, as it can get.
Right. Then readln is broken.
Oct 19 2013
prev sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 18 Oct 2013 17:55:46 +0100, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 10/18/13 9:26 AM, Max Samukha wrote:
 *That's* bad API design. readln should be symmetrical to writeln, not
 write. And about preserving the exact representation of new lines,
 readln/writeln shouldn't preserve that, pure and simple.
Fair point. I just gave one possible alternative out of many. Thing is, relying on client code to distinguish subtleties between empty and null strings is fraught with dangers.
My code does not need to distinguish between empty and null. null is checked for, and empty is just a normal value for a string. The "problem" you're referring to is /casused/ by conflating null and empty, by making empty strings "special" in the same way someone might make 0 a special value for an int (meaning not specified - for example). If you stop using empty string as a special case of null, then empty does not need special handling - it's just a normal string value handled like any other - you can read it, write it, append it, etc etc etc. null is the /only/ case which needs special handling - just like any other reference type. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 21 2013
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Oct 18, 2013 at 06:26:05PM +0200, Max Samukha wrote:
 On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu
 wrote:
On 10/18/13 3:44 AM, Regan Heath wrote:
[...]
Take this simple design:

  string readline();

This function would like to be able to:
 - return null for EOF
 - return [] for a blank line
That's bad API design, pure and simple. The function should e.g. return the string including the line terminator, and only return an empty (or null) string upon EOF. Andrei
*That's* bad API design. readln should be symmetrical to writeln, not write. And about preserving the exact representation of new lines, readln/writeln shouldn't preserve that, pure and simple.
I agree. A better solution is to provide an eof() method (or better, .empty) that tells you when readln() will succeed, and readln() should throw upon EOF. The problem is analogous to reading from an input range: you always check whether the range is .empty before you call .front, since when the range is empty .front has no meaningful value to return. Relying on some kind of special sentinel value to represent the absence of a value is a code smell. T -- Тише едешь, дальше будешь.
Oct 18 2013
prev sibling next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu 
wrote:
 That's bad API design, pure and simple. The function should 
 e.g. return the string including the line terminator, and only 
 return an empty (or null) string upon EOF.
I'd say it should throw upon EOF as it is pretty high-level convenience function.
Oct 18 2013
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 18 Oct 2013 17:36:28 +0100, Dicebot <public dicebot.lv> wrote:

 On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu wrote:
 That's bad API design, pure and simple. The function should e.g. return  
 the string including the line terminator, and only return an empty (or  
 null) string upon EOF.
I'd say it should throw upon EOF as it is pretty high-level convenience function.
I disagree. Exceptions should never be used for flow control so the rule is to throw on exceptional occurrences ONLY not on something that you will ALWAYS eventually happen. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 21 2013
next sibling parent "Dicebot" <public dicebot.lv> writes:
On Monday, 21 October 2013 at 09:40:13 UTC, Regan Heath wrote:
 I disagree.  Exceptions should never be used for flow control 
 so the rule is to throw on exceptional occurrences ONLY not on 
 something that you will ALWAYS eventually happen.
For such function it is exceptional situation. For precise reading different API is required anyway (==different function).
Oct 21 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Oct 21, 2013 at 10:40:14AM +0100, Regan Heath wrote:
 On Fri, 18 Oct 2013 17:36:28 +0100, Dicebot <public dicebot.lv> wrote:
 
On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu wrote:
That's bad API design, pure and simple. The function should e.g.
return the string including the line terminator, and only return
an empty (or null) string upon EOF.
I'd say it should throw upon EOF as it is pretty high-level convenience function.
I disagree. Exceptions should never be used for flow control so the rule is to throw on exceptional occurrences ONLY not on something that you will ALWAYS eventually happen.
[...] while (!file.eof) { auto line = file.readln(); // never throws ... } T -- There are two ways to write error-free programs; only the third one works.
Oct 21 2013
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 21 Oct 2013 15:02:35 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Mon, Oct 21, 2013 at 10:40:14AM +0100, Regan Heath wrote:
 On Fri, 18 Oct 2013 17:36:28 +0100, Dicebot <public dicebot.lv> wrote:

On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu wrote:
That's bad API design, pure and simple. The function should e.g.
return the string including the line terminator, and only return
an empty (or null) string upon EOF.
I'd say it should throw upon EOF as it is pretty high-level convenience function.
I disagree. Exceptions should never be used for flow control so the rule is to throw on exceptional occurrences ONLY not on something that you will ALWAYS eventually happen.
[...] while (!file.eof) { auto line = file.readln(); // never throws ... }
For a file this is implementable (without a buffer) but not for a socket or similar source/stream where a read MUST be performed to detect EOF. So, if you're implementing a line reader over multiple sources, you would need to buffer. Not the end of the world, but definitely more complicated than just returning a null, no? R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 21 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Oct 21, 2013 at 04:47:05PM +0100, Regan Heath wrote:
 On Mon, 21 Oct 2013 15:02:35 +0100, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:
 
On Mon, Oct 21, 2013 at 10:40:14AM +0100, Regan Heath wrote:
On Fri, 18 Oct 2013 17:36:28 +0100, Dicebot <public dicebot.lv> wrote:

On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu wrote:
That's bad API design, pure and simple. The function should e.g.
return the string including the line terminator, and only return
an empty (or null) string upon EOF.
I'd say it should throw upon EOF as it is pretty high-level convenience function.
I disagree. Exceptions should never be used for flow control so the rule is to throw on exceptional occurrences ONLY not on something that you will ALWAYS eventually happen.
[...] while (!file.eof) { auto line = file.readln(); // never throws ... }
For a file this is implementable (without a buffer) but not for a socket or similar source/stream where a read MUST be performed to detect EOF. So, if you're implementing a line reader over multiple sources, you would need to buffer. Not the end of the world, but definitely more complicated than just returning a null, no?
[...] This is actually a very interesting issue to me, and one which I've thought about a lot in the past. There are two incompatible (albeit with much overlap) approaches here. One is the Unix approach where EOF is unknown until you try to read past the end of a file (socket, etc.), and the other is where EOF is known *before* you perform a read. Personally, I prefer the second approach as being conceptually cleaner: an input stream should "know" when it doesn't have any more data, so that its EOF state can be queried at any time. Conceptually speaking one shouldn't need to (try to) read from it before realizing there's nothing left. However, I understand that the Unix approach is easier to implement, in the sense that if you have a network socket, it may be the case that when you attempt to read from it, it is still connected, but before any further data is received, the remote end disconnects. In this case, the OS can't reasonably predict when there will be more incoming data, so you do have to read the socket before finding out that the remote end is going to disconnect without sending anything more. In terms of API design, though, I still lean towards the approach where EOF is always query-able, because it leads to cleaner code. This can be implemented on Posix by having .eof read a single byte (or whatever unit is expected) and buffering it, and the subsequent readln() takes this buffering into account. This slight complication in implementation is worth achieving the nicer user-facing API, IMO. T -- I've been around long enough to have seen an endless parade of magic new techniques du jour, most of which purport to remove the necessity of thought about your programming problem. In the end they wind up contributing one or two pieces to the collective wisdom, and fade away in the rearview mirror. -- Walter Bright
Oct 21 2013
parent "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 21 Oct 2013 17:49:43 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Mon, Oct 21, 2013 at 04:47:05PM +0100, Regan Heath wrote:
 On Mon, 21 Oct 2013 15:02:35 +0100, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:

On Mon, Oct 21, 2013 at 10:40:14AM +0100, Regan Heath wrote:
On Fri, 18 Oct 2013 17:36:28 +0100, Dicebot <public dicebot.lv> wrote:

On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu  
wrote:
That's bad API design, pure and simple. The function should e.g.
return the string including the line terminator, and only return
an empty (or null) string upon EOF.
I'd say it should throw upon EOF as it is pretty high-level convenience function.
I disagree. Exceptions should never be used for flow control so the rule is to throw on exceptional occurrences ONLY not on something that you will ALWAYS eventually happen.
[...] while (!file.eof) { auto line = file.readln(); // never throws ... }
For a file this is implementable (without a buffer) but not for a socket or similar source/stream where a read MUST be performed to detect EOF. So, if you're implementing a line reader over multiple sources, you would need to buffer. Not the end of the world, but definitely more complicated than just returning a null, no?
[...] This is actually a very interesting issue to me, and one which I've thought about a lot in the past. There are two incompatible (albeit with much overlap) approaches here. One is the Unix approach where EOF is unknown until you try to read past the end of a file (socket, etc.), and the other is where EOF is known *before* you perform a read. Personally, I prefer the second approach as being conceptually cleaner: an input stream should "know" when it doesn't have any more data, so that its EOF state can be queried at any time. Conceptually speaking one shouldn't need to (try to) read from it before realizing there's nothing left. However, I understand that the Unix approach is easier to implement, in the sense that if you have a network socket, it may be the case that when you attempt to read from it, it is still connected, but before any further data is received, the remote end disconnects. In this case, the OS can't reasonably predict when there will be more incoming data, so you do have to read the socket before finding out that the remote end is going to disconnect without sending anything more. In terms of API design, though, I still lean towards the approach where EOF is always query-able, because it leads to cleaner code. This can be implemented on Posix by having .eof read a single byte (or whatever unit is expected) and buffering it, and the subsequent readln() takes this buffering into account. This slight complication in implementation is worth achieving the nicer user-facing API, IMO.
I don't agree the user-facing API is nicer. It is more complex both in concept and implementation. API #1: 1 function, readline(), returns null on EOF. You call readline() and check the result for null. The check, naturally follows the attempt to read, which is the task you are trying to accomplish. Simple, straight forward. API #2: 2 functions, readline() throws on EOF, isEof() checks for EOF. Your purpose is to read lines, so you call readline(), it is naturally easy to forget to call isEof(). Coding the example loop above requires you think about EOF /before/ you read a line, this is not how people think. This API is therefore more complex, and less intuitive for no gain. So, having a usable null state allows the simpler, more direct API. Lack of it requires a more complicated design and a more complicated implementation. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 22 2013
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu 
wrote:
 This comes up time and again.  The use of, and ability to 
 distinguish
 empty from null is very useful.
I disagree.
That what if does by default.
Oct 18 2013
prev sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 18 Oct 2013 16:43:23 +0100, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 10/18/13 3:44 AM, Regan Heath wrote:
 On Fri, 18 Oct 2013 00:32:46 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>
 wrote:

 On Fri, Oct 18, 2013 at 01:27:33AM +0200, Adam D. Ruppe wrote:
 On Thursday, 17 October 2013 at 23:12:03 UTC, ProgrammingGhost
 wrote:
is null still treats [] as null.
blah, you're right. It will at least distinguish it from an empty slice though (like arr[$..$]). I don't think there's any way to tell [] from null except typeof(null) at all. At runtime they're both the same: no contents, so null pointer and zero length.
I think it's a mistake to rely on the distinction between null and non-null but empty arrays in D. They should be regarded as implementation details that user code shouldn't depend on. If you need to distinguish between arrays that are empty and arrays that are null, consider using Nullable!(T[]) instead.
This comes up time and again. The use of, and ability to distinguish empty from null is very useful.
I disagree.
Because.. the risk of a null pointer exception is not worth the gain? If so, why not go the whole hog and prevent string from ever being null? Then, at least we'd gain something from the loss of the null/empty distinction/limitation. D strings ought to decide whether they're reference types or value types, if the former then I want consistent null back, if the latter then I want to be rid of null for good. This middle ground sucks.
 Yes, you run the risk of things like
 null pointer exceptions etc, but we have that risk now without the
 reward of being able to distinguish these cases.

 Take this simple design:

    string readline();

 This function would like to be able to:
   - return null for EOF
   - return [] for a blank line
That's bad API design, pure and simple. The function should e.g. return the string including the line terminator, and only return an empty (or null) string upon EOF.
It's the C# ReadLine() design and I've never once had a bug because of it. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 21 2013
prev sibling parent reply "Kagamin" <spam here.lot> writes:
On Friday, 18 October 2013 at 10:44:11 UTC, Regan Heath wrote:
 This comes up time and again.  The use of, and ability to 
 distinguish empty from null is very useful.  Yes, you run the 
 risk of things like null pointer exceptions etc, but we have 
 that risk now without the reward of being able to distinguish 
 these cases.
In C# code null strings are a plague. Most of the time you don't need them, but still must check for them just in order to not get an exception. Also business logic makes no difference between null and empty - both of them are just "no data", so you end up typing if(string.IsNullOrEmpty(mystr)) every time everywhere. And, yeah, only one small feature in this big mess ever needs to differentiate between null and empty. I found this one case trivially implementable, but nulls still plague all remaining code.
 Take this simple design:

   string readline();

 This function would like to be able to:
  - return null for EOF
  - return [] for a blank line

 but it cannot, because as soon as you write:

   foo(readline())

 the null/[] case merges.
This is a horrible design. You better throw an exception on eof instead of null: this null will break the caller anyway possibly in a contrived way. It works if you read one line per loop cycle, but if you read several lines and assume they're not null (some multiline data format), you're screwed or your code becomes littered with null checks, but who accounts for all alternative scenarios from the start? If readline returns empty string on eof, I don't expect it to break any business logic. If the empty string doesn't match, ok, no match, continue. You can check for eof equally, but at *your* discretion, not when the external data wants you to do it.
 There are plenty of other such design/cases that can be 
 imagined, and while you can work around them all they add 
 complexity for zero gain.
I believe there's no problem domain, which would like to differentiate between null and empty string instead of treating them as "no data".
Oct 19 2013
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Sat, 19 Oct 2013 10:56:02 +0100, Kagamin <spam here.lot> wrote:

 On Friday, 18 October 2013 at 10:44:11 UTC, Regan Heath wrote:
 This comes up time and again.  The use of, and ability to distinguish  
 empty from null is very useful.  Yes, you run the risk of things like  
 null pointer exceptions etc, but we have that risk now without the  
 reward of being able to distinguish these cases.
In C# code null strings are a plague.
I code in C# every day for work and I never have any problems with null strings. The conflated empty/null cases are the real nightmare for me (more below). null strings are no different to null class references, they're not a special case. People seem to have this odd idea that null is somehow an invalid state for a string /reference/ (c# strings are reference types), it's not. People also seem to elevate empty strings to some sort of special status, that's like saying 0 has some special status for int - it doesn't it's just one of a number of possible values. In fact, int having no null like state is a "problem" causing solutions like boxing to elevate the value type to a reference in order to allow a null state for int. Yet, in D we've decided to inconsistently remove that functionality from string for no gain. If string could not actually be null then we'd gain something from the limitation, instead we lose functionality and gain nothing - you still have to check your strings for null in D. We ought to go one way or the other, this middle ground is worse than either of the other options. In my code I don't have to check for or treat empty strings any differently to other values. I simply have to check for null. Remembering to check for null on reference types is automatic for me, strings are not special in this regard.
 Most of the time you don't need them
Sure, and if I don't have access to null (like when using a value type like int), I can code around that lack, but it's never as straight forward a solution.
 but still must check for them just in order to not get an exception.
Sure, you must check for the possible states of a reference type.
 Also business logic makes no difference between null and empty
This is simply not true. Example at the end.
 both of them are just "no data", so you end up typing  
 if(string.IsNullOrEmpty(mystr)) every time everywhere.
I only have to code like this when I use 3rd party code which has conflated empty and null. In my code when it's null it means not specified, and empty is just one type of value - for which I do no special handling.
 And, yeah, only one small feature in this big mess ever needs to  
 differentiate between null and empty.
Untrue, null allows many alternate and IMO more direct/obvious designs.
 I found this one case trivially implementable, but nulls still plague  
 all remaining code.
Which one case? The readline() one below?
 Take this simple design:

   string readline();

 This function would like to be able to:
  - return null for EOF
  - return [] for a blank line

 but it cannot, because as soon as you write:

   foo(readline())

 the null/[] case merges.
This is a horrible design. You better throw an exception on eof instead of null:
No, no, no. You should only throw in exceptional circumstances or you risk using exceptions for flow control, and that is just plain horrid.
 this null will break the caller anyway possibly in a contrived way.
Never a contrived way, always a blatantly obvious one and only if you're not doing your job properly. If you want a contrived, unpredictable and difficult to debug breakage look no further than heap or stack corruption. Null is never a difficult bug to find and fix, and is no different to forgetting to handle one of the integer return values of a function. I use this all the time: http://msdn.microsoft.com/en-us/library/system.io.streamreader.readline.aspx It has never caused me any issues. It explicitly states that null is a possible output, and so I check for it - doing anything less is simply bad programming.
 It works if you read one line per loop cycle, but if you read several  
 lines and assume they're not null (some multiline data format),
There is your problem, never "assume" - the documentation is very clear on the issue.
 you're screwed or your code becomes littered with null checks, but who  
 accounts for all alternative scenarios from the start?
Me, and IMO any competent programmer. It is misguided to think you can ignore valid states, null is a valid state in C, C++, C#, and D.. You should be thinking about and handling it. You don't have to check for it on every access to the variable, but you do need to check for it once where the variable is assigned, or passed (in private functions you can skip this). From that point onward you can assume non-null, valid, job done.
 There are plenty of other such design/cases that can be imagined, and  
 while you can work around them all they add complexity for zero gain.
I believe there's no problem domain, which would like to differentiate between null and empty string instead of treating them as "no data".
null means not specified, non existent, was not there. empty means, present but set to empty/blank. Databases have this distinction for a reason. If you get input from a user a field called "foo" may be: - not specified - specified and if specified, may be: - empty - not empty If foo is not specified you may want to assign a default value for it, if your business logic is using empty to mean "not specified" you prevent the user actually setting foo to empty and that limitation is a right pain in many cases. You can code around this by using a boolean a dictionary to indicate the specified/not specified distinction, but this is less direct than simply using null. If we have null, lets use it, if we want to remove null the lets remove it, but can we get out of this horrid middle ground please. Regan -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 21 2013
parent reply "Kagamin" <spam here.lot> writes:
On Monday, 21 October 2013 at 10:33:01 UTC, Regan Heath wrote:
 null strings are no different to null class references, they're 
 not a special case.
True. That's an implementation detail which has no meaning for business logic. When implementation deviates from business logic, one ends up fixing the implementation details everywhere in order to implement business logic. That's why string.IsNullOrEmpty is used.
 People seem to have this odd idea that null is somehow an 
 invalid state for a string /reference/ (c# strings are 
 reference types), it's not.
That's the very problem: null and empty are valid states and must be treated equally as "no data", but they can't for purely technical reasons.
 People also seem to elevate empty strings to some sort of 
 special status, that's like saying 0 has some special status 
 for int - it doesn't it's just one of a number of possible 
 values.

 In fact, int having no null like state is a "problem" causing 
 solutions like boxing to elevate the value type to a reference 
 in order to allow a null state for int.
You want to check ints for null everywhere too?
 Yet, in D we've decided to inconsistently remove that 
 functionality from string for no gain.  If string could not 
 actually be null then we'd gain something from the limitation, 
 instead we lose functionality and gain nothing - you still have 
 to check your strings for null in D.
Huh? Null slices work just like empty ones - that's why this topic was started in the first place. One doesn't have to check slices for nulls, only for length. If you want clear nullable semantics, you have Nullable, it works for everything, including strings and ints. You would want this feature only in rare cases, so it doesn't make sense to make it default, or it will be a nuisance.
 both of them are just "no data", so you end up typing 
 if(string.IsNullOrEmpty(mystr)) every time everywhere.
I only have to code like this when I use 3rd party code which has conflated empty and null. In my code when it's null it means not specified, and empty is just one type of value - for which I do no special handling.
Equivalence between null and empty is a business logic's requirement, that's why it's done.
 And, yeah, only one small feature in this big mess ever needs 
 to differentiate between null and empty.
Untrue, null allows many alternate and IMO more direct/obvious designs.
The need for those designs is rare and trivially implementable for all value types.
 I found this one case trivially implementable, but nulls still 
 plague all remaining code.
Which one case? The readline() one below?
No, it was an authentication system in third-party code for one special case. I also had to specify this null value in app.config - guess how, explicitly specify, not substitute missing parameter with a default. Another possibility for readline is to return a tuple {bool eof, string line(non-null)} - this way you have easy check for eof and don't have to check for null when you don't need it.
 I use this all the time:
 http://msdn.microsoft.com/en-us/library/system.io.streamreader.readline.aspx

 It has never caused me any issues.  It explicitly states that 
 null is a possible output, and so I check for it - doing 
 anything less is simply bad programming.

 It works if you read one line per loop cycle, but if you read 
 several lines and assume they're not null (some multiline data 
 format),
There is your problem, never "assume" - the documentation is very clear on the issue.
 you're screwed or your code becomes littered with null checks, 
 but who accounts for all alternative scenarios from the start?
Me, and IMO any competent programmer. It is misguided to think you can ignore valid states, null is a valid state in C, C++, C#, and D.. You should be thinking about and handling it.
Here null is a valid state for readline, not for the caller: if the caller parses a multiline data format, unexpected end of file is an invalid state. And what do you gain by littering your code with those null checks? Just making runtime happy and adding noise to the code? You could use that time to improve the code or add features or even relax. It's exactly nullable strings, which gain you only a time waste.
 You don't have to check for it on every access to the variable, 
 but you do need to check for it once where the variable is 
 assigned, or passed (in private functions you can skip this).  
 From that point onward you can assume non-null, valid, job done.
You just said "never assume". The assumption may fail, because the string type is still nullable, compiler doesn't save you here, this sucks. And in order to check for everything everywhere on a level near that of the compiler, you must be not just competent, but perfect.
 I believe there's no problem domain, which would like to 
 differentiate between null and empty string instead of 
 treating them as "no data".
null means not specified, non existent, was not there. empty means, present but set to empty/blank. Databases have this distinction for a reason.
Oracle makes no distinction between null and empty string. For a reason? A database is an implementation detail of a data storage, it doesn't implement business logic, it only provides features, which can be used with more or less success to implement business logic. Ever heard of advantages of OO databases over relational ones? That's an illustration of technical details, which don't precisely map to business logic.
 If you get input from a user a field called "foo" may be:
  - not specified
  - specified

 and if specified, may be:
  - empty
  - not empty
If the user doesn't fill a text box, it's both empty and not specified - there's just no difference. And it doesn't matter how you store it in the database - as null or as empty string - both are presented in the same way. Heck, we use these optional text boxes everywhere - can you tell if their content is empty or not specified? And what if the value is required? Would you accept an empty value? And if your database treats empty string as not null, would you allow to register a user with an empty login name? And how to express this constraint in the database? In SQL "not null" means "required value", but it's not equivalent to the business logic'a notion of a required value. I wouldn't be surprised if Oracle did that in order to reject empty strings in not null fields. Let's consider a process of specifying user's data. What text fields do we have? 1. Login. No difference between null and empty - both invalid - "no data", must enter something. 2. First name. No difference between null and empty - both are "no data" and are presented as empty text box. 3. Middle name. ditto. 4. Last name. ditto. 5. Country. ditto. 6. State. ditto. 7. City. ditto. 8. Address. ditto. 9. Building. ditto. 10. Flat. ditto. 11. Zip code. ditto. 12. Phone. ditto. 13. Fax. ditto. 14. E-mail. ditto. 15. Site. ditto. 16. Passport number. ditto. 17. Birth place. ditto. 18. Comment. Hell! Comment! See? Not a single field in the list requires distinction between null and empty. And slices don't differentiate between them. Just as planned.
 If we have null, lets use it, if we want to remove null the 
 lets remove it, but can we get out of this horrid middle ground 
 please.
*sigh* people just don't buy the KISS principle...
Oct 25 2013
next sibling parent reply "Wyatt" <wyatt.epp gmail.com> writes:
On Friday, 25 October 2013 at 11:41:38 UTC, Kagamin wrote:
 That's an implementation detail which has no meaning for 
 business logic.
I've no real truck in this, but I do find it pretty bizarre to see _anyone_ using "business logic" as justification for anything here when D's own documentation is pretty explicit about not catering exclusively to that domain. -Wyatt
Oct 25 2013
parent reply "Kagamin" <spam here.lot> writes:
On Friday, 25 October 2013 at 12:35:44 UTC, Wyatt wrote:
 On Friday, 25 October 2013 at 11:41:38 UTC, Kagamin wrote:
 That's an implementation detail which has no meaning for 
 business logic.
I've no real truck in this, but I do find it pretty bizarre to see _anyone_ using "business logic" as justification for anything here when D's own documentation is pretty explicit about not catering exclusively to that domain.
Dunno about D documentation, I use tools to get shit done. If they help, that's good, if they don't, that's bad. And by "shit" I don't mean a product, not a heap of text files.
Oct 25 2013
parent "Kagamin" <spam here.lot> writes:
*fix* I mean a product.
Oct 25 2013
prev sibling next sibling parent reply "Max Samukha" <maxsamukha gmail.com> writes:
On Friday, 25 October 2013 at 11:41:38 UTC, Kagamin wrote:
 On Monday, 21 October 2013 at 10:33:01 UTC, Regan Heath wrote:
 null strings are no different to null class references, 
 they're not a special case.
True. That's an implementation detail which has no meaning for business logic. When implementation deviates from business logic, one ends up fixing the implementation details everywhere in order to implement business logic. That's why string.IsNullOrEmpty is used.
That's not an implementation detail. Whether "null" is in the set of values of a string type and whether it is identical to "empty" are fundamental properties of that type. If you define the string type to include "null", then "null" should be either identical to "empty" in *all cases* or distinct from that in all cases. D chose to fuse "null" and "empty" together in an inconsistent manner, which is a mistake. If we include "null" in the set, then either the [] literal should be non-null (and "null" and "empty" properly disjoint), or "null" and "empty" should always represent the same value. If we exclude it - *then* "null" becomes an implementation detail and should be dealt with only via .ptr.
 People seem to have this odd idea that null is somehow an 
 invalid state for a string /reference/ (c# strings are 
 reference types), it's not.
That's the very problem: null and empty are valid states and must be treated equally as "no data", but they can't for purely technical reasons.
Whether they are valid states is irrelevant. What matters is whether they represent identical values. In D, they are unhealthily mixed.
Oct 25 2013
next sibling parent "Kagamin" <spam here.lot> writes:
On Friday, 25 October 2013 at 16:31:54 UTC, Max Samukha wrote:
 D chose to fuse "null" and "empty" together in an inconsistent 
 manner, which is a mistake.
Slices are reasonably consistent and perfectly working with reasonable code, so I see no merit in fixing them, but you can try, why not.
Oct 25 2013
prev sibling parent "Kagamin" <spam here.lot> writes:
On Friday, 25 October 2013 at 16:31:54 UTC, Max Samukha wrote:
 If you define the string type to include "null", then "null" 
 should be either identical to "empty" in *all cases* or 
 distinct from that in all cases.
AFAIK, that's how equality operator works, use it and you will get the desired semantics. Should be no problem.
Oct 28 2013
prev sibling next sibling parent Shammah Chancellor <S S.com> writes:
On 2013-10-25 11:41:36 +0000, Kagamin said:

 Oracle makes no distinction between null and empty string. For a reason?
 A database is an implementation detail of a data storage, it doesn't 
 implement business logic, it only provides features, which can be used 
 with more or less success to implement business logic. Ever heard of 
 advantages of OO databases over relational ones? That's an illustration 
 of technical details, which don't precisely map to business logic.
That's poor friggin design, and it's for a bad reason. Oracle is not the example you want to be following. Sql Server does *NOT* follow their example for GOOD reason. My middle name is not null, it is NOTHING. There are lots of places where Oracle made bad design decisions and they cannot escape them due to requiring backwards compatibility.
Oct 25 2013
prev sibling next sibling parent "ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:
As the OP of this thread I want to say that I think nullable is 
the solution http://dlang.org/phobos/std_typecons.html but I 
dislike how I cant pass 5 or null to a parameter that is 
nullable!int, nullable!string
Oct 25 2013
prev sibling parent "Regan Heath" <regan netmail.co.nz> writes:
I find that have repeated myself a lot in each section/reply below, I am  
not sure whether you'd prefer I just reply with those points once, or  
inline, I chose inline so as it make it clear I was not ignoring your  
points, and to make it clear which of my arguments apply to which point...

:)

On Fri, 25 Oct 2013 12:41:36 +0100, Kagamin <spam here.lot> wrote:
 On Monday, 21 October 2013 at 10:33:01 UTC, Regan Heath wrote:
 null strings are no different to null class references, they're not a  
 special case.
True. That's an implementation detail which has no meaning for business logic.
This argument applies both ways. If D conflates null and empty, then this restricts business logic with an implementation detail. We agree that D has no place in defining business logic, therefore it follows that the more flexible option is preferable as it is neutral in its effect on business logic. However, this decision, like most is a cost/benefit analysis and in the case of strings the case can be made that they should be a value type, and never null. I can get behind such a decision, as it would mean D was taking a side, finally. If strings cannot be null then we actually benefit from the current conflation of the two, by avoiding having to do null reference checking, and the associated exception/crash. I would prefer to go the other way and allow a consistent null/empty distinction but either option is better than the status quo where we have to check for null ("cost") but gain no benefit from this, because we cannot use the null state consistently.
 When implementation deviates from business logic, one ends up fixing the  
 implementation details everywhere in order to implement business logic.  
 That's why string.IsNullOrEmpty is used.
I almost never need to use string.IsNullOrEmpty. The reason why is simple. An empty string is just one value a string may hold, and my code does not "generally" treat it as special except in certain specific cases where I make that additional check (your blank username example, for one). Null is the only "special" state a string reference can have, so I check for this and this alone.
 People seem to have this odd idea that null is somehow an invalid state  
 for a string /reference/ (c# strings are reference types), it's not.
That's the very problem: null and empty are valid states and must be treated equally as "no data", but they can't for purely technical reasons.
I never treat null and empty "equally as "no data"" that is my whole point. They are not the same thing conceptually, you should never treat them as the same thing. null means "no data", empty is just one possible state of "data". You might make the business logic decision of disallowing empty values, of treating an empty value as if no value was given. The two would still be conceptually separate, but your code would be making the decision to treat them in the same way. You encode this decision in the function which accesses the input, once, and your problems are all solved. If you make the mistake of conflating null and empty in your input layer then you restrict your "business logic" and create the very problem you're complaining about here, stop conflating them and the problem simply vanishes. If your input mechanism or a 3rd party library is conflating them, then you can add a business/conversion layer to convert empty to null and all your code can ignore the empty case and simply concentrate on checking for null, as it should already do - because this is unavoidable in any case. This is KISS, collapse the 2 possible "error" states into 1 and check for that.
 People also seem to elevate empty strings to some sort of special  
 status, that's like saying 0 has some special status for int - it  
 doesn't it's just one of a number of possible values.

 In fact, int having no null like state is a "problem" causing solutions  
 like boxing to elevate the value type to a reference in order to allow  
 a null state for int.
You want to check ints for null everywhere too?
No. (Strawman). There are some cases where people wrap int in nullable however as there are some use cases where you do want to be able to indicate "no data" using a single variable. This is the flexibility of a reference type, and the cost is the check for null. If you do cost/benefit analysis for int with this in mind it is clearly not a type we want as a reference type - the performance penalty alone kills this.
 Yet, in D we've decided to inconsistently remove that functionality  
 from string for no gain.  If string could not actually be null then  
 we'd gain something from the limitation, instead we lose functionality  
 and gain nothing - you still have to check your strings for null in D.
Huh? Null slices work just like empty ones - that's why this topic was started in the first place. One doesn't have to check slices for nulls, only for length.
Slices are not strings, as slices cannot be null. However "if (slice is null)" can still be true - this is just plain wrong/inconsistent. Lets pick a side and handle it consistently, above all else. We can argue about which side, but can we at least agree the inconsistency is a bad thing?
 If you want clear nullable semantics, you have Nullable, it works for  
 everything, including strings and ints. You would want this feature only  
 in rare cases, so it doesn't make sense to make it default, or it will  
 be a nuisance.
Strings can be null, not checking for null is fatal. You cannot easily tell if you have a string or a slice so you currently have to check for null in most/all cases already. We're paying that "cost" already and yet not getting the full benefit from it. It's simply a bad investment. D should pick a side and conform to it, either we have nullable strings or we don't. The current middle ground is just worse.
 both of them are just "no data", so you end up typing  
 if(string.IsNullOrEmpty(mystr)) every time everywhere.
I only have to code like this when I use 3rd party code which has conflated empty and null. In my code when it's null it means not specified, and empty is just one type of value - for which I do no special handling.
Equivalence between null and empty is a business logic's requirement, that's why it's done.
Whose business logic? This is perhaps my secondary point here. D has no grounds to define business logic for all possible applications, this is something each application must have the flexibility to define for itself. A library ought to provide the tools to do it - converting "" to null for you - but the language should not mandate it.
 And, yeah, only one small feature in this big mess ever needs to  
 differentiate between null and empty.
Untrue, null allows many alternate and IMO more direct/obvious designs.
The need for those designs is rare and trivially implementable for all value types.
Rare; untrue, I use null all the time to good effect. Trivially implementable, debatable - if you have to do more work you're paying a price, if you get no reward for that price then you're wasting resources. The current situation in D has you paying the price for no reward.
 I found this one case trivially implementable, but nulls still plague  
 all remaining code.
Which one case? The readline() one below?
No, it was an authentication system in third-party code for one special case.
No-one is trying to say you cannot code around it, even trivially in some cases, but the null design would likely have been simpler still. And, this means less wasted effort, and worse still it gained you nothing.
 I also had to specify this null value in app.config - guess how,  
 explicitly specify, not substitute missing parameter with a default.
Seems to me that if you want a config to be null, you simply omit it from the configuration file. Then have the code return null for it's value, to indicate "no data". If it's present, and set to "" then you would be able to differentiate these two cases, which is essential if your business logic requires that "" is a valid value for the config. D should not place restrictions on you business logic - with an implementation detail.
 Another possibility for readline is to return a tuple
 {bool eof, string line(non-null)} - this way you have easy check for eof  
 and don't have to check for null when you don't need it.
Yet another more complex design, for no gain. The additional boolean buys us nothing over the string reference, it costs more in terms of memory and complexity and you still have to remember to check it, as you have to remember to check for null in the original design.
 you're screwed or your code becomes littered with null checks, but who  
 accounts for all alternative scenarios from the start?
Me, and IMO any competent programmer. It is misguided to think you can ignore valid states, null is a valid state in C, C++, C#, and D.. You should be thinking about and handling it.
Here null is a valid state for readline, not for the caller: if the caller parses a multiline data format, unexpected end of file is an invalid state.
If they pass a multi-line data format, and they have counted the number of lines prior to passing it (to verify that they can call readline() N times safely) then yes, calling readline and getting EOF would be unexpected and worthy of an exception. But, why would you want to pay the cost of processing the lines twice (to count them and ensure no EOF)? Why not just have readline do that for you, by returning null on EOF. Simpler, more direct.
 And what do you gain by littering your code with those null checks? Just  
 making runtime happy and adding noise to the code? You could use that  
 time to improve the code or add features or even relax. It's exactly  
 nullable strings, which gain you only a time waste.
I D, you already have to "litter your code with null checks" so you're already paying the cost, you're just not getting any benefit.
 You don't have to check for it on every access to the variable, but you  
 do need to check for it once where the variable is assigned, or passed  
 (in private functions you can skip this).  From that point onward you  
 can assume non-null, valid, job done.
You just said "never assume". The assumption may fail, because the string type is still nullable, compiler doesn't save you here, this sucks. And in order to check for everything everywhere on a level near that of the compiler, you must be not just competent, but perfect.
Play on words. If you've filtered out null, you're not "assuming" you're "ensuring" it's non-null. The only way to get null from that point is either "by design" or via memory corruption. D does protect you from memory corruption by avoiding the need for raw pointers etc. And, if you're setting string variables to null "by design" then you will need to check them again, of course. Yes, if you want to write good code you need to develop good habits WRT using null, it's unavoidable. Unless we remove null and the power/flexibility it affords - which is a valid option. So, can we just pick an option for D and go with it, I don't really mind which way we go - tho my preference should be obvious :)
 I believe there's no problem domain, which would like to differentiate  
 between null and empty string instead of treating them as "no data".
null means not specified, non existent, was not there. empty means, present but set to empty/blank. Databases have this distinction for a reason.
Oracle makes no distinction between null and empty string. For a reason?
Looks like it was (ultimately) a mistake: http://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_elements005.htm <quote>Note: Oracle Database currently treats a character value with a length of zero as null. However, this may not continue to be true in future releases, and Oracle recommends that you do not treat empty strings the same as nulls.</quote> To repeat the important part.. "Oracle recommends that you do not treat empty strings the same as nulls". For. A. Reason. The database has no right to define business logic - this restriction in oracle database has no doubt caused people to have to work around it, by using a specific "value" as null.
 A database is an implementation detail of a data storage, it doesn't  
 implement business logic
Agree 100% conflating null and empty string is a business logic decision, it has no place in a database or other base level - like a language or standard library.
 If you get input from a user a field called "foo" may be:
  - not specified
  - specified

 and if specified, may be:
  - empty
  - not empty
If the user doesn't fill a text box, it's both empty and not specified - there's just no difference.
There is a clear and important difference. Lets say the text box represents the users middle name, lets presume they have given a value for it at some stage, lets assume they would like to remove it. They load the page, and erase the value and click submit. Your business logic will ignore the empty value, and not update the users middle name. My business logic will detect the text box was present (not null) and apply the given value "" to the users middle name (in the database for example).
 And it doesn't matter how you store it in the database - as null or as  
 empty string - both are presented in the same way.
They don't have to be, that is my point. The decision of how to display them is a business logic decision and having a clear distinction between null and empty allows you to display them differently. Not having the distinction, ties your hands.
 Heck, we use these optional text boxes everywhere - can you tell if  
 their content is empty or not specified?
http is one such input mechanism which conflates null and empty, there are numerous ways to code around it. D is making the same mistake, with the same consequences, this is my central point.
 And what if the value is required? Would you accept an empty value?
This is a business logic decision, which D, and the database have no right to make. Yes, if the user could input an empty value and yes if my business logic wanted to detect and disallow it - I would. If not, I would not. The point is that null gives you the power to express both, rather than restricting you and forcing an indirect solution to code around the lack.
 If we have null, lets use it, if we want to remove null the lets remove  
 it, but can we get out of this horrid middle ground please.
*sigh* people just don't buy the KISS principle...
No kidding. From my perspective null /is/ KISS and having to code around the lack with a more complex design is not. :P R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 28 2013
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 10/18/2013 12:50 AM, ProgrammingGhost wrote:
 How do I find out if null was passed in? As you can guess I wasn't happy
 with the current behavior.
 ...
http://forum.dlang.org/thread/rkdzdxygpflpnaznxxnl forum.dlang.org?page=5
Oct 18 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, October 18, 2013 10:38:12 H. S. Teoh wrote:
 On Fri, Oct 18, 2013 at 01:32:58PM -0400, Jonathan M Davis wrote:
 On Friday, October 18, 2013 09:55:46 Andrei Alexandrescu wrote:
 On 10/18/13 9:26 AM, Max Samukha wrote:
 *That's* bad API design. readln should be symmetrical to writeln,
 not write. And about preserving the exact representation of new
 lines, readln/writeln shouldn't preserve that, pure and simple.
Fair point. I just gave one possible alternative out of many. Thing is, relying on client code to distinguish subtleties between empty and null strings is fraught with dangers.
Yeah, but the primary reason that it's bad design is the fact that D tries to conflate null and empty instead of keeping them distinct (which is essentially the complaint that was made). Whether that's ultimately good or bad is up for debate, but the side effect is that relying on the difference between null and empty ends up being very bug-prone, whereas in other languages which don't conflate the two, it isn't problematic in the same way, and it's much more reasonable to have the API treat them differently.
[...] IMO, distinguishing between null and empty arrays is bad abstraction. I agree with D's "conflation" of null with empty, actually. Conceptually speaking, an array is a sequence of values of non-negative length. An array with non-zero length contains at least one element, and is therefore non-empty, whereas an array with zero length is empty. Same thing goes with a slice. A slice is a view into zero or more array elements. A slice with zero length is empty, and a slice with non-zero length contains at least one element. There's nowhere in this conceptual scheme for such a thing as a "null array" that's distinct from an empty array. This distinction only crops up in implementation, and IMO leads to code smells because code should be operating based on the conceptual behaviour of arrays rather than on the implementation details.
In most languages, an array is a reference type, so there's the question of whether it's even _there_. There's a clear distinction between having null reference to an array and having a reference to an empty array. This is particularly clear in C++ where an array is just a pointer, but it's try in plenty of other languages that don't treat as arrays as pointers (e.g. Java). The problem is that D put the length on the stack alongside the pointer, making it so that D arrays are sort of reference types and sort of not. The pointer is a reference type, but the length is a value type, making the dynamic array half and half. If it were fully a reference type, then there would be no problem with distinguishing between null and empty arrays. A null array is simply a null reference to an array. But since D arrays aren't quite reference types, that doesn't work. I see no problem in the abstraction of arrays with having null arrays, because a null array is simply a null reference to an array, which is exactly the same as having a null object or null pointer. It's the reference that's null, not what it points to. It's just D's implementation that's weird. It would be like taking some of the member variables of a class and putting them in the reference instead of in the object and then discussing how much a null object makes sense. It's just bizarre. Now, D arrays end up working great overall in spite of their semantic weirdness, but it does mean that you can't really have proper null arrays in the same way that most languages with arrays can, forcing you to either be extremely careful when dealing with null and arrays or to waste space doing stuff to keep track of nullability separately from the array itself like Nullable does. - Jonathan M Davis
Oct 18 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Oct 18, 2013 at 02:04:41PM -0400, Jonathan M Davis wrote:
 On Friday, October 18, 2013 10:38:12 H. S. Teoh wrote:
[...]
 IMO, distinguishing between null and empty arrays is bad
 abstraction. I agree with D's "conflation" of null with empty,
 actually. Conceptually speaking, an array is a sequence of values of
 non-negative length. An array with non-zero length contains at least
 one element, and is therefore non-empty, whereas an array with zero
 length is empty. Same thing goes with a slice. A slice is a view
 into zero or more array elements. A slice with zero length is empty,
 and a slice with non-zero length contains at least one element.
 There's nowhere in this conceptual scheme for such a thing as a
 "null array" that's distinct from an empty array. This distinction
 only crops up in implementation, and IMO leads to code smells
 because code should be operating based on the conceptual behaviour
 of arrays rather than on the implementation details.
In most languages, an array is a reference type, so there's the question of whether it's even _there_. There's a clear distinction between having null reference to an array and having a reference to an empty array. This is particularly clear in C++ where an array is just a pointer, but it's try in plenty of other languages that don't treat as arrays as pointers (e.g. Java).
To me, these are just implementation details. Conceptually speaking, D arrays are actually slices, so that gives them reference semantics. Being slices, they refer to zero or more elements, so either their length is zero, or not. There is no concept of nullity here. That only comes because we chose to implement slices as pointer + length, so implementation-wise we can distinguish between a null .ptr and a non-null .ptr. But from the conceptual POV, if we consider slices as a whole, they are just a sequence of zero or more elements. Null has no meaning here. Put another way, slices themselves are value types, but they refer to their elements by reference. It's a subtle but important difference.
 The problem is that D put the length on the stack alongside the
 pointer, making it so that D arrays are sort of reference types and
 sort of not. The pointer is a reference type, but the length is a
 value type, making the dynamic array half and half. If it were fully a
 reference type, then there would be no problem with distinguishing
 between null and empty arrays. A null array is simply a null reference
 to an array. But since D arrays aren't quite reference types, that
 doesn't work.
[...] I think the issue comes from the preconceived notion acquired from other languages that arrays are some kind of object floating somewhere out there on the heap, for which we have a handle here. Thus we have the notion of null, being the case when we have a handle here but there's actually nothing out there. But we consider the slice as being a thing right *here* and now, referencing some sequence of elements out there, then we arrive at D's notion of null and empty being the same thing, because while there may be no elements out there being referenced, the handle (i.e. slice) is always *here*. In that sense, there's no distinction between an empty slice and a null slice: either there are elements out there that we're referring to, or there are none. There is no third "null" case. There's no reason why we should adopt the previous notion if this one works just as well, if not better. I argue that the second notion is conceptually cleaner, because it eliminates an unnecessary distinction between an empty sequence and a non-existent sequence (which then leads to similar issues one encounters with null pointers). T -- Answer: Because it breaks the logical sequence of discussion. / Question: Why is top posting bad?
Oct 18 2013
next sibling parent reply "Meta" <jared771 gmail.com> writes:
On Friday, 18 October 2013 at 19:59:26 UTC, H. S. Teoh wrote:
 ...because it eliminates an unnecessary distinction between an 
 empty sequence and a non-existent sequence (which then leads to 
 similar issues one encounters with null pointers).
That just seems silly. Surely we all recognize that there's a difference between the empty set and having no set at all, and that it's valuable to be able to distinguish between the two. The empty set is still a set, while nothing is... nothing.
Oct 18 2013
next sibling parent reply "Blake Anderton" <rbanderton gmail.com> writes:
I agree a null value and empty array are separate concepts, but 
from my very anecdotal/non rigorous point of view I really 
appreciate D's ability to treat them as equivalent.

My day job mostly involves C# and array code almost always 
follows the pattern if(arr == null || arr.Length == 0) ...

In D just doing if(arr.length) feels much nicer and less error 
prone. I'm all for correctness but would hate to throw the baby 
out with the bathwater.
Oct 18 2013
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/18/2013 10:09 PM, Blake Anderton wrote:
 I agree a null value and empty array are separate concepts, but from my
 very anecdotal/non rigorous point of view I really appreciate D's
 ability to treat them as equivalent.

 My day job mostly involves C# and array code almost always follows the
 pattern if(arr == null || arr.Length == 0) ...

 In D just doing if(arr.length) feels much nicer and less error prone.
 I'm all for correctness but would hate to throw the baby out with the
 bathwater.
(This will work either way.)
Oct 18 2013
parent "Meta" <jared771 gmail.com> writes:
On Friday, 18 October 2013 at 20:15:31 UTC, Timon Gehr wrote:
 (This will work either way.)
Speaking of that, it's really annoying to have to import std.array just to use range primitives with slices. Would these be better in druntime, or is that a bad idea?
Oct 18 2013
prev sibling next sibling parent reply "ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:
On Friday, 18 October 2013 at 20:09:37 UTC, Blake Anderton wrote:
 I agree a null value and empty array are separate concepts, but 
 from my very anecdotal/non rigorous point of view I really 
 appreciate D's ability to treat them as equivalent.

 My day job mostly involves C# and array code almost always 
 follows the pattern if(arr == null || arr.Length == 0) ...

 In D just doing if(arr.length) feels much nicer and less error 
 prone. I'm all for correctness but would hate to throw the baby 
 out with the bathwater.
Really? I NEVER write that pattern. I may check if an array is null or don't because the function shouldnt be receiving nulls (maybe its bad but idc). I just write linq and never bother to see if something is empty
Oct 18 2013
parent "Blake Anderton" <rbanderton gmail.com> writes:
On Friday, 18 October 2013 at 20:32:48 UTC, ProgrammingGhost 
wrote:
 Really? I NEVER write that pattern. I may check if an array is 
 null or don't because the function shouldnt be receiving nulls 
 (maybe its bad but idc). I just write linq and never bother to 
 see if something is empty
Yeah, LINQ makes it a lot easier, but I usually take IEnumerable<T> instead of coding directly against arrays in that case. I find most of the time I use arrays directly is when using "params" parameters. It's very easy to not null check that and cause heartache down the line.
Oct 18 2013
prev sibling next sibling parent "David Nadlinger" <code klickverbot.at> writes:
On Friday, 18 October 2013 at 20:09:37 UTC, Blake Anderton wrote:
 I agree a null value and empty array are separate concepts […]
Yes, null values are a different concept, and slices being value types, there isn't really one for them. I'm torn on whether allowing conversion of arrays to pointers for the purpose of null comparison was a good idea or not. David
Oct 18 2013
prev sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 18 Oct 2013 21:09:35 +0100, Blake Anderton <rbanderton gmail.com>  
wrote:

 I agree a null value and empty array are separate concepts, but from my  
 very anecdotal/non rigorous point of view I really appreciate D's  
 ability to treat them as equivalent.

 My day job mostly involves C# and array code almost always follows the  
 pattern if(arr == null || arr.Length == 0) ...
Interesting. My day job is C# and I almost never do that. I check for null and treat empty as any other string value. The /only/ time I have to check for empty is when I have interfaced with 3rd party code which has decided to conflate empty and null to mean the same thing. Regan -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 21 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Oct 18, 2013 at 10:04:52PM +0200, Meta wrote:
 On Friday, 18 October 2013 at 19:59:26 UTC, H. S. Teoh wrote:
...because it eliminates an unnecessary distinction between an
empty sequence and a non-existent sequence (which then leads to
similar issues one encounters with null pointers).
That just seems silly. Surely we all recognize that there's a difference between the empty set and having no set at all, and that it's valuable to be able to distinguish between the two. The empty set is still a set, while nothing is... nothing.
Yes, but if you declare a variable to contain a set, then by definition there is *something*, even if it's an empty set. For there to be nothing, there shouldn't even be a variable in the first place. The fact that the variable exists and has an identifer means that there is *something*. So your argument is moot. T -- Computers shouldn't beep through the keyhole.
Oct 18 2013
next sibling parent reply "Meta" <jared771 gmail.com> writes:
On Friday, 18 October 2013 at 21:15:32 UTC, H. S. Teoh wrote:
 Yes, but if you declare a variable to contain a set, then by 
 definition there is *something*, even if it's an empty set.
Exactly. There is still *something*, even though the set is empty. That is, the set itself.
 For there to be nothing, there shouldn't even be a variable in 
 the first place. The fact that the variable exists and has an 
 identifer means that there is *something*. So your argument is 
 moot.
Not really. Null is a special marker to indicate the absence of a value. There is nothing, as opposed to the previous case.
Oct 18 2013
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Oct 19, 2013 at 12:04:47AM +0200, Meta wrote:
 On Friday, 18 October 2013 at 21:15:32 UTC, H. S. Teoh wrote:
Yes, but if you declare a variable to contain a set, then by
definition there is *something*, even if it's an empty set.
Exactly. There is still *something*, even though the set is empty. That is, the set itself.
For there to be nothing, there shouldn't even be a variable in the
first place. The fact that the variable exists and has an
identifer means that there is *something*. So your argument is
moot.
Not really. Null is a special marker to indicate the absence of a value. There is nothing, as opposed to the previous case.
That's if you consider a set to be a reference type. Then you can say that the reference may be referring to something (which may be empty or not), or it can refer to nothing (null). But if the set is a value type, then there is no such thing as null, only empty. T -- INTEL = Only half of "intelligence".
Oct 18 2013
prev sibling next sibling parent reply "ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:
On Friday, 18 October 2013 at 21:15:32 UTC, H. S. Teoh wrote:
 On Fri, Oct 18, 2013 at 10:04:52PM +0200, Meta wrote:
 On Friday, 18 October 2013 at 19:59:26 UTC, H. S. Teoh wrote:
...because it eliminates an unnecessary distinction between an
empty sequence and a non-existent sequence (which then leads 
to
similar issues one encounters with null pointers).
That just seems silly. Surely we all recognize that there's a difference between the empty set and having no set at all, and that it's valuable to be able to distinguish between the two. The empty set is still a set, while nothing is... nothing.
Yes, but if you declare a variable to contain a set, then by definition there is *something*, even if it's an empty set. For there to be nothing, there shouldn't even be a variable in the first place. The fact that the variable exists and has an identifer means that there is *something*. So your argument is moot. T
I was simply thinking about sdl where you pass in a rect for the coords to blt one surface to the other. Null/0 means copy the whole thing. Rect is an object but I was thinking what about arrays (empty VS pull a default somewhere). Thats how I came up with this question and the point is I WANT to NOT specify a value so a DYNAMIC SUITABLE default value can be used.
Oct 18 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Oct 19, 2013 at 12:45:02AM +0200, ProgrammingGhost wrote:
 On Friday, 18 October 2013 at 21:15:32 UTC, H. S. Teoh wrote:
On Fri, Oct 18, 2013 at 10:04:52PM +0200, Meta wrote:
On Friday, 18 October 2013 at 19:59:26 UTC, H. S. Teoh wrote:
...because it eliminates an unnecessary distinction between an
empty sequence and a non-existent sequence (which then leads to
similar issues one encounters with null pointers).
That just seems silly. Surely we all recognize that there's a difference between the empty set and having no set at all, and that it's valuable to be able to distinguish between the two. The empty set is still a set, while nothing is... nothing.
Yes, but if you declare a variable to contain a set, then by definition there is *something*, even if it's an empty set. For there to be nothing, there shouldn't even be a variable in the first place. The fact that the variable exists and has an identifer means that there is *something*. So your argument is moot. T
I was simply thinking about sdl where you pass in a rect for the coords to blt one surface to the other. Null/0 means copy the whole thing. Rect is an object but I was thinking what about arrays (empty VS pull a default somewhere). Thats how I came up with this question and the point is I WANT to NOT specify a value so a DYNAMIC SUITABLE default value can be used.
You could use T[]* and pass a null pointer as default? T -- What is Matter, what is Mind? Never Mind, it doesn't Matter.
Oct 18 2013
parent "ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:
 You could use T[]* and pass a null pointer as default?
Yet this answer wasn't on the first page. I see I can't write fn([1,2]) anymore so I'm unsure how this solution compares to using Nullable (I can't write fn([1,2]) with nullable either).
Oct 18 2013
prev sibling parent reply "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Friday, 18 October 2013 at 21:15:32 UTC, H. S. Teoh wrote:
 Yes, but if you declare a variable to contain a set, then by 
 definition
 there is *something*, even if it's an empty set. For there to be
 nothing, there shouldn't even be a variable in the first place. 
 The fact
 that the variable exists and has an identifer means that there 
 is
 *something*. So your argument is moot.


 T
We can declare a variable to contain an object, and there can still not be an object there. You're trying to make arrays non-nullable. Which I suppose isn't so bad, it is a structure after all. Why do we even allow checking against null, can't do it with int or bool. (ok, I know, breaks code).
Oct 18 2013
parent "bearophile" <bearophileHUGS lycos.com> writes:
Jesse Phillips:

 Why do we even allow checking against null, can't do it
 with int or bool. (ok, I know, breaks code).
Sometimes breaking code is acceptable. Bye, bearophile
Oct 19 2013
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 10/18/2013 09:58 PM, H. S. Teoh wrote:
 To me, these are just implementation details. Conceptually speaking, D
 arrays are actually slices, so that gives them reference semantics.
 Being slices, they refer to zero or more elements, so either their
 length is zero, or not. There is no concept of nullity here. That only
 comes because we chose to implement slices as pointer + length, so
 implementation-wise we can distinguish between a null .ptr and a
 non-null .ptr. But from the conceptual POV, if we consider slices as a
 whole, they are just a sequence of zero or more elements. Null has no
 meaning here.
int[] a = null; // <- :(
Oct 18 2013
prev sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 18 Oct 2013 20:58:07 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Fri, Oct 18, 2013 at 02:04:41PM -0400, Jonathan M Davis wrote:
 On Friday, October 18, 2013 10:38:12 H. S. Teoh wrote:
[...]
 IMO, distinguishing between null and empty arrays is bad
 abstraction. I agree with D's "conflation" of null with empty,
 actually. Conceptually speaking, an array is a sequence of values of
 non-negative length. An array with non-zero length contains at least
 one element, and is therefore non-empty, whereas an array with zero
 length is empty. Same thing goes with a slice. A slice is a view
 into zero or more array elements. A slice with zero length is empty,
 and a slice with non-zero length contains at least one element.
 There's nowhere in this conceptual scheme for such a thing as a
 "null array" that's distinct from an empty array. This distinction
 only crops up in implementation, and IMO leads to code smells
 because code should be operating based on the conceptual behaviour
 of arrays rather than on the implementation details.
In most languages, an array is a reference type, so there's the question of whether it's even _there_. There's a clear distinction between having null reference to an array and having a reference to an empty array. This is particularly clear in C++ where an array is just a pointer, but it's try in plenty of other languages that don't treat as arrays as pointers (e.g. Java).
To me, these are just implementation details. Conceptually speaking, D arrays are actually slices, so that gives them reference semantics. Being slices, they refer to zero or more elements, so either their length is zero, or not. There is no concept of nullity here. That only comes because we chose to implement slices as pointer + length, so implementation-wise we can distinguish between a null .ptr and a non-null .ptr. But from the conceptual POV, if we consider slices as a whole, they are just a sequence of zero or more elements. Null has no meaning here. Put another way, slices themselves are value types, but they refer to their elements by reference. It's a subtle but important difference.
 The problem is that D put the length on the stack alongside the
 pointer, making it so that D arrays are sort of reference types and
 sort of not. The pointer is a reference type, but the length is a
 value type, making the dynamic array half and half. If it were fully a
 reference type, then there would be no problem with distinguishing
 between null and empty arrays. A null array is simply a null reference
 to an array. But since D arrays aren't quite reference types, that
 doesn't work.
[...] I think the issue comes from the preconceived notion acquired from other languages that arrays are some kind of object floating somewhere out there on the heap, for which we have a handle here. Thus we have the notion of null, being the case when we have a handle here but there's actually nothing out there. But we consider the slice as being a thing right *here* and now, referencing some sequence of elements out there, then we arrive at D's notion of null and empty being the same thing, because while there may be no elements out there being referenced, the handle (i.e. slice) is always *here*. In that sense, there's no distinction between an empty slice and a null slice: either there are elements out there that we're referring to, or there are none. There is no third "null" case. There's no reason why we should adopt the previous notion if this one works just as well, if not better. I argue that the second notion is conceptually cleaner, because it eliminates an unnecessary distinction between an empty sequence and a non-existent sequence (which then leads to similar issues one encounters with null pointers).
If what you say is true then slices would and could never be null... If that were the case I would stop complaining and simply "box" them with Nullable when I wanted a reference type. But, D's strings/slices are some kind of mutant half reference half value type, and that's the underlying problem here. Regan -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 21 2013
next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 21 Oct 2013 11:58:07 +0100, Regan Heath <regan netmail.co.nz>  
wrote:

 On Fri, 18 Oct 2013 20:58:07 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
 wrote:

 On Fri, Oct 18, 2013 at 02:04:41PM -0400, Jonathan M Davis wrote:
 On Friday, October 18, 2013 10:38:12 H. S. Teoh wrote:
[...]
 IMO, distinguishing between null and empty arrays is bad
 abstraction. I agree with D's "conflation" of null with empty,
 actually. Conceptually speaking, an array is a sequence of values of
 non-negative length. An array with non-zero length contains at least
 one element, and is therefore non-empty, whereas an array with zero
 length is empty. Same thing goes with a slice. A slice is a view
 into zero or more array elements. A slice with zero length is empty,
 and a slice with non-zero length contains at least one element.
 There's nowhere in this conceptual scheme for such a thing as a
 "null array" that's distinct from an empty array. This distinction
 only crops up in implementation, and IMO leads to code smells
 because code should be operating based on the conceptual behaviour
 of arrays rather than on the implementation details.
In most languages, an array is a reference type, so there's the question of whether it's even _there_. There's a clear distinction between having null reference to an array and having a reference to an empty array. This is particularly clear in C++ where an array is just a pointer, but it's try in plenty of other languages that don't treat as arrays as pointers (e.g. Java).
To me, these are just implementation details. Conceptually speaking, D arrays are actually slices, so that gives them reference semantics. Being slices, they refer to zero or more elements, so either their length is zero, or not. There is no concept of nullity here. That only comes because we chose to implement slices as pointer + length, so implementation-wise we can distinguish between a null .ptr and a non-null .ptr. But from the conceptual POV, if we consider slices as a whole, they are just a sequence of zero or more elements. Null has no meaning here. Put another way, slices themselves are value types, but they refer to their elements by reference. It's a subtle but important difference.
 The problem is that D put the length on the stack alongside the
 pointer, making it so that D arrays are sort of reference types and
 sort of not. The pointer is a reference type, but the length is a
 value type, making the dynamic array half and half. If it were fully a
 reference type, then there would be no problem with distinguishing
 between null and empty arrays. A null array is simply a null reference
 to an array. But since D arrays aren't quite reference types, that
 doesn't work.
[...] I think the issue comes from the preconceived notion acquired from other languages that arrays are some kind of object floating somewhere out there on the heap, for which we have a handle here. Thus we have the notion of null, being the case when we have a handle here but there's actually nothing out there. But we consider the slice as being a thing right *here* and now, referencing some sequence of elements out there, then we arrive at D's notion of null and empty being the same thing, because while there may be no elements out there being referenced, the handle (i.e. slice) is always *here*. In that sense, there's no distinction between an empty slice and a null slice: either there are elements out there that we're referring to, or there are none. There is no third "null" case. There's no reason why we should adopt the previous notion if this one works just as well, if not better. I argue that the second notion is conceptually cleaner, because it eliminates an unnecessary distinction between an empty sequence and a non-existent sequence (which then leads to similar issues one encounters with null pointers).
If what you say is true then slices would and could never be null...
Aargh, my apologies I misread your post. Ignore my first reply. I agree that slices never being null are like a pre-null checked array, which is a good thing. The issue I have had in the past is with strings (not slices) mutating from null to empty and/or vice-versa. Also, it's not at all clear when you're dealing with a pre-check not-null slice and when you're dealing with a possibly null array, for example.. import std.stdio; void foo(string arr) { if (arr is null) writefln("null"); else writefln("not null"); if (arr.length == 0) writefln("empty"); else writefln("not empty"); } void main() { string arr; foo(arr); foo(arr[0..$]); arr = ""; foo(arr); foo(arr[0..$]); } Output: null empty null empty not null empty not null empty Which of those are strings/arrays and which are slices? Why are the ones formed by actually slicing coming up as "is null"? (This last, not directed at you, just venting..) I can understand arguing against null from a safety point of view. I can understand arguing against designs that use null, for the same reasons. I disagree, but then I have comfortably used null for a long time so the cost/benefit of using null is heavily on the benefit side for me. I can understand for others this may not be the case. But, I cannot understand someone who says they have no use for the concept of non-existence, or that no code will ever want to make the distinction, that is just plainly incorrect .. implementing a singleton pattern (probably a bad example :p) relies on being able to check for non-existence, using null as the indicator, we do it all the time. Regan -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 21 2013
prev sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, October 21, 2013 11:58:07 Regan Heath wrote:
 If what you say is true then slices would and could never be null... If
 that were the case I would stop complaining and simply "box" them with
 Nullable when I wanted a reference type.  But, D's strings/slices are some
 kind of mutant half reference half value type, and that's the underlying
 problem here.
Yeah, dynamic arrays in D are just plain weird. They're halfway between reference types and value types, and it definitely causes confusion, and it totally screws with null (which definitely sucks). But they mostly work really well the way that they are, and in general, the way that slices work works really well. So, I don't know if what we have is ultimately the right design or not. I definitely don't like how null works for arrays though. Given how they work, we probably would have been better off if they couldn't be null. The ptr obviously could be null, but the array itself arguably shouldn't be able to be null. If we did that, then it would be clear that null wouldn't work with arrays, and no one would try. It would still kind of suck, since you wouldn't have null, but then at least it would be clear that null wouldn't work with arrays instead of having a situation where it kind of does and kind of doesn't. - Jonathan M Davis
Oct 21 2013
parent "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 21 Oct 2013 12:54:56 +0100, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Monday, October 21, 2013 11:58:07 Regan Heath wrote:
 If what you say is true then slices would and could never be null... If
 that were the case I would stop complaining and simply "box" them with
 Nullable when I wanted a reference type.  But, D's strings/slices are  
 some
 kind of mutant half reference half value type, and that's the underlying
 problem here.
Yeah, dynamic arrays in D are just plain weird. They're halfway between reference types and value types, and it definitely causes confusion, and it totally screws with null (which definitely sucks). But they mostly work really well the way that they are, and in general, the way that slices work works really well. So, I don't know if what we have is ultimately the right design or not. I definitely don't like how null works for arrays though. Given how they work, we probably would have been better off if they couldn't be null. The ptr obviously could be null, but the array itself arguably shouldn't be able to be null. If we did that, then it would be clear that null wouldn't work with arrays, and no one would try. It would still kind of suck, since you wouldn't have null, but then at least it would be clear that null wouldn't work with arrays instead of having a situation where it kind of does and kind of doesn't.
Agreed. This is preferable to the current situation, even if it's not my personal preferred solution. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 21 2013