digitalmars.D - Empty VS null array?

ProgrammingGhost (24/24) Oct 17 2013 How do I find out if null was passed in? As you can guess I

ProgrammingGhost (2/2) Oct 17 2013 Sorry I misspoke. I meant to say empty array or not null passed

anonymous (32/58) Oct 17 2013 On Thursday, 17 October 2013 at 22:51:24 UTC, ProgrammingGhost

ProgrammingGhost (4/69) Oct 17 2013 Overloads are acceptable. But that behavior is odd although I do

Adam D. Ruppe (5/6) Oct 17 2013 try if(v is null) { use default }

deadalnix (2/8) Oct 17 2013 Which is ultimately wrong as equality shouldn't test for identity.
ProgrammingGhost (3/9) Oct 17 2013 is null still treats [] as null. I tried && !is [] for fun and it

Adam D. Ruppe (6/7) Oct 17 2013 blah, you're right. It will at least distinguish it from an empty

H. S. Teoh (9/17) Oct 17 2013 I think it's a mistake to rely on the distinction between null and

Regan Heath (20/34) Oct 18 2013 This comes up time and again. The use of, and ability to distinguish

Andrei Alexandrescu (6/33) Oct 18 2013 That's bad API design, pure and simple. The function should e.g. return

Max Samukha (5/53) Oct 18 2013 *That's* bad API design. readln should be symmetrical to writeln,

Andrei Alexandrescu (5/8) Oct 18 2013 Fair point. I just gave one possible alternative out of many. Thing is,

Jonathan M Davis (9/17) Oct 18 2013 Yeah, but the primary reason that it's bad design is the fact that D tri...

Shammah Chancellor (6/25) Oct 25 2013 Null and the Empty Set are different entities. A set containing

Timon Gehr (2/3) Oct 26 2013 Check again.

H. S. Teoh (17/35) Oct 18 2013 [...]

Regan Heath (32/65) Oct 21 2013 This describes the empty/not empty distinction.

H. S. Teoh (16/38) Oct 21 2013 The thing is, D slices are value types even though the elements they

Regan Heath (12/47) Oct 21 2013 True, and that's a pointer, and I am comfortable using pointers.. howeve...

H. S. Teoh (11/41) Oct 21 2013 No, pointers are allowed in @safe. What is not allowed is pointer

Regan Heath (12/50) Oct 22 2013 Ah, thanks.

Max Samukha (10/14) Oct 18 2013 I agree. Thinking about your variant of readln - it's ok to use

Kagamin (3/14) Oct 19 2013 No, if the last line is empty, it has no new line character(s) at

Max Samukha (2/17) Oct 19 2013 Right. Then readln is broken.

Regan Heath (15/22) Oct 21 2013 My code does not need to distinguish between empty and null. null is

H. S. Teoh (12/33) Oct 18 2013 I agree. A better solution is to provide an eof() method (or better,

Dicebot (4/7) Oct 18 2013 I'd say it should throw upon EOF as it is pretty high-level

Regan Heath (7/13) Oct 21 2013 I disagree. Exceptions should never be used for flow control so the rul...

Dicebot (3/6) Oct 21 2013 For such function it is exceptional situation. For precise
H. S. Teoh (9/22) Oct 21 2013 [...]

Regan Heath (10/29) Oct 21 2013 For a file this is implementable (without a buffer) but not for a socket...

H. S. Teoh (32/61) Oct 21 2013 [...]

Regan Heath (19/74) Oct 22 2013 I don't agree the user-facing API is nicer. It is more complex both in ...

deadalnix (3/7) Oct 18 2013 That what if does by default.
Regan Heath (13/50) Oct 21 2013 Because.. the risk of a null pointer exception is not worth the gain? I...

Kagamin (24/40) Oct 19 2013 In C# code null strings are a plague. Most of the time you don't

Regan Heath (78/117) Oct 21 2013 I code in C# every day for work and I never have any problems with null ...

Kagamin (89/153) Oct 25 2013 True. That's an implementation detail which has no meaning for

Wyatt (6/8) Oct 25 2013 I've no real truck in this, but I do find it pretty bizarre to

Kagamin (4/11) Oct 25 2013 Dunno about D documentation, I use tools to get shit done. If

Kagamin (1/1) Oct 25 2013 *fix* I mean a product.

Max Samukha (15/29) Oct 25 2013 That's not an implementation detail. Whether "null" is in the set

Kagamin (4/6) Oct 25 2013 Slices are reasonably consistent and perfectly working with
Kagamin (3/6) Oct 28 2013 AFAIK, that's how equality operator works, use it and you will

Shammah Chancellor (7/13) Oct 25 2013 That's poor friggin design, and it's for a bad reason. Oracle is not
ProgrammingGhost (4/4) Oct 25 2013 As the OP of this thread I want to say that I think nullable is
Regan Heath (145/251) Oct 28 2013 I find that have repeated myself a lot in each section/reply below, I am...

Timon Gehr (2/5) Oct 18 2013 http://forum.dlang.org/thread/rkdzdxygpflpnaznxxnl@forum.dlang.org?page=...
Jonathan M Davis (27/61) Oct 18 2013 In most languages, an array is a reference type, so there's the question...
H. S. Teoh (34/63) Oct 18 2013 To me, these are just implementation details. Conceptually speaking, D

Meta (5/8) Oct 18 2013 That just seems silly. Surely we all recognize that there's a

Blake Anderton (8/8) Oct 18 2013 I agree a null value and empty array are separate concepts, but

Timon Gehr (2/10) Oct 18 2013 (This will work either way.)

Meta (4/5) Oct 18 2013 Speaking of that, it's really annoying to have to import

ProgrammingGhost (5/13) Oct 18 2013 Really? I NEVER write that pattern. I may check if an array is

Blake Anderton (7/11) Oct 18 2013 Yeah, LINQ makes it a lot easier, but I usually take

David Nadlinger (6/7) Oct 18 2013 Yes, null values are a different concept, and slices being value
Regan Heath (9/14) Oct 21 2013 Interesting. My day job is C# and I almost never do that. I check for ...

H. S. Teoh (9/18) Oct 18 2013 Yes, but if you declare a variable to contain a set, then by definition

Meta (5/11) Oct 18 2013 Exactly. There is still *something*, even though the set is

H. S. Teoh (9/23) Oct 18 2013 That's if you consider a set to be a reference type. Then you can say

ProgrammingGhost (7/29) Oct 18 2013 I was simply thinking about sdl where you pass in a rect for the

H. S. Teoh (5/32) Oct 18 2013 You could use T[]* and pass a null pointer as default?

ProgrammingGhost (4/5) Oct 18 2013 Yet this answer wasn't on the first page.

Jesse Phillips (7/16) Oct 18 2013 We can declare a variable to contain an object, and there can

bearophile (4/6) Oct 19 2013 Sometimes breaking code is acceptable.

Timon Gehr (2/11) Oct 18 2013 int[] a = null; // <- :(
Regan Heath (10/70) Oct 21 2013 If what you say is true then slices would and could never be null... If ...

Regan Heath (51/121) Oct 21 2013 Aargh, my apologies I misread your post. Ignore my first reply.
Jonathan M Davis (15/20) Oct 21 2013 Yeah, dynamic arrays in D are just plain weird. They're halfway between

Regan Heath (7/37) Oct 21 2013 Agreed. This is preferable to the current situation, even if it's not m...

"ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:

How do I find out if null was passed in? As you can guess I 
wasn't happy with the current behavior.

Code:

	import std.stdio;

	void main() {

		fn([1,2]);
		fn(null);
		fn([]);
	}
	void fn(int[] v) {
		writeln("-");
		if(v==null)
			writeln("Use default");
		foreach(e; v)
			writeln(e);
	}

Output

	-
	1
	2
	-
	Use default
	-
	Use default

Oct 17 2013

"ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:

Sorry I misspoke. I meant to say empty array or not null passed 
in. The 3rd call to fn is what I didn't like.

Oct 17 2013

"anonymous" <anonymous example.com> writes:

On Thursday, 17 October 2013 at 22:50:22 UTC, ProgrammingGhost 
wrote:
 How do I find out if null was passed in? As you can guess I 
 wasn't happy with the current behavior.

 Code:

 	import std.stdio;

 	void main() {

 		fn([1,2]);
 		fn(null);
 		fn([]);
 	}
 	void fn(int[] v) {
 		writeln("-");
 		if(v==null)
 			writeln("Use default");
 		foreach(e; v)
 			writeln(e);
 	}

 Output

 	-
 	1
 	2
 	-
 	Use default
 	-
 	Use default

On Thursday, 17 October 2013 at 22:51:24 UTC, ProgrammingGhost 
wrote:
 Sorry I misspoke. I meant to say empty array or not null passed 
 in. The 3rd call to fn is what I didn't like.

null implicitly converts to []. You can't distinguish them in fn.

You could add an overload for typeof(null), but that only catches 
the literal null, probably not what you'd expect:

import std.stdio;
void fn(typeof(null) v) {
	writeln("-");
	writeln("Use default");
}
void fn(int[] v) {
	writeln("-");
	foreach(e; v)
		writeln(e);
}
void main() {
	fn([1,2]);
	fn(null);
	fn([]);
	int[] x = null;
	fn(x);
}
----
-
1
2
-
Use default
-
-

Oct 17 2013

"ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:

On Thursday, 17 October 2013 at 23:14:51 UTC, anonymous wrote:
 On Thursday, 17 October 2013 at 22:50:22 UTC, ProgrammingGhost 
 wrote:
 How do I find out if null was passed in? As you can guess I 
 wasn't happy with the current behavior.

 Code:

 	import std.stdio;

 	void main() {

 		fn([1,2]);
 		fn(null);
 		fn([]);
 	}
 	void fn(int[] v) {
 		writeln("-");
 		if(v==null)
 			writeln("Use default");
 		foreach(e; v)
 			writeln(e);
 	}

 Output

 	-
 	1
 	2
 	-
 	Use default
 	-
 	Use default

 On Thursday, 17 October 2013 at 22:51:24 UTC, ProgrammingGhost 
 wrote:
 Sorry I misspoke. I meant to say empty array or not null 
 passed in. The 3rd call to fn is what I didn't like.

 null implicitly converts to []. You can't distinguish them in 
 fn.

 You could add an overload for typeof(null), but that only 
 catches the literal null, probably not what you'd expect:

 import std.stdio;
 void fn(typeof(null) v) {
 	writeln("-");
 	writeln("Use default");
 }
 void fn(int[] v) {
 	writeln("-");
 	foreach(e; v)
 		writeln(e);
 }
 void main() {
 	fn([1,2]);
 	fn(null);
 	fn([]);
 	int[] x = null;
 	fn(x);
 }
 ----
 -
 1
 2
 -
 Use default
 -
 -

Overloads are acceptable. But that behavior is odd although I do 
understand its being passed as value. I guess I have to suck it 
up and hope this behavior doesn't give me problems.

Oct 17 2013

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 17 October 2013 at 22:50:22 UTC, ProgrammingGhost 
wrote:
 How do I find out if null was passed in?

try if(v is null) { use default }

if all you care about is if there's contents, I like to use 
if(v.length) {}

Oct 17 2013

"deadalnix" <deadalnix gmail.com> writes:

On Thursday, 17 October 2013 at 23:00:12 UTC, Adam D. Ruppe wrote:
 On Thursday, 17 October 2013 at 22:50:22 UTC, ProgrammingGhost 
 wrote:
 How do I find out if null was passed in?

 try if(v is null) { use default }

 if all you care about is if there's contents, I like to use 
 if(v.length) {}

Which is ultimately wrong as equality shouldn't test for identity.

Oct 17 2013

"ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:

On Thursday, 17 October 2013 at 23:00:12 UTC, Adam D. Ruppe wrote:
 On Thursday, 17 October 2013 at 22:50:22 UTC, ProgrammingGhost 
 wrote:
 How do I find out if null was passed in?

 try if(v is null) { use default }

 if all you care about is if there's contents, I like to use 
 if(v.length) {}

is null still treats [] as null. I tried && !is [] for fun and it 
didnt worth either (null is [])

Oct 17 2013

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 17 October 2013 at 23:12:03 UTC, ProgrammingGhost 
wrote:
 is null still treats [] as null.

blah, you're right. It will at least distinguish it from an empty 
slice though (like arr[$..$]). I don't think there's any way to 
tell [] from null except typeof(null) at all. At runtime they're 
both the same: no contents, so null pointer and zero length.

Oct 17 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Oct 18, 2013 at 01:27:33AM +0200, Adam D. Ruppe wrote:
 On Thursday, 17 October 2013 at 23:12:03 UTC, ProgrammingGhost
 wrote:
is null still treats [] as null.

 
 blah, you're right. It will at least distinguish it from an empty
 slice though (like arr[$..$]). I don't think there's any way to tell
 [] from null except typeof(null) at all. At runtime they're both the
 same: no contents, so null pointer and zero length.

I think it's a mistake to rely on the distinction between null and
non-null but empty arrays in D. They should be regarded as
implementation details that user code shouldn't depend on. If you need
to distinguish between arrays that are empty and arrays that are null,
consider using Nullable!(T[]) instead.


T

-- 
Curiosity kills the cat. Moral: don't be the cat.

Oct 17 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 18 Oct 2013 00:32:46 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Fri, Oct 18, 2013 at 01:27:33AM +0200, Adam D. Ruppe wrote:
 On Thursday, 17 October 2013 at 23:12:03 UTC, ProgrammingGhost
 wrote:
is null still treats [] as null.

 blah, you're right. It will at least distinguish it from an empty
 slice though (like arr[$..$]). I don't think there's any way to tell
 [] from null except typeof(null) at all. At runtime they're both the
 same: no contents, so null pointer and zero length.

 I think it's a mistake to rely on the distinction between null and
 non-null but empty arrays in D. They should be regarded as
 implementation details that user code shouldn't depend on. If you need
 to distinguish between arrays that are empty and arrays that are null,
 consider using Nullable!(T[]) instead.

This comes up time and again.  The use of, and ability to distinguish  
empty from null is very useful.  Yes, you run the risk of things like null  
pointer exceptions etc, but we have that risk now without the reward of  
being able to distinguish these cases.

Take this simple design:

   string readline();

This function would like to be able to:
  - return null for EOF
  - return [] for a blank line

but it cannot, because as soon as you write:

   foo(readline())

the null/[] case merges.

There are plenty of other such design/cases that can be imagined, and  
while you can work around them all they add complexity for zero gain.

A simple pointer can do this.. string cannot, this is sad.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 18 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/18/13 3:44 AM, Regan Heath wrote:
 On Fri, 18 Oct 2013 00:32:46 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>
 wrote:

 On Fri, Oct 18, 2013 at 01:27:33AM +0200, Adam D. Ruppe wrote:
 On Thursday, 17 October 2013 at 23:12:03 UTC, ProgrammingGhost
 wrote:
is null still treats [] as null.

 blah, you're right. It will at least distinguish it from an empty
 slice though (like arr[$..$]). I don't think there's any way to tell
 [] from null except typeof(null) at all. At runtime they're both the
 same: no contents, so null pointer and zero length.

 I think it's a mistake to rely on the distinction between null and
 non-null but empty arrays in D. They should be regarded as
 implementation details that user code shouldn't depend on. If you need
 to distinguish between arrays that are empty and arrays that are null,
 consider using Nullable!(T[]) instead.

 This comes up time and again.  The use of, and ability to distinguish
 empty from null is very useful.

I disagree.

 Yes, you run the risk of things like
 null pointer exceptions etc, but we have that risk now without the
 reward of being able to distinguish these cases.

 Take this simple design:

    string readline();

 This function would like to be able to:
   - return null for EOF
   - return [] for a blank line

That's bad API design, pure and simple. The function should e.g. return 
the string including the line terminator, and only return an empty (or 
null) string upon EOF.


Andrei

Oct 18 2013

"Max Samukha" <maxsamukha gmail.com> writes:

On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu 
wrote:
 On 10/18/13 3:44 AM, Regan Heath wrote:
 On Fri, 18 Oct 2013 00:32:46 +0100, H. S. Teoh 
 <hsteoh quickfur.ath.cx>
 wrote:

 On Fri, Oct 18, 2013 at 01:27:33AM +0200, Adam D. Ruppe wrote:
 On Thursday, 17 October 2013 at 23:12:03 UTC, 
 ProgrammingGhost
 wrote:
is null still treats [] as null.

 blah, you're right. It will at least distinguish it from an 
 empty
 slice though (like arr[$..$]). I don't think there's any way 
 to tell
 [] from null except typeof(null) at all. At runtime they're 
 both the
 same: no contents, so null pointer and zero length.

 I think it's a mistake to rely on the distinction between 
 null and
 non-null but empty arrays in D. They should be regarded as
 implementation details that user code shouldn't depend on. If 
 you need
 to distinguish between arrays that are empty and arrays that 
 are null,
 consider using Nullable!(T[]) instead.

 This comes up time and again.  The use of, and ability to 
 distinguish
 empty from null is very useful.

 I disagree.

 Yes, you run the risk of things like
 null pointer exceptions etc, but we have that risk now without 
 the
 reward of being able to distinguish these cases.

 Take this simple design:

   string readline();

 This function would like to be able to:
  - return null for EOF
  - return [] for a blank line

 That's bad API design, pure and simple. The function should 
 e.g. return the string including the line terminator, and only 
 return an empty (or null) string upon EOF.


 Andrei

*That's* bad API design. readln should be symmetrical to writeln, 
not write. And about preserving the exact representation of new 
lines, readln/writeln shouldn't preserve that, pure and simple.

Oct 18 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/18/13 9:26 AM, Max Samukha wrote:
 *That's* bad API design. readln should be symmetrical to writeln, not
 write. And about preserving the exact representation of new lines,
 readln/writeln shouldn't preserve that, pure and simple.

Fair point. I just gave one possible alternative out of many. Thing is, 
relying on client code to distinguish subtleties between empty and null 
strings is fraught with dangers.

Andrei

Oct 18 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, October 18, 2013 09:55:46 Andrei Alexandrescu wrote:
 On 10/18/13 9:26 AM, Max Samukha wrote:
 *That's* bad API design. readln should be symmetrical to writeln, not
 write. And about preserving the exact representation of new lines,
 readln/writeln shouldn't preserve that, pure and simple.

 
 Fair point. I just gave one possible alternative out of many. Thing is,
 relying on client code to distinguish subtleties between empty and null
 strings is fraught with dangers.

Yeah, but the primary reason that it's bad design is the fact that D tries to 
conflate null and empty instead of keeping them distinct (which is essentially 
the complaint that was made). Whether that's ultimately good or bad is up for 
debate, but the side effect is that relying on the difference between null and 
empty ends up being very bug-prone, whereas in other languages which don't 
conflate the two, it isn't problematic in the same way, and it's much more 
reasonable to have the API treat them differently.

- Jonathan M Davis

Oct 18 2013

Shammah Chancellor <s s.com> writes:

On 2013-10-18 17:32:58 +0000, Jonathan M Davis said:

 On Friday, October 18, 2013 09:55:46 Andrei Alexandrescu wrote:
 On 10/18/13 9:26 AM, Max Samukha wrote:
 *That's* bad API design. readln should be symmetrical to writeln, not
 write. And about preserving the exact representation of new lines,
 readln/writeln shouldn't preserve that, pure and simple.

 
 Fair point. I just gave one possible alternative out of many. Thing is,
 relying on client code to distinguish subtleties between empty and null
 strings is fraught with dangers.

 
 Yeah, but the primary reason that it's bad design is the fact that D tries to
 conflate null and empty instead of keeping them distinct (which is essentially
 the complaint that was made). Whether that's ultimately good or bad is up for
 debate, but the side effect is that relying on the difference between null and
 empty ends up being very bug-prone, whereas in other languages which don't
 conflate the two, it isn't problematic in the same way, and it's much more
 reasonable to have the API treat them differently.
 
 - Jonathan M Davis

Null and the Empty Set are different entities.   A set containing 
exactly nothing, vs undefined.   However, null is not handled properly 
in D or any other systems language since it's simply a pointer with 
value = 0.  if (null == 0) is a true statement in C, C++, and D, but is 
not in fact true.  Null is neither equal to zero, nor not equal to zero.

Oct 25 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 10/25/2013 11:02 PM, Shammah Chancellor wrote:
 ... null == 0 ... in C, C++, and D,

Check again.

Oct 26 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Oct 18, 2013 at 01:32:58PM -0400, Jonathan M Davis wrote:
 On Friday, October 18, 2013 09:55:46 Andrei Alexandrescu wrote:
 On 10/18/13 9:26 AM, Max Samukha wrote:
 *That's* bad API design. readln should be symmetrical to writeln,
 not write. And about preserving the exact representation of new
 lines, readln/writeln shouldn't preserve that, pure and simple.

 
 Fair point. I just gave one possible alternative out of many. Thing
 is, relying on client code to distinguish subtleties between empty
 and null strings is fraught with dangers.

 
 Yeah, but the primary reason that it's bad design is the fact that D
 tries to conflate null and empty instead of keeping them distinct
 (which is essentially the complaint that was made). Whether that's
 ultimately good or bad is up for debate, but the side effect is that
 relying on the difference between null and empty ends up being very
 bug-prone, whereas in other languages which don't conflate the two, it
 isn't problematic in the same way, and it's much more reasonable to
 have the API treat them differently.

[...]

IMO, distinguishing between null and empty arrays is bad abstraction.  I
agree with D's "conflation" of null with empty, actually. Conceptually
speaking, an array is a sequence of values of non-negative length. An
array with non-zero length contains at least one element, and is
therefore non-empty, whereas an array with zero length is empty. Same
thing goes with a slice. A slice is a view into zero or more array
elements. A slice with zero length is empty, and a slice with non-zero
length contains at least one element. There's nowhere in this conceptual
scheme for such a thing as a "null array" that's distinct from an empty
array. This distinction only crops up in implementation, and IMO leads
to code smells because code should be operating based on the conceptual
behaviour of arrays rather than on the implementation details.


T

-- 
The most powerful one-line C program: #include "/dev/tty" -- IOCCC

Oct 18 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 18 Oct 2013 18:38:12 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Fri, Oct 18, 2013 at 01:32:58PM -0400, Jonathan M Davis wrote:
 On Friday, October 18, 2013 09:55:46 Andrei Alexandrescu wrote:
 On 10/18/13 9:26 AM, Max Samukha wrote:
 *That's* bad API design. readln should be symmetrical to writeln,
 not write. And about preserving the exact representation of new
 lines, readln/writeln shouldn't preserve that, pure and simple.

 Fair point. I just gave one possible alternative out of many. Thing
 is, relying on client code to distinguish subtleties between empty
 and null strings is fraught with dangers.

 Yeah, but the primary reason that it's bad design is the fact that D
 tries to conflate null and empty instead of keeping them distinct
 (which is essentially the complaint that was made). Whether that's
 ultimately good or bad is up for debate, but the side effect is that
 relying on the difference between null and empty ends up being very
 bug-prone, whereas in other languages which don't conflate the two, it
 isn't problematic in the same way, and it's much more reasonable to
 have the API treat them differently.

 [...]

 Conceptually
 speaking, an array is a sequence of values of non-negative length. An
 array with non-zero length contains at least one element, and is
 therefore non-empty, whereas an array with zero length is empty. Same
 thing goes with a slice. A slice is a view into zero or more array
 elements. A slice with zero length is empty, and a slice with non-zero
 length contains at least one element.

This describes the empty/not empty distinction.

 There's nowhere in this conceptual
 scheme for such a thing as a "null array" that's distinct from an empty
 array.

And this is the problem/complaint.  You cannot represent specified/not  
specified, you can only represent empty/not empty.

I agree you cannot logically have an existing array that is somehow a  
"null array" and distinct/different from an empty array, but that's not  
what I want/am asking for.  I want to use an array 'reference' to  
represent that the array is non existent, has not been set, has not been  
defined, etc.  This is what null is for.

 This distinction only crops up in implementation, and IMO leads
 to code smells because code should be operating based on the conceptual
 behaviour of arrays rather than on the implementation details.

It is not an implementation detail, it's a conceptual difference.  A  
reference type has the power to represent specified/not specified in  
addition to referring to an array which is empty/not empty.  A value type,  
like int, cannot do the same thing without either boxing (into a reference  
type, whose reference can be null) or by giving up one of it's values  
(i.e. 0) and pretending it's something special.

This is what D's string has done with empty, it is pretending that it is  
special and means "not specified", and because it converts null into  
empty, that means we cannot rely on empty really being empty (as in the  
user wants the value set to empty), as it might also be a value the user  
did not specify.

It's actually a fairly simple distinction I want to be able to make.  If  
you get input from a user a field called "foo" may be:
  - not specified
  - specified

and if specified, may be:
  - empty
  - not empty

null allows us the specified/not specified distinction.

Regan

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 21 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, Oct 21, 2013 at 11:53:44AM +0100, Regan Heath wrote:
 On Fri, 18 Oct 2013 18:38:12 +0100, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:

[...]
Conceptually speaking, an array is a sequence of values of
non-negative length. An array with non-zero length contains at least
one element, and is therefore non-empty, whereas an array with zero
length is empty. Same thing goes with a slice. A slice is a view into
zero or more array elements. A slice with zero length is empty, and a
slice with non-zero length contains at least one element.

 
 This describes the empty/not empty distinction.
 
There's nowhere in this conceptual scheme for such a thing as a "null
array" that's distinct from an empty array.

 
 And this is the problem/complaint.  You cannot represent specified/not
 specified, you can only represent empty/not empty.
 
 I agree you cannot logically have an existing array that is somehow a
 "null array" and distinct/different from an empty array, but that's
 not what I want/am asking for.  I want to use an array 'reference' to
 represent that the array is non existent, has not been set, has not
 been defined, etc.  This is what null is for.

The thing is, D slices are value types even though the elements they
point to are pointed to by reference. If you treat slices (slices
themselves, that is, not the elements they refer to) as value types,
then the problem goes away. If you want to have a *reference* to a
slice, then you simply write T[]* and then it becomes nullable as
expected.

I do agree that the current situation is confusing, though, mainly
because you can write `if (arr is null)`, which then makes you think of
it as a reference type. I think that should be prohibited, and slices
should be treated as pure value types, and all comparisons should be
checked with .length (or .empty if you import std.range).


T

-- 
Кто везде - тот нигде.

Oct 21 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Mon, 21 Oct 2013 15:01:04 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Mon, Oct 21, 2013 at 11:53:44AM +0100, Regan Heath wrote:
 On Fri, 18 Oct 2013 18:38:12 +0100, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:

 [...]
Conceptually speaking, an array is a sequence of values of
non-negative length. An array with non-zero length contains at least
one element, and is therefore non-empty, whereas an array with zero
length is empty. Same thing goes with a slice. A slice is a view into
zero or more array elements. A slice with zero length is empty, and a
slice with non-zero length contains at least one element.

 This describes the empty/not empty distinction.

There's nowhere in this conceptual scheme for such a thing as a "null
array" that's distinct from an empty array.

 And this is the problem/complaint.  You cannot represent specified/not
 specified, you can only represent empty/not empty.

 I agree you cannot logically have an existing array that is somehow a
 "null array" and distinct/different from an empty array, but that's
 not what I want/am asking for.  I want to use an array 'reference' to
 represent that the array is non existent, has not been set, has not
 been defined, etc.  This is what null is for.

 The thing is, D slices are value types even though the elements they
 point to are pointed to by reference. If you treat slices (slices
 themselves, that is, not the elements they refer to) as value types,
 then the problem goes away. If you want to have a *reference* to a
 slice, then you simply write T[]* and then it becomes nullable as
 expected.

True, and that's a pointer, and I am comfortable using pointers.. however  
I worry this will limit the compilers ability to optimise somehow.. and  
doesn't it make the code immediately un"safe"?

 I do agree that the current situation is confusing, though, mainly
 because you can write `if (arr is null)`, which then makes you think of
 it as a reference type. I think that should be prohibited, and slices
 should be treated as pure value types, and all comparisons should be
 checked with .length (or .empty if you import std.range).

IMO, this would be preferable to the current situation even thought I  
would rather go the other way and have a reference type.  I can see the  
argument that it would be safer and easier for most users, even though I  
do not believe I am in that category.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 21 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, Oct 21, 2013 at 04:41:23PM +0100, Regan Heath wrote:
 On Mon, 21 Oct 2013 15:01:04 +0100, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:
 
On Mon, Oct 21, 2013 at 11:53:44AM +0100, Regan Heath wrote:


[...]
I agree you cannot logically have an existing array that is somehow
a "null array" and distinct/different from an empty array, but
that's not what I want/am asking for.  I want to use an array
'reference' to represent that the array is non existent, has not
been set, has not been defined, etc.  This is what null is for.

The thing is, D slices are value types even though the elements they
point to are pointed to by reference. If you treat slices (slices
themselves, that is, not the elements they refer to) as value types,
then the problem goes away. If you want to have a *reference* to a
slice, then you simply write T[]* and then it becomes nullable as
expected.

 
 True, and that's a pointer, and I am comfortable using pointers..
 however I worry this will limit the compilers ability to optimise
 somehow.. and doesn't it make the code immediately un"safe"?

No, pointers are allowed in  safe. What is not allowed is pointer
*arithmetic* and casting pointers into pointers of different types.


I do agree that the current situation is confusing, though, mainly
because you can write `if (arr is null)`, which then makes you think
of it as a reference type. I think that should be prohibited, and
slices should be treated as pure value types, and all comparisons
should be checked with .length (or .empty if you import std.range).

 
 IMO, this would be preferable to the current situation even thought I
 would rather go the other way and have a reference type.  I can see
 the argument that it would be safer and easier for most users, even
 though I do not believe I am in that category.

[...]

Well, either way would work, though I do prefer treating slices as value
types. It's just cleaner conceptually, IMO. But I suppose this is one of
those things in which reasonable people may disagree.


T

-- 
Sometimes the best solution to morale problems is just to fire all of the
unhappy people. -- despair.com

Oct 21 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Mon, 21 Oct 2013 17:34:51 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Mon, Oct 21, 2013 at 04:41:23PM +0100, Regan Heath wrote:
 On Mon, 21 Oct 2013 15:01:04 +0100, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:

On Mon, Oct 21, 2013 at 11:53:44AM +0100, Regan Heath wrote:


 [...]
I agree you cannot logically have an existing array that is somehow
a "null array" and distinct/different from an empty array, but
that's not what I want/am asking for.  I want to use an array
'reference' to represent that the array is non existent, has not
been set, has not been defined, etc.  This is what null is for.

The thing is, D slices are value types even though the elements they
point to are pointed to by reference. If you treat slices (slices
themselves, that is, not the elements they refer to) as value types,
then the problem goes away. If you want to have a *reference* to a
slice, then you simply write T[]* and then it becomes nullable as
expected.

 True, and that's a pointer, and I am comfortable using pointers..
 however I worry this will limit the compilers ability to optimise
 somehow.. and doesn't it make the code immediately un"safe"?

 No, pointers are allowed in  safe. What is not allowed is pointer
 *arithmetic* and casting pointers into pointers of different types.

Ah, thanks.

I do agree that the current situation is confusing, though, mainly
because you can write `if (arr is null)`, which then makes you think
of it as a reference type. I think that should be prohibited, and
slices should be treated as pure value types, and all comparisons
should be checked with .length (or .empty if you import std.range).

 IMO, this would be preferable to the current situation even thought I
 would rather go the other way and have a reference type.  I can see
 the argument that it would be safer and easier for most users, even
 though I do not believe I am in that category.

 [...]

 Well, either way would work, though I do prefer treating slices as value
 types. It's just cleaner conceptually, IMO. But I suppose this is one of
 those things in which reasonable people may disagree.

I agree that conceptually if you slice something, you cannot get a 'null'  
reference.  So, a null state for slices makes no sense.  However, most  
people see arrays as slices, slices as arrays - do you?  If so, for arrays  
the same conceptual argument does not apply.  If not, how do we tell we  
have a slice, or an array?  If we can't tell, then we have to check for  
null with both anyway..

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 22 2013

"Max Samukha" <maxsamukha gmail.com> writes:

On Friday, 18 October 2013 at 16:55:19 UTC, Andrei Alexandrescu 
wrote:
 Fair point. I just gave one possible alternative out of many. 
 Thing is, relying on client code to distinguish subtleties 
 between empty and null strings is fraught with dangers.

 Andrei

I agree. Thinking about your variant of readln - it's ok to use 
[] as the value indicating EOF, since it is not included in the 
value set of type "line" as you define it. But generally, neither 
cast(T[])[] nor cast(T[])null should be used like that, because 
both of them are in the set of T[]'s values, i.e. a generic 
stream returning [] to signify its end would be a bad idea - that 
should be either a side effect or a value outside T[]'s set.

Hm, I've just said nothing with many words. Never mind.

Oct 18 2013

"Kagamin" <spam here.lot> writes:

On Friday, 18 October 2013 at 17:59:17 UTC, Max Samukha wrote:
 On Friday, 18 October 2013 at 16:55:19 UTC, Andrei Alexandrescu 
 wrote:
 Fair point. I just gave one possible alternative out of many. 
 Thing is, relying on client code to distinguish subtleties 
 between empty and null strings is fraught with dangers.

 Andrei

 I agree. Thinking about your variant of readln - it's ok to use 
 [] as the value indicating EOF, since it is not included in the 
 value set of type "line" as you define it.

No, if the last line is empty, it has no new line character(s) at 
the end, and is as empty, as it can get.

Oct 19 2013

"Max Samukha" <maxsamukha gmail.com> writes:

On Saturday, 19 October 2013 at 12:04:43 UTC, Kagamin wrote:
 On Friday, 18 October 2013 at 17:59:17 UTC, Max Samukha wrote:
 On Friday, 18 October 2013 at 16:55:19 UTC, Andrei 
 Alexandrescu wrote:
 Fair point. I just gave one possible alternative out of many. 
 Thing is, relying on client code to distinguish subtleties 
 between empty and null strings is fraught with dangers.

 Andrei

 I agree. Thinking about your variant of readln - it's ok to 
 use [] as the value indicating EOF, since it is not included 
 in the value set of type "line" as you define it.

 No, if the last line is empty, it has no new line character(s) 
 at the end, and is as empty, as it can get.

Right. Then readln is broken.

Oct 19 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 18 Oct 2013 17:55:46 +0100, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 10/18/13 9:26 AM, Max Samukha wrote:
 *That's* bad API design. readln should be symmetrical to writeln, not
 write. And about preserving the exact representation of new lines,
 readln/writeln shouldn't preserve that, pure and simple.

 Fair point. I just gave one possible alternative out of many. Thing is,  
 relying on client code to distinguish subtleties between empty and null  
 strings is fraught with dangers.

My code does not need to distinguish between empty and null.  null is  
checked for, and empty is just a normal value for a string.  The "problem"  
you're referring to is /casused/ by conflating null and empty, by making  
empty strings "special" in the same way someone might make 0 a special  
value for an int (meaning not specified - for example).

If you stop using empty string as a special case of null, then empty does  
not need special handling - it's just a normal string value handled like  
any other - you can read it, write it, append it, etc etc etc.

null is the /only/ case which needs special handling - just like any other  
reference type.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 21 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Oct 18, 2013 at 06:26:05PM +0200, Max Samukha wrote:
 On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu
 wrote:
On 10/18/13 3:44 AM, Regan Heath wrote:


[...]
Take this simple design:

  string readline();

This function would like to be able to:
 - return null for EOF
 - return [] for a blank line

That's bad API design, pure and simple. The function should e.g.
return the string including the line terminator, and only return
an empty (or null) string upon EOF.


Andrei

 
 *That's* bad API design. readln should be symmetrical to writeln,
 not write. And about preserving the exact representation of new
 lines, readln/writeln shouldn't preserve that, pure and simple.

I agree. A better solution is to provide an eof() method (or better,
.empty) that tells you when readln() will succeed, and readln() should
throw upon EOF. The problem is analogous to reading from an input range:
you always check whether the range is .empty before you call .front,
since when the range is empty .front has no meaningful value to return.
Relying on some kind of special sentinel value to represent the absence
of a value is a code smell.


T

-- 
Тише едешь, дальше будешь.

Oct 18 2013

"Dicebot" <public dicebot.lv> writes:

On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu 
wrote:
 That's bad API design, pure and simple. The function should 
 e.g. return the string including the line terminator, and only 
 return an empty (or null) string upon EOF.

I'd say it should throw upon EOF as it is pretty high-level 
convenience function.

Oct 18 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 18 Oct 2013 17:36:28 +0100, Dicebot <public dicebot.lv> wrote:

 On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu wrote:
 That's bad API design, pure and simple. The function should e.g. return  
 the string including the line terminator, and only return an empty (or  
 null) string upon EOF.

 I'd say it should throw upon EOF as it is pretty high-level convenience  
 function.

I disagree.  Exceptions should never be used for flow control so the rule  
is to throw on exceptional occurrences ONLY not on something that you will  
ALWAYS eventually happen.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 21 2013

"Dicebot" <public dicebot.lv> writes:

On Monday, 21 October 2013 at 09:40:13 UTC, Regan Heath wrote:
 I disagree.  Exceptions should never be used for flow control 
 so the rule is to throw on exceptional occurrences ONLY not on 
 something that you will ALWAYS eventually happen.

For such function it is exceptional situation. For precise 
reading different API is required anyway (==different function).

Oct 21 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, Oct 21, 2013 at 10:40:14AM +0100, Regan Heath wrote:
 On Fri, 18 Oct 2013 17:36:28 +0100, Dicebot <public dicebot.lv> wrote:
 
On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu wrote:
That's bad API design, pure and simple. The function should e.g.
return the string including the line terminator, and only return
an empty (or null) string upon EOF.

I'd say it should throw upon EOF as it is pretty high-level
convenience function.

 
 I disagree.  Exceptions should never be used for flow control so the
 rule is to throw on exceptional occurrences ONLY not on something
 that you will ALWAYS eventually happen.

[...]

	while (!file.eof) {
		auto line = file.readln(); // never throws
		...
	}


T

-- 
There are two ways to write error-free programs; only the third one works.

Oct 21 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Mon, 21 Oct 2013 15:02:35 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Mon, Oct 21, 2013 at 10:40:14AM +0100, Regan Heath wrote:
 On Fri, 18 Oct 2013 17:36:28 +0100, Dicebot <public dicebot.lv> wrote:

On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu wrote:
That's bad API design, pure and simple. The function should e.g.
return the string including the line terminator, and only return
an empty (or null) string upon EOF.

I'd say it should throw upon EOF as it is pretty high-level
convenience function.

 I disagree.  Exceptions should never be used for flow control so the
 rule is to throw on exceptional occurrences ONLY not on something
 that you will ALWAYS eventually happen.

 [...]

 	while (!file.eof) {
 		auto line = file.readln(); // never throws
 		...
 	}

For a file this is implementable (without a buffer) but not for a socket  
or similar source/stream where a read MUST be performed to detect EOF.   
So, if you're implementing a line reader over multiple sources, you would  
need to buffer.  Not the end of the world, but definitely more complicated  
than just returning a null, no?

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 21 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, Oct 21, 2013 at 04:47:05PM +0100, Regan Heath wrote:
 On Mon, 21 Oct 2013 15:02:35 +0100, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:
 
On Mon, Oct 21, 2013 at 10:40:14AM +0100, Regan Heath wrote:
On Fri, 18 Oct 2013 17:36:28 +0100, Dicebot <public dicebot.lv> wrote:

On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu wrote:
That's bad API design, pure and simple. The function should e.g.
return the string including the line terminator, and only return
an empty (or null) string upon EOF.

I'd say it should throw upon EOF as it is pretty high-level
convenience function.

I disagree.  Exceptions should never be used for flow control so the
rule is to throw on exceptional occurrences ONLY not on something
that you will ALWAYS eventually happen.

[...]

	while (!file.eof) {
		auto line = file.readln(); // never throws
		...
	}

 
 For a file this is implementable (without a buffer) but not for a
 socket or similar source/stream where a read MUST be performed to
 detect EOF.  So, if you're implementing a line reader over multiple
 sources, you would need to buffer.  Not the end of the world, but
 definitely more complicated than just returning a null, no?

[...]

This is actually a very interesting issue to me, and one which I've
thought about a lot in the past. There are two incompatible (albeit with
much overlap) approaches here. One is the Unix approach where EOF is
unknown until you try to read past the end of a file (socket, etc.), and
the other is where EOF is known *before* you perform a read.

Personally, I prefer the second approach as being conceptually cleaner:
an input stream should "know" when it doesn't have any more data, so
that its EOF state can be queried at any time. Conceptually speaking one
shouldn't need to (try to) read from it before realizing there's nothing
left.

However, I understand that the Unix approach is easier to implement, in
the sense that if you have a network socket, it may be the case that
when you attempt to read from it, it is still connected, but before any
further data is received, the remote end disconnects. In this case, the
OS can't reasonably predict when there will be more incoming data, so
you do have to read the socket before finding out that the remote end
is going to disconnect without sending anything more.

In terms of API design, though, I still lean towards the approach where
EOF is always query-able, because it leads to cleaner code. This can be
implemented on Posix by having .eof read a single byte (or whatever unit
is expected) and buffering it, and the subsequent readln() takes this
buffering into account. This slight complication in implementation is
worth achieving the nicer user-facing API, IMO.


T

-- 
I've been around long enough to have seen an endless parade of magic new
techniques du jour, most of which purport to remove the necessity of
thought about your programming problem.  In the end they wind up
contributing one or two pieces to the collective wisdom, and fade away
in the rearview mirror. -- Walter Bright

Oct 21 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Mon, 21 Oct 2013 17:49:43 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Mon, Oct 21, 2013 at 04:47:05PM +0100, Regan Heath wrote:
 On Mon, 21 Oct 2013 15:02:35 +0100, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:

On Mon, Oct 21, 2013 at 10:40:14AM +0100, Regan Heath wrote:
On Fri, 18 Oct 2013 17:36:28 +0100, Dicebot <public dicebot.lv> wrote:

On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu  



 wrote:
That's bad API design, pure and simple. The function should e.g.
return the string including the line terminator, and only return
an empty (or null) string upon EOF.

I'd say it should throw upon EOF as it is pretty high-level
convenience function.

I disagree.  Exceptions should never be used for flow control so the
rule is to throw on exceptional occurrences ONLY not on something
that you will ALWAYS eventually happen.

[...]

	while (!file.eof) {
		auto line = file.readln(); // never throws
		...
	}

 For a file this is implementable (without a buffer) but not for a
 socket or similar source/stream where a read MUST be performed to
 detect EOF.  So, if you're implementing a line reader over multiple
 sources, you would need to buffer.  Not the end of the world, but
 definitely more complicated than just returning a null, no?

 [...]

 This is actually a very interesting issue to me, and one which I've
 thought about a lot in the past. There are two incompatible (albeit with
 much overlap) approaches here. One is the Unix approach where EOF is
 unknown until you try to read past the end of a file (socket, etc.), and
 the other is where EOF is known *before* you perform a read.

 Personally, I prefer the second approach as being conceptually cleaner:
 an input stream should "know" when it doesn't have any more data, so
 that its EOF state can be queried at any time. Conceptually speaking one
 shouldn't need to (try to) read from it before realizing there's nothing
 left.

 However, I understand that the Unix approach is easier to implement, in
 the sense that if you have a network socket, it may be the case that
 when you attempt to read from it, it is still connected, but before any
 further data is received, the remote end disconnects. In this case, the
 OS can't reasonably predict when there will be more incoming data, so
 you do have to read the socket before finding out that the remote end
 is going to disconnect without sending anything more.

 In terms of API design, though, I still lean towards the approach where
 EOF is always query-able, because it leads to cleaner code. This can be
 implemented on Posix by having .eof read a single byte (or whatever unit
 is expected) and buffering it, and the subsequent readln() takes this
 buffering into account. This slight complication in implementation is
 worth achieving the nicer user-facing API, IMO.

I don't agree the user-facing API is nicer.  It is more complex both in  
concept and implementation.


and check the result for null.  The check, naturally follows the attempt  
to read, which is the task you are trying to accomplish.  Simple, straight  
forward.


Your purpose is to read lines, so you call readline(), it is naturally  
easy to forget to call isEof().  Coding the example loop above requires  
you think about EOF /before/ you read a line, this is not how people  
think.  This API is therefore more complex, and less intuitive for no gain.

So, having a usable null state allows the simpler, more direct API.  Lack  
of it requires a more complicated design and a more complicated  
implementation.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 22 2013

"deadalnix" <deadalnix gmail.com> writes:

On Friday, 18 October 2013 at 15:42:56 UTC, Andrei Alexandrescu 
wrote:
 This comes up time and again.  The use of, and ability to 
 distinguish
 empty from null is very useful.

 I disagree.

That what if does by default.

Oct 18 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 18 Oct 2013 16:43:23 +0100, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 10/18/13 3:44 AM, Regan Heath wrote:
 On Fri, 18 Oct 2013 00:32:46 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>
 wrote:

 On Fri, Oct 18, 2013 at 01:27:33AM +0200, Adam D. Ruppe wrote:
 On Thursday, 17 October 2013 at 23:12:03 UTC, ProgrammingGhost
 wrote:
is null still treats [] as null.

 blah, you're right. It will at least distinguish it from an empty
 slice though (like arr[$..$]). I don't think there's any way to tell
 [] from null except typeof(null) at all. At runtime they're both the
 same: no contents, so null pointer and zero length.

 I think it's a mistake to rely on the distinction between null and
 non-null but empty arrays in D. They should be regarded as
 implementation details that user code shouldn't depend on. If you need
 to distinguish between arrays that are empty and arrays that are null,
 consider using Nullable!(T[]) instead.

 This comes up time and again.  The use of, and ability to distinguish
 empty from null is very useful.

 I disagree.

Because.. the risk of a null pointer exception is not worth the gain?  If  
so, why not go the whole hog and prevent string from ever being null?   
Then, at least we'd gain something from the loss of the null/empty  
distinction/limitation.

D strings ought to decide whether they're reference types or value types,  
if the former then I want consistent null back, if the latter then I want  
to be rid of null for good.  This middle ground sucks.

 Yes, you run the risk of things like
 null pointer exceptions etc, but we have that risk now without the
 reward of being able to distinguish these cases.

 Take this simple design:

    string readline();

 This function would like to be able to:
   - return null for EOF
   - return [] for a blank line

 That's bad API design, pure and simple. The function should e.g. return  
 the string including the line terminator, and only return an empty (or  
 null) string upon EOF.



R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 21 2013

"Kagamin" <spam here.lot> writes:

On Friday, 18 October 2013 at 10:44:11 UTC, Regan Heath wrote:
 This comes up time and again.  The use of, and ability to 
 distinguish empty from null is very useful.  Yes, you run the 
 risk of things like null pointer exceptions etc, but we have 
 that risk now without the reward of being able to distinguish 
 these cases.


need them, but still must check for them just in order to not get 
an exception. Also business logic makes no difference between 
null and empty - both of them are just "no data", so you end up 
typing if(string.IsNullOrEmpty(mystr)) every time everywhere. 
And, yeah, only one small feature in this big mess ever needs to 
differentiate between null and empty. I found this one case 
trivially implementable, but nulls still plague all remaining 
code.

 Take this simple design:

   string readline();

 This function would like to be able to:
  - return null for EOF
  - return [] for a blank line

 but it cannot, because as soon as you write:

   foo(readline())

 the null/[] case merges.

This is a horrible design. You better throw an exception on eof 
instead of null: this null will break the caller anyway possibly 
in a contrived way. It works if you read one line per loop cycle, 
but if you read several lines and assume they're not null (some 
multiline data format), you're screwed or your code becomes 
littered with null checks, but who accounts for all alternative 
scenarios from the start?

If readline returns empty string on eof, I don't expect it to 
break any business logic. If the empty string doesn't match, ok, 
no match, continue. You can check for eof equally, but at *your* 
discretion, not when the external data wants you to do it.

 There are plenty of other such design/cases that can be 
 imagined, and while you can work around them all they add 
 complexity for zero gain.

I believe there's no problem domain, which would like to 
differentiate between null and empty string instead of treating 
them as "no data".

Oct 19 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Sat, 19 Oct 2013 10:56:02 +0100, Kagamin <spam here.lot> wrote:

 On Friday, 18 October 2013 at 10:44:11 UTC, Regan Heath wrote:
 This comes up time and again.  The use of, and ability to distinguish  
 empty from null is very useful.  Yes, you run the risk of things like  
 null pointer exceptions etc, but we have that risk now without the  
 reward of being able to distinguish these cases.




strings.  The conflated empty/null cases are the real nightmare for me  
(more below).

null strings are no different to null class references, they're not a  
special case.  People seem to have this odd idea that null is somehow an  

it's not.

People also seem to elevate empty strings to some sort of special status,  
that's like saying 0 has some special status for int - it doesn't it's  
just one of a number of possible values.

In fact, int having no null like state is a "problem" causing solutions  
like boxing to elevate the value type to a reference in order to allow a  
null state for int.

Yet, in D we've decided to inconsistently remove that functionality from  
string for no gain.  If string could not actually be null then we'd gain  
something from the limitation, instead we lose functionality and gain  
nothing - you still have to check your strings for null in D.

We ought to go one way or the other, this middle ground is worse than  
either of the other options.

In my code I don't have to check for or treat empty strings any  
differently to other values.  I simply have to check for null.   
Remembering to check for null on reference types is automatic for me,  
strings are not special in this regard.

 Most of the time you don't need them

Sure, and if I don't have access to null (like when using a value type  
like int), I can code around that lack, but it's never as straight forward  
a solution.

 but still must check for them just in order to not get an exception.

Sure, you must check for the possible states of a reference type.

 Also business logic makes no difference between null and empty

This is simply not true.  Example at the end.

 both of them are just "no data", so you end up typing  
 if(string.IsNullOrEmpty(mystr)) every time everywhere.

I only have to code like this when I use 3rd party code which has  
conflated empty and null.  In my code when it's null it means not  
specified, and empty is just one type of value - for which I do no special  
handling.

 And, yeah, only one small feature in this big mess ever needs to  
 differentiate between null and empty.

Untrue, null allows many alternate and IMO more direct/obvious designs.

 I found this one case trivially implementable, but nulls still plague  
 all remaining code.

Which one case?  The readline() one below?

 Take this simple design:

   string readline();

 This function would like to be able to:
  - return null for EOF
  - return [] for a blank line

 but it cannot, because as soon as you write:

   foo(readline())

 the null/[] case merges.

 This is a horrible design. You better throw an exception on eof instead  
 of null:

No, no, no.  You should only throw in exceptional circumstances or you  
risk using exceptions for flow control, and that is just plain horrid.

 this null will break the caller anyway possibly in a contrived way.

Never a contrived way, always a blatantly obvious one and only if you're  
not doing your job properly.  If you want a contrived, unpredictable and  
difficult to debug breakage look no further than heap or stack  
corruption.  Null is never a difficult bug to find and fix, and is no  
different to forgetting to handle one of the integer return values of a  
function.

I use this all the time:
http://msdn.microsoft.com/en-us/library/system.io.streamreader.readline.aspx

It has never caused me any issues.  It explicitly states that null is a  
possible output, and so I check for it - doing anything less is simply bad  
programming.

 It works if you read one line per loop cycle, but if you read several  
 lines and assume they're not null (some multiline data format),

There is your problem, never "assume" - the documentation is very clear on  
the issue.

 you're screwed or your code becomes littered with null checks, but who  
 accounts for all alternative scenarios from the start?

Me, and IMO any competent programmer.  It is misguided to think you can  

should be thinking about and handling it.

You don't have to check for it on every access to the variable, but you do  
need to check for it once where the variable is assigned, or passed (in  
private functions you can skip this).  From that point onward you can  
assume non-null, valid, job done.

 There are plenty of other such design/cases that can be imagined, and  
 while you can work around them all they add complexity for zero gain.

 I believe there's no problem domain, which would like to differentiate  
 between null and empty string instead of treating them as "no data".

null means not specified, non existent, was not there.
empty means, present but set to empty/blank.

Databases have this distinction for a reason.

If you get input from a user a field called "foo" may be:
  - not specified
  - specified

and if specified, may be:
  - empty
  - not empty

If foo is not specified you may want to assign a default value for it, if  
your business logic is using empty to mean "not specified" you prevent the  
user actually setting foo to empty and that limitation is a right pain in  
many cases.

You can code around this by using a boolean a dictionary to indicate the  
specified/not specified distinction, but this is less direct than simply  
using null.

If we have null, lets use it, if we want to remove null the lets remove  
it, but can we get out of this horrid middle ground please.

Regan

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 21 2013

"Kagamin" <spam here.lot> writes:

On Monday, 21 October 2013 at 10:33:01 UTC, Regan Heath wrote:
 null strings are no different to null class references, they're 
 not a special case.

True. That's an implementation detail which has no meaning for 
business logic. When implementation deviates from business logic, 
one ends up fixing the implementation details everywhere in order 
to implement business logic. That's why string.IsNullOrEmpty is 
used.

 People seem to have this odd idea that null is somehow an 

 reference types), it's not.

That's the very problem: null and empty are valid states and must 
be treated equally as "no data", but they can't for purely 
technical reasons.

 People also seem to elevate empty strings to some sort of 
 special status, that's like saying 0 has some special status 
 for int - it doesn't it's just one of a number of possible 
 values.

 In fact, int having no null like state is a "problem" causing 
 solutions like boxing to elevate the value type to a reference 
 in order to allow a null state for int.

You want to check ints for null everywhere too?

 Yet, in D we've decided to inconsistently remove that 
 functionality from string for no gain.  If string could not 
 actually be null then we'd gain something from the limitation, 
 instead we lose functionality and gain nothing - you still have 
 to check your strings for null in D.

Huh? Null slices work just like empty ones - that's why this 
topic was started in the first place. One doesn't have to check 
slices for nulls, only for length.

If you want clear nullable semantics, you have Nullable, it works 
for everything, including strings and ints. You would want this 
feature only in rare cases, so it doesn't make sense to make it 
default, or it will be a nuisance.

 both of them are just "no data", so you end up typing 
 if(string.IsNullOrEmpty(mystr)) every time everywhere.

 I only have to code like this when I use 3rd party code which 
 has conflated empty and null.  In my code when it's null it 
 means not specified, and empty is just one type of value - for 
 which I do no special handling.

Equivalence between null and empty is a business logic's 
requirement, that's why it's done.

 And, yeah, only one small feature in this big mess ever needs 
 to differentiate between null and empty.

 Untrue, null allows many alternate and IMO more direct/obvious 
 designs.

The need for those designs is rare and trivially implementable 
for all value types.

 I found this one case trivially implementable, but nulls still 
 plague all remaining code.

 Which one case?  The readline() one below?

No, it was an authentication system in third-party code for one 
special case. I also had to specify this null value in app.config 
- guess how, explicitly specify, not substitute missing parameter 
with a default.

Another possibility for readline is to return a tuple
{bool eof, string line(non-null)} - this way you have easy check 
for eof and don't have to check for null when you don't need it.

 I use this all the time:
 http://msdn.microsoft.com/en-us/library/system.io.streamreader.readline.aspx

 It has never caused me any issues.  It explicitly states that 
 null is a possible output, and so I check for it - doing 
 anything less is simply bad programming.

 It works if you read one line per loop cycle, but if you read 
 several lines and assume they're not null (some multiline data 
 format),

 There is your problem, never "assume" - the documentation is 
 very clear on the issue.

 you're screwed or your code becomes littered with null checks, 
 but who accounts for all alternative scenarios from the start?

 Me, and IMO any competent programmer.  It is misguided to think 
 you can ignore valid states, null is a valid state in C, C++, 


Here null is a valid state for readline, not for the caller: if 
the caller parses a multiline data format, unexpected end of file 
is an invalid state.

And what do you gain by littering your code with those null 
checks? Just making runtime happy and adding noise to the code? 
You could use that time to improve the code or add features or 
even relax. It's exactly nullable strings, which gain you only a 
time waste.

 You don't have to check for it on every access to the variable, 
 but you do need to check for it once where the variable is 
 assigned, or passed (in private functions you can skip this).  
 From that point onward you can assume non-null, valid, job done.

You just said "never assume". The assumption may fail, because 
the string type is still nullable, compiler doesn't save you 
here, this sucks. And in order to check for everything everywhere 
on a level near that of the compiler, you must be not just 
competent, but perfect.

 I believe there's no problem domain, which would like to 
 differentiate between null and empty string instead of 
 treating them as "no data".

 null means not specified, non existent, was not there.
 empty means, present but set to empty/blank.

 Databases have this distinction for a reason.

Oracle makes no distinction between null and empty string. For a 
reason?
A database is an implementation detail of a data storage, it 
doesn't implement business logic, it only provides features, 
which can be used with more or less success to implement business 
logic. Ever heard of advantages of OO databases over relational 
ones? That's an illustration of technical details, which don't 
precisely map to business logic.

 If you get input from a user a field called "foo" may be:
  - not specified
  - specified

 and if specified, may be:
  - empty
  - not empty

If the user doesn't fill a text box, it's both empty and not 
specified - there's just no difference. And it doesn't matter how 
you store it in the database - as null or as empty string - both 
are presented in the same way. Heck, we use these optional text 
boxes everywhere - can you tell if their content is empty or not 
specified?

And what if the value is required? Would you accept an empty 
value? And if your database treats empty string as not null, 
would you allow to register a user with an empty login name? And 
how to express this constraint in the database? In SQL "not null" 
means "required value", but it's not equivalent to the business 
logic'a notion of a required value. I wouldn't be surprised if 
Oracle did that in order to reject empty strings in not null 
fields.

Let's consider a process of specifying user's data. What text 
fields do we have?
1. Login. No difference between null and empty - both invalid - 
"no data", must enter something.
2. First name. No difference between null and empty - both are 
"no data" and are presented as empty text box.
3. Middle name. ditto.
4. Last name. ditto.
5. Country. ditto.
6. State. ditto.
7. City. ditto.
8. Address. ditto.
9. Building. ditto.
10. Flat. ditto.
11. Zip code. ditto.
12. Phone. ditto.
13. Fax. ditto.
14. E-mail. ditto.
15. Site. ditto.
16. Passport number. ditto.
17. Birth place. ditto.
18. Comment. Hell! Comment!
See? Not a single field in the list requires distinction between 
null and empty. And slices don't differentiate between them. Just 
as planned.

 If we have null, lets use it, if we want to remove null the 
 lets remove it, but can we get out of this horrid middle ground 
 please.

*sigh* people just don't buy the KISS principle...

Oct 25 2013

"Wyatt" <wyatt.epp gmail.com> writes:

On Friday, 25 October 2013 at 11:41:38 UTC, Kagamin wrote:
 That's an implementation detail which has no meaning for 
 business logic.

I've no real truck in this, but I do find it pretty bizarre to 
see _anyone_ using "business logic" as justification for anything 
here when D's own documentation is pretty explicit about not 
catering exclusively to that domain.

-Wyatt

Oct 25 2013

"Kagamin" <spam here.lot> writes:

On Friday, 25 October 2013 at 12:35:44 UTC, Wyatt wrote:
 On Friday, 25 October 2013 at 11:41:38 UTC, Kagamin wrote:
 That's an implementation detail which has no meaning for 
 business logic.

 I've no real truck in this, but I do find it pretty bizarre to 
 see _anyone_ using "business logic" as justification for 
 anything here when D's own documentation is pretty explicit 
 about not catering exclusively to that domain.

Dunno about D documentation, I use tools to get shit done. If 
they help, that's good, if they don't, that's bad. And by "shit" 
I don't mean a product, not a heap of text files.

Oct 25 2013

"Kagamin" <spam here.lot> writes:

*fix* I mean a product.

Oct 25 2013

"Max Samukha" <maxsamukha gmail.com> writes:

On Friday, 25 October 2013 at 11:41:38 UTC, Kagamin wrote:
 On Monday, 21 October 2013 at 10:33:01 UTC, Regan Heath wrote:
 null strings are no different to null class references, 
 they're not a special case.

 True. That's an implementation detail which has no meaning for 
 business logic. When implementation deviates from business 
 logic, one ends up fixing the implementation details everywhere 
 in order to implement business logic. That's why 
 string.IsNullOrEmpty is used.

That's not an implementation detail. Whether "null" is in the set 
of values of a string type and whether it is identical to "empty" 
are fundamental properties of that type. If you define the string 
type to include "null", then "null" should be either identical to 
"empty" in *all cases* or distinct from that in all cases.

D chose to fuse "null" and "empty" together in an inconsistent 
manner, which is a mistake. If we include "null" in the set, then 
either the [] literal should be non-null (and "null" and "empty" 
properly disjoint), or "null" and "empty" should always represent 
the same value. If we exclude it - *then* "null" becomes an 
implementation detail and should be dealt with only via .ptr.

 People seem to have this odd idea that null is somehow an 

 reference types), it's not.

 That's the very problem: null and empty are valid states and 
 must be treated equally as "no data", but they can't for purely 
 technical reasons.

Whether they are valid states is irrelevant. What matters is 
whether they represent identical values. In D, they are 
unhealthily mixed.

Oct 25 2013

"Kagamin" <spam here.lot> writes:

On Friday, 25 October 2013 at 16:31:54 UTC, Max Samukha wrote:
 D chose to fuse "null" and "empty" together in an inconsistent 
 manner, which is a mistake.

Slices are reasonably consistent and perfectly working with 
reasonable code, so I see no merit in fixing them, but you can 
try, why not.

Oct 25 2013

"Kagamin" <spam here.lot> writes:

On Friday, 25 October 2013 at 16:31:54 UTC, Max Samukha wrote:
 If you define the string type to include "null", then "null" 
 should be either identical to "empty" in *all cases* or 
 distinct from that in all cases.

AFAIK, that's how equality operator works, use it and you will 
get the desired semantics. Should be no problem.

Oct 28 2013

Shammah Chancellor <S S.com> writes:

On 2013-10-25 11:41:36 +0000, Kagamin said:

 Oracle makes no distinction between null and empty string. For a reason?
 A database is an implementation detail of a data storage, it doesn't 
 implement business logic, it only provides features, which can be used 
 with more or less success to implement business logic. Ever heard of 
 advantages of OO databases over relational ones? That's an illustration 
 of technical details, which don't precisely map to business logic.

That's poor friggin design, and it's for a bad reason.  Oracle is not 
the example you want to be following.    Sql Server does *NOT* follow 
their example for GOOD reason.   My middle name is not null, it is 
NOTHING.   There are lots of places where Oracle made bad design 
decisions and they cannot escape them due to requiring backwards 
compatibility.

Oct 25 2013

"ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:

As the OP of this thread I want to say that I think nullable is 
the solution http://dlang.org/phobos/std_typecons.html but I 
dislike how I cant pass 5 or null to a parameter that is 
nullable!int, nullable!string

Oct 25 2013

"Regan Heath" <regan netmail.co.nz> writes:

I find that have repeated myself a lot in each section/reply below, I am  
not sure whether you'd prefer I just reply with those points once, or  
inline, I chose inline so as it make it clear I was not ignoring your  
points, and to make it clear which of my arguments apply to which point...

:)

On Fri, 25 Oct 2013 12:41:36 +0100, Kagamin <spam here.lot> wrote:
 On Monday, 21 October 2013 at 10:33:01 UTC, Regan Heath wrote:
 null strings are no different to null class references, they're not a  
 special case.

 True. That's an implementation detail which has no meaning for business  
 logic.

This argument applies both ways.  If D conflates null and empty, then this  
restricts business logic with an implementation detail.  We agree that D  
has no place in defining business logic, therefore it follows that the  
more flexible option is preferable as it is neutral in its effect on  
business logic.

However, this decision, like most is a cost/benefit analysis and in the  
case of strings the case can be made that they should be a value type, and  
never null.  I can get behind such a decision, as it would mean D was  
taking a side, finally.  If strings cannot be null then we actually  
benefit from the current conflation of the two, by avoiding having to do  
null reference checking, and the associated exception/crash.  I would  
prefer to go the other way and allow a consistent null/empty distinction  
but either option is better than the status quo where we have to check for  
null ("cost") but gain no benefit from this, because we cannot use the  
null state consistently.

 When implementation deviates from business logic, one ends up fixing the  
 implementation details everywhere in order to implement business logic.  
 That's why string.IsNullOrEmpty is used.

I almost never need to use string.IsNullOrEmpty.  The reason why is  
simple.  An empty string is just one value a string may hold, and my code  
does not "generally" treat it as special except in certain specific cases  
where I make that additional check (your blank username example, for  
one).  Null is the only "special" state a string reference can have, so I  
check for this and this alone.

 People seem to have this odd idea that null is somehow an invalid state  


 That's the very problem: null and empty are valid states and must be  
 treated equally as "no data", but they can't for purely technical  
 reasons.

I never treat null and empty "equally as "no data"" that is my whole  
point.  They are not the same thing conceptually, you should never treat  
them as the same thing.  null means "no data", empty is just one possible  
state of "data".

You might make the business logic decision of disallowing empty values, of  
treating an empty value as if no value was given.  The two would still be  
conceptually separate, but your code would be making the decision to treat  
them in the same way.  You encode this decision in the function which  
accesses the input, once, and your problems are all solved.

If you make the mistake of conflating null and empty in your input layer  
then you restrict your "business logic" and create the very problem you're  
complaining about here, stop conflating them and the problem simply  
vanishes.

If your input mechanism or a 3rd party library is conflating them, then  
you can add a business/conversion layer to convert empty to null and all  
your code can ignore the empty case and simply concentrate on checking for  
null, as it should already do - because this is unavoidable in any case.

This is KISS, collapse the 2 possible "error" states into 1 and check for  
that.

 People also seem to elevate empty strings to some sort of special  
 status, that's like saying 0 has some special status for int - it  
 doesn't it's just one of a number of possible values.

 In fact, int having no null like state is a "problem" causing solutions  
 like boxing to elevate the value type to a reference in order to allow  
 a null state for int.

 You want to check ints for null everywhere too?

No. (Strawman).  There are some cases where people wrap int in nullable  
however as there are some use cases where you do want to be able to  
indicate "no data" using a single variable.  This is the flexibility of a  
reference type, and the cost is the check for null.  If you do  
cost/benefit analysis for int with this in mind it is clearly not a type  
we want as a reference type - the performance penalty alone kills this.

 Yet, in D we've decided to inconsistently remove that functionality  
 from string for no gain.  If string could not actually be null then  
 we'd gain something from the limitation, instead we lose functionality  
 and gain nothing - you still have to check your strings for null in D.

 Huh? Null slices work just like empty ones - that's why this topic was  
 started in the first place. One doesn't have to check slices for nulls,  
 only for length.

Slices are not strings, as slices cannot be null.  However "if (slice is  
null)" can still be true - this is just plain wrong/inconsistent.  Lets  
pick a side and handle it consistently, above all else.  We can argue  
about which side, but can we at least agree the inconsistency is a bad  
thing?

 If you want clear nullable semantics, you have Nullable, it works for  
 everything, including strings and ints. You would want this feature only  
 in rare cases, so it doesn't make sense to make it default, or it will  
 be a nuisance.

Strings can be null, not checking for null is fatal.  You cannot easily  
tell if you have a string or a slice so you currently have to check for  
null in most/all cases already.  We're paying that "cost" already and yet  
not getting the full benefit from it.  It's simply a bad investment.  D  
should pick a side and conform to it, either we have nullable strings or  
we don't.  The current middle ground is just worse.

 both of them are just "no data", so you end up typing  
 if(string.IsNullOrEmpty(mystr)) every time everywhere.

 I only have to code like this when I use 3rd party code which has  
 conflated empty and null.  In my code when it's null it means not  
 specified, and empty is just one type of value - for which I do no  
 special handling.

 Equivalence between null and empty is a business logic's requirement,  
 that's why it's done.

Whose business logic?  This is perhaps my secondary point here.  D has no  
grounds to define business logic for all possible applications, this is  
something each application must have the flexibility to define for  
itself.  A library ought to provide the tools to do it - converting "" to  
null for you - but the language should not mandate it.

 And, yeah, only one small feature in this big mess ever needs to  
 differentiate between null and empty.

 Untrue, null allows many alternate and IMO more direct/obvious designs.

 The need for those designs is rare and trivially implementable for all  
 value types.

Rare; untrue, I use null all the time to good effect.  Trivially  
implementable, debatable - if you have to do more work you're paying a  
price, if you get no reward for that price then you're wasting resources.   
The current situation in D has you paying the price for no reward.

 I found this one case trivially implementable, but nulls still plague  
 all remaining code.

 Which one case?  The readline() one below?

 No, it was an authentication system in third-party code for one special  
 case.

No-one is trying to say you cannot code around it, even trivially in some  
cases, but the null design would likely have been simpler still.  And,  
this means less wasted effort, and worse still it gained you nothing.

 I also had to specify this null value in app.config - guess how,  
 explicitly specify, not substitute missing parameter with a default.

Seems to me that if you want a config to be null, you simply omit it from  
the configuration file.  Then have the code return null for it's value, to  
indicate "no data".  If it's present, and set to "" then you would be able  
to differentiate these two cases, which is essential if your business  
logic requires that "" is a valid value for the config.  D should not  
place restrictions on you business logic - with an implementation detail.

 Another possibility for readline is to return a tuple
 {bool eof, string line(non-null)} - this way you have easy check for eof  
 and don't have to check for null when you don't need it.

Yet another more complex design, for no gain.  The additional boolean buys  
us nothing over the string reference, it costs more in terms of memory and  
complexity and you still have to remember to check it, as you have to  
remember to check for null in the original design.

 you're screwed or your code becomes littered with null checks, but who  
 accounts for all alternative scenarios from the start?

 Me, and IMO any competent programmer.  It is misguided to think you can  

 should be thinking about and handling it.

 Here null is a valid state for readline, not for the caller: if the  
 caller parses a multiline data format, unexpected end of file is an  
 invalid state.

If they pass a multi-line data format, and they have counted the number of  
lines prior to passing it (to verify that they can call readline() N times  
safely) then yes, calling readline and getting EOF would be unexpected and  
worthy of an exception.

But, why would you want to pay the cost of processing the lines twice (to  
count them and ensure no EOF)?  Why not just have readline do that for  
you, by returning null on EOF.  Simpler, more direct.

 And what do you gain by littering your code with those null checks? Just  
 making runtime happy and adding noise to the code? You could use that  
 time to improve the code or add features or even relax. It's exactly  
 nullable strings, which gain you only a time waste.

I D, you already have to "litter your code with null checks" so you're  
already paying the cost, you're just not getting any benefit.

 You don't have to check for it on every access to the variable, but you  
 do need to check for it once where the variable is assigned, or passed  
 (in private functions you can skip this).  From that point onward you  
 can assume non-null, valid, job done.

 You just said "never assume". The assumption may fail, because the  
 string type is still nullable, compiler doesn't save you here, this  
 sucks. And in order to check for everything everywhere on a level near  
 that of the compiler, you must be not just competent, but perfect.

Play on words.  If you've filtered out null, you're not "assuming" you're  
"ensuring" it's non-null.  The only way to get null from that point is  
either "by design" or via memory corruption.  D does protect you from  
memory corruption by avoiding the need for raw pointers etc.  And, if  
you're setting string variables to null "by design" then you will need to  
check them again, of course.

Yes, if you want to write good code you need to develop good habits WRT  
using null, it's unavoidable.  Unless we remove null and the  
power/flexibility it affords - which is a valid option.  So, can we just  
pick an option for D and go with it, I don't really mind which way we go -  
tho my preference should be obvious :)

 I believe there's no problem domain, which would like to differentiate  
 between null and empty string instead of treating them as "no data".

 null means not specified, non existent, was not there.
 empty means, present but set to empty/blank.

 Databases have this distinction for a reason.

 Oracle makes no distinction between null and empty string. For a reason?

Looks like it was (ultimately) a mistake:
http://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_elements005.htm

<quote>Note:
Oracle Database currently treats a character value with a length of zero  
as null. However, this may not continue to be true in future releases, and  
Oracle recommends that you do not treat empty strings the same as  
nulls.</quote>

To repeat the important part.. "Oracle recommends that you do not treat  
empty strings the same as nulls".

For. A. Reason.  The database has no right to define business logic - this  
restriction in oracle database has no doubt caused people to have to work  
around it, by using a specific "value" as null.

 A database is an implementation detail of a data storage, it doesn't  
 implement business logic

Agree 100% conflating null and empty string is a business logic decision,  
it has no place in a database or other base level - like a language or  
standard library.

 If you get input from a user a field called "foo" may be:
  - not specified
  - specified

 and if specified, may be:
  - empty
  - not empty

 If the user doesn't fill a text box, it's both empty and not specified -  
 there's just no difference.

There is a clear and important difference.  Lets say the text box  
represents the users middle name, lets presume they have given a value for  
it at some stage, lets assume they would like to remove it.  They load the  
page, and erase the value and click submit.  Your business logic will  
ignore the empty value, and not update the users middle name.  My business  
logic will detect the text box was present (not null) and apply the given  
value "" to the users middle name (in the database for example).

 And it doesn't matter how you store it in the database - as null or as  
 empty string - both are presented in the same way.

They don't have to be, that is my point.  The decision of how to display  
them is a business logic decision and having a clear distinction between  
null and empty allows you to display them differently.  Not having the  
distinction, ties your hands.

 Heck, we use these optional text boxes everywhere - can you tell if  
 their content is empty or not specified?

http is one such input mechanism which conflates null and empty, there are  
numerous ways to code around it.  D is making the same mistake, with the  
same consequences, this is my central point.

 And what if the value is required? Would you accept an empty value?

This is a business logic decision, which D, and the database have no right  
to make.  Yes, if the user could input an empty value and yes if my  
business logic wanted to detect and disallow it - I would.  If not, I  
would not.  The point is that null gives you the power to express both,  
rather than restricting you and forcing an indirect solution to code  
around the lack.

 If we have null, lets use it, if we want to remove null the lets remove  
 it, but can we get out of this horrid middle ground please.

 *sigh* people just don't buy the KISS principle...

No kidding.  From my perspective null /is/ KISS and having to code around  
the lack with a more complex design is not.  :P

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 28 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 10/18/2013 12:50 AM, ProgrammingGhost wrote:
 How do I find out if null was passed in? As you can guess I wasn't happy
 with the current behavior.
 ...

http://forum.dlang.org/thread/rkdzdxygpflpnaznxxnl forum.dlang.org?page=5

Oct 18 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, October 18, 2013 10:38:12 H. S. Teoh wrote:
 On Fri, Oct 18, 2013 at 01:32:58PM -0400, Jonathan M Davis wrote:
 On Friday, October 18, 2013 09:55:46 Andrei Alexandrescu wrote:
 On 10/18/13 9:26 AM, Max Samukha wrote:
 *That's* bad API design. readln should be symmetrical to writeln,
 not write. And about preserving the exact representation of new
 lines, readln/writeln shouldn't preserve that, pure and simple.

 
 Fair point. I just gave one possible alternative out of many. Thing
 is, relying on client code to distinguish subtleties between empty
 and null strings is fraught with dangers.

 
 Yeah, but the primary reason that it's bad design is the fact that D
 tries to conflate null and empty instead of keeping them distinct
 (which is essentially the complaint that was made). Whether that's
 ultimately good or bad is up for debate, but the side effect is that
 relying on the difference between null and empty ends up being very
 bug-prone, whereas in other languages which don't conflate the two, it
 isn't problematic in the same way, and it's much more reasonable to
 have the API treat them differently.

 
 [...]
 
 IMO, distinguishing between null and empty arrays is bad abstraction. I
 agree with D's "conflation" of null with empty, actually. Conceptually
 speaking, an array is a sequence of values of non-negative length. An
 array with non-zero length contains at least one element, and is
 therefore non-empty, whereas an array with zero length is empty. Same
 thing goes with a slice. A slice is a view into zero or more array
 elements. A slice with zero length is empty, and a slice with non-zero
 length contains at least one element. There's nowhere in this conceptual
 scheme for such a thing as a "null array" that's distinct from an empty
 array. This distinction only crops up in implementation, and IMO leads
 to code smells because code should be operating based on the conceptual
 behaviour of arrays rather than on the implementation details.

In most languages, an array is a reference type, so there's the question of 
whether it's even _there_. There's a clear distinction between having null 
reference to an array and having a reference to an empty array. This is 
particularly clear in C++ where an array is just a pointer, but it's try in 
plenty of other languages that don't treat as arrays as pointers (e.g. Java).

The problem is that D put the length on the stack alongside the pointer, 
making it so that D arrays are sort of reference types and sort of not. The 
pointer is a reference type, but the length is a value type, making the 
dynamic array half and half. If it were fully a reference type, then there 
would be no problem with distinguishing between null and empty arrays. A null 
array is simply a null reference to an array. But since D arrays aren't quite 
reference types, that doesn't work.

I see no problem in the abstraction of arrays with having null arrays, because 
a null array is simply a null reference to an array, which is exactly the same 
as having a null object or null pointer. It's the reference that's null, not 
what it points to. It's just D's implementation that's weird. It would be like 
taking some of the member variables of a class and putting them in the 
reference instead of in the object and then discussing how much a null object 
makes sense. It's just bizarre.

Now, D arrays end up working great overall in spite of their semantic 
weirdness, but it does mean that you can't really have proper null arrays in 
the same way that most languages with arrays can, forcing you to either be 
extremely careful when dealing with null and arrays or to waste space doing 
stuff to keep track of nullability separately from the array itself like 
Nullable does.

- Jonathan M Davis

Oct 18 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Oct 18, 2013 at 02:04:41PM -0400, Jonathan M Davis wrote:
 On Friday, October 18, 2013 10:38:12 H. S. Teoh wrote:

[...]
 IMO, distinguishing between null and empty arrays is bad
 abstraction. I agree with D's "conflation" of null with empty,
 actually. Conceptually speaking, an array is a sequence of values of
 non-negative length. An array with non-zero length contains at least
 one element, and is therefore non-empty, whereas an array with zero
 length is empty. Same thing goes with a slice. A slice is a view
 into zero or more array elements. A slice with zero length is empty,
 and a slice with non-zero length contains at least one element.
 There's nowhere in this conceptual scheme for such a thing as a
 "null array" that's distinct from an empty array. This distinction
 only crops up in implementation, and IMO leads to code smells
 because code should be operating based on the conceptual behaviour
 of arrays rather than on the implementation details.

 
 In most languages, an array is a reference type, so there's the
 question of whether it's even _there_. There's a clear distinction
 between having null reference to an array and having a reference to an
 empty array. This is particularly clear in C++ where an array is just
 a pointer, but it's try in plenty of other languages that don't treat
 as arrays as pointers (e.g. Java).

To me, these are just implementation details. Conceptually speaking, D
arrays are actually slices, so that gives them reference semantics.
Being slices, they refer to zero or more elements, so either their
length is zero, or not. There is no concept of nullity here. That only
comes because we chose to implement slices as pointer + length, so
implementation-wise we can distinguish between a null .ptr and a
non-null .ptr. But from the conceptual POV, if we consider slices as a
whole, they are just a sequence of zero or more elements. Null has no
meaning here.

Put another way, slices themselves are value types, but they refer to
their elements by reference. It's a subtle but important difference.


 The problem is that D put the length on the stack alongside the
 pointer, making it so that D arrays are sort of reference types and
 sort of not. The pointer is a reference type, but the length is a
 value type, making the dynamic array half and half. If it were fully a
 reference type, then there would be no problem with distinguishing
 between null and empty arrays. A null array is simply a null reference
 to an array. But since D arrays aren't quite reference types, that
 doesn't work.

[...]

I think the issue comes from the preconceived notion acquired from other
languages that arrays are some kind of object floating somewhere out
there on the heap, for which we have a handle here. Thus we have the
notion of null, being the case when we have a handle here but there's
actually nothing out there.

But we consider the slice as being a thing right *here* and now,
referencing some sequence of elements out there, then we arrive at D's
notion of null and empty being the same thing, because while there may
be no elements out there being referenced, the handle (i.e. slice) is
always *here*. In that sense, there's no distinction between an empty
slice and a null slice: either there are elements out there that we're
referring to, or there are none. There is no third "null" case.

There's no reason why we should adopt the previous notion if this one
works just as well, if not better. I argue that the second notion is
conceptually cleaner, because it eliminates an unnecessary distinction
between an empty sequence and a non-existent sequence (which then leads
to similar issues one encounters with null pointers).


T

-- 
Answer: Because it breaks the logical sequence of discussion. / Question: Why
is top posting bad?

Oct 18 2013

"Meta" <jared771 gmail.com> writes:

On Friday, 18 October 2013 at 19:59:26 UTC, H. S. Teoh wrote:
 ...because it eliminates an unnecessary distinction between an 
 empty sequence and a non-existent sequence (which then leads to 
 similar issues one encounters with null pointers).

That just seems silly. Surely we all recognize that there's a 
difference between the empty set and having no set at all, and 
that it's valuable to be able to distinguish between the two. The 
empty set is still a set, while nothing is... nothing.

Oct 18 2013

"Blake Anderton" <rbanderton gmail.com> writes:

I agree a null value and empty array are separate concepts, but 
from my very anecdotal/non rigorous point of view I really 
appreciate D's ability to treat them as equivalent.


follows the pattern if(arr == null || arr.Length == 0) ...

In D just doing if(arr.length) feels much nicer and less error 
prone. I'm all for correctness but would hate to throw the baby 
out with the bathwater.

Oct 18 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 10/18/2013 10:09 PM, Blake Anderton wrote:
 I agree a null value and empty array are separate concepts, but from my
 very anecdotal/non rigorous point of view I really appreciate D's
 ability to treat them as equivalent.


 pattern if(arr == null || arr.Length == 0) ...

 In D just doing if(arr.length) feels much nicer and less error prone.
 I'm all for correctness but would hate to throw the baby out with the
 bathwater.

(This will work either way.)

Oct 18 2013

"Meta" <jared771 gmail.com> writes:

On Friday, 18 October 2013 at 20:15:31 UTC, Timon Gehr wrote:
 (This will work either way.)

Speaking of that, it's really annoying to have to import 
std.array just to use range primitives with slices. Would these 
be better in druntime, or is that a bad idea?

Oct 18 2013

"ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:

On Friday, 18 October 2013 at 20:09:37 UTC, Blake Anderton wrote:
 I agree a null value and empty array are separate concepts, but 
 from my very anecdotal/non rigorous point of view I really 
 appreciate D's ability to treat them as equivalent.


 follows the pattern if(arr == null || arr.Length == 0) ...

 In D just doing if(arr.length) feels much nicer and less error 
 prone. I'm all for correctness but would hate to throw the baby 
 out with the bathwater.

Really? I NEVER write that pattern. I may check if an array is 
null or don't because the function shouldnt be receiving nulls 
(maybe its bad but idc). I just write linq and never bother to 
see if something is empty

Oct 18 2013

"Blake Anderton" <rbanderton gmail.com> writes:

On Friday, 18 October 2013 at 20:32:48 UTC, ProgrammingGhost 
wrote:
 Really? I NEVER write that pattern. I may check if an array is 
 null or don't because the function shouldnt be receiving nulls 
 (maybe its bad but idc). I just write linq and never bother to 
 see if something is empty

Yeah, LINQ makes it a lot easier, but I usually take 
IEnumerable<T> instead of coding directly against arrays in that 
case. I find most of the time I use arrays directly is when using 
"params" parameters. It's very easy to not null check that and 
cause heartache down the line.

Oct 18 2013

"David Nadlinger" <code klickverbot.at> writes:

On Friday, 18 October 2013 at 20:09:37 UTC, Blake Anderton wrote:
 I agree a null value and empty array are separate concepts […]

Yes, null values are a different concept, and slices being value 
types, there isn't really one for them. I'm torn on whether 
allowing conversion of arrays to pointers for the purpose of null 
comparison was a good idea or not.

David

Oct 18 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 18 Oct 2013 21:09:35 +0100, Blake Anderton <rbanderton gmail.com>  
wrote:

 I agree a null value and empty array are separate concepts, but from my  
 very anecdotal/non rigorous point of view I really appreciate D's  
 ability to treat them as equivalent.


 pattern if(arr == null || arr.Length == 0) ...


null and treat empty as any other string value.  The /only/ time I have to  
check for empty is when I have interfaced with 3rd party code which has  
decided to conflate empty and null to mean the same thing.

Regan

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 21 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Oct 18, 2013 at 10:04:52PM +0200, Meta wrote:
 On Friday, 18 October 2013 at 19:59:26 UTC, H. S. Teoh wrote:
...because it eliminates an unnecessary distinction between an
empty sequence and a non-existent sequence (which then leads to
similar issues one encounters with null pointers).

 
 That just seems silly. Surely we all recognize that there's a
 difference between the empty set and having no set at all, and that
 it's valuable to be able to distinguish between the two. The empty
 set is still a set, while nothing is... nothing.

Yes, but if you declare a variable to contain a set, then by definition
there is *something*, even if it's an empty set. For there to be
nothing, there shouldn't even be a variable in the first place. The fact
that the variable exists and has an identifer means that there is
*something*. So your argument is moot.


T

-- 
Computers shouldn't beep through the keyhole.

Oct 18 2013

"Meta" <jared771 gmail.com> writes:

On Friday, 18 October 2013 at 21:15:32 UTC, H. S. Teoh wrote:
 Yes, but if you declare a variable to contain a set, then by 
 definition there is *something*, even if it's an empty set.

Exactly. There is still *something*, even though the set is 
empty. That is, the set itself.

 For there to be nothing, there shouldn't even be a variable in 
 the first place. The fact that the variable exists and has an 
 identifer means that there is *something*. So your argument is 
 moot.

Not really. Null is a special marker to indicate the absence of a 
value. There is nothing, as opposed to the previous case.

Oct 18 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Oct 19, 2013 at 12:04:47AM +0200, Meta wrote:
 On Friday, 18 October 2013 at 21:15:32 UTC, H. S. Teoh wrote:
Yes, but if you declare a variable to contain a set, then by
definition there is *something*, even if it's an empty set.

 
 Exactly. There is still *something*, even though the set is empty.
 That is, the set itself.
 
For there to be nothing, there shouldn't even be a variable in the
first place. The fact that the variable exists and has an
identifer means that there is *something*. So your argument is
moot.

 
 Not really. Null is a special marker to indicate the absence of a
 value. There is nothing, as opposed to the previous case.

That's if you consider a set to be a reference type. Then you can say
that the reference may be referring to something (which may be empty or
not), or it can refer to nothing (null).

But if the set is a value type, then there is no such thing as null,
only empty.


T

-- 
INTEL = Only half of "intelligence".

Oct 18 2013

"ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:

On Friday, 18 October 2013 at 21:15:32 UTC, H. S. Teoh wrote:
 On Fri, Oct 18, 2013 at 10:04:52PM +0200, Meta wrote:
 On Friday, 18 October 2013 at 19:59:26 UTC, H. S. Teoh wrote:
...because it eliminates an unnecessary distinction between an
empty sequence and a non-existent sequence (which then leads 
to
similar issues one encounters with null pointers).

 
 That just seems silly. Surely we all recognize that there's a
 difference between the empty set and having no set at all, and 
 that
 it's valuable to be able to distinguish between the two. The 
 empty
 set is still a set, while nothing is... nothing.

 Yes, but if you declare a variable to contain a set, then by 
 definition
 there is *something*, even if it's an empty set. For there to be
 nothing, there shouldn't even be a variable in the first place. 
 The fact
 that the variable exists and has an identifer means that there 
 is
 *something*. So your argument is moot.


 T

I was simply thinking about sdl where you pass in a rect for the 
coords to blt one surface to the other. Null/0 means copy the 
whole thing. Rect is an object but I was thinking what about 
arrays (empty VS pull a default somewhere). Thats how I came up 
with this question and the point is I WANT to NOT specify a value 
so a DYNAMIC SUITABLE default value can be used.

Oct 18 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Oct 19, 2013 at 12:45:02AM +0200, ProgrammingGhost wrote:
 On Friday, 18 October 2013 at 21:15:32 UTC, H. S. Teoh wrote:
On Fri, Oct 18, 2013 at 10:04:52PM +0200, Meta wrote:
On Friday, 18 October 2013 at 19:59:26 UTC, H. S. Teoh wrote:
...because it eliminates an unnecessary distinction between an
empty sequence and a non-existent sequence (which then leads to
similar issues one encounters with null pointers).

That just seems silly. Surely we all recognize that there's a
difference between the empty set and having no set at all, and that
it's valuable to be able to distinguish between the two. The empty
set is still a set, while nothing is... nothing.

Yes, but if you declare a variable to contain a set, then by
definition there is *something*, even if it's an empty set. For there
to be nothing, there shouldn't even be a variable in the first place.
The fact that the variable exists and has an identifer means that
there is *something*. So your argument is moot.


T

 
 I was simply thinking about sdl where you pass in a rect for the
 coords to blt one surface to the other. Null/0 means copy the whole
 thing. Rect is an object but I was thinking what about arrays (empty
 VS pull a default somewhere). Thats how I came up with this question
 and the point is I WANT to NOT specify a value so a DYNAMIC SUITABLE
 default value can be used.

You could use T[]* and pass a null pointer as default?


T

-- 
What is Matter, what is Mind? Never Mind, it doesn't Matter.

Oct 18 2013

"ProgrammingGhost" <dsioafiseghvfawklncfskzdcf sdifjsdiovgfdisjcisj.com> writes:

 You could use T[]* and pass a null pointer as default?

Yet this answer wasn't on the first page.

I see I can't write fn([1,2]) anymore so I'm unsure how this
solution compares to using Nullable (I can't write fn([1,2]) with
nullable either).

Oct 18 2013

"Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:

On Friday, 18 October 2013 at 21:15:32 UTC, H. S. Teoh wrote:
 Yes, but if you declare a variable to contain a set, then by 
 definition
 there is *something*, even if it's an empty set. For there to be
 nothing, there shouldn't even be a variable in the first place. 
 The fact
 that the variable exists and has an identifer means that there 
 is
 *something*. So your argument is moot.


 T

We can declare a variable to contain an object, and there can 
still not be an object there.

You're trying to make arrays non-nullable. Which I suppose isn't 
so bad, it is a structure after all. Why do we even allow 
checking against null, can't do it with int or bool. (ok, I know, 
breaks code).

Oct 18 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Jesse Phillips:

 Why do we even allow checking against null, can't do it
 with int or bool. (ok, I know, breaks code).

Sometimes breaking code is acceptable.

Bye,
bearophile

Oct 19 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 10/18/2013 09:58 PM, H. S. Teoh wrote:
 To me, these are just implementation details. Conceptually speaking, D
 arrays are actually slices, so that gives them reference semantics.
 Being slices, they refer to zero or more elements, so either their
 length is zero, or not. There is no concept of nullity here. That only
 comes because we chose to implement slices as pointer + length, so
 implementation-wise we can distinguish between a null .ptr and a
 non-null .ptr. But from the conceptual POV, if we consider slices as a
 whole, they are just a sequence of zero or more elements. Null has no
 meaning here.

int[] a = null; // <- :(

Oct 18 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 18 Oct 2013 20:58:07 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Fri, Oct 18, 2013 at 02:04:41PM -0400, Jonathan M Davis wrote:
 On Friday, October 18, 2013 10:38:12 H. S. Teoh wrote:

 [...]
 IMO, distinguishing between null and empty arrays is bad
 abstraction. I agree with D's "conflation" of null with empty,
 actually. Conceptually speaking, an array is a sequence of values of
 non-negative length. An array with non-zero length contains at least
 one element, and is therefore non-empty, whereas an array with zero
 length is empty. Same thing goes with a slice. A slice is a view
 into zero or more array elements. A slice with zero length is empty,
 and a slice with non-zero length contains at least one element.
 There's nowhere in this conceptual scheme for such a thing as a
 "null array" that's distinct from an empty array. This distinction
 only crops up in implementation, and IMO leads to code smells
 because code should be operating based on the conceptual behaviour
 of arrays rather than on the implementation details.

 In most languages, an array is a reference type, so there's the
 question of whether it's even _there_. There's a clear distinction
 between having null reference to an array and having a reference to an
 empty array. This is particularly clear in C++ where an array is just
 a pointer, but it's try in plenty of other languages that don't treat
 as arrays as pointers (e.g. Java).

 To me, these are just implementation details. Conceptually speaking, D
 arrays are actually slices, so that gives them reference semantics.
 Being slices, they refer to zero or more elements, so either their
 length is zero, or not. There is no concept of nullity here. That only
 comes because we chose to implement slices as pointer + length, so
 implementation-wise we can distinguish between a null .ptr and a
 non-null .ptr. But from the conceptual POV, if we consider slices as a
 whole, they are just a sequence of zero or more elements. Null has no
 meaning here.

 Put another way, slices themselves are value types, but they refer to
 their elements by reference. It's a subtle but important difference.


 The problem is that D put the length on the stack alongside the
 pointer, making it so that D arrays are sort of reference types and
 sort of not. The pointer is a reference type, but the length is a
 value type, making the dynamic array half and half. If it were fully a
 reference type, then there would be no problem with distinguishing
 between null and empty arrays. A null array is simply a null reference
 to an array. But since D arrays aren't quite reference types, that
 doesn't work.

 [...]

 I think the issue comes from the preconceived notion acquired from other
 languages that arrays are some kind of object floating somewhere out
 there on the heap, for which we have a handle here. Thus we have the
 notion of null, being the case when we have a handle here but there's
 actually nothing out there.

 But we consider the slice as being a thing right *here* and now,
 referencing some sequence of elements out there, then we arrive at D's
 notion of null and empty being the same thing, because while there may
 be no elements out there being referenced, the handle (i.e. slice) is
 always *here*. In that sense, there's no distinction between an empty
 slice and a null slice: either there are elements out there that we're
 referring to, or there are none. There is no third "null" case.

 There's no reason why we should adopt the previous notion if this one
 works just as well, if not better. I argue that the second notion is
 conceptually cleaner, because it eliminates an unnecessary distinction
 between an empty sequence and a non-existent sequence (which then leads
 to similar issues one encounters with null pointers).

If what you say is true then slices would and could never be null... If  
that were the case I would stop complaining and simply "box" them with  
Nullable when I wanted a reference type.  But, D's strings/slices are some  
kind of mutant half reference half value type, and that's the underlying  
problem here.

Regan

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 21 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Mon, 21 Oct 2013 11:58:07 +0100, Regan Heath <regan netmail.co.nz>  
wrote:

 On Fri, 18 Oct 2013 20:58:07 +0100, H. S. Teoh <hsteoh quickfur.ath.cx>  
 wrote:

 On Fri, Oct 18, 2013 at 02:04:41PM -0400, Jonathan M Davis wrote:
 On Friday, October 18, 2013 10:38:12 H. S. Teoh wrote:

 [...]
 IMO, distinguishing between null and empty arrays is bad
 abstraction. I agree with D's "conflation" of null with empty,
 actually. Conceptually speaking, an array is a sequence of values of
 non-negative length. An array with non-zero length contains at least
 one element, and is therefore non-empty, whereas an array with zero
 length is empty. Same thing goes with a slice. A slice is a view
 into zero or more array elements. A slice with zero length is empty,
 and a slice with non-zero length contains at least one element.
 There's nowhere in this conceptual scheme for such a thing as a
 "null array" that's distinct from an empty array. This distinction
 only crops up in implementation, and IMO leads to code smells
 because code should be operating based on the conceptual behaviour
 of arrays rather than on the implementation details.

 In most languages, an array is a reference type, so there's the
 question of whether it's even _there_. There's a clear distinction
 between having null reference to an array and having a reference to an
 empty array. This is particularly clear in C++ where an array is just
 a pointer, but it's try in plenty of other languages that don't treat
 as arrays as pointers (e.g. Java).

 To me, these are just implementation details. Conceptually speaking, D
 arrays are actually slices, so that gives them reference semantics.
 Being slices, they refer to zero or more elements, so either their
 length is zero, or not. There is no concept of nullity here. That only
 comes because we chose to implement slices as pointer + length, so
 implementation-wise we can distinguish between a null .ptr and a
 non-null .ptr. But from the conceptual POV, if we consider slices as a
 whole, they are just a sequence of zero or more elements. Null has no
 meaning here.

 Put another way, slices themselves are value types, but they refer to
 their elements by reference. It's a subtle but important difference.


 The problem is that D put the length on the stack alongside the
 pointer, making it so that D arrays are sort of reference types and
 sort of not. The pointer is a reference type, but the length is a
 value type, making the dynamic array half and half. If it were fully a
 reference type, then there would be no problem with distinguishing
 between null and empty arrays. A null array is simply a null reference
 to an array. But since D arrays aren't quite reference types, that
 doesn't work.

 [...]

 I think the issue comes from the preconceived notion acquired from other
 languages that arrays are some kind of object floating somewhere out
 there on the heap, for which we have a handle here. Thus we have the
 notion of null, being the case when we have a handle here but there's
 actually nothing out there.

 But we consider the slice as being a thing right *here* and now,
 referencing some sequence of elements out there, then we arrive at D's
 notion of null and empty being the same thing, because while there may
 be no elements out there being referenced, the handle (i.e. slice) is
 always *here*. In that sense, there's no distinction between an empty
 slice and a null slice: either there are elements out there that we're
 referring to, or there are none. There is no third "null" case.

 There's no reason why we should adopt the previous notion if this one
 works just as well, if not better. I argue that the second notion is
 conceptually cleaner, because it eliminates an unnecessary distinction
 between an empty sequence and a non-existent sequence (which then leads
 to similar issues one encounters with null pointers).

 If what you say is true then slices would and could never be null...

Aargh, my apologies I misread your post.  Ignore my first reply.

I agree that slices never being null are like a pre-null checked array,  
which is a good thing.  The issue I have had in the past is with strings  
(not slices) mutating from null to empty and/or vice-versa.

Also, it's not at all clear when you're dealing with a pre-check not-null  
slice and when you're dealing with a possibly null array, for example..

import std.stdio;

void foo(string arr)
{
	if (arr is null) writefln("null");
	else writefln("not null");
	if (arr.length == 0) writefln("empty");
	else writefln("not empty");
}

void main()
{
	string arr;
	foo(arr);
	foo(arr[0..$]);
	arr = "";
	foo(arr);
	foo(arr[0..$]);
}

Output:
null
empty
null
empty
not null
empty
not null
empty

Which of those are strings/arrays and which are slices?  Why are the ones  
formed by actually slicing coming up as "is null"?

(This last, not directed at you, just venting..)

I can understand arguing against null from a safety point of view.

I can understand arguing against designs that use null, for the same  
reasons.

I disagree, but then I have comfortably used null for a long time so the  
cost/benefit of using null is heavily on the benefit side for me.  I can  
understand for others this may not be the case.

But, I cannot understand someone who says they have no use for the concept  
of non-existence, or that no code will ever want to make the distinction,  
that is just plainly incorrect .. implementing a singleton pattern  
(probably a bad example :p) relies on being able to check for  
non-existence, using null as the indicator, we do it all the time.

Regan

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 21 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, October 21, 2013 11:58:07 Regan Heath wrote:
 If what you say is true then slices would and could never be null... If
 that were the case I would stop complaining and simply "box" them with
 Nullable when I wanted a reference type.  But, D's strings/slices are some
 kind of mutant half reference half value type, and that's the underlying
 problem here.

Yeah, dynamic arrays in D are just plain weird. They're halfway between 
reference types and value types, and it definitely causes confusion, and it 
totally screws with null (which definitely sucks). But they mostly work really 
well the way that they are, and in general, the way that slices work works 
really well. So, I don't know if what we have is ultimately the right design 
or not. I definitely don't like how null works for arrays though.

Given how they work, we probably would have been better off if they couldn't be 
null. The ptr obviously could be null, but the array itself arguably shouldn't 
be able to be null. If we did that, then it would be clear that null wouldn't 
work with arrays, and no one would try. It would still kind of suck, since you 
wouldn't have null, but then at least it would be clear that null wouldn't 
work with arrays instead of having a situation where it kind of does and kind 
of doesn't.

- Jonathan M Davis

Oct 21 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Mon, 21 Oct 2013 12:54:56 +0100, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Monday, October 21, 2013 11:58:07 Regan Heath wrote:
 If what you say is true then slices would and could never be null... If
 that were the case I would stop complaining and simply "box" them with
 Nullable when I wanted a reference type.  But, D's strings/slices are  
 some
 kind of mutant half reference half value type, and that's the underlying
 problem here.

 Yeah, dynamic arrays in D are just plain weird. They're halfway between
 reference types and value types, and it definitely causes confusion, and  
 it
 totally screws with null (which definitely sucks). But they mostly work  
 really
 well the way that they are, and in general, the way that slices work  
 works
 really well. So, I don't know if what we have is ultimately the right  
 design
 or not. I definitely don't like how null works for arrays though.

 Given how they work, we probably would have been better off if they  
 couldn't be
 null. The ptr obviously could be null, but the array itself arguably  
 shouldn't
 be able to be null. If we did that, then it would be clear that null  
 wouldn't
 work with arrays, and no one would try. It would still kind of suck,  
 since you
 wouldn't have null, but then at least it would be clear that null  
 wouldn't
 work with arrays instead of having a situation where it kind of does and  
 kind
 of doesn't.

Agreed.  This is preferable to the current situation, even if it's not my  
personal preferred solution.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 21 2013

D Programming

C/C++ Programming

Other

digitalmars.D - Empty VS null array?