digitalmars.D - Re: Why Strings as Classes?

bearophile <bearophileHUGS lycos.com> Aug 27 2008

"Denis Koroskin" <2korden gmail.com> Aug 27 2008

bearophile <bearophileHUGS lycos.com> Aug 27 2008
"Nick Sabalausky" <a a.a> Aug 27 2008

Derek Parnell <derek psych.ward> Aug 27 2008

Dee Girl <deegirl noreply.com> Aug 27 2008

Christopher Wright <dhasenan gmail.com> Aug 27 2008

Dee Girl <deegirl noreply.com> Aug 27 2008

superdan <super dan.org> Aug 27 2008
Christopher Wright <dhasenan gmail.com> Aug 27 2008

Dee Girl <deegirl noreply.com> Aug 27 2008

Michiel Helvensteijn <nomail please.com> Aug 28 2008
Christopher Wright <dhasenan gmail.com> Aug 28 2008

Dee Girl <deegirl noreply.com> Aug 29 2008

"Nick Sabalausky" <a a.a> Aug 27 2008

"Nick Sabalausky" <a a.a> Aug 27 2008

lurker <lurker lurk.com> Aug 27 2008

"Nick Sabalausky" <a a.a> Aug 28 2008
"Nick Sabalausky" <a a.a> Aug 28 2008

Dee Girl <deegirl noreply.com> Aug 28 2008

Fawzi Mohamed <fmohamed mac.com> Aug 28 2008

Fawzi Mohamed <fmohamed mac.com> Aug 28 2008
Michel Fortin <michel.fortin michelf.com> Aug 28 2008

Fawzi Mohamed <fmohamed mac.com> Aug 28 2008

Fawzi Mohamed <fmohamed mac.com> Aug 28 2008

"David Wilson" <dw botanicus.net> Aug 28 2008
"Denis Koroskin" <2korden gmail.com> Aug 28 2008

bearophile <bearophileHUGS lycos.com> writes:

Robert Fraser:
 It's a good feature to have (I wouldn't consider a list class complete 
 without it), it just shouldn't be abused.


Its code:

final Ref nth (int n) {
    auto p = this;
    for (int i; i < n; ++i)
        p = p.next;
    return p;
}

If the usage patterns show this is positive, then a static pointer/index pair
may be added, that keeps the last accessed pointer/index, this may speed up
things if the items are accessed sequentially, or even if they are accessed
with some several forward skips, avoiding starting from the beginning each time
:-)

Bye,
bearophile

Aug 27 2008

"Denis Koroskin" <2korden gmail.com> writes:

On Wed, 27 Aug 2008 21:23:53 +0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Robert Fraser:
 It's a good feature to have (I wouldn't consider a list class complete
 without it), it just shouldn't be abused.


 Its code:

 final Ref nth (int n) {
     auto p = this;
     for (int i; i < n; ++i)
         p = p.next;
     return p;
 }

 If the usage patterns show this is positive, then a static pointer/index  
 pair may be added, that keeps the last accessed pointer/index, this may  
 speed up things if the items are accessed sequentially, or even if they  
 are accessed with some several forward skips, avoiding starting from the  
 beginning each time :-)

 Bye,
 bearophile


I believe this breaks encapsulation, unless an iterator is returned and  
passed as a start. Find is a better name for such operation.

Aug 27 2008

bearophile <bearophileHUGS lycos.com> writes:

Denis Koroskin:
 I believe this breaks encapsulation,


You may be right, I'm learning such thematic still. Can you explain why do you
think it breaks information hiding?
The two static variables are inside the method, and they can't be seen from
outside.

Bye and thank you,
bearophile

Aug 27 2008

"Nick Sabalausky" <a a.a> writes:

"Denis Koroskin" <2korden gmail.com> wrote in message 
news:op.ugj2yeu0o7cclz proton.creatstudio.intranet...
 On Wed, 27 Aug 2008 21:23:53 +0400, bearophile <bearophileHUGS lycos.com> 
 wrote:

 Robert Fraser:
 It's a good feature to have (I wouldn't consider a list class complete
 without it), it just shouldn't be abused.


 Its code:

 final Ref nth (int n) {
     auto p = this;
     for (int i; i < n; ++i)
         p = p.next;
     return p;
 }

 If the usage patterns show this is positive, then a static pointer/index 
 pair may be added, that keeps the last accessed pointer/index, this may 
 speed up things if the items are accessed sequentially, or even if they 
 are accessed with some several forward skips, avoiding starting from the 
 beginning each time :-)

 Bye,
 bearophile


 I believe this breaks encapsulation, unless an iterator is returned and 
 passed as a start. Find is a better name for such operation.


The way I see it, encapsulation is all about the black box idea. And the 
only things you can see from outside the black box are the inputs and 
outputs. So, if encapsulation is desired, things like "Indexing" and "Find" 
should be considered black boxes (ie, encapsulation). And if "Indexing" and 
"Find" are black boxes, that means as defining them purely in terms of the 
relationship between their input and output. So this is how I consider them 
to be defined:

Indexing: Return the element at position N.
Find: Return the position of the element that is equal (value-equal or 
identity-equal, depending on the find) to X .

So if there's a function that returns the element at position N and you call 
it "find" just because of the *internal implementation* of the function is 
an incrementing loop, I consider that to be both a violation of 
encapsulation (because the name has been chosen based on the inner workings 
of the function, not it's input-output relationship) and a violation of the 
definitions.

Aug 27 2008

Derek Parnell <derek psych.ward> writes:

On Wed, 27 Aug 2008 17:08:47 -0400, Nick Sabalausky wrote:

 The way I see it, encapsulation is all about the black box idea. And the 
 only things you can see from outside the black box are the inputs and 
 outputs. 


Well said. 

 Indexing: Return the element at position N.
 Find: Return the position of the element that is equal (value-equal or 
 identity-equal, depending on the find) to X .


Exactly.

 So if there's a function that returns the element at position N and you call 
 it "find" just because of the *internal implementation* of the function is 
 an incrementing loop, I consider that to be both a violation of 
 encapsulation (because the name has been chosen based on the inner workings 
 of the function, not it's input-output relationship) and a violation of the 
 definitions.


Not to mention just plain confusing.

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell

Aug 27 2008

Dee Girl <deegirl noreply.com> writes:

Derek Parnell Wrote:

 On Wed, 27 Aug 2008 17:08:47 -0400, Nick Sabalausky wrote:
 
 The way I see it, encapsulation is all about the black box idea. And the 
 only things you can see from outside the black box are the inputs and 
 outputs. 


 Well said. 


I am sorry I will say my opinion. This sounds good but is simplistic. Black box
is good in principle. But it assume you found right interface for the black
box. If you define bad interface you have a bad black box. Also please remember
that iterator is black box also. But it defines right interface.

 Indexing: Return the element at position N.
 Find: Return the position of the element that is equal (value-equal or 
 identity-equal, depending on the find) to X .


 Exactly.
 
 So if there's a function that returns the element at position N and you call 
 it "find" just because of the *internal implementation* of the function is 
 an incrementing loop, I consider that to be both a violation of 
 encapsulation (because the name has been chosen based on the inner workings 
 of the function, not it's input-output relationship) and a violation of the 
 definitions.


 Not to mention just plain confusing.


I am confused to. Why is making [] do find good and making find do [] wrong?
How do you make decision? To me it looks the opposite way. Because find is less
demanding contract than []. In design a stronger contract can replace a weaker
contract. Not the inverse. Thank you, Dee Girl

Aug 27 2008

Christopher Wright <dhasenan gmail.com> writes:

Dee Girl wrote:
 I am confused to. Why is making [] do find good and making find do [] wrong?
How do you make decision? To me it looks the opposite way. Because find is less
demanding contract than []. In design a stronger contract can replace a weaker
contract. Not the inverse. Thank you, Dee Girl


Choosing the data structure used is never the job of the code you're 
giving the data to. It's the job of whatever creates the data structure. 
By giving two functions where the only difference is performance 
guarantees, you're allowing the called code to determine what data 
structures it accepts, not based on the operations the structure 
supports, but based on the performance of those data structures.

D's standard arrays are a bad example of this: since they're built in, 
everyone uses them, so it's difficult to replace them with a linked list 
or something else. The effect is more pronounced with associative 
arrays, since there isn't as much choice in lists as in dictionaries. 
But they're so damn useful, and the syntax is hard to match.

Aug 27 2008

Dee Girl <deegirl noreply.com> writes:

Christopher Wright Wrote:

 Dee Girl wrote:
 I am confused to. Why is making [] do find good and making find do [] wrong?
How do you make decision? To me it looks the opposite way. Because find is less
demanding contract than []. In design a stronger contract can replace a weaker
contract. Not the inverse. Thank you, Dee Girl


 Choosing the data structure used is never the job of the code you're 
 giving the data to.


Yes. But code should refuse data if data does not respect an interface.

I think binary_search never should work on list. It does really bad thing when
linear search is the correct thing. Do you think it should?

 It's the job of whatever creates the data structure. 
 By giving two functions where the only difference is performance 
 guarantees, you're allowing the called code to determine what data 
 structures it accepts, not based on the operations the structure 
 supports, but based on the performance of those data structures.


I do not think there is anything wrong. And for me performance and complexity
is very different words. Performance is like dmd -O -release against dmd
-debug. Or asm implementation of function. Complexity is very different thing
and essential for algorithm.

 D's standard arrays are a bad example of this: since they're built in, 
 everyone uses them, so it's difficult to replace them with a linked list 
 or something else. The effect is more pronounced with associative 
 arrays, since there isn't as much choice in lists as in dictionaries. 
 But they're so damn useful, and the syntax is hard to match.


I hope Walter adds new data structure that work with std.algorithm. I am not
sure. But I think std.algorithm is make so it works with other types. Not only
arrays.

Aug 27 2008

superdan <super dan.org> writes:

Dee Girl Wrote:

 Christopher Wright Wrote:
 
 Dee Girl wrote:
 I am confused to. Why is making [] do find good and making find do [] wrong?
How do you make decision? To me it looks the opposite way. Because find is less
demanding contract than []. In design a stronger contract can replace a weaker
contract. Not the inverse. Thank you, Dee Girl


 Choosing the data structure used is never the job of the code you're 
 giving the data to.


 Yes. But code should refuse data if data does not respect an interface.
 
 I think binary_search never should work on list. It does really bad thing when
linear search is the correct thing. Do you think it should?
 
 It's the job of whatever creates the data structure. 
 By giving two functions where the only difference is performance 
 guarantees, you're allowing the called code to determine what data 
 structures it accepts, not based on the operations the structure 
 supports, but based on the performance of those data structures.


 I do not think there is anything wrong. And for me performance and complexity
is very different words. Performance is like dmd -O -release against dmd
-debug. Or asm implementation of function. Complexity is very different thing
and essential for algorithm.


cool. dee there's two things u said i like. one was the contract view of
complexity guarantee. after all o(n) is a weaker guarantee than o(1). and heck
nobody complain if someone comes with sort that works in o(n log log n). so
there's some sort of complexity hierarchy. never thot of it that way. that's as
cool as the other side of the pillow.

2nd thing i liked was the performance vs. complexity contrast. linear search in
ruby has less performance than linear search in d. but both have the same
complexity. that's universal. spot on girl. good food for thot while i get
drunk at tgiw. thanx.

Aug 27 2008

Christopher Wright <dhasenan gmail.com> writes:

Dee Girl wrote:
 Christopher Wright Wrote:
 
 Dee Girl wrote:
 I am confused to. Why is making [] do find good and making find do [] wrong?
How do you make decision? To me it looks the opposite way. Because find is less
demanding contract than []. In design a stronger contract can replace a weaker
contract. Not the inverse. Thank you, Dee Girl


 giving the data to.


 Yes. But code should refuse data if data does not respect an interface.


Correct. The question is whether you should make asymptotic complexity 
part of the interface. If you do, that hurts when you want to optimize 
for a common case but still allow some inefficient operations.

If you want to define some empty interfaces, such as:
interface IFastIndexList : IList {}

These would allow you to do things like:
bool containsSorted (IList list, Object element)
{
	auto fast = cast(IFastIndexList)list;
	if (fast) return (binarySearch(list, element) >= 0);
	// default implementation here
}

However, your library should not have any publicly exposed operations 
that only take IFastIndexList, and it's silly to define different 
functions for indexing in an IFastIndexList versus an IList.

 I think binary_search never should work on list. It does really bad thing when
linear search is the correct thing. Do you think it should?


I think it should work, since the data structure implements the 
necessary operations. I think no sensible programmer should use it.

 It's the job of whatever creates the data structure. 
 By giving two functions where the only difference is performance 
 guarantees, you're allowing the called code to determine what data 
 structures it accepts, not based on the operations the structure 
 supports, but based on the performance of those data structures.


 I do not think there is anything wrong. And for me performance and complexity
is very different words. Performance is like dmd -O -release against dmd
-debug. Or asm implementation of function. Complexity is very different thing
and essential for algorithm.
 
 D's standard arrays are a bad example of this: since they're built in, 
 everyone uses them, so it's difficult to replace them with a linked list 
 or something else. The effect is more pronounced with associative 
 arrays, since there isn't as much choice in lists as in dictionaries. 
 But they're so damn useful, and the syntax is hard to match.


 I hope Walter adds new data structure that work with std.algorithm. I am not
sure. But I think std.algorithm is make so it works with other types. Not only
arrays. 


You can do that with templates. If you want to use classes with 
inheritance, or interfaces, you can no longer use templates.

Aug 27 2008

Dee Girl <deegirl noreply.com> writes:

Christopher Wright Wrote:

 Dee Girl wrote:
 Christopher Wright Wrote:
 
 Dee Girl wrote:
 I am confused to. Why is making [] do find good and making find do [] wrong?
How do you make decision? To me it looks the opposite way. Because find is less
demanding contract than []. In design a stronger contract can replace a weaker
contract. Not the inverse. Thank you, Dee Girl


 giving the data to.


 Yes. But code should refuse data if data does not respect an interface.


 Correct. The question is whether you should make asymptotic complexity 
 part of the interface. If you do, that hurts when you want to optimize 
 for a common case but still allow some inefficient operations.


I think it is really helpful if you learn STL. Very opening for your mind.
Because STL does exactly you said. For example there is function distance(). It
computes distance between iterators in O(n). But if iterators are random it
works in O(1). So function is as fast as possible. And has uniform interface.

 If you want to define some empty interfaces, such as:
 interface IFastIndexList : IList {}
 
 These would allow you to do things like:
 bool containsSorted (IList list, Object element)
 {
 	auto fast = cast(IFastIndexList)list;
 	if (fast) return (binarySearch(list, element) >= 0);
 	// default implementation here
 }
 
 However, your library should not have any publicly exposed operations 
 that only take IFastIndexList, and it's silly to define different 
 functions for indexing in an IFastIndexList versus an IList.


Again. Why make design on your knees (I mean in a hurry) when STL has much
better and beautiful design? I think STL book helps very much!

 I think binary_search never should work on list. It does really bad thing when
linear search is the correct thing. Do you think it should?


 I think it should work, since the data structure implements the 
 necessary operations. I think no sensible programmer should use it.


This is very very big misunderstanding. I see this written again. I think
somebody else also explained. In concrete code you can make sensible choice.
But in generic code you can not choice. I think STL book is really helpful.
Also when you maintain code you need compiler to help. If code still compiles
but does O(n*n) it is bad!

 It's the job of whatever creates the data structure. 
 By giving two functions where the only difference is performance 
 guarantees, you're allowing the called code to determine what data 
 structures it accepts, not based on the operations the structure 
 supports, but based on the performance of those data structures.


 I do not think there is anything wrong. And for me performance and complexity
is very different words. Performance is like dmd -O -release against dmd
-debug. Or asm implementation of function. Complexity is very different thing
and essential for algorithm.
 
 D's standard arrays are a bad example of this: since they're built in, 
 everyone uses them, so it's difficult to replace them with a linked list 
 or something else. The effect is more pronounced with associative 
 arrays, since there isn't as much choice in lists as in dictionaries. 
 But they're so damn useful, and the syntax is hard to match.


 I hope Walter adds new data structure that work with std.algorithm. I am not
sure. But I think std.algorithm is make so it works with other types. Not only
arrays. 


 You can do that with templates. If you want to use classes with 
 inheritance, or interfaces, you can no longer use templates.


I do not understand. Thank you, Dee Girl

Aug 27 2008

Michiel Helvensteijn <nomail please.com> writes:

Dee Girl wrote:

 I think it is really helpful if you learn STL. Very opening for your mind.
 Because STL does exactly you said. For example there is function
 distance(). It computes distance between iterators in O(n). But if
 iterators are random it works in O(1). So function is as fast as possible.
 And has uniform interface.


Isn't that all we're saying about opIndex()? It retrieves the value from a
list at a certain index in O(n). But if it is an array it works in O(1). So
function is as fast as possible. And has uniform interface.

-- 
Michiel

Aug 28 2008

Christopher Wright <dhasenan gmail.com> writes:

Dee Girl wrote:
 Christopher Wright Wrote:
 
 Dee Girl wrote:
 Christopher Wright Wrote:

 Dee Girl wrote:
 I am confused to. Why is making [] do find good and making find do [] wrong?
How do you make decision? To me it looks the opposite way. Because find is less
demanding contract than []. In design a stronger contract can replace a weaker
contract. Not the inverse. Thank you, Dee Girl


 giving the data to.




 part of the interface. If you do, that hurts when you want to optimize 
 for a common case but still allow some inefficient operations.


 I think it is really helpful if you learn STL. Very opening for your mind.
Because STL does exactly you said. For example there is function distance(). It
computes distance between iterators in O(n). But if iterators are random it
works in O(1). So function is as fast as possible. And has uniform interface.


The problem with your suggestions -- I have no idea whether your 
suggestions are related to STL or how, since I don't know the STL -- is 
that it's too easy for a library developer to prevent people from using 
a particular data structure just because it implements an operation in a 
slow manner.

 If you want to define some empty interfaces, such as:
 interface IFastIndexList : IList {}

 These would allow you to do things like:
 bool containsSorted (IList list, Object element)
 {
 	auto fast = cast(IFastIndexList)list;
 	if (fast) return (binarySearch(list, element) >= 0);
 	// default implementation here
 }

 However, your library should not have any publicly exposed operations 
 that only take IFastIndexList, and it's silly to define different 
 functions for indexing in an IFastIndexList versus an IList.


 Again. Why make design on your knees (I mean in a hurry) when STL has much
better and beautiful design? I think STL book helps very much!


I don't know of STL's design.

A collection has a set of supported operations. A supported operation is 
one that has a correct implementation. An operation need not be 
efficient to be supported, such as opIndex for linked lists.

A collection has guarantees about asymptotic complexity. (Or maybe not.)

These are two different responsibilities. It is a terrible thing to 
conflate them.

 I think binary_search never should work on list. It does really bad thing when
linear search is the correct thing. Do you think it should?


 necessary operations. I think no sensible programmer should use it.


 This is very very big misunderstanding. I see this written again. I think
somebody else also explained. In concrete code you can make sensible choice.
But in generic code you can not choice. I think STL book is really helpful.
Also when you maintain code you need compiler to help. If code still compiles
but does O(n*n) it is bad!


If you are writing generic code, then your code is not choosing the data 
structure to use. It is the responsibility of the calling code to use an 
efficient data structure for the operation or deal with the extra time 
that the operation will take.

 It's the job of whatever creates the data structure. 
 By giving two functions where the only difference is performance 
 guarantees, you're allowing the called code to determine what data 
 structures it accepts, not based on the operations the structure 
 supports, but based on the performance of those data structures.



 D's standard arrays are a bad example of this: since they're built in, 
 everyone uses them, so it's difficult to replace them with a linked list 
 or something else. The effect is more pronounced with associative 
 arrays, since there isn't as much choice in lists as in dictionaries. 
 But they're so damn useful, and the syntax is hard to match.




 inheritance, or interfaces, you can no longer use templates.


 I do not understand. Thank you, Dee Girl


You have builtin arrays. Nothing can inherit from them.

Assume you want your function to work with anything that has the operation:
T opIndex(int);

You can do this:
void operation(T)(T value)
{
	for (int i = 0; i < some_number; i++) do_stuff(value[i]);
}

Now you want to put this in a class and maybe override the behavior in a 
subclass. Fail; you can't; it's a template. You have to choose between 
an interface and force people to use some collection class, or a 
primitive array and force people to use arrays.

Aug 28 2008

Dee Girl <deegirl noreply.com> writes:

Christopher Wright Wrote:

 Dee Girl wrote:
 Christopher Wright Wrote:
 
 Dee Girl wrote:
 Christopher Wright Wrote:

 Dee Girl wrote:
 I am confused to. Why is making [] do find good and making find do [] wrong?
How do you make decision? To me it looks the opposite way. Because find is less
demanding contract than []. In design a stronger contract can replace a weaker
contract. Not the inverse. Thank you, Dee Girl


 giving the data to.




 part of the interface. If you do, that hurts when you want to optimize 
 for a common case but still allow some inefficient operations.


 I think it is really helpful if you learn STL. Very opening for your mind.
Because STL does exactly you said. For example there is function distance(). It
computes distance between iterators in O(n). But if iterators are random it
works in O(1). So function is as fast as possible. And has uniform interface.


 The problem with your suggestions -- I have no idea whether your 
 suggestions are related to STL or how, since I don't know the STL -- is 
 that it's too easy for a library developer to prevent people from using 
 a particular data structure just because it implements an operation in a 
 slow manner.
 
 If you want to define some empty interfaces, such as:
 interface IFastIndexList : IList {}

 These would allow you to do things like:
 bool containsSorted (IList list, Object element)
 {
 	auto fast = cast(IFastIndexList)list;
 	if (fast) return (binarySearch(list, element) >= 0);
 	// default implementation here
 }

 However, your library should not have any publicly exposed operations 
 that only take IFastIndexList, and it's silly to define different 
 functions for indexing in an IFastIndexList versus an IList.


 Again. Why make design on your knees (I mean in a hurry) when STL has much
better and beautiful design? I think STL book helps very much!


 I don't know of STL's design.
 
 A collection has a set of supported operations. A supported operation is 
 one that has a correct implementation. An operation need not be 
 efficient to be supported, such as opIndex for linked lists.
 
 A collection has guarantees about asymptotic complexity. (Or maybe not.)
 
 These are two different responsibilities. It is a terrible thing to 
 conflate them.
 
 I think binary_search never should work on list. It does really bad thing when
linear search is the correct thing. Do you think it should?


 necessary operations. I think no sensible programmer should use it.


 This is very very big misunderstanding. I see this written again. I think
somebody else also explained. In concrete code you can make sensible choice.
But in generic code you can not choice. I think STL book is really helpful.
Also when you maintain code you need compiler to help. If code still compiles
but does O(n*n) it is bad!


 If you are writing generic code, then your code is not choosing the data 
 structure to use. It is the responsibility of the calling code to use an 
 efficient data structure for the operation or deal with the extra time 
 that the operation will take.
 
 It's the job of whatever creates the data structure. 
 By giving two functions where the only difference is performance 
 guarantees, you're allowing the called code to determine what data 
 structures it accepts, not based on the operations the structure 
 supports, but based on the performance of those data structures.



 D's standard arrays are a bad example of this: since they're built in, 
 everyone uses them, so it's difficult to replace them with a linked list 
 or something else. The effect is more pronounced with associative 
 arrays, since there isn't as much choice in lists as in dictionaries. 
 But they're so damn useful, and the syntax is hard to match.




 inheritance, or interfaces, you can no longer use templates.


 I do not understand. Thank you, Dee Girl


 You have builtin arrays. Nothing can inherit from them.
 
 Assume you want your function to work with anything that has the operation:
 T opIndex(int);
 
 You can do this:
 void operation(T)(T value)
 {
 	for (int i = 0; i < some_number; i++) do_stuff(value[i]);
 }
 
 Now you want to put this in a class and maybe override the behavior in a 
 subclass. Fail; you can't; it's a template. You have to choose between 
 an interface and force people to use some collection class, or a 
 primitive array and force people to use arrays.


Hi, Christopher!

Thank you for answering my post. Thank you very much. Unfortunately I must
leave this discussion. The subject becomes very long. There are many mistakes
in post above or Nick post I want to explain. But I do not know to explain very
well. And it takes me very long time! I can not continue.

Discussion becomes about STL design. To explain I have to explain STL first.
But it is impossible to me ^_^. I think, if you want to discuss STL, then it is
good to know it. It is very brave you can discuss without knowing! I can not do
it and I admire your strong opinion. But even strong opinion can be mistake.
And even strong opinion can change with knowledge. I think it helps very much
if you learn STL. Then even if your opinion does not change you have more
knowledge. Good luck! Thank you and sorry, Dee Girl

Aug 29 2008

"Nick Sabalausky" <a a.a> writes:

"Dee Girl" <deegirl noreply.com> wrote in message 
news:g94oup$2mk7$1 digitalmars.com...
 Christopher Wright Wrote:

 Dee Girl wrote:
 I am confused to. Why is making [] do find good and making find do [] 
 wrong? How do you make decision? To me it looks the opposite way. 
 Because find is less demanding contract than []. In design a stronger 
 contract can replace a weaker contract. Not the inverse. Thank you, Dee 
 Girl


 Choosing the data structure used is never the job of the code you're
 giving the data to.


 Yes. But code should refuse data if data does not respect an interface.

 I think binary_search never should work on list. It does really bad thing 
 when linear search is the correct thing. Do you think it should?

 It's the job of whatever creates the data structure.
 By giving two functions where the only difference is performance
 guarantees, you're allowing the called code to determine what data
 structures it accepts, not based on the operations the structure
 supports, but based on the performance of those data structures.


 I do not think there is anything wrong. And for me performance and 
 complexity is very different words. Performance is like dmd -O -release 
 against dmd -debug. Or asm implementation of function. Complexity is very 
 different thing and essential for algorithm.


True. But, since performance (as you described it) is not particularly 
relevant to these discussions, some of us have been (perhaps incorrectly) 
using "performance" and "complexity" interchangably. We mean "complexity".

Aug 27 2008

"Nick Sabalausky" <a a.a> writes:

"Dee Girl" <deegirl noreply.com> wrote in message 
news:g94k7i$2ahk$1 digitalmars.com...
 Derek Parnell Wrote:

 On Wed, 27 Aug 2008 17:08:47 -0400, Nick Sabalausky wrote:

 The way I see it, encapsulation is all about the black box idea. And 
 the
 only things you can see from outside the black box are the inputs and
 outputs.


 Well said.


 I am sorry I will say my opinion. This sounds good but is simplistic. 
 Black box is good in principle. But it assume you found right interface 
 for the black box. If you define bad interface you have a bad black box. 
 Also please remember that iterator is black box also. But it defines right 
 interface.

 Indexing: Return the element at position N.
 Find: Return the position of the element that is equal (value-equal or
 identity-equal, depending on the find) to X .


 Exactly.

 So if there's a function that returns the element at position N and you 
 call
 it "find" just because of the *internal implementation* of the function 
 is
 an incrementing loop, I consider that to be both a violation of
 encapsulation (because the name has been chosen based on the inner 
 workings
 of the function, not it's input-output relationship) and a violation of 
 the
 definitions.


 Not to mention just plain confusing.


 I am confused to. Why is making [] do find good and making find do [] 
 wrong? How do you make decision? To me it looks the opposite way. Because 
 find is less demanding contract than []. In design a stronger contract can 
 replace a weaker contract. Not the inverse. Thank you, Dee Girl


Making [] do "find" is never good. The [n] should always mean "return the 
element at position n". That is never "find", it's always "indexing" 
(Although some people here have disagreed and claimed that "return the 
element at position n" on a linked list is "find", which might be why you're 
confused).

Aug 27 2008

lurker <lurker lurk.com> writes:

Nick Sabalausky Wrote:

 "Dee Girl" <deegirl noreply.com> wrote in message 
 news:g94k7i$2ahk$1 digitalmars.com...
 Derek Parnell Wrote:

 On Wed, 27 Aug 2008 17:08:47 -0400, Nick Sabalausky wrote:

 The way I see it, encapsulation is all about the black box idea. And 
 the
 only things you can see from outside the black box are the inputs and
 outputs.


 Well said.


 I am sorry I will say my opinion. This sounds good but is simplistic. 
 Black box is good in principle. But it assume you found right interface 
 for the black box. If you define bad interface you have a bad black box. 
 Also please remember that iterator is black box also. But it defines right 
 interface.

 Indexing: Return the element at position N.
 Find: Return the position of the element that is equal (value-equal or
 identity-equal, depending on the find) to X .


 Exactly.

 So if there's a function that returns the element at position N and you 
 call
 it "find" just because of the *internal implementation* of the function 
 is
 an incrementing loop, I consider that to be both a violation of
 encapsulation (because the name has been chosen based on the inner 
 workings
 of the function, not it's input-output relationship) and a violation of 
 the
 definitions.


 Not to mention just plain confusing.


 I am confused to. Why is making [] do find good and making find do [] 
 wrong? How do you make decision? To me it looks the opposite way. Because 
 find is less demanding contract than []. In design a stronger contract can 
 replace a weaker contract. Not the inverse. Thank you, Dee Girl


 Making [] do "find" is never good. The [n] should always mean "return the 
 element at position n". That is never "find", it's always "indexing" 
 (Although some people here have disagreed and claimed that "return the 
 element at position n" on a linked list is "find", which might be why you're 
 confused).


(I hate myself for letting myself getting into this......)

It's you who is confused, and badly. More over I think Dee said she's confused
sort of ironically. Her posts are nowhere near confusion. I can only say she
kicks some serious butt. She merly reveals your confusion. Whenever she says
something we should be listening. 

Come to your senses. Open any algorithms book. Finding the nth element in a
list is linear search. It does not matter whether you are looking for a value
or an index. Search is search is search. Claiming a linear search it's not a
linear search is just empty retoric and semantic masturbation.

At the beginning of this huge thread I was straight in the
I-dont-care-about-theory-crap-opIndex-is-convenient camp. But the arguments put
forth especially by Dee and (I hate to say it), Dan (Dan you ARE a pompous
prick) got me seriously thinking. Dee had me at the contract metaphor.

Aug 27 2008

"Nick Sabalausky" <a a.a> writes:

"lurker" <lurker lurk.com> wrote in message 
news:g95e2c$1rbf$1 digitalmars.com...
 Finding the nth element in a list is linear search. It does not matter 
 whether you are looking for a value or an index.


In terms of under-the-hood implementation, yes. In terms of input/output 
interface, no.

Aug 28 2008

"Nick Sabalausky" <a a.a> writes:

"lurker" <lurker lurk.com> wrote in message 
news:g95e2c$1rbf$1 digitalmars.com...
 Come to your senses. Open any algorithms book. Finding the nth element in 
 a list is linear search. It does not matter whether you are looking for a 
 value or an index. Search is search is search. Claiming a linear search 
 it's not a linear search is just empty retoric and semantic masturbation.


Not that I normally go to such lengths for online debates, but I just 
happened to have my algorithms book a few feet away, and, well, I really was 
curious what it would say...

Apperently, not much. The Second Edition of "Introduction to Algorithms" 
from MIT Press doesn't appear to back either of us on this point. On page 
198, it lists "Operations on dynamic sets". "Search" is defined as 
retreiving a pointer to an element, given the element's "key" (ie, not index 
or value, and the book defines "key" on the prior page and the defenition 
doesn't match what we would consider an index or value).  But none of the 
other operations are anything that would correspond to an "Indexing" 
operation.  The section on linked lists (pages 204-209) doesn't provide any 
more clarification. It still treats search as retrieving a pointer to an 
element given an associated key and doesn't mention anything about getting 
the "Nth" element (perhaps not surprising, since the *implementation* of 
such an operation would obviously be very similar to search).

Aug 28 2008

Dee Girl <deegirl noreply.com> writes:

Nick Sabalausky Wrote:

 "lurker" <lurker lurk.com> wrote in message 
 news:g95e2c$1rbf$1 digitalmars.com...
 Come to your senses. Open any algorithms book. Finding the nth element in 
 a list is linear search. It does not matter whether you are looking for a 
 value or an index. Search is search is search. Claiming a linear search 
 it's not a linear search is just empty retoric and semantic masturbation.


 Not that I normally go to such lengths for online debates, but I just 
 happened to have my algorithms book a few feet away, and, well, I really was 
 curious what it would say...
 
 Apperently, not much. The Second Edition of "Introduction to Algorithms" 
 from MIT Press doesn't appear to back either of us on this point. On page 
 198, it lists "Operations on dynamic sets". "Search" is defined as 
 retreiving a pointer to an element, given the element's "key" (ie, not index 
 or value, and the book defines "key" on the prior page and the defenition 
 doesn't match what we would consider an index or value).  But none of the 
 other operations are anything that would correspond to an "Indexing" 
 operation.  The section on linked lists (pages 204-209) doesn't provide any 
 more clarification. It still treats search as retrieving a pointer to an 
 element given an associated key and doesn't mention anything about getting 
 the "Nth" element (perhaps not surprising, since the *implementation* of 
 such an operation would obviously be very similar to search).


The properties are important Nick-san. Not the name! If it looks like a duck
and quakes like a duck then it is linear search ^_^.

Aug 28 2008

Fawzi Mohamed <fmohamed mac.com> writes:

On 2008-08-28 00:24:50 +0200, Dee Girl <deegirl noreply.com> said:

 Derek Parnell Wrote:
 
 On Wed, 27 Aug 2008 17:08:47 -0400, Nick Sabalausky wrote:
 
 The way I see it, encapsulation is all about the black box idea. And the
 only things you can see from outside the black box are the inputs and
 outputs.


 Well said.


 I am sorry I will say my opinion. This sounds good but is simplistic. 
 Black box is good in principle. But it assume you found right interface 
 for the black box. If you define bad interface you have a bad black 
 box. Also please remember that iterator is black box also. But it 
 defines right interface.


I agree with the meaning, but I disagree with the example.
I think that iterators are an example of bad interface, as also others 
brought up the iterator as good example I though that I should say 
something.

An iterator should be like a generator, have a method next, and one 
at_end or something similar packaged (and maybe prev() and at_start() 
if it can also go back) in a single struct, furthermore it should work 
seamlessly with a kind of for_each(x;iterator) construct.

Instead C++ choose to have begin & end iterators, simply because with 
that construct it is trivial for the compiler to optimize it for 
arrays, and you can use pointers as iterators without a 
cast/constructor.

This means a worse interface for 99% of the uses, apart form arrays and 
vectors I think one is better off without end iterator, and even when 
this is not the case writing something like for_each(x;FromTo(a,b)), 
with FromTo constructing a generator is (I think) better than 
for(i=t.begin();i!=t.end();++t), and the implementation of an a 
generator itself is easier (no ==,!=,increment, decrement(pre/post),... 
for performance reasons)

As I believe that the optimizations to make the better interface be as 
efficient as the iterator one are perfectly doable (some work, yes, but 
not extremely much so), I see no good reason for C++ design.

Fawzi

Aug 28 2008

Fawzi Mohamed <fmohamed mac.com> writes:

On 2008-08-28 11:47:22 +0200, Fawzi Mohamed <fmohamed mac.com> said:

 On 2008-08-28 00:24:50 +0200, Dee Girl <deegirl noreply.com> said:
 
 Derek Parnell Wrote:
 
 On Wed, 27 Aug 2008 17:08:47 -0400, Nick Sabalausky wrote:
 
 The way I see it, encapsulation is all about the black box idea. And the
 only things you can see from outside the black box are the inputs and
 outputs.


 Well said.


 I am sorry I will say my opinion. This sounds good but is simplistic. 
 Black box is good in principle. But it assume you found right interface 
 for the black box. If you define bad interface you have a bad black 
 box. Also please remember that iterator is black box also. But it 
 defines right interface.


 I agree with the meaning, but I disagree with the example.
 I think that iterators are an example of bad interface, as also others 
 brought up the iterator as good example I though that I should say 
 something.
 
 An iterator should be like a generator, have a method next, and one 
 at_end or something similar packaged (and maybe prev() and at_start() 
 if it can also go back) in a single struct, furthermore it should work 
 seamlessly with a kind of for_each(x;iterator) construct.
 
 Instead C++ choose to have begin & end iterators, simply because with 
 that construct it is trivial for the compiler to optimize it for 
 arrays, and you can use pointers as iterators without a 
 cast/constructor.
 
 This means a worse interface for 99% of the uses, apart form arrays and 
 vectors I think one is better off without end iterator, and even when 
 this is not the case writing something like for_each(x;FromTo(a,b)), 
 with FromTo constructing a generator is (I think) better than 
 for(i=t.begin();i!=t.end();++t), and the implementation of an a 
 generator itself is easier (no ==,!=,increment, decrement(pre/post),... 
 for performance reasons)
 
 As I believe that the optimizations to make the better interface be as 
 efficient as the iterator one are perfectly doable (some work, yes, but 
 not extremely much so), I see no good reason for C++ design.
 
 Fawzi


Please note before the discussion gets messy, that I find the hierarchy 
and kinds of iterators in c++ well done, (input,output,...random), but 
the interface of them (especially the most used) flawed.

Aug 28 2008

Michel Fortin <michel.fortin michelf.com> writes:

On 2008-08-28 05:47:22 -0400, Fawzi Mohamed <fmohamed mac.com> said:

 An iterator should be like a generator, have a method next, and one 
 at_end or something similar packaged (and maybe prev() and at_start() 
 if it can also go back) in a single struct, furthermore it should work 
 seamlessly with a kind of for_each(x;iterator) construct.


I perfectly agree with this. This is why I prefer much Objective-C's 
NSEnumerator over C++ iterators: they're much simpler to use.

Some time ago, Walter asked if he could change D arrays to have 
internally an end pointer instead of a length value. He said it was to 
allow arrays, or slices, to be used as iterator efficiently. I hope he 
hasn't given up on that idea since slices are much easier to understand 
and use than C++-style iterators.


 Instead C++ choose to have begin & end iterators, simply because with 
 that construct it is trivial for the compiler to optimize it for 
 arrays, and you can use pointers as iterators without a 
 cast/constructor.


I would add that if you are using many iterators for the same array 
(vector) in a given algorithm or in function parameters, having only 
one "end" pointer save some space.


-- 
Michel Fortin  
michel.fortin michelf.com  
http://michelf.com/

Aug 28 2008

Fawzi Mohamed <fmohamed mac.com> writes:

On 2008-08-28 13:06:59 +0200, Michel Fortin <michel.fortin michelf.com> said:

 On 2008-08-28 05:47:22 -0400, Fawzi Mohamed <fmohamed mac.com> said:
 
 An iterator should be like a generator, have a method next, and one 
 at_end or something similar packaged (and maybe prev() and at_start() 
 if it can also go back) in a single struct, furthermore it should work 
 seamlessly with a kind of for_each(x;iterator) construct.


 I perfectly agree with this. This is why I prefer much Objective-C's 
 NSEnumerator over C++ iterators: they're much simpler to use.
 
 Some time ago, Walter asked if he could change D arrays to have 
 internally an end pointer instead of a length value. He said it was to 
 allow arrays, or slices, to be used as iterator efficiently. I hope he 
 hasn't given up on that idea since slices are much easier to understand 
 and use than C++-style iterators.


I agree

 Instead C++ choose to have begin & end iterators, simply because with 
 that construct it is trivial for the compiler to optimize it for 
 arrays, and you can use pointers as iterators without a 
 cast/constructor.


 I would add that if you are using many iterators for the same array 
 (vector) in a given algorithm or in function parameters, having only 
 one "end" pointer save some space.


ok given, but I would say that one almost never takes advantage of 
this, and even then the real advantage is probably small.

Aug 28 2008

Fawzi Mohamed <fmohamed mac.com> writes:

On 2008-08-28 13:31:26 +0200, "David Wilson" <dw botanicus.net> said:

 2008/8/28 Fawzi Mohamed <fmohamed mac.com>:
 On 2008-08-28 00:24:50 +0200, Dee Girl <deegirl noreply.com> said:
 
 Derek Parnell Wrote:
 
 On Wed, 27 Aug 2008 17:08:47 -0400, Nick Sabalausky wrote:
 
 The way I see it, encapsulation is all about the black box idea. And the
 only things you can see from outside the black box are the inputs and
 outputs.


 Well said.


 I am sorry I will say my opinion. This sounds good but is simplistic.
 Black box is good in principle. But it assume you found right interface for
 the black box. If you define bad interface you have a bad black box. Also
 please remember that iterator is black box also. But it defines right
 interface.


 I agree with the meaning, but I disagree with the example.
 I think that iterators are an example of bad interface, as also others
 brought up the iterator as good example I though that I should say
 something.
 
 An iterator should be like a generator, have a method next, and one at_end
 or something similar packaged (and maybe prev() and at_start() if it can
 also go back) in a single struct, furthermore it should work seamlessly with
 a kind of for_each(x;iterator) construct.
 
 Instead C++ choose to have begin & end iterators, simply because with that
 construct it is trivial for the compiler to optimize it for arrays, and you
 can use pointers as iterators without a cast/constructor.
 
 This means a worse interface for 99% of the uses, apart form arrays and
 vectors I think one is better off without end iterator, and even when this
 is not the case writing something like for_each(x;FromTo(a,b)), with FromTo
 constructing a generator is (I think) better than
 for(i=t.begin();i!=t.end();++t), and the implementation of an a generator
 itself is easier (no ==,!=,increment, decrement(pre/post),... for
 performance reasons)


 It is worse in simple "for each element" uses, but one of the cooler
 things about C++ iterators is exactly that they are exchangeable for
 any pointer-like object, and that it's trivial to iterate over a
 subset of a container. Emulating pointers means that in some ways,
 once you grok how they work, they are more trivial than e.g. Python or
 D since there is no further specialised syntax beyond "*", "++", and
 maybe "--".
 
 I love the concise D and Python syntaxes, just playing devil's advocate 
 here. :)


I appreciate it, and in fact a general iterator is more powerful than 
the simple interface I gave.
And the iterator concept *is* a powerful one.
The thing is a general random iterator can also be implemented as a 
single structure (not a pair, start and end), you just need more 
methods (and a comparison operator if you want to compare iterators).
Indeed you have a little overhead for arrays and vectors (you also 
store the end), but I think that in most cases the real difference is 
0, and a nicer handling of the most common case more than offsets all 
these considerations (and it is also nicer to implement a new iterator).

 
 As I believe that the optimizations to make the better interface be as
 efficient as the iterator one are perfectly doable (some work, yes, but not
 extremely much so), I see no good reason for C++ design.
 
 Fawzi

Aug 28 2008

"David Wilson" <dw botanicus.net> writes:

2008/8/28 Fawzi Mohamed <fmohamed mac.com>:
 On 2008-08-28 00:24:50 +0200, Dee Girl <deegirl noreply.com> said:

 Derek Parnell Wrote:

 On Wed, 27 Aug 2008 17:08:47 -0400, Nick Sabalausky wrote:

 The way I see it, encapsulation is all about the black box idea. And the
 only things you can see from outside the black box are the inputs and
 outputs.


 Well said.


 I am sorry I will say my opinion. This sounds good but is simplistic.
 Black box is good in principle. But it assume you found right interface for
 the black box. If you define bad interface you have a bad black box. Also
 please remember that iterator is black box also. But it defines right
 interface.


 I agree with the meaning, but I disagree with the example.
 I think that iterators are an example of bad interface, as also others
 brought up the iterator as good example I though that I should say
 something.

 An iterator should be like a generator, have a method next, and one at_end
 or something similar packaged (and maybe prev() and at_start() if it can
 also go back) in a single struct, furthermore it should work seamlessly with
 a kind of for_each(x;iterator) construct.

 Instead C++ choose to have begin & end iterators, simply because with that
 construct it is trivial for the compiler to optimize it for arrays, and you
 can use pointers as iterators without a cast/constructor.

 This means a worse interface for 99% of the uses, apart form arrays and
 vectors I think one is better off without end iterator, and even when this
 is not the case writing something like for_each(x;FromTo(a,b)), with FromTo
 constructing a generator is (I think) better than
 for(i=t.begin();i!=t.end();++t), and the implementation of an a generator
 itself is easier (no ==,!=,increment, decrement(pre/post),... for
 performance reasons)


It is worse in simple "for each element" uses, but one of the cooler
things about C++ iterators is exactly that they are exchangeable for
any pointer-like object, and that it's trivial to iterate over a
subset of a container. Emulating pointers means that in some ways,
once you grok how they work, they are more trivial than e.g. Python or
D since there is no further specialised syntax beyond "*", "++", and
maybe "--".

I love the concise D and Python syntaxes, just playing devil's advocate here. :)


 As I believe that the optimizations to make the better interface be as
 efficient as the iterator one are perfectly doable (some work, yes, but not
 extremely much so), I see no good reason for C++ design.

 Fawzi

Aug 28 2008

"Denis Koroskin" <2korden gmail.com> writes:

On Thu, 28 Aug 2008 13:47:22 +0400, Fawzi Mohamed <fmohamed mac.com> wrote:

 On 2008-08-28 00:24:50 +0200, Dee Girl <deegirl noreply.com> said:

 Derek Parnell Wrote:

 On Wed, 27 Aug 2008 17:08:47 -0400, Nick Sabalausky wrote:

 The way I see it, encapsulation is all about the black box idea. And  
 the
 only things you can see from outside the black box are the inputs and
 outputs.




 Black box is good in principle. But it assume you found right interface  
 for the black box. If you define bad interface you have a bad black  
 box. Also please remember that iterator is black box also. But it  
 defines right interface.


 I agree with the meaning, but I disagree with the example.
 I think that iterators are an example of bad interface, as also others  
 brought up the iterator as good example I though that I should say  
 something.

 An iterator should be like a generator, have a method next, and one  
 at_end or something similar packaged (and maybe prev() and at_start() if  
 it can also go back) in a single struct, furthermore it should work  
 seamlessly with a kind of for_each(x;iterator) construct.

 Instead C++ choose to have begin & end iterators, simply because with  
 that construct it is trivial for the compiler to optimize it for arrays,  
 and you can use pointers as iterators without a cast/constructor.

 This means a worse interface for 99% of the uses, apart form arrays and  
 vectors I think one is better off without end iterator, and even when  
 this is not the case writing something like for_each(x;FromTo(a,b)),  
 with FromTo constructing a generator is (I think) better than  
 for(i=t.begin();i!=t.end();++t), and the implementation of an a  
 generator itself is easier (no ==,!=,increment, decrement(pre/post),...  
 for performance reasons)

 As I believe that the optimizations to make the better interface be as  
 efficient as the iterator one are perfectly doable (some work, yes, but  
 not extremely much so), I see no good reason for C++ design.

 Fawzi


Agreed. I usually write iterators that support "T geValue()", "void  
moveNext()" and "bool isValid()" operations.
Also I don't like Java-style "T getNext()" iterator idiom and think it  
should be split into two methods: getValue() and moveNext().

Aug 28 2008

D Programming

C/C++ Programming

Other

digitalmars.D - Re: Why Strings as Classes?