www.digitalmars.com         C & C++   DMDScript  

D - Arrays, Slices, Cases

reply Karl Bochert <kbochert ix.netcom.com> writes:
Having too much time on my hands, I submit the following
summary of my viewpoint.  At the very least it shows how I
will rationalize the method that D uses.

I would welcome any other rationale that would make D's
choices seem more natural.

(After 20 yrs of C, I still have fencepost problems! :-)

Concerning Arrays, slices, and cases.

1) Ordinal Arrays
     Arrays are ordered set of elements accessed by their position
     in the set.
     An array with N elements has a first element and an
     N'th element.
     A slice of the entire array is arr[1..N].
     A slice excluding the ends is arr[2..N-1].

  end-inclusion:
      a slice can be thought of as the result of a procedure
      that (somehow) extracts the range, similar to:
          result = slice (&arr[start], &arr[end]);
      Obviously, end is included.

      a slice can be thought of as the result of a loop that
	  'extracts' elements of the array, similar to:
         for (i = start, i <= end; ++i)  result[1+i-start] = arr[i];
      Obviously, like the for loop,  'end' is included.
	  Cases are also end-inclusive:
      case [1]:
      case [2 to 4]:
      case [5]:

  end-exclusion:
      Why would anyone do that?

-----------

2) Cardinal Arrays
    Arrays are chunks of storage accessed by the offset from
    their start.
    An array with N elements has a zero'th element and an N-1'th
    element.
    A slice of the entire array is arr[0..N-1].

    end-inclusion:
      A slice excluding the ends is arr[1..N-1].
      a slice can be thought of as the result of a procedure
      that (somehow) extracts the range, similar to:
          result = slice (&arr[start], &arr[end]);
      Obviously, end is included.
	  Cases are also end-inclusive:
      case [1]:
      case [2 to 4]:
      case [5]:
	  

    end-exclusion:
      A slice excluding the ends is arr[1..N-2].
      a slice can be thought of as the result of a loop that
      'extracts' elements of the array, similar to:
         for (i = start, i < end; ++i) result[i-start] = arr[i];
      Obviously, like the for loop, end is excluded.
      Cases are also end-exclusive:
         case [1]:
         case [2 to 5]:
         case [5]:
      Cases are end-inclusive (different than slices)
         case [1]:
         case [2 to 4]:
         case [5]:


D is:
  Cardinal arrays, end-exclusive, case-inclusive?

I am for:
  Ordinal arrays, end-inclusive, case-inclusive.
  simpler and more consistant. 

Karl Bochert
Feb 21 2002
parent reply not here <not.known this.address.com> writes:
On Fri, 22 Feb 2002 00:43:28 GMT, Karl Bochert <kbochert ix.netcom.com> wrote:
 
 Having too much time on my hands, I submit the following
 summary of my viewpoint.  At the very least it shows how I
 will rationalize the method that D uses.
 

[big snip]
 
 D is:
   Cardinal arrays, end-exclusive, case-inclusive?
 
 I am for:
   Ordinal arrays, end-inclusive, case-inclusive.
   simpler and more consistant. 

Hi Karl, whether one uses either a index or an offset to reference an array element, is often influenced by what we have been exposed to already. However, I'd like to approach the issue in a different manner. To me an 'index' is 1-based and is a normal way that people think about enumerated elements. If you go up to somebody with a list of items and asked them to number them, the person would normally start with the number 1. An 'offset' is 0-based and is the normal way that computers get access to memory. Addr + offset gives another address - the start of the element in question. I believe that programming languages have a primary aim of helping people describe their algorithms. In other words, programming languages are for people and not computers - that's why we have compilers. So, I would hold that 1-based array referencing is the normal way for people to describe what they are trying to do. Furthermore, an index has the connotation that the entire element is being referenced, whereas an offset is better thought of referencing the start of an element. Thus a slice reference of say [2..4] seems to say to me that the slice encompases element#2, element#3, and element#4. That is the whole of each of these elements. The fact that the length of this slice is 3 is obvious because all of the elements are being referenced. If we were using offsets in slice notation, then [2..4] would be saying that the slice starts from the start of element #3 and ends at the start of element #5. This represents all of element#3 and all of element#4, but not any of element#5, thus has a length of 2. But this is not how people normally view the world. I vote with Karl on this one. Besides, calculating the length of an index notation slice is not beyond us, especially if we can do " myArray[x..y].length " Now consider the way we might remove an element from a dynamic array. Given that 'pos' references the element to be removed... using Index Notation A1 = A1[1..pos-1] ~ A1[pos+1 .. A1.length] using Offset Notation A1 = A1[0..pos] ~ A1[pos+1 .. A1.length-1] Not a lot of difference really. Personally, I find that the index notation is more clearly telling the reader that I am trying to exclude the 'pos' element but include everything else. ------- cheers.
Feb 21 2002
next sibling parent reply Karl Bochert <kbochert ix.netcom.com> writes:
 Hi Karl,
 ...
 To me an 'index' is 1-based and is a normal way that people think about
enumerated elements. If you 
 go up to somebody with a list of items and asked them to number them, the
person would normally 
 start with the number 1.
 
 An 'offset' is 0-based and is the normal way that computers get access to
memory. Addr + offset 
 gives another address - the start of the element in question.
 

 
 Furthermore, an index has the connotation that the entire element is being
 referenced, whereas an  offset is better thought of referencing the
 start of an element. 
 

 ...
using Index Notation
   A1 = A1[1..pos-1] ~ A1[pos+1 .. A1.length]

using Offset Notation
   A1 = A1[0..pos] ~ A1[pos+1 .. A1.length-1]

Didn't you get that wrong? using Offset Notation (exclusive) A1 = A1[0..pos] ~ A1[pos+1 .. A1.length] using Offset Notation (inclusive) A1 = A1[0..pos-1] ~ A1[pos+1 .. A1.length-1] (I think thats right -- I had to draw little boxes on a sheet of graph paper) I have this sneaky feeling that the reason D uses offsets instead of indexes is to be backward- compatible with C, and therefore more familiar. Karl Bochert
Feb 21 2002
parent reply not here <not.known this.address.com> writes:
On Fri, 22 Feb 2002 05:49:50 GMT, Karl Bochert <kbochert ix.netcom.com> wrote:
 Hi Karl,
 ...
 To me an 'index' is 1-based and is a normal way that people think about
enumerated elements. If 


 go up to somebody with a list of items and asked them to number them, the
person would normally 
 start with the number 1.
 
 An 'offset' is 0-based and is the normal way that computers get access to
memory. Addr + offset 
 gives another address - the start of the element in question.
 


I really think that too many language designers forget that its people that have to actually use them, and not computers. The "user-interface" for most programming languages is sub-optimal. Often the language encourages hard-to-comprehend syntax thus making it easier for people to make mistakes.
 ...
using Index Notation
   A1 = A1[1..pos-1] ~ A1[pos+1 .. A1.length]

using Offset Notation
   A1 = A1[0..pos] ~ A1[pos+1 .. A1.length-1]

Didn't you get that wrong? using Offset Notation (exclusive) A1 = A1[0..pos] ~ A1[pos+1 .. A1.length] using Offset Notation (inclusive) A1 = A1[0..pos-1] ~ A1[pos+1 .. A1.length-1] (I think thats right -- I had to draw little boxes on a sheet of graph paper)

Ooops. You are right. I did get the 'index' code wrong. That might be example of its inherent non- user-friendly interface ;-) A1 = A1[0..pos] ~ A1[pos+1 .. A1.length] is what I should have coded. To the average person, knowing the 'pos' refers to the element being removed, this code looks wrong as it seems to be including A1[pos]!
  I have this sneaky feeling that the reason D uses offsets instead of
 indexes is to be backward- compatible with C, and therefore
 more familiar.
 

More familar to whom? C/C++ coders? One would have hoped that D might be used as a replacement for C/C++ and thus newbies can learn a "better" language and not have to be backward compatible. Also, reading the D Overview we find under the things to drop from C/C++ "C source code compatibility. Extensions to C that maintain source compatiblity have already been done (C++ and ObjectiveC). Further work in this area is hampered by so much legacy code it is unlikely that significant improvements can be made." ---- cheers.
Feb 21 2002
next sibling parent reply "Sean L. Palmer" <spalmer iname.com> writes:
"not here" <not.known this.address.com> wrote in message
news:1103_1014364009 news.digitalmars.com...
 On Fri, 22 Feb 2002 05:49:50 GMT, Karl Bochert <kbochert ix.netcom.com>

 To me an 'index' is 1-based and is a normal way that people think



 you
 go up to somebody with a list of items and asked them to number them,



 start with the number 1.

 An 'offset' is 0-based and is the normal way that computers get access



 gives another address - the start of the element in question.



 I really think that too many language designers forget that its people

 them, and not computers. The "user-interface" for most programming

 the language encourages hard-to-comprehend syntax thus making it easier

I don't know what planet you guys are from... go use BASIC or something if you want arrays that start at position 1 instead of 0. Computer arrays start at 0. Every programmer needs to learn this right away. It's very fundamental, and trying to "humanize" it just results in a language that requires suboptimal code generation. I personally think they should teach people about zero earlier on in school, then we wouldn't have this problem. How would you like that? ;) Sean
Feb 22 2002
next sibling parent "Pavel Minayev" <evilone omen.ru> writes:
"Sean L. Palmer" <spalmer iname.com> wrote in message
news:a552bk$17v$1 digitaldaemon.com...

 I personally think they should teach people about zero earlier on in

 then we wouldn't have this problem.  How would you like that?  ;)

Great idea! So, I have two cars, the zeroeth is red and the first is blue =)
Feb 22 2002
prev sibling parent not here <not.here this.address.com> writes:
On Fri, 22 Feb 2002 01:18:54 -0800, "Sean L. Palmer" <spalmer iname.com> wrote:
 
 I don't know what planet you guys are from... go use BASIC or something if
 you want arrays that start at position 1 instead of 0.
 
 Computer arrays start at 0.  Every programmer needs to learn this right
 away.  It's very fundamental, and trying to "humanize" it just results in a
 language that requires suboptimal code generation.
 
 I personally think they should teach people about zero earlier on in school,
 then we wouldn't have this problem.  How would you like that?  ;)

This sounds a lot like "Well thank you, Ma'am, but quite frankly, that's not how we do things around these parts". I would have thought that with D, we have a chance to break free of the computer-centric way of doing things and instead design a language that makes life easier for coders at every possible chance. If people all around the world, in all cultures (except it seems, vetern coders), count off things starting with one, why should we have to "retrain" them to start thinking as if they are a computer. Yes, I know that computer arrays start at 0. Just like my high school ruler also started at zero. But that first inch is still inch #1 and not inch #0. If one is truely concerned with suboptimal code generation, we would all be still creating hand-crafted assembler (or even machine code) programs. All we are talking about here is sometimes generating a "subtract one" opcode or similar, and todays, let alone tomorrow's, computers are very, very fast. Isn't a compiler a tool? A tool for people to use? To make our lives easier? So let our compilers take what is normal for people and convert it for computer usage, rather than having the language make people convert what is normal for them into computer-ese. ------- cheers.
Feb 22 2002
prev sibling parent reply "Pavel Minayev" <evilone omen.ru> writes:
     A1 = A1[0..pos] ~ A1[pos+1 .. A1.length]

 is what I should have coded. To the average person, knowing the 'pos'

 removed, this code looks wrong as it seems to be including A1[pos]!

To the average C/C++/C#/Java programmer, it looks just as it should.
 More familar to whom? C/C++ coders? One would have hoped that D might be

 C/C++ and thus newbies can learn a "better" language and not have to be

 reading the D Overview we find under the things to drop from C/C++

I believe Walter said that D is not a language for the beginners. BASIC, or even Pascal would be better for this purpose. D is a practical language for practical programmers, and I don't think it's the best idea to sacrifice speed to gain such a subtle simplicity, IMO...
 "C source code compatibility. Extensions to C that maintain source

 done (C++ and ObjectiveC). Further work in this area is hampered by so

 unlikely that significant improvements can be made."

"compatibility" means ability to compile code from that language. This is what D is not for. But there are many programmers that know only C (or C++, or C#, or Java - the same language family) - and those people expect to find a common environment to start coding quick, without having to learn everything from scratch. Arrays are indexed from 0, every C programmer should remember that better than his own name - why disappoint them? 0-based indexing is a tradition too old to change it - it's better to live with it, especially since it's not hard to get used to it.
Feb 22 2002
next sibling parent reply not here <not.here this.address.com> writes:
On Fri, 22 Feb 2002 16:09:36 +0300, "Pavel Minayev" <evilone omen.ru> wrote:
     A1 = A1[0..pos] ~ A1[pos+1 .. A1.length]

 is what I should have coded. To the average person, knowing the 'pos'

 removed, this code looks wrong as it seems to be including A1[pos]!

To the average C/C++/C#/Java programmer, it looks just as it should.

Should I infer then that the average D programmer is always going to be an average C/C++/C#/Java programmer too? Is this a short-sighted attidude for the future of D? Can we not expect COBOL coders to come over the fence? If not, why not?
 More familar to whom? C/C++ coders? One would have hoped that D might be

 C/C++ and thus newbies can learn a "better" language and not have to be

 reading the D Overview we find under the things to drop from C/C++

I believe Walter said that D is not a language for the beginners. BASIC, or even Pascal would be better for this purpose.

I assume that Walter is referring to people who are just learning to program. I would have thought that the fewer new things that people have to learn, the sooner they can become productive. If this is so, then it would appear than a design goal for D is to assume new comers to D will be existing C/etc coders so they don't have to learn too many new things. Oh well, maybe we are condemed to repeat history.
 D is a practical language
 for practical programmers, 

.meaning that Basic and Pascal are NOT practical languages, and their users are NOT practical? Ummm. Sounds a little xenophobic to me.
 and I don't think it's the best idea to
 sacrifice speed to gain such a subtle simplicity, IMO...

Why is that we spend hours of coding time to optimise a few micro-seconds into a program? We no longer live in the age when computer time is more expensive than people time. It seems you are willing to sacrifice coders time rather than computer time. I don't think it's the best idea to sacrifice coding speed, IMO.
 "C source code compatibility. Extensions to C that maintain source

 done (C++ and ObjectiveC). Further work in this area is hampered by so

 unlikely that significant improvements can be made."

"compatibility" means ability to compile code from that language. This is what D is not for. But there are many programmers that know only C (or C++, or C#, or Java - the same language family) - and those people expect to find a common environment to start coding quick, without having to learn everything from scratch. Arrays are indexed from 0, every C programmer should remember that better than his own name - why disappoint them?

Heaven forbid that we should try to retrain C coders! Everyone knows that we are sacrosanct and must be protected. Every good C programmer knows how useful the macro preprocessor is (oops, that's not in D is it?) Every good C++ programmer knows how useful multiple inheritance can be (ooops, that not in D is it?) Every good C++/Java/C# programer knows how useful namespaces can be (ooops, that's not in D is it?) Every C programmer can type #include files in their sleep (ooops, that's not in D is it?) Yes I know these are a little unfair. But what I'm trying to get across is that D will already force C coders to learn/unlearn things. So why not have 1-based indexes, just like we use in the real world.
 0-based indexing is a tradition
 too old to change it - it's better to live with it, especially
 since it's not hard to get used to it.

Hey we got tradition! You can't mess that baby. Sure it makes things a bit harder but you'll soon get used to that. Is this the same as saying "We can't do that new thing because its not what we currently do"? I could just as equally say "1-based indexing is not hard to get used to, seeing you already do it everywhere else except when you are thinking like a computer." ----- cheers
Feb 22 2002
next sibling parent reply "Pavel Minayev" <evilone omen.ru> writes:
"not here" <not.here this.address.com> wrote in message
news:1104_1014393359 news.digitalmars.com...

 Should I infer then that the average D programmer is always going to be an

the future of D? Definitely not. However, the most popular language nowadays is C++, so C-centric model seems most appropriate to me here.
 .meaning that Basic and Pascal are NOT practical languages, and their

Not really. It's just that C/C++ proves to be better in such cases. So should D.
 Why is that we spend hours of coding time to optimise a few micro-seconds

expensive than people time. It seems you are willing to sacrifice
 coders time rather than computer time. I don't think it's the best idea to

There is no "sacrifice" in coding speed with 0-based indexing. Nothing you can't get used to, and, in fact, many programmers over the world are already. On other hand, that microsecond can cost you 20fps drop when writing a game which iterates through 100000 objects in a loop...
 Every good C programmer knows how useful the macro preprocessor is (oops,

Preprocessor thing had been discussed many times and there are _very_ serious reasons to ban it. After all, if you really need one, you can always use an external program. Otherwise, most uses of preprocessor in C are covered by constants, inline functions, and version/debug statements in D.
 Every good C++ programmer knows how useful multiple inheritance can be

The latest discussion on the topic in this group shows that, in fact, most C++ programmers use MI rarely or don't use it at all.
 Every good C++/Java/C# programer knows how useful namespaces can be

Namespaces ARE good. And they are present in D - implicitly, each module is a namespace. Don't forget about packages as well. And the fact that each enum has its own namespace speaks for itself.
 Every C programmer can type #include files in their sleep (ooops, that's

import c.stdio;
 Yes I know these are a little unfair. But what I'm trying to get across is

1-based indexes, just like we use in the real world. Even in the real world, indices aren't always 1-based. Then again, with 1-based arrays, you can only have 2^32-1 elements, while with 0-based you get the whole 2^32! Just think of all the benefits this gives! =)
 Hey we got tradition! You can't mess that baby. Sure it makes things a bit

"Tradition" isn't something you should get used to. It's something that _most_ people got used two.
 Is this the same as saying "We can't do that new thing because its not

Almost. Just because it seems impractical.
 I could just as equally say "1-based indexing is not hard to get used to,

computer." Well, mathematicians don't use it. Also, I understand why a user shouldn't think like a computer. But why not programmer? Memory is indexed from 0, nothing you can do with it. So are all that assembler opcodes that your program ends up being, anyhow. You write programs for computers, not for men, after all...
Feb 22 2002
parent Karl Bochert <kbochert ix.netcom.com> writes:
On Fri, 22 Feb 2002 21:27:16 +0300, "Pavel Minayev" <evilone omen.ru> wrote:

 
 There is no "sacrifice" in coding speed with 0-based indexing.

 ...
 Nothing
 you can't get used to, and, in fact, many programmers over the world
 are already. On other hand, that microsecond can cost you 20fps drop
 when writing a game which iterates through 100000 objects in a loop...

Actually the instruction time for an add is more like 1 ns. these days, especially on computers where speed is a concern. The cpu has multiple execution units, so the add would frequently be done in parallel with other instructions, costing 0 time. The +1 will often be combined with another constant costing 0 time. Many assembly instructions used for array access use an addressing mode that adds a constant whether it is needed or not, again 0 time. Finally there is the issue of how often the +1 comes up in the first place. arr[2] -- no +1 arr[x] -- no +1 unless the programmer has stored the wrong thing in x Of course you run into a problem if you pass a cardinal to something that expects an ordinal! An interesting point: Addressing arrays with cardinals implies the existence of an element at [-1]. There are no negative ordinals!
 
 Even in the real world, indices aren't always 1-based.

indexes.
 Then again, with 1-based arrays, you can only have 2^32-1 elements,
 while with 0-based you get the whole 2^32! Just think of all the
 benefits this gives! =)

indexes. Unless you count (shudder) negative indexes. I suppose implementation issues would limit ordinal arrays though.
 Is this the same as saying "We can't do that new thing because its not

Almost. Just because it seems impractical.

written using cardinal arrays ( And that is a killer!)
 
 Well, mathematicians don't use it.
 Also, I understand why a user shouldn't think like a computer. But
 why not programmer? Memory is indexed from 0, nothing you can do
 with it. So are all that assembler opcodes that your program ends
 up being, anyhow.

The program actually ends up being binary -- I would like to make as few concessions as possible to that unfortunate fact.
 You write programs for computers, not for men,  after all...

I write programs for 2 readers: compilers, and humans (If you can call programmers that :-) The great majority of my programming time is spent trying to satisfy the humans.
Feb 22 2002
prev sibling parent reply Russell Borogove <kaleja estarcion.com> writes:
not here wrote:
 Heaven forbid that we should try to retrain C coders! 
 Everyone knows that we are sacrosanct and must be protected.
 Every good C programmer knows how useful the macro 
 preprocessor is (oops, that's not in D is it?)
 Every good C++ programmer knows how useful multiple 
 inheritance can be (ooops, that not in D is it?)
 Every good C++/Java/C# programer knows how useful namespaces 
 can be (ooops, that's not in D is it?)
 Every C programmer can type #include files in their sleep 
 (ooops, that's not in D is it?)
 
 Yes I know these are a little unfair. But what I'm trying to 
 get across is that D will already force C coders to learn/
 unlearn things. So why not have 1-based indexes, just like
 we use in the real world.

I don't see the problem as being one of tradition -- as you point out, Walter has dispensed with a number of C/C++ traditions -- but of practicality. The changes you describe above are wholesale removals of features that the compiler can slap you for very quickly if you forget about. Trying to use the preprocessor in D clearly won't lead to subtle bugs; the compiler will scream and the typical programmer will very quickly adapt. Changing the index base will create subtle bugs, because code that would compile in C or C++ will still frequently compile in what I'll call D-1 (D with 1-based arrays), but change its meaning. That scares me. The most common error, of course, will be accessing the zeroeth element of a 1-based array, which should throw an exception, but how about the following code: void parse_command_line_option( char[] option ) { switch (option[1]) // switch on the chracter after the '-' { case 'x': do_option_x(); break; case 'y': do_option_y(); break; default: printf("undefined option\n"); break; } } The comment is correct for C/C++, but in D-1 the comment is misleading and both "-x" and "-y" yield an undefined option message. -RB
Feb 22 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Russell Borogove" <kaleja estarcion.com> wrote in message
news:3C76928A.8090508 estarcion.com...
 I don't see the problem as being one of tradition -- as you point
 out, Walter has dispensed with a number of C/C++ traditions -- but
 of practicality. The changes you describe above are wholesale
 removals of features that the compiler can slap you for very
 quickly if you forget about. Trying to use the preprocessor in
 D clearly won't lead to subtle bugs; the compiler will scream and
 the typical programmer will very quickly adapt.

 Changing the index base will create subtle bugs, because code
 that would compile in C or C++ will still frequently compile in
 what I'll call D-1 (D with 1-based arrays), but change its meaning.
 That scares me. The most common error, of course, will be accessing
 the zeroeth element of a 1-based array, which should throw an
 exception, but how about the following code:

And I want to emphasize this is an important point. D tries hard to avoid having incompatibilities with C that will subtly break things. 1 based arrays would do that. Another way to really mess people up would be to change the operator precedence. Since D is meant to appeal to C and C++ programmers, I feel it would be a mistake to change those things, regardless of how meritorious those changes are when viewed outside of the context of C familiarity. I've done some conversions of a few thousand line programs from C++ to D, and it's bad enough finding and fixing all the dependencies on 0 terminated strings. That turns out to be more work than I'd anticipated.
Feb 22 2002
next sibling parent "not here" <not.here this.address.com> writes:
"Walter" <walter digitalmars.com> wrote in message
news:a56tc2$1534$1 digitaldaemon.com...
 And I want to emphasize this is an important point. D tries hard to avoid
 having incompatibilities with C that will subtly break things. 1 based
 arrays would do that. Another way to really mess people up would be to
 change the operator precedence. Since D is meant to appeal to C and C++
 programmers, I feel it would be a mistake to change those things,

 of how meritorious those changes are when viewed outside of the context of

 familiarity.

 I've done some conversions of a few thousand line programs from C++ to D,
 and it's bad enough finding and fixing all the dependencies on 0

 strings. That turns out to be more work than I'd anticipated.

Thanks for this explanation, Walter. I now have a better understanding of your goals for D. I wish you and your endevours well. I think you are off to a good start. Goodbye.
Feb 22 2002
prev sibling parent reply Karl Bochert <kbochert ix.netcom.com> writes:
 
 And I want to emphasize this is an important point. D tries hard to avoid
 having incompatibilities with C that will subtly break things. 1 based
 arrays would do that. Another way to really mess people up would be to
 change the operator precedence. Since D is meant to appeal to C and C++
 programmers, I feel it would be a mistake to change those things, regardless
 of how meritorious those changes are when viewed outside of the context of C
 familiarity.

So while ordinal arrays might be more technically correct, D would be less successful with them. However annoying that is, I'm sure you're right.
 
 I've done some conversions of a few thousand line programs from C++ to D,
 and it's bad enough finding and fixing all the dependencies on 0 terminated
 strings. That turns out to be more work than I'd anticipated.
 

0-terminated strings today; 0-terminated arrays tomorrow. (maybe in 'E'). Karl
Feb 22 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Karl Bochert" <kbochert ix.netcom.com> wrote in message
news:1104_1014441056 bose...
 So while ordinal arrays might be more technically correct, D would be less
 successful with them. However annoying that is, I'm sure you're right.

Language design is inevitably going to be an uneasy alliance of contradictory goals <g>. It's like designing a house - to make a bigger closet, you have to shrink the bedroom.
Feb 22 2002
parent reply Barry Pederson <barryp yahoo.com> writes:
I've been following this debate about slices and cases with some interest.  As 
a Python programmer, I'm rooting for end-exclusiveness in slices - it's the 
way Python handles slices, and seems to work very well in practice.  If D 
sticks with that, then I think there'll be a lot of Python programmers who 
will be able to make use of it very comfortably.

Another Python-ish thing that might be worth considering would be support for 
negative array indexes.  For example:

    foo[-1]  is the last element of array foo
    foo[-2]  is the second-to-last element of array foo

then with slices

    foo[-2..] would be the last two elements of foo
    foo[1..-1] would be a copy of foo, except for the first and last elements

implementation should be pretty simple, for any index < 0, such as: foo[-n], 
the compiler would just treat it like: foo[foo.length - n]

CASES
----------

When it comes to case-ranges though, I agree that end-exclusiveness would be a 
weird PITA.  Perhaps the thing to do is decide that case-ranges and 
array-slices will be different things, and go ahead and use different syntaxes.

To steal from Python again, perhaps use a colon for array slicing (foo[1:-1]) 
and keep '..' for case-ranges (which seems pretty natural).

This would also kind of make sense if you also supported Python-style negative 
array indexes, since negative numbers in case-ranges would presumably have a 
completely different meaning.

     Barry
Feb 23 2002
parent "Walter" <walter digitalmars.com> writes:
"Barry Pederson" <barryp yahoo.com> wrote in message
news:3C77CFDD.2050404 yahoo.com...
 I've been following this debate about slices and cases with some interest.

 a Python programmer, I'm rooting for end-exclusiveness in slices - it's

 way Python handles slices, and seems to work very well in practice.  If D
 sticks with that, then I think there'll be a lot of Python programmers who
 will be able to make use of it very comfortably.

 Another Python-ish thing that might be worth considering would be support

 negative array indexes.  For example:

     foo[-1]  is the last element of array foo
     foo[-2]  is the second-to-last element of array foo

 then with slices

     foo[-2..] would be the last two elements of foo
     foo[1..-1] would be a copy of foo, except for the first and last

 implementation should be pretty simple, for any index < 0, such as:

 the compiler would just treat it like: foo[foo.length - n]

While this is a great idea, it suffers from a serious problem - the compiler will have to insert a runtime check for negative indices whenever the index expression is not a constant. This will be too much overhead.
 CASES
 ----------

 When it comes to case-ranges though, I agree that end-exclusiveness would

 weird PITA.  Perhaps the thing to do is decide that case-ranges and
 array-slices will be different things, and go ahead and use different

 To steal from Python again, perhaps use a colon for array slicing

 and keep '..' for case-ranges (which seems pretty natural).

 This would also kind of make sense if you also supported Python-style

 array indexes, since negative numbers in case-ranges would presumably have

 completely different meaning.

Yes, using a different syntax may be the best solution.
Feb 23 2002
prev sibling parent reply "Roberto Mariottini" <rmariottini lycosmail.com> writes:
"Pavel Minayev" <evilone omen.ru> ha scritto nel messaggio
news:a55g08$7e8$1 digitaldaemon.com...
     A1 = A1[0..pos] ~ A1[pos+1 .. A1.length]

 is what I should have coded. To the average person, knowing the 'pos'

 removed, this code looks wrong as it seems to be including A1[pos]!

To the average C/C++/C#/Java programmer, it looks just as it should.

So I'm not an "average" C/C++/Java programmer, therefore I use them only since 1991/93/96. I know how slicing currently works in D, but I had to double check to understand. [...]
 But there are many programmers that know
 only C (or C++, or C#, or Java - the same language family) - and
 those people expect to find a common environment to start coding
 quick, without having to learn everything from scratch. Arrays are
 indexed from 0, every C programmer should remember that better than
 his own name - why disappoint them? 0-based indexing is a tradition
 too old to change it - it's better to live with it, especially
 since it's not hard to get used to it.

I always wondered why C and derivatives don't have a way to define a start index like Pascal does. To me it seems better to leave to the compiler the task to subtract the start index from the actual index. In C you write: int occurrencies['Z'-'A']; for (i = 0; i < size; ++i) { ++occurrencies[s[i]-'A']; } Here the task to subtract 'A' to every indexing is left to the programmer. Maybe the compiler could live with an optional initial index to subtract every time the array is accessed. Ciao
Feb 22 2002
parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
news:a55s13$1c8a$1 digitaldaemon.com...

 So I'm not an "average"  C/C++/Java programmer, therefore I use them only
 since 1991/93/96.
 I know how slicing currently works in D, but I had to double check to
 understand.

End-exclusive slicing _is_ an issue, definitely. But we were talking about 0-based indexing.
 I always wondered why C and derivatives don't have a way to define a start
 index like Pascal does. To me it seems better to leave to the compiler the
 task
 to subtract the start index from the actual index. In C you write:

 int occurrencies['Z'-'A'];
 for (i = 0; i < size; ++i)
 {
     ++occurrencies[s[i]-'A'];
 }

 Here the task to subtract 'A' to every indexing is left to the programmer.

 Maybe the compiler could live with an optional initial index to subtract
 every time
 the array is accessed.

This is a far better idea. What I like in Pascal is the ability to use 0-based, 1-based or whatever else based arrays depending on your task and your personal taste. Those who care of speed (me) would probably use 0-based (and I believe it should be the default, to work the same way as in C/C++). Otherwise, you can specify it yourself: int[5] foo; // consists of foo[0] to foo[4] int[1..5] bar; // consists of bar[1] to bar[5]
Feb 22 2002
next sibling parent reply "Walter" <walter digitalmars.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:a562qa$5lv$1 digitaldaemon.com...
 This is a far better idea. What I like in Pascal is the ability to
 use 0-based, 1-based or whatever else based arrays depending on
 your task and your personal taste. Those who care of speed (me) would
 probably use 0-based (and I believe it should be the default, to
 work the same way as in C/C++). Otherwise, you can specify it yourself:

     int[5]    foo;     // consists of foo[0] to foo[4]
     int[1..5] bar;     // consists of bar[1] to bar[5]

Having lower bounds specifiable will work with D (and even with C), but in my decades (!) of programming I've never found a use for it. I came to C from Basic, FORTRAN, and Pascal. I had some initial trouble getting used to 0 based rather than 1 based, but never looked back. 0 based looked more 'right' to me.
Feb 22 2002
next sibling parent "Roberto Mariottini" <rmariottini lycosmail.com> writes:
"Walter" <walter digitalmars.com> wrote in message
news:a56tc4$1534$2 digitaldaemon.com...
 "Pavel Minayev" <evilone omen.ru> wrote in message
 news:a562qa$5lv$1 digitaldaemon.com...
 This is a far better idea. What I like in Pascal is the ability to
 use 0-based, 1-based or whatever else based arrays depending on
 your task and your personal taste. Those who care of speed (me) would
 probably use 0-based (and I believe it should be the default, to
 work the same way as in C/C++). Otherwise, you can specify it yourself:

     int[5]    foo;     // consists of foo[0] to foo[4]
     int[1..5] bar;     // consists of bar[1] to bar[5]

Having lower bounds specifiable will work with D (and even with C), but in my decades (!) of programming I've never found a use for it. I came to C from Basic, FORTRAN, and Pascal.

Instead I have some Pascal sources that will keep as they are. When I do programming I want to use all features the language provides. And my old Pascal sources are full of sets, subrange variables, nested procedures, arbitrary indexed arrays, and so on. Automatic code translators simply produce unmaintainable code, once I started to translate one of them in C++, but I stopped when I realized I'd have to work hard to produce a huge set of classes only to support base Pascal capabilities, with a big loss in readability.
 I had some initial trouble getting used to
 0 based rather than 1 based, but never looked back. 0 based looked more
 'right' to me.

Don't confuse normal initial trouble with language expressiveness lack. I know that, at the end, an array index must be translated to a 0-based integer. But simply doing the translation myself doesn't seem to me the right solution in most cases. So I agree with Pavel: array should be 0-based by default, letting the possibility to choose a different start index if needed, stating clearly that n-based indexes are less performant. Ciao
Feb 26 2002
prev sibling parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Walter" <walter digitalmars.com> wrote in message
news:a56tc4$1534$2 digitaldaemon.com...

 Having lower bounds specifiable will work with D (and even with C), but in
 my decades (!) of programming I've never found a use for it. I came to C
 from Basic, FORTRAN, and Pascal. I had some initial trouble getting used

 0 based rather than 1 based, but never looked back. 0 based looked more
 'right' to me.

Well, there are actually cases where you'd prefer some base other than 0. As you've seen, many people here consider 1 to be more suitable, and I understand them... also there are some other cases, for example, suppose you have an array of year income for 1990-2000, in Pascal you'd probably declare it as "array[1990 .. 2000] of integer", and then index it like income[1995], letting the compiler do his job and insert all the necessary decrements; the result is clean code, easy to read and maintain. In C, you have to do it all yourself, and probably define some const base = 1990, and clutter all your code with things like income[1995 - base]. After all, it's as simple as subtracting the base from the index, a single SUB... ain't it worth the thing?
Feb 27 2002
parent reply "Carlos Santander B." <carlos8294 msn.com> writes:
"Pavel Minayev" <evilone omen.ru> escribi๓ en el mensaje
news:a5jben$16o8$1 digitaldaemon.com...
| "Walter" <walter digitalmars.com> wrote in message
| news:a56tc4$1534$2 digitaldaemon.com...
|
| > Having lower bounds specifiable will work with D (and even with C), but
in
| > my decades (!) of programming I've never found a use for it. I came to C
| > from Basic, FORTRAN, and Pascal. I had some initial trouble getting used
| to
| > 0 based rather than 1 based, but never looked back. 0 based looked more
| > 'right' to me.
|
| Well, there are actually cases where you'd prefer some base other than 0.
| As you've seen, many people here consider 1 to be more suitable, and
| I understand them... also there are some other cases, for example, suppose
| you have an array of year income for 1990-2000, in Pascal you'd probably
| declare it as "array[1990 .. 2000] of integer", and then index it like
| income[1995], letting the compiler do his job and insert all the necessary
| decrements; the result is clean code, easy to read and maintain. In C,
| you have to do it all yourself, and probably define some const base =
1990,
| and clutter all your code with things like income[1995 - base].
|
| After all, it's as simple as subtracting the base from the index,
| a single SUB... ain't it worth the thing?
|
|

No one ever answered to this one. It seems very clever to me.

—————————————————————————
Carlos Santander


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.463 / Virus Database: 262 - Release Date: 2003-03-17
Mar 22 2003
next sibling parent reply Farmer <itsFarmer. freenet.de> writes:
"Carlos Santander B." <carlos8294 msn.com> wrote in
news:b5k815$11ia$2 digitaldaemon.com: 

 "Pavel Minayev" <evilone omen.ru> escribi๓ en el mensaje
 news:a5jben$16o8$1 digitaldaemon.com...
| "Walter" <walter digitalmars.com> wrote in message
| news:a56tc4$1534$2 digitaldaemon.com...
|
| > Having lower bounds specifiable will work with D (and even with C),
| > but 
 in
| > my decades (!) of programming I've never found a use for it. I came
| > to C from Basic, FORTRAN, and Pascal. I had some initial trouble
| > getting used 
| to
| > 0 based rather than 1 based, but never looked back. 0 based looked
| > more 'right' to me.
|
| Well, there are actually cases where you'd prefer some base other
| than 0. As you've seen, many people here consider 1 to be more
| suitable, and I understand them... also there are some other cases,
| for example, suppose you have an array of year income for 1990-2000,
| in Pascal you'd probably declare it as "array[1990 .. 2000] of
| integer", and then index it like income[1995], letting the compiler
| do his job and insert all the necessary decrements; the result is
| clean code, easy to read and maintain. In C, you have to do it all
| yourself, and probably define some const base = 
 1990,
| and clutter all your code with things like income[1995 - base].

Having arrays with base 1 and zero-based index is likely to become a maintainer's nightmare: Whenever you see an array, you must look at it's declaration (or wait for a tooltip from you IDE) and check whether is zero- based or not. Array operations become much more bug prone. Once your brain adjusted to zero-based indices, you can easily write bug-free code for arrays or verify that a given code is bug-free. But switching between different bases is very difficult. Zero-based indices are favoured over 1-based indices for implementations reasons. E.g. with a Byte you can address 256 array elements instead of only 255. In D, there are many ways to express concepts, e.g. functions, classes, D- structs, templates. I believe that non-zero based arrays are not really required to express concepts in away that suitable for the problems, programmers have to solve. Farmer.
Mar 26 2003
next sibling parent "Carlos Santander B." <carlos8294 msn.com> writes:
"Farmer" <itsFarmer. freenet.de> escribi๓ en el mensaje
news:Xns934B78D5F5FDitsFarmer 63.105.9.61...
| "Carlos Santander B." <carlos8294 msn.com> wrote in
| news:b5k815$11ia$2 digitaldaemon.com:
|
| > "Pavel Minayev" <evilone omen.ru> escribi๓ en el mensaje
| > news:a5jben$16o8$1 digitaldaemon.com...
| >| "Walter" <walter digitalmars.com> wrote in message
| >| news:a56tc4$1534$2 digitaldaemon.com...
| >|
| >| > Having lower bounds specifiable will work with D (and even with C),
| >| > but
| > in
| >| > my decades (!) of programming I've never found a use for it. I came
| >| > to C from Basic, FORTRAN, and Pascal. I had some initial trouble
| >| > getting used
| >| to
| >| > 0 based rather than 1 based, but never looked back. 0 based looked
| >| > more 'right' to me.
| >|
| >| Well, there are actually cases where you'd prefer some base other
| >| than 0. As you've seen, many people here consider 1 to be more
| >| suitable, and I understand them... also there are some other cases,
| >| for example, suppose you have an array of year income for 1990-2000,
| >| in Pascal you'd probably declare it as "array[1990 .. 2000] of
| >| integer", and then index it like income[1995], letting the compiler
| >| do his job and insert all the necessary decrements; the result is
| >| clean code, easy to read and maintain. In C, you have to do it all
| >| yourself, and probably define some const base =
| > 1990,
| >| and clutter all your code with things like income[1995 - base].
|
| Having arrays with base 1 and zero-based index is likely to become a
| maintainer's nightmare: Whenever you see an array, you must look at it's
| declaration (or wait for a tooltip from you IDE) and check whether is
zero-
| based or not. Array operations become much more bug prone. Once your brain
| adjusted to zero-based indices, you can easily write bug-free code for
| arrays or verify that a given code is bug-free. But switching between
| different bases is very difficult.
|
| Zero-based indices are favoured over 1-based indices for implementations
| reasons. E.g. with a Byte you can address 256 array elements instead of
| only 255.
|
| In D, there are many ways to express concepts, e.g. functions, classes, D-
| structs, templates. I believe that non-zero based arrays are not really
| required to express concepts in away that suitable for the problems,
| programmers have to solve.
|
|
| Farmer.

I wasn't referring to 1-based arrays only, but any-base arrays. By default,
arrays would be 0-based, but what if we could have:

int [4..14] b; //starts in 4, ends in 13
int [-3..7] c; //starts in -3, ends in 6
int [10] a; //normal array, identical to int[0..10] a

Delphi supports that, and I think it could be an interesting addition.

—————————————————————————
Carlos Santander


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.465 / Virus Database: 263 - Release Date: 2003-03-25
Mar 26 2003
prev sibling next sibling parent reply "Sean L. Palmer" <seanpalmer directvinternet.com> writes:
I disagree.  Nonzero-based arrays buy you convenience.  Now you don't have
to remember to subtract the base, it'll do it automatically.

Not everybody thinks like a pro programmer.  That used to be one of the big
selling points of BASIC that it had 1-based arrays.  I always liked the
feature in Pascal.

If I want to index my array as 2,3,4 for some reason instead of 0,1,2,  why
should the compiler force me to use 0,1,2?  I'll just go and use 2,3,4
anyway but now I have to write a stupid array wrapper class that does the
subtraction for me, or remember to subtract 2 all the time.

I can't think of a good example off the top of my head, but say your indices
are an enum type instead of int.  And let's just say that for instance your
enum type only goes from 8 thru 12 because it also happens to be part of a
hardware register matching some bits you don't have control over.  And you
want to map those hardware states to some other data.  Well you have to
remember to subtract 8 all the time or you'll waste memory or get an array
bounds error.

It's such a simple thing...

D needs a range type.  Then using them for declaring arrays becomes easy;
if it wasn't there you'd wonder why not.

Sean

"Farmer" <itsFarmer. freenet.de> wrote in message
news:Xns934B78D5F5FDitsFarmer 63.105.9.61...
 "Carlos Santander B." <carlos8294 msn.com> wrote in
 news:b5k815$11ia$2 digitaldaemon.com:

 "Pavel Minayev" <evilone omen.ru> escribi๓ en el mensaje
 news:a5jben$16o8$1 digitaldaemon.com...
| "Walter" <walter digitalmars.com> wrote in message
| news:a56tc4$1534$2 digitaldaemon.com...
|
| > Having lower bounds specifiable will work with D (and even with C),
| > but
 in
| > my decades (!) of programming I've never found a use for it. I came
| > to C from Basic, FORTRAN, and Pascal. I had some initial trouble
| > getting used
| to
| > 0 based rather than 1 based, but never looked back. 0 based looked
| > more 'right' to me.
|
| Well, there are actually cases where you'd prefer some base other
| than 0. As you've seen, many people here consider 1 to be more
| suitable, and I understand them... also there are some other cases,
| for example, suppose you have an array of year income for 1990-2000,
| in Pascal you'd probably declare it as "array[1990 .. 2000] of
| integer", and then index it like income[1995], letting the compiler
| do his job and insert all the necessary decrements; the result is
| clean code, easy to read and maintain. In C, you have to do it all
| yourself, and probably define some const base =
 1990,
| and clutter all your code with things like income[1995 - base].

Having arrays with base 1 and zero-based index is likely to become a maintainer's nightmare: Whenever you see an array, you must look at it's declaration (or wait for a tooltip from you IDE) and check whether is

 based or not. Array operations become much more bug prone. Once your brain
 adjusted to zero-based indices, you can easily write bug-free code for
 arrays or verify that a given code is bug-free. But switching between
 different bases is very difficult.

 Zero-based indices are favoured over 1-based indices for implementations
 reasons. E.g. with a Byte you can address 256 array elements instead of
 only 255.

 In D, there are many ways to express concepts, e.g. functions, classes, D-
 structs, templates. I believe that non-zero based arrays are not really
 required to express concepts in away that suitable for the problems,
 programmers have to solve.


 Farmer.

Mar 27 2003
next sibling parent reply Helmut Leitner <helmut.leitner chello.at> writes:
"Sean L. Palmer" wrote:
 
 I disagree.  Nonzero-based arrays buy you convenience.  Now you don't have
 to remember to subtract the base, it'll do it automatically.

Yes, but it will cost any user of an array a little bit of performance for the potential base correction. If D is seeking to be a successor to C and C++ in system and game programming, one has to be very careful with this. -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com
Mar 27 2003
next sibling parent reply "Sean L. Palmer" <seanpalmer directvinternet.com> writes:
As in C++, you don't pay for what you don't use.  It doesn't necessarily
cost any performance anyway.

If your array is from 1..10, and starts at address 0x80000000, and each
entry is 4 bytes long, the compiler just takes your index and does this to
compute the address:

(index*4)+0x7ffffffc

If it were zero based, it'd do this:

(index*4)+0x80000000

There are situations where it would cost you performance, but not many.

Sean

"Helmut Leitner" <helmut.leitner chello.at> wrote in message
news:3E82FBF5.75FDADF4 chello.at...
 "Sean L. Palmer" wrote:
 I disagree.  Nonzero-based arrays buy you convenience.  Now you don't


 to remember to subtract the base, it'll do it automatically.

Yes, but it will cost any user of an array a little bit of performance for the potential base correction. If D is seeking to be a successor to C and C++ in system and game programming, one has to be very careful with this.

Mar 27 2003
parent reply Helmut Leitner <leitner hls.via.at> writes:
"Sean L. Palmer" wrote:
 There are situations where it would cost you performance, but not many.
 "Helmut Leitner" <helmut.leitner chello.at> wrote in message
 news:3E82FBF5.75FDADF4 chello.at...
 "Sean L. Palmer" wrote:
 I disagree.  Nonzero-based arrays buy you convenience.  Now you don't


 to remember to subtract the base, it'll do it automatically.

Yes, but it will cost any user of an array a little bit of performance for the potential base correction. If D is seeking to be a successor to C and C++ in system and game programming, one has to be very careful with this.

As in C++, you don't pay for what you don't use. It doesn't necessarily cost any performance anyway. If your array is from 1..10, and starts at address 0x80000000, and each entry is 4 bytes long, the compiler just takes your index and does this to compute the address: (index*4)+0x7ffffffc If it were zero based, it'd do this: (index*4)+0x80000000

And when this address in not available at compile time (e. g. an element of an dynamically allocate part of an object) or passed through function call interfaces - how will you do it then without loss of performance? I think that's impossible. -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com
Mar 27 2003
next sibling parent "Mike Wynn" <mike.wynn l8night.co.uk> writes:
"Helmut Leitner" <leitner hls.via.at> wrote in message
news:3E833158.D568C839 hls.via.at...
 "Sean L. Palmer" wrote:
 There are situations where it would cost you performance, but not many.
 "Helmut Leitner" <helmut.leitner chello.at> wrote in message
 news:3E82FBF5.75FDADF4 chello.at...
 "Sean L. Palmer" wrote:
 I disagree.  Nonzero-based arrays buy you convenience.  Now you




 have
 to remember to subtract the base, it'll do it automatically.

Yes, but it will cost any user of an array a little bit of performance for the potential base correction. If D is seeking to be a successor



 C and C++ in system and game programming, one has to be very careful
 with this.

As in C++, you don't pay for what you don't use. It doesn't necessarily cost any performance anyway. If your array is from 1..10, and starts at address 0x80000000, and each entry is 4 bytes long, the compiler just takes your index and does this


 compute the address:

 (index*4)+0x7ffffffc

 If it were zero based, it'd do this:

 (index*4)+0x80000000

And when this address in not available at compile time (e. g. an element of an dynamically allocate part of an object) or passed through function call interfaces - how will you do it then without loss of performance? I think that's impossible.

in C++ any array can be rebased thus; inline template<T> T * rebase( T * ar, int base ) { return &(ar[-base]); } or C example int * myarray = malloc( sizeof(int) * 80 ); .... int * onebase = &myarray[-1]; // or myarray - 1; onebase[1] is now myarray[0] :) in a func call int func( int * ar ) { ar -= 1; // ar now 1 based. ..... }
Mar 27 2003
prev sibling parent reply Mark Evans <Mark_member pathlink.com> writes:
Helmut Leitner says...

And when this address in not available at compile time ...
 - how will you do it then without
loss of performance? I think that's impossible.

That tradeoff the programmer should be allowed to make. I suspect it's wrong anyway. The base value needs computation only once - dynamically or otherwise - and thereafter may be stored. Each (random) array access involves at minimum one addition to find the desired element's memory address. Using a different base adds not a whit of extra work. We had a similar debate about negative indices. Walter was against them for performance reasons that are easily addressed if not completely fictitious. Sean's motto 'pay for what you use' is apropos. Farmer says...
In D, there are many ways to express concepts, e.g. functions, classes,
D-structs, templates. I believe that non-zero based arrays are not really 
required to express concepts in away that suitable for the problems, 
programmers have to solve.

Functional languages owe much of their fabulous productivity to array (list) handling capabilities. An array can hold virtually anything, not just numbers. It can have more than one dimension to associate objects on different axes en masse. C++ folks unfamiliar with such paradigms know little of what they're missing, so I understand these counteroffers, but there is no substitute. The ability to pick apart, rearrange, index, map across, thread, and otherwise sling arrays around - and morph them into new forms - is a truly expressive and compact way to write tons of code with performance results comparable to C and even better, depending on your C programmer and his available time for optimizing nested inner loops and chasing down off-by-one errors. Mark
Mar 27 2003
parent Farmer <itsFarmer. freenet.de> writes:
Mark Evans <Mark_member pathlink.com> wrote in
news:b60d0o$2sf7$1 digitaldaemon.com: 

 Farmer says...
In D, there are many ways to express concepts, e.g. functions,
classes, D-structs, templates. I believe that non-zero based arrays
are not really required to express concepts in away that suitable for
the problems, programmers have to solve.

Functional languages owe much of their fabulous productivity to array (list) handling capabilities. An array can hold virtually anything, not just numbers. It can have more than one dimension to associate objects on different axes en masse. C++ folks unfamiliar with such paradigms know little of what they're missing, so I understand these counteroffers, but there is no substitute. The ability to pick apart, rearrange, index, map across, thread, and otherwise sling arrays around - and morph them into new forms - is a truly expressive and compact way to write tons of code with performance results comparable to C and even better, depending on your C programmer and his available time for optimizing nested inner loops and chasing down off-by-one errors. Mark

You are right. I don't know about functional programming, but I know that doing complex work with arrays is a pain in C++, C#, Java or D (not to mention C or Pascal). D arrays really shine when used for system level programming tasks, like getting memory from the GC, copying a memory block or working with a rather fixed set of objects/values. Non-zero based arrays for such tasks have few benefits, but pose the risk of harder to maintain code: Some/May people will use different array-bases, in the same language, for the same project, for similar concepts. I think, that D arrays would better stay a low-level, implementation determined feature. Putting more features to them, would further increase the confusion about them. But (separate) features to the D language and/or Phobos, that enables programmers to work with arrays(lists) in a productive, safe and reasonably fast manner could be a worthwhile addition to D. Farmer.
Mar 30 2003
prev sibling parent reply Ilya Minkov <midiclub tiscali.de> writes:
Helmut Leitner wrote:
 
 "Sean L. Palmer" wrote:
 
I disagree.  Nonzero-based arrays buy you convenience.  Now you don't have
to remember to subtract the base, it'll do it automatically.

Yes, but it will cost any user of an array a little bit of performance for the potential base correction. If D is seeking to be a successor to C and C++ in system and game programming, one has to be very careful with this.

No, it won't. In Pascal, functions disliked taking arrays of unknown size. You could only make them take an array[SomeConstant..] of SomeType, then you could process this array as SomeConstant-based. This function would only have a run-time specification of an array length, but not of a base, and thus wouldn't probably be any slower than a 0-based array. I'm not sure whether it was allowed to assume that arrays had the same base, or type checking had caught such misuse. -i.
Mar 28 2003
parent reply Helmut Leitner <leitner hls.via.at> writes:
Ilya Minkov wrote:
 
 Helmut Leitner wrote:
 "Sean L. Palmer" wrote:

I disagree.  Nonzero-based arrays buy you convenience.  Now you don't have
to remember to subtract the base, it'll do it automatically.

Yes, but it will cost any user of an array a little bit of performance for the potential base correction. If D is seeking to be a successor to C and C++ in system and game programming, one has to be very careful with this.

No, it won't. In Pascal, functions disliked taking arrays of unknown size. You could only make them take an array[SomeConstant..] of SomeType, then you could process this array as SomeConstant-based. This function would only have a run-time specification of an array length, but not of a base, and thus wouldn't probably be any slower than a 0-based array. I'm not sure whether it was allowed to assume that arrays had the same base, or type checking had caught such misuse.

You are right about Pascal. It was a major PITA that it was impossible even to write a generic function to e.g . calculate the average of an array of arbitrary size because the array size was part of the parameter type and therefore part of the function definition. But I don't think that you can compare this. If you want a base != 0, than you have to pay for it. Maybe not much, but you have to pay. If you don't offset the array pointer you will have to add the offset when accessing the array elements. If you offset the array pointer you need to reset it before you free it's dynamical memory. So you have to store either the offset or the pointer to the allocated memory. If you want to implement range checking, you will have to either use the offset, or store separate hi and lo bounds. Anyway things become more complicated and it won't make a single C, C++ or Java programmer feel better about D. -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com
Mar 28 2003
next sibling parent Ilya Minkov <midiclub 8ung.at> writes:
Helmut Leitner wrote:
 
 Ilya Minkov wrote:
No, it won't. In Pascal, functions disliked taking arrays of unknown
size. You could only make them take an array[SomeConstant..] of
SomeType, then you could process this array as SomeConstant-based. This
function would only have a run-time specification of an array length,
but not of a base, and thus wouldn't probably be any slower than a
0-based array. I'm not sure whether it was allowed to assume that arrays
had the same base, or type checking had caught such misuse.

You are right about Pascal. It was a major PITA that it was impossible even to write a generic function to e.g . calculate the average of an array of arbitrary size because the array size was part of the parameter type and therefore part of the function definition.

It was possible. Only the low bound was fixed, and the length was passed implicitly to the function.
 But I don't think that you can compare this. If you want a base != 0,
 than you have to pay for it. Maybe not much, but you have to pay.

Let the base be in the calling code, and let the function accept a dynamic array "as if" it was placed on the certain offset. Consider: this all offset thingy is good for program readability. And while length usually depends on the run-time condition, the base usually depends upon readability considerations for algorithms. For example, in a string you usually have to iterate from 0 to length-1. That's what all the C guys do, they even have a "for" which allows to perfectly hide this fact or the oppsite so that a bug is not too easy to see. But isn't it neater to reference the string as 1-based?
 If you don't offset the array pointer you will have to add the offset
 when accessing the array elements.

Pointer offset can only be done as a short-term optimisation. Actually, i don't even think it's requiered, since almost every memory acess gets additive constants in algorithms. Another example where it is useful, is a syntactic sugar for acessing arrays of constant size, like in Sean's example. You can also make the function programmer handle it: in the upper example, he specifies the lower bound, but does not specify the upper bound. It is his responsibility to retrieve the upper bound and take it into account. The same can be done with a lower bound. The bad thing is that it would probably mean expanding the dynamic array specification, or would requiere extended function annotation by the compiler. However, the problem may be of a more general nature, and might be better solved in a more generic way, as was pointed out by Mark. I'm not sure i exactly understand what specific features/ solution he means though. -i.
Mar 28 2003
prev sibling parent Mark Evans <Mark_member pathlink.com> writes:
The array record can include both a real base pointer and a pseudo-base pointer
(incorporating the offset). The pseudo-base pointer is computed only once,
whether dynamically or statically.  Array access cost is identical with either
pointer.

Beyond that, the array record could include an 'end' pointer making negative
indexing equally simple.  It would be adjusted every time the length property is
adjusted.

Mark


Helmut Leitner says...
If you want a base != 0,
than you have to pay for it. Maybe not much, but you have to pay.

If you don't offset the array pointer you will have to add the offset
when accessing the array elements.

If you offset the array pointer you need to reset it before you free
it's dynamical memory. So you have to store either the offset or the
pointer to the allocated memory. If you want to implement range
checking, you will have to either use the offset, or store separate
hi and lo bounds. Anyway things become more complicated and it won't
make a single C, C++ or Java programmer feel better about D.

-- 
Helmut Leitner    leitner hls.via.at
Graz, Austria   www.hls-software.com

Mar 28 2003
prev sibling parent Mark Evans <Mark_member pathlink.com> writes:
Base index is nothing next to fundamental array manipulation/creation functions.
Experience stands behind that statement.  Mathematica arrays are 1-based, C
arrays are 0-based, and I must often pass arrays between them.  The 0 vs. 1
issue has never hurt me.

The big deal is that C/C++ offers no help in array manipulation.  That is
primarily why I use Mathematica so regularly.  Mathematica is a multiparadigm
language offering functional-style array manipulations.  I would do these
manipulations in C++ if that were possible.  The more of them in D, the better.

Just to give a flavor of what I mean - here is some Mathematica code pulled at
random.  These lines showcase typical array manipulations and a bit of
functional style.  They come from an autocorrelation spectral estimator that
performs as fast as the equivalent C, but says, in mere lines, what would be
pages of C.

'Table' is equivalent to Sean's 'range' (I think).  'Flatten' drops a
multi-dimensional array down to 1 dimension.    This is the kind of stuff I wish
I could do in D.  -Mark

X = InverseFourier[piece];
X2 = Map[(# Conjugate[#])&, X];
auto = Re[Chop[Fourier[X2]]] / Q;
s = Take[auto,M] Table[w[m],{m,1,M}];
s = Flatten[{s,Table[0,{N- (2 M - 1)}]}];
lastpart = Table[auto[[N-m+1 +1]] w[N-m+1],{m,N-M+1+1,N}];
s = Flatten[{s,lastpart}];
capS = Re[Chop[InverseFourier[s]]];



Sean L. Palmer says...
I disagree.  Nonzero-based arrays buy you convenience.  Now you don't have
to remember to subtract the base, it'll do it automatically.

Mar 27 2003
prev sibling parent reply Karl Bochert <kbochert copper.net> writes:
On Wed, 26 Mar 2003 23:45:19 +0000 (UTC), Farmer <itsFarmer. freenet.de> wrote:
 "Carlos Santander B." <carlos8294 msn.com> wrote in
 news:b5k815$11ia$2 digitaldaemon.com: 
 
 "Pavel Minayev" <evilone omen.ru> escribi๓ en el mensaje
 news:a5jben$16o8$1 digitaldaemon.com...
| "Walter" <walter digitalmars.com> wrote in message
| news:a56tc4$1534$2 digitaldaemon.com...
|
| > Having lower bounds specifiable will work with D (and even with C),
| > but 
 in
| > my decades (!) of programming I've never found a use for it. I came
| > to C from Basic, FORTRAN, and Pascal. I had some initial trouble
| > getting used 
| to
| > 0 based rather than 1 based, but never looked back. 0 based looked
| > more 'right' to me.
|
| Well, there are actually cases where you'd prefer some base other
| than 0. As you've seen, many people here consider 1 to be more
| suitable, and I understand them... also there are some other cases,
| for example, suppose you have an array of year income for 1990-2000,
| in Pascal you'd probably declare it as "array[1990 .. 2000] of
| integer", and then index it like income[1995], letting the compiler
| do his job and insert all the necessary decrements; the result is
| clean code, easy to read and maintain. In C, you have to do it all
| yourself, and probably define some const base = 
 1990,
| and clutter all your code with things like income[1995 - base].

Having arrays with base 1 and zero-based index is likely to become a maintainer's nightmare: Whenever you see an array, you must look at it's declaration (or wait for a tooltip from you IDE) and check whether is zero- based or not. Array operations become much more bug prone. Once your brain adjusted to zero-based indices, you can easily write bug-free code for arrays or verify that a given code is bug-free. But switching between different bases is very difficult. Zero-based indices are favoured over 1-based indices for implementations reasons. E.g. with a Byte you can address 256 array elements instead of only 255. In D, there are many ways to express concepts, e.g. functions, classes, D- structs, templates. I believe that non-zero based arrays are not really required to express concepts in away that suitable for the problems, programmers have to solve. Farmer.

I always get annoyed when people refer to '1-based' arrays. An array whose first element is arr[1] is a very special beast. Its elements are being labled with their position in the array. Access is by ordinal rather than cardinal. A quick test: Where is the character 's' in the word 'test'? If you answered "the character offset 2 from the start" then maybe '0-based' arrays make sense. If you said "the third character" you ought to consider having arrays accessed by ordinals. Karl It is highly misleading to think of '0-based' and '1-based' arrays. What they really are is arrays accessed by index, and arrays accessed by position. An array may have any arrangment of indices, but it always has a first position. I am against 0-based arrays. I am against 1-based arrays. I am for positional arrays. Last I looked, D had come up with an absolutely horrible approach to specifiying slices, brought on, I'm sure, by this 'based' concept.
Mar 29 2003
parent Mark Evans <Mark_member pathlink.com> writes:
Karl Bochert complains correctly:
If you answered "the character offset 2 from the start" then maybe '0-based'
arrays make sense.

Karl - welcome to the universe of design flaws we call the C programming language. Confounding arrays with pointers with strings produces ... C. Pull one string, and the whole building collapses. Disentangling these concepts invites more than one answer to your question. The Icon language defines string positions between characters, while general arrays are "1-based." Icon is exceptionally adept at string processing. From the Icon Handbook, "Icon's strings and tables make text processing much more convenient than in languages that only provide characters and character arrays....Unlike most languages where strings are implemented as arrays of characters, Icon provides strings as a primitive data type. They can be of any length. There are extensive facilities for searching and editing strings." Unicode with its variable-byte-length encodings makes disentangling strings from arrays more urgent still. D should promote strings to primitive type status with dedicated constructs. Icon had it right a long time ago. D is struggling with Unicode because the confused C model is not amenable to Unicode. As a primitive type, Unicode string complexities would vanish under the hood. I'm not holding my breath, but if you ask me, that is how to do strings right. (Those in love with C strings could still declare arrays of char.) http://www.toolsofcomputing.com/IconHandbook/ http://unicon.sourceforge.net/index.html Mark
Mar 29 2003
prev sibling parent Burton Radons <loth users.sourceforge.net> writes:
Carlos Santander B. wrote:
 "Pavel Minayev" <evilone omen.ru> escribi๓ en el mensaje
 news:a5jben$16o8$1 digitaldaemon.com...
 | "Walter" <walter digitalmars.com> wrote in message
 | news:a56tc4$1534$2 digitaldaemon.com...
 |
 | > Having lower bounds specifiable will work with D (and even with C), but
 in
 | > my decades (!) of programming I've never found a use for it. I came to C
 | > from Basic, FORTRAN, and Pascal. I had some initial trouble getting used
 | to
 | > 0 based rather than 1 based, but never looked back. 0 based looked more
 | > 'right' to me.
 |
 | Well, there are actually cases where you'd prefer some base other than 0.
 | As you've seen, many people here consider 1 to be more suitable, and
 | I understand them... also there are some other cases, for example, suppose
 | you have an array of year income for 1990-2000, in Pascal you'd probably
 | declare it as "array[1990 .. 2000] of integer", and then index it like
 | income[1995], letting the compiler do his job and insert all the necessary
 | decrements; the result is clean code, easy to read and maintain. In C,
 | you have to do it all yourself, and probably define some const base =
 1990,
 | and clutter all your code with things like income[1995 - base].
 |
 | After all, it's as simple as subtracting the base from the index,
 | a single SUB... ain't it worth the thing?
 |
 |
 
 No one ever answered to this one. It seems very clever to me.

Putting aside the issue of implementation details, nobody has produced any practical examples for it, and I've never seen any good use of it in Pascal, either; it's always used static assumptions about the environment code is to be used in, as with Pavel's pseudo-example above.
Mar 28 2003
prev sibling next sibling parent "OddesE" <OddesE_XYZ hotmail.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:a562qa$5lv$1 digitaldaemon.com...
 "Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
 news:a55s13$1c8a$1 digitaldaemon.com...

 This is a far better idea. What I like in Pascal is the ability to
 use 0-based, 1-based or whatever else based arrays depending on
 your task and your personal taste. Those who care of speed (me) would
 probably use 0-based (and I believe it should be the default, to
 work the same way as in C/C++). Otherwise, you can specify it yourself:

     int[5]    foo;     // consists of foo[0] to foo[4]
     int[1..5] bar;     // consists of bar[1] to bar[5]

Seconded! -- Stijn OddesE_XYZ hotmail.com http://OddesE.cjb.net __________________________________________ Remove _XYZ from my address when replying by mail
Feb 27 2002
prev sibling parent reply "OddesE" <OddesE_XYZ hotmail.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:a562qa$5lv$1 digitaldaemon.com...
 "Roberto Mariottini" <rmariottini lycosmail.com> wrote in message
 news:a55s13$1c8a$1 digitaldaemon.com...

 This is a far better idea. What I like in Pascal is the ability to
 use 0-based, 1-based or whatever else based arrays depending on
 your task and your personal taste. Those who care of speed (me) would
 probably use 0-based (and I believe it should be the default, to
 work the same way as in C/C++). Otherwise, you can specify it yourself:

     int[5]    foo;     // consists of foo[0] to foo[4]
     int[1..5] bar;     // consists of bar[1] to bar[5]

And maybe make a property, array.StartIndex so you could always dynamically find out what the start index of the array is! -- Stijn OddesE_XYZ hotmail.com http://OddesE.cjb.net __________________________________________ Remove _XYZ from my address when replying by mail
Feb 27 2002
parent reply "Pavel Minayev" <evilone omen.ru> writes:
"OddesE" <OddesE_XYZ hotmail.com> wrote in message
news:a5jajn$168e$1 digitaldaemon.com...

 And maybe make a property, array.StartIndex so
 you could always dynamically find out what the
 start index of the array is!

Then maybe: array.start // first index array.end // last index array.length // length (end - start + 1) And define this for all arrays, including 0-based.
Feb 27 2002
parent reply "Sean L. Palmer" <spalmer iname.com> writes:
On a related note, Pascal had succ and pred for enums, but from what I
remember didn't have first and last?

All four would be quite handy to have for enums... unfortunately if you can
define your own values for an enumerant, succ and pred become (at runtime) a
function containing a switch statement or table lookup.

Personally I think enums *should* be sequential, and a separate flags type
could deal with bitflags.  typedef'd ints can handle any other case.

Sean

"Pavel Minayev" <evilone omen.ru> wrote in message
news:a5jb30$16ij$1 digitaldaemon.com...
 "OddesE" <OddesE_XYZ hotmail.com> wrote in message
 news:a5jajn$168e$1 digitaldaemon.com...

 And maybe make a property, array.StartIndex so
 you could always dynamically find out what the
 start index of the array is!

Then maybe: array.start // first index array.end // last index array.length // length (end - start + 1) And define this for all arrays, including 0-based.

Feb 28 2002
parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Sean L. Palmer" <spalmer iname.com> wrote in message
news:a5l3t1$1ua6$1 digitaldaemon.com...

 On a related note, Pascal had succ and pred for enums, but from what I
 remember didn't have first and last?

Yes, right. Succ and Pred, however, were defined for all ordinal types, not just enums.
 All four would be quite handy to have for enums... unfortunately if you

 define your own values for an enumerant, succ and pred become (at runtime)

 function containing a switch statement or table lookup.

There's the same problem in Pascal (it supports custom values for enum members since Delphi 6, AFAIK), and it's handled in a somewhat strange manner: Succ always means +1, and Pred is -1, regardless of what declaration says. So: type TEnum = (foo = 1000, bar = 2000, baz = 3000); The thing is, Pascal defines enum as "a subrange whose lowest and highest values correspond to the lowest and highest ordinalities of the constants in the declaration". So variable of type TEnum can take any value in range 1000 .. 3000, and thus, Succ and Pred just inrements/decrements by one...
 Personally I think enums *should* be sequential, and a separate flags type
 could deal with bitflags.  typedef'd ints can handle any other case.

It is sometimes very convenient to define an enum with members equal to those of API: enum Key { LButton = 1, RButton = 2, Cancel = 3, MButton = 4, Back = 8, BackSpace = Back, Tab = 9, Clear = 12, Return = 13, Enter = Return, Shift = 16, Control = 17, ... } Now every Key, being casted to int, equals the appropriate VK_* constant - no need for switch() or alike.
Feb 28 2002
parent reply "Sean L. Palmer" <spalmer iname.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:a5labi$214t$1 digitaldaemon.com...
 "Sean L. Palmer" <spalmer iname.com> wrote in message
 news:a5l3t1$1ua6$1 digitaldaemon.com...

 On a related note, Pascal had succ and pred for enums, but from what I
 remember didn't have first and last?

Yes, right. Succ and Pred, however, were defined for all ordinal types, not just enums.

Right. Used it in place of ++ and -- alot.
 All four would be quite handy to have for enums... unfortunately if you

 define your own values for an enumerant, succ and pred become (at


 a
 function containing a switch statement or table lookup.

There's the same problem in Pascal (it supports custom values for enum members since Delphi 6, AFAIK), and it's handled in a somewhat strange manner: Succ always means +1, and Pred is -1, regardless of what declaration says. So: type TEnum = (foo = 1000, bar = 2000, baz = 3000); The thing is, Pascal defines enum as "a subrange whose lowest and highest values correspond to the lowest and highest ordinalities of the constants in the declaration". So variable of type TEnum can take any value in range 1000 .. 3000, and thus, Succ and Pred just inrements/decrements by one...

That's why I think enums should be limited to sequential values.
 Personally I think enums *should* be sequential, and a separate flags


 could deal with bitflags.  typedef'd ints can handle any other case.

It is sometimes very convenient to define an enum with members equal to those of API: enum Key { LButton = 1, RButton = 2, Cancel = 3, MButton = 4, Back = 8, BackSpace = Back, Tab = 9, Clear = 12, Return = 13, Enter = Return, Shift = 16, Control = 17, ... } Now every Key, being casted to int, equals the appropriate VK_* constant - no need for switch() or alike.

So what's so inconvenient about this? typedef int VKCode; // in D I believe this makes a distinct type which behaves identically to int except for type conversion can't // be implicitly done from int to VKCode (though I think the opposite still happens implicitly) static const VKCode LButton = 1, RButton = 2, Cancel = 3, MButton = 4, Back = 8, BackSpace = Back, Tab = 9, Clear = 12, Return = 13, Enter = Return, Shift = 16, Control = 17; // this is assuming that implicit conversion from int to VKCode can still be done in the initializer. // if you think about it, that's really an explicit conversion anyway, don't you think? That's how I'd like it to be handled, anyway. Sean
Mar 01 2002
parent reply "Pavel Minayev" <evilone omen.ru> writes:
"Sean L. Palmer" <spalmer iname.com> wrote in message
news:a5nm1p$r0$1 digitaldaemon.com...

 So what's so inconvenient about this?

 typedef int VKCode;   // in D I believe this makes a distinct type which
 behaves identically to int except for type conversion can't
                                    // be implicitly done from int to

 (though I think the opposite still happens implicitly)
 static const VKCode
         LButton = 1,
         RButton = 2,
         Cancel = 3,
         MButton = 4,
         Back = 8, BackSpace = Back,
         Tab = 9,
         Clear = 12,
         Return = 13, Enter = Return,
         Shift = 16,
         Control = 17;  // this is assuming that implicit conversion from

 to VKCode can still be done in the initializer.
                                // if you think about it, that's really an
 explicit conversion anyway, don't you think?

The difference is that enum defines its own namespace. So it'd be Key.Enter, Key.Tab, Key.A etc... I don't see any other way to do it apart from declaring a separate class specially for that - probably not the best idea...
Mar 01 2002
parent reply "OddesE" <OddesE_XYZ hotmail.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:a5nugt$4qh$1 digitaldaemon.com...
 "Sean L. Palmer" <spalmer iname.com> wrote in message
 news:a5nm1p$r0$1 digitaldaemon.com...

 So what's so inconvenient about this?

 typedef int VKCode;   // in D I believe this makes a distinct type which
 behaves identically to int except for type conversion can't
                                    // be implicitly done from int to

 (though I think the opposite still happens implicitly)
 static const VKCode
         LButton = 1,
         RButton = 2,
         Cancel = 3,
         MButton = 4,
         Back = 8, BackSpace = Back,
         Tab = 9,
         Clear = 12,
         Return = 13, Enter = Return,
         Shift = 16,
         Control = 17;  // this is assuming that implicit conversion from

 to VKCode can still be done in the initializer.
                                // if you think about it, that's really


 explicit conversion anyway, don't you think?

The difference is that enum defines its own namespace. So it'd be Key.Enter, Key.Tab, Key.A etc... I don't see any other way to do it apart from declaring a separate class specially for that - probably not the best idea...

How about placing them into it's own module named Key.d? Not the most beatiful solution I agree though... -- Stijn OddesE_XYZ hotmail.com http://OddesE.cjb.net __________________________________________ Remove _XYZ from my address when replying by mail
Mar 01 2002
parent reply "Pavel Minayev" <evilone omen.ru> writes:
"OddesE" <OddesE_XYZ hotmail.com> wrote in message
news:a5oasr$hqb$1 digitaldaemon.com...

 How about placing them into it's own module
 named Key.d?  Not the most beatiful solution
 I agree though...

Two problems here. First is that there might be another module with such (frequently used) name. Second is that if another module contains a function Enter(), you'll have to resolve scope each time you use it. With enum, you'd always use Key.Enter for key, and Enter() to call function.
Mar 01 2002
parent "OddesE" <OddesE_XYZ hotmail.com> writes:
"Pavel Minayev" <evilone omen.ru> wrote in message
news:a5omng$lmv$1 digitaldaemon.com...
<SNIP>
 Two problems here. First is that there might be another module with
 such (frequently used) name. Second is that if another module contains
 a function Enter(), you'll have to resolve scope each time you use
 it. With enum, you'd always use Key.Enter for key, and Enter() to
 call function.

Yes you are right, hadn't thought of that. -- Stijn OddesE_XYZ hotmail.com http://OddesE.cjb.net __________________________________________ Remove _XYZ from my address when replying by mail
Mar 02 2002
prev sibling parent reply Antti Sykari <jsykari gamma.hut.fi> writes:
Mark Evans <Mark_member pathlink.com> writes:
 Unicode with its variable-byte-length encodings makes disentangling
 strings from arrays more urgent still.  D should promote strings to
 primitive type status with dedicated constructs.  Icon had it right
 a long time ago.  D is struggling with Unicode because the confused
 C model is not amenable to Unicode.  As a primitive type, Unicode
 string complexities would vanish under the hood.  I'm not holding my
 breath, but if you ask me, that is how to do strings right.  (Those
 in love with C strings could still declare arrays of char.)

One thing D needs for working Unicode strings is a decent foreach (or iterator) construct, which I suppose is "coming soon". Doing C-like for (int i = 0; i < string.length; ++i) process(string[i]); is out of the question if the internal representation is, for example, UTF-8. Implementing String as a class would not be totally impossible either, at least if assignment operator could be overloaded. Of course, the best would probably to have a string concept built into the language. As of now, the array type seems to have gathered lot of the functionality that would in normal circumstances be part of the string class (how often do you concatenate other arrays than strings, for example?)
 http://www.toolsofcomputing.com/IconHandbook/
 http://unicon.sourceforge.net/index.html

While Icon is said to be adept at string processing, it's unfortunate that it doesn't support Unicode either: --- B3. Is there a Unicode version of Icon? No. Icon is defined in terms of 8-bit characters, and changing this presents several design challenges that would likely break existing programs. --- -Antti
Mar 29 2003
parent reply Mark Evans <Mark_member pathlink.com> writes:
While Icon is said to be adept at string processing, it's unfortunate
that it doesn't support Unicode either:
-Antti

Icon is recognized worldwide as the king of string processing languages. Its development ceased before Unicode came into favor. Incidentally, if you know of any language that supports native Unicode strings I am all ears. One of the Unicon testimonials has it right: "Other languages have minimal data structures. Most of our programming is in C. Quite often, I need a list of objects. In C, it is (as you know) a royal pain to declare a structure with a pointer to itself, and malloc them, free them, and walk the chain. Why can't a language just have a 'list' datatype, and be done with it? Why can't a language provide the constructs we all need, instead of providing nearly-assembly- language constructs and letting us develop the rest ourselves?" Mark
Mar 29 2003
parent reply "Matthew Wilson" <dmd synesis.com.au> writes:
Depends what you mean by Unicode?

Java and C# (and Verifiable Balderdash) all use Unicode UCS-16 as their
native type.

I'm not aware of any programming language - XML is not a programming
language, all you soap suds! - that works with UTF-8 (or 7). Maybe that's
what you meant?

"Mark Evans" <Mark_member pathlink.com> wrote in message
news:b65see$qsr$1 digitaldaemon.com...
While Icon is said to be adept at string processing, it's unfortunate
that it doesn't support Unicode either:
-Antti

Icon is recognized worldwide as the king of string processing languages.

 development ceased before Unicode came into favor.  Incidentally, if you

 any language that supports native Unicode strings I am all ears.

 One of the Unicon testimonials has it right: "Other languages have minimal

 structures. Most of our programming is in C. Quite often, I need a list of
 objects. In C, it is (as you know) a royal pain to declare a structure

 pointer to itself, and malloc them, free them, and walk the chain. Why

 language just have a 'list' datatype, and be done with it? Why can't a

 provide the constructs we all need, instead of providing nearly-assembly-
 language constructs and letting us develop the rest ourselves?"

 Mark

Mar 29 2003
parent reply Mark Evans <Mark_member pathlink.com> writes:
Matthew Wilson says...
Java and C# ... all use Unicode ... as their native type.

And some languages have seen Unicode retrofits. http://www.reportlab.com/i18n/python_unicode_tutorial.html http://rf.net/~james/perli18n.html#Q4 To clarify the remark, I was considering languages that offer Unicode strings as primitives (not merely characters) and are fast string processors (in the C speed range). Maybe C# fits the bill. Python and Java are not 'fast' and Java's String is not a primitive anyway. Other languages aside, the point is that D needs a Unicode string primitive. Mark cter
Mar 29 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Mark Evans" <Mark_member pathlink.com> wrote in message
news:b660it$tn4$1 digitaldaemon.com...
 Other languages aside, the point is that D needs a Unicode string

It does already. In D, a char[] is really a utf-8 array.
Mar 31 2003
parent reply Burton Radons <loth users.sourceforge.net> writes:
Walter wrote:
 "Mark Evans" <Mark_member pathlink.com> wrote in message
 news:b660it$tn4$1 digitaldaemon.com...
 
Other languages aside, the point is that D needs a Unicode string

primitive. It does already. In D, a char[] is really a utf-8 array.

Er, no... void main () { char[] foo; foo = "\uFF00"; } "cannot implicitly convert wchar[1] to char[]". Putting in an explicit cast results in a foo with length 1 and value 0.
Mar 31 2003
parent "Walter" <walter digitalmars.com> writes:
Ok, I'll fix it. -Walter

"Burton Radons" <loth users.sourceforge.net> wrote in message
news:b6a7vb$voq$1 digitaldaemon.com...
 Walter wrote:
 "Mark Evans" <Mark_member pathlink.com> wrote in message
 news:b660it$tn4$1 digitaldaemon.com...

Other languages aside, the point is that D needs a Unicode string

primitive. It does already. In D, a char[] is really a utf-8 array.

Er, no... void main () { char[] foo; foo = "\uFF00"; } "cannot implicitly convert wchar[1] to char[]". Putting in an explicit cast results in a foo with length 1 and value 0.

Mar 31 2003