D - Arrays, Slices, Cases
- Karl Bochert <kbochert ix.netcom.com> Feb 21 2002
- not here <not.known this.address.com> Feb 21 2002
- Karl Bochert <kbochert ix.netcom.com> Feb 21 2002
- not here <not.known this.address.com> Feb 21 2002
- "Sean L. Palmer" <spalmer iname.com> Feb 22 2002
- "Pavel Minayev" <evilone omen.ru> Feb 22 2002
- not here <not.here this.address.com> Feb 22 2002
- "Pavel Minayev" <evilone omen.ru> Feb 22 2002
- not here <not.here this.address.com> Feb 22 2002
- "Pavel Minayev" <evilone omen.ru> Feb 22 2002
- Karl Bochert <kbochert ix.netcom.com> Feb 22 2002
- Russell Borogove <kaleja estarcion.com> Feb 22 2002
- "Walter" <walter digitalmars.com> Feb 22 2002
- "not here" <not.here this.address.com> Feb 22 2002
- Karl Bochert <kbochert ix.netcom.com> Feb 22 2002
- "Walter" <walter digitalmars.com> Feb 22 2002
- Barry Pederson <barryp yahoo.com> Feb 23 2002
- "Walter" <walter digitalmars.com> Feb 23 2002
- "Roberto Mariottini" <rmariottini lycosmail.com> Feb 22 2002
- "Pavel Minayev" <evilone omen.ru> Feb 22 2002
- "Walter" <walter digitalmars.com> Feb 22 2002
- "Roberto Mariottini" <rmariottini lycosmail.com> Feb 26 2002
- "Pavel Minayev" <evilone omen.ru> Feb 27 2002
- "Carlos Santander B." <carlos8294 msn.com> Mar 22 2003
- Farmer <itsFarmer. freenet.de> Mar 26 2003
- "Carlos Santander B." <carlos8294 msn.com> Mar 26 2003
- "Sean L. Palmer" <seanpalmer directvinternet.com> Mar 27 2003
- Helmut Leitner <helmut.leitner chello.at> Mar 27 2003
- "Sean L. Palmer" <seanpalmer directvinternet.com> Mar 27 2003
- Helmut Leitner <leitner hls.via.at> Mar 27 2003
- "Mike Wynn" <mike.wynn l8night.co.uk> Mar 27 2003
- Mark Evans <Mark_member pathlink.com> Mar 27 2003
- Farmer <itsFarmer. freenet.de> Mar 30 2003
- Ilya Minkov <midiclub tiscali.de> Mar 28 2003
- Helmut Leitner <leitner hls.via.at> Mar 28 2003
- Ilya Minkov <midiclub 8ung.at> Mar 28 2003
- Mark Evans <Mark_member pathlink.com> Mar 28 2003
- Mark Evans <Mark_member pathlink.com> Mar 27 2003
- Karl Bochert <kbochert copper.net> Mar 29 2003
- Mark Evans <Mark_member pathlink.com> Mar 29 2003
- Burton Radons <loth users.sourceforge.net> Mar 28 2003
- "OddesE" <OddesE_XYZ hotmail.com> Feb 27 2002
- "OddesE" <OddesE_XYZ hotmail.com> Feb 27 2002
- "Pavel Minayev" <evilone omen.ru> Feb 27 2002
- "Sean L. Palmer" <spalmer iname.com> Feb 28 2002
- "Pavel Minayev" <evilone omen.ru> Feb 28 2002
- "Sean L. Palmer" <spalmer iname.com> Mar 01 2002
- "Pavel Minayev" <evilone omen.ru> Mar 01 2002
- "OddesE" <OddesE_XYZ hotmail.com> Mar 01 2002
- "Pavel Minayev" <evilone omen.ru> Mar 01 2002
- "OddesE" <OddesE_XYZ hotmail.com> Mar 02 2002
- Antti Sykari <jsykari gamma.hut.fi> Mar 29 2003
- Mark Evans <Mark_member pathlink.com> Mar 29 2003
- "Matthew Wilson" <dmd synesis.com.au> Mar 29 2003
- Mark Evans <Mark_member pathlink.com> Mar 29 2003
- "Walter" <walter digitalmars.com> Mar 31 2003
- Burton Radons <loth users.sourceforge.net> Mar 31 2003
- "Walter" <walter digitalmars.com> Mar 31 2003
Having too much time on my hands, I submit the following
summary of my viewpoint. At the very least it shows how I
will rationalize the method that D uses.
I would welcome any other rationale that would make D's
choices seem more natural.
(After 20 yrs of C, I still have fencepost problems! :-)
Concerning Arrays, slices, and cases.
1) Ordinal Arrays
Arrays are ordered set of elements accessed by their position
in the set.
An array with N elements has a first element and an
N'th element.
A slice of the entire array is arr[1..N].
A slice excluding the ends is arr[2..N-1].
end-inclusion:
a slice can be thought of as the result of a procedure
that (somehow) extracts the range, similar to:
result = slice (&arr[start], &arr[end]);
Obviously, end is included.
a slice can be thought of as the result of a loop that
'extracts' elements of the array, similar to:
for (i = start, i <= end; ++i) result[1+i-start] = arr[i];
Obviously, like the for loop, 'end' is included.
Cases are also end-inclusive:
case [1]:
case [2 to 4]:
case [5]:
end-exclusion:
Why would anyone do that?
-----------
2) Cardinal Arrays
Arrays are chunks of storage accessed by the offset from
their start.
An array with N elements has a zero'th element and an N-1'th
element.
A slice of the entire array is arr[0..N-1].
end-inclusion:
A slice excluding the ends is arr[1..N-1].
a slice can be thought of as the result of a procedure
that (somehow) extracts the range, similar to:
result = slice (&arr[start], &arr[end]);
Obviously, end is included.
Cases are also end-inclusive:
case [1]:
case [2 to 4]:
case [5]:
end-exclusion:
A slice excluding the ends is arr[1..N-2].
a slice can be thought of as the result of a loop that
'extracts' elements of the array, similar to:
for (i = start, i < end; ++i) result[i-start] = arr[i];
Obviously, like the for loop, end is excluded.
Cases are also end-exclusive:
case [1]:
case [2 to 5]:
case [5]:
Cases are end-inclusive (different than slices)
case [1]:
case [2 to 4]:
case [5]:
D is:
Cardinal arrays, end-exclusive, case-inclusive?
I am for:
Ordinal arrays, end-inclusive, case-inclusive.
simpler and more consistant.
Karl Bochert
Feb 21 2002
On Fri, 22 Feb 2002 00:43:28 GMT, Karl Bochert <kbochert ix.netcom.com> wrote:Having too much time on my hands, I submit the following summary of my viewpoint. At the very least it shows how I will rationalize the method that D uses.
[big snip]D is: Cardinal arrays, end-exclusive, case-inclusive? I am for: Ordinal arrays, end-inclusive, case-inclusive. simpler and more consistant.
Hi Karl, whether one uses either a index or an offset to reference an array element, is often influenced by what we have been exposed to already. However, I'd like to approach the issue in a different manner. To me an 'index' is 1-based and is a normal way that people think about enumerated elements. If you go up to somebody with a list of items and asked them to number them, the person would normally start with the number 1. An 'offset' is 0-based and is the normal way that computers get access to memory. Addr + offset gives another address - the start of the element in question. I believe that programming languages have a primary aim of helping people describe their algorithms. In other words, programming languages are for people and not computers - that's why we have compilers. So, I would hold that 1-based array referencing is the normal way for people to describe what they are trying to do. Furthermore, an index has the connotation that the entire element is being referenced, whereas an offset is better thought of referencing the start of an element. Thus a slice reference of say [2..4] seems to say to me that the slice encompases element#2, element#3, and element#4. That is the whole of each of these elements. The fact that the length of this slice is 3 is obvious because all of the elements are being referenced. If we were using offsets in slice notation, then [2..4] would be saying that the slice starts from the start of element #3 and ends at the start of element #5. This represents all of element#3 and all of element#4, but not any of element#5, thus has a length of 2. But this is not how people normally view the world. I vote with Karl on this one. Besides, calculating the length of an index notation slice is not beyond us, especially if we can do " myArray[x..y].length " Now consider the way we might remove an element from a dynamic array. Given that 'pos' references the element to be removed... using Index Notation A1 = A1[1..pos-1] ~ A1[pos+1 .. A1.length] using Offset Notation A1 = A1[0..pos] ~ A1[pos+1 .. A1.length-1] Not a lot of difference really. Personally, I find that the index notation is more clearly telling the reader that I am trying to exclude the 'pos' element but include everything else. ------- cheers.
Feb 21 2002
Hi Karl, ... To me an 'index' is 1-based and is a normal way that people think about enumerated elements. If you go up to somebody with a list of items and asked them to number them, the person would normally start with the number 1. An 'offset' is 0-based and is the normal way that computers get access to memory. Addr + offset gives another address - the start of the element in question.
Furthermore, an index has the connotation that the entire element is being referenced, whereas an offset is better thought of referencing the start of an element.
... using Index Notation A1 = A1[1..pos-1] ~ A1[pos+1 .. A1.length] using Offset Notation A1 = A1[0..pos] ~ A1[pos+1 .. A1.length-1]
Didn't you get that wrong? using Offset Notation (exclusive) A1 = A1[0..pos] ~ A1[pos+1 .. A1.length] using Offset Notation (inclusive) A1 = A1[0..pos-1] ~ A1[pos+1 .. A1.length-1] (I think thats right -- I had to draw little boxes on a sheet of graph paper) I have this sneaky feeling that the reason D uses offsets instead of indexes is to be backward- compatible with C, and therefore more familiar. Karl Bochert
Feb 21 2002
On Fri, 22 Feb 2002 05:49:50 GMT, Karl Bochert <kbochert ix.netcom.com> wrote:Hi Karl, ... To me an 'index' is 1-based and is a normal way that people think about enumerated elements. If
go up to somebody with a list of items and asked them to number them, the person would normally start with the number 1. An 'offset' is 0-based and is the normal way that computers get access to memory. Addr + offset gives another address - the start of the element in question.
I really think that too many language designers forget that its people that have to actually use them, and not computers. The "user-interface" for most programming languages is sub-optimal. Often the language encourages hard-to-comprehend syntax thus making it easier for people to make mistakes.... using Index Notation A1 = A1[1..pos-1] ~ A1[pos+1 .. A1.length] using Offset Notation A1 = A1[0..pos] ~ A1[pos+1 .. A1.length-1]
Didn't you get that wrong? using Offset Notation (exclusive) A1 = A1[0..pos] ~ A1[pos+1 .. A1.length] using Offset Notation (inclusive) A1 = A1[0..pos-1] ~ A1[pos+1 .. A1.length-1] (I think thats right -- I had to draw little boxes on a sheet of graph paper)
Ooops. You are right. I did get the 'index' code wrong. That might be example of its inherent non- user-friendly interface ;-) A1 = A1[0..pos] ~ A1[pos+1 .. A1.length] is what I should have coded. To the average person, knowing the 'pos' refers to the element being removed, this code looks wrong as it seems to be including A1[pos]!I have this sneaky feeling that the reason D uses offsets instead of indexes is to be backward- compatible with C, and therefore more familiar.
More familar to whom? C/C++ coders? One would have hoped that D might be used as a replacement for C/C++ and thus newbies can learn a "better" language and not have to be backward compatible. Also, reading the D Overview we find under the things to drop from C/C++ "C source code compatibility. Extensions to C that maintain source compatiblity have already been done (C++ and ObjectiveC). Further work in this area is hampered by so much legacy code it is unlikely that significant improvements can be made." ---- cheers.
Feb 21 2002
"not here" <not.known this.address.com> wrote in message news:1103_1014364009 news.digitalmars.com...On Fri, 22 Feb 2002 05:49:50 GMT, Karl Bochert <kbochert ix.netcom.com>
To me an 'index' is 1-based and is a normal way that people think
yougo up to somebody with a list of items and asked them to number them,
start with the number 1. An 'offset' is 0-based and is the normal way that computers get access
gives another address - the start of the element in question.
I really think that too many language designers forget that its people
them, and not computers. The "user-interface" for most programming
the language encourages hard-to-comprehend syntax thus making it easier
I don't know what planet you guys are from... go use BASIC or something if you want arrays that start at position 1 instead of 0. Computer arrays start at 0. Every programmer needs to learn this right away. It's very fundamental, and trying to "humanize" it just results in a language that requires suboptimal code generation. I personally think they should teach people about zero earlier on in school, then we wouldn't have this problem. How would you like that? ;) Sean
Feb 22 2002
"Sean L. Palmer" <spalmer iname.com> wrote in message news:a552bk$17v$1 digitaldaemon.com...I personally think they should teach people about zero earlier on in
then we wouldn't have this problem. How would you like that? ;)
Great idea! So, I have two cars, the zeroeth is red and the first is blue =)
Feb 22 2002
On Fri, 22 Feb 2002 01:18:54 -0800, "Sean L. Palmer" <spalmer iname.com> wrote:I don't know what planet you guys are from... go use BASIC or something if you want arrays that start at position 1 instead of 0. Computer arrays start at 0. Every programmer needs to learn this right away. It's very fundamental, and trying to "humanize" it just results in a language that requires suboptimal code generation. I personally think they should teach people about zero earlier on in school, then we wouldn't have this problem. How would you like that? ;)
This sounds a lot like "Well thank you, Ma'am, but quite frankly, that's not how we do things around these parts". I would have thought that with D, we have a chance to break free of the computer-centric way of doing things and instead design a language that makes life easier for coders at every possible chance. If people all around the world, in all cultures (except it seems, vetern coders), count off things starting with one, why should we have to "retrain" them to start thinking as if they are a computer. Yes, I know that computer arrays start at 0. Just like my high school ruler also started at zero. But that first inch is still inch #1 and not inch #0. If one is truely concerned with suboptimal code generation, we would all be still creating hand-crafted assembler (or even machine code) programs. All we are talking about here is sometimes generating a "subtract one" opcode or similar, and todays, let alone tomorrow's, computers are very, very fast. Isn't a compiler a tool? A tool for people to use? To make our lives easier? So let our compilers take what is normal for people and convert it for computer usage, rather than having the language make people convert what is normal for them into computer-ese. ------- cheers.
Feb 22 2002
A1 = A1[0..pos] ~ A1[pos+1 .. A1.length] is what I should have coded. To the average person, knowing the 'pos'
removed, this code looks wrong as it seems to be including A1[pos]!
To the average C/C++/C#/Java programmer, it looks just as it should.More familar to whom? C/C++ coders? One would have hoped that D might be
C/C++ and thus newbies can learn a "better" language and not have to be
reading the D Overview we find under the things to drop from C/C++
I believe Walter said that D is not a language for the beginners. BASIC, or even Pascal would be better for this purpose. D is a practical language for practical programmers, and I don't think it's the best idea to sacrifice speed to gain such a subtle simplicity, IMO..."C source code compatibility. Extensions to C that maintain source
done (C++ and ObjectiveC). Further work in this area is hampered by so
unlikely that significant improvements can be made."
"compatibility" means ability to compile code from that language. This is what D is not for. But there are many programmers that know only C (or C++, or C#, or Java - the same language family) - and those people expect to find a common environment to start coding quick, without having to learn everything from scratch. Arrays are indexed from 0, every C programmer should remember that better than his own name - why disappoint them? 0-based indexing is a tradition too old to change it - it's better to live with it, especially since it's not hard to get used to it.
Feb 22 2002
On Fri, 22 Feb 2002 16:09:36 +0300, "Pavel Minayev" <evilone omen.ru> wrote:A1 = A1[0..pos] ~ A1[pos+1 .. A1.length] is what I should have coded. To the average person, knowing the 'pos'
removed, this code looks wrong as it seems to be including A1[pos]!
To the average C/C++/C#/Java programmer, it looks just as it should.
Should I infer then that the average D programmer is always going to be an average C/C++/C#/Java programmer too? Is this a short-sighted attidude for the future of D? Can we not expect COBOL coders to come over the fence? If not, why not?More familar to whom? C/C++ coders? One would have hoped that D might be
C/C++ and thus newbies can learn a "better" language and not have to be
reading the D Overview we find under the things to drop from C/C++
I believe Walter said that D is not a language for the beginners. BASIC, or even Pascal would be better for this purpose.
I assume that Walter is referring to people who are just learning to program. I would have thought that the fewer new things that people have to learn, the sooner they can become productive. If this is so, then it would appear than a design goal for D is to assume new comers to D will be existing C/etc coders so they don't have to learn too many new things. Oh well, maybe we are condemed to repeat history.D is a practical language for practical programmers,
.meaning that Basic and Pascal are NOT practical languages, and their users are NOT practical? Ummm. Sounds a little xenophobic to me.and I don't think it's the best idea to sacrifice speed to gain such a subtle simplicity, IMO...
Why is that we spend hours of coding time to optimise a few micro-seconds into a program? We no longer live in the age when computer time is more expensive than people time. It seems you are willing to sacrifice coders time rather than computer time. I don't think it's the best idea to sacrifice coding speed, IMO."C source code compatibility. Extensions to C that maintain source
done (C++ and ObjectiveC). Further work in this area is hampered by so
unlikely that significant improvements can be made."
"compatibility" means ability to compile code from that language. This is what D is not for. But there are many programmers that know only C (or C++, or C#, or Java - the same language family) - and those people expect to find a common environment to start coding quick, without having to learn everything from scratch. Arrays are indexed from 0, every C programmer should remember that better than his own name - why disappoint them?
Heaven forbid that we should try to retrain C coders! Everyone knows that we are sacrosanct and must be protected. Every good C programmer knows how useful the macro preprocessor is (oops, that's not in D is it?) Every good C++ programmer knows how useful multiple inheritance can be (ooops, that not in D is it?) Every good C++/Java/C# programer knows how useful namespaces can be (ooops, that's not in D is it?) Every C programmer can type #include files in their sleep (ooops, that's not in D is it?) Yes I know these are a little unfair. But what I'm trying to get across is that D will already force C coders to learn/unlearn things. So why not have 1-based indexes, just like we use in the real world.0-based indexing is a tradition too old to change it - it's better to live with it, especially since it's not hard to get used to it.
Hey we got tradition! You can't mess that baby. Sure it makes things a bit harder but you'll soon get used to that. Is this the same as saying "We can't do that new thing because its not what we currently do"? I could just as equally say "1-based indexing is not hard to get used to, seeing you already do it everywhere else except when you are thinking like a computer." ----- cheers
Feb 22 2002
"not here" <not.here this.address.com> wrote in message news:1104_1014393359 news.digitalmars.com...Should I infer then that the average D programmer is always going to be an
the future of D? Definitely not. However, the most popular language nowadays is C++, so C-centric model seems most appropriate to me here..meaning that Basic and Pascal are NOT practical languages, and their
Not really. It's just that C/C++ proves to be better in such cases. So should D.Why is that we spend hours of coding time to optimise a few micro-seconds
expensive than people time. It seems you are willing to sacrificecoders time rather than computer time. I don't think it's the best idea to
There is no "sacrifice" in coding speed with 0-based indexing. Nothing you can't get used to, and, in fact, many programmers over the world are already. On other hand, that microsecond can cost you 20fps drop when writing a game which iterates through 100000 objects in a loop...Every good C programmer knows how useful the macro preprocessor is (oops,
Preprocessor thing had been discussed many times and there are _very_ serious reasons to ban it. After all, if you really need one, you can always use an external program. Otherwise, most uses of preprocessor in C are covered by constants, inline functions, and version/debug statements in D.Every good C++ programmer knows how useful multiple inheritance can be
The latest discussion on the topic in this group shows that, in fact, most C++ programmers use MI rarely or don't use it at all.Every good C++/Java/C# programer knows how useful namespaces can be
Namespaces ARE good. And they are present in D - implicitly, each module is a namespace. Don't forget about packages as well. And the fact that each enum has its own namespace speaks for itself.Every C programmer can type #include files in their sleep (ooops, that's
import c.stdio;Yes I know these are a little unfair. But what I'm trying to get across is
1-based indexes, just like we use in the real world. Even in the real world, indices aren't always 1-based. Then again, with 1-based arrays, you can only have 2^32-1 elements, while with 0-based you get the whole 2^32! Just think of all the benefits this gives! =)Hey we got tradition! You can't mess that baby. Sure it makes things a bit
"Tradition" isn't something you should get used to. It's something that _most_ people got used two.Is this the same as saying "We can't do that new thing because its not
Almost. Just because it seems impractical.I could just as equally say "1-based indexing is not hard to get used to,
computer." Well, mathematicians don't use it. Also, I understand why a user shouldn't think like a computer. But why not programmer? Memory is indexed from 0, nothing you can do with it. So are all that assembler opcodes that your program ends up being, anyhow. You write programs for computers, not for men, after all...
Feb 22 2002
On Fri, 22 Feb 2002 21:27:16 +0300, "Pavel Minayev" <evilone omen.ru> wrote:There is no "sacrifice" in coding speed with 0-based indexing.
... Nothing you can't get used to, and, in fact, many programmers over the world are already. On other hand, that microsecond can cost you 20fps drop when writing a game which iterates through 100000 objects in a loop...
Actually the instruction time for an add is more like 1 ns. these days, especially on computers where speed is a concern. The cpu has multiple execution units, so the add would frequently be done in parallel with other instructions, costing 0 time. The +1 will often be combined with another constant costing 0 time. Many assembly instructions used for array access use an addressing mode that adds a constant whether it is needed or not, again 0 time. Finally there is the issue of how often the +1 comes up in the first place. arr[2] -- no +1 arr[x] -- no +1 unless the programmer has stored the wrong thing in x Of course you run into a problem if you pass a cardinal to something that expects an ordinal! An interesting point: Addressing arrays with cardinals implies the existence of an element at [-1]. There are no negative ordinals!Even in the real world, indices aren't always 1-based.
indexes.Then again, with 1-based arrays, you can only have 2^32-1 elements, while with 0-based you get the whole 2^32! Just think of all the benefits this gives! =)
indexes. Unless you count (shudder) negative indexes. I suppose implementation issues would limit ordinal arrays though.Is this the same as saying "We can't do that new thing because its not
Almost. Just because it seems impractical.
written using cardinal arrays ( And that is a killer!)Well, mathematicians don't use it. Also, I understand why a user shouldn't think like a computer. But why not programmer? Memory is indexed from 0, nothing you can do with it. So are all that assembler opcodes that your program ends up being, anyhow.
The program actually ends up being binary -- I would like to make as few concessions as possible to that unfortunate fact.You write programs for computers, not for men, after all...
I write programs for 2 readers: compilers, and humans (If you can call programmers that :-) The great majority of my programming time is spent trying to satisfy the humans.
Feb 22 2002
not here wrote:Heaven forbid that we should try to retrain C coders! Everyone knows that we are sacrosanct and must be protected. Every good C programmer knows how useful the macro preprocessor is (oops, that's not in D is it?) Every good C++ programmer knows how useful multiple inheritance can be (ooops, that not in D is it?) Every good C++/Java/C# programer knows how useful namespaces can be (ooops, that's not in D is it?) Every C programmer can type #include files in their sleep (ooops, that's not in D is it?) Yes I know these are a little unfair. But what I'm trying to get across is that D will already force C coders to learn/ unlearn things. So why not have 1-based indexes, just like we use in the real world.
I don't see the problem as being one of tradition -- as you point out, Walter has dispensed with a number of C/C++ traditions -- but of practicality. The changes you describe above are wholesale removals of features that the compiler can slap you for very quickly if you forget about. Trying to use the preprocessor in D clearly won't lead to subtle bugs; the compiler will scream and the typical programmer will very quickly adapt. Changing the index base will create subtle bugs, because code that would compile in C or C++ will still frequently compile in what I'll call D-1 (D with 1-based arrays), but change its meaning. That scares me. The most common error, of course, will be accessing the zeroeth element of a 1-based array, which should throw an exception, but how about the following code: void parse_command_line_option( char[] option ) { switch (option[1]) // switch on the chracter after the '-' { case 'x': do_option_x(); break; case 'y': do_option_y(); break; default: printf("undefined option\n"); break; } } The comment is correct for C/C++, but in D-1 the comment is misleading and both "-x" and "-y" yield an undefined option message. -RB
Feb 22 2002
"Russell Borogove" <kaleja estarcion.com> wrote in message news:3C76928A.8090508 estarcion.com...I don't see the problem as being one of tradition -- as you point out, Walter has dispensed with a number of C/C++ traditions -- but of practicality. The changes you describe above are wholesale removals of features that the compiler can slap you for very quickly if you forget about. Trying to use the preprocessor in D clearly won't lead to subtle bugs; the compiler will scream and the typical programmer will very quickly adapt. Changing the index base will create subtle bugs, because code that would compile in C or C++ will still frequently compile in what I'll call D-1 (D with 1-based arrays), but change its meaning. That scares me. The most common error, of course, will be accessing the zeroeth element of a 1-based array, which should throw an exception, but how about the following code:
And I want to emphasize this is an important point. D tries hard to avoid having incompatibilities with C that will subtly break things. 1 based arrays would do that. Another way to really mess people up would be to change the operator precedence. Since D is meant to appeal to C and C++ programmers, I feel it would be a mistake to change those things, regardless of how meritorious those changes are when viewed outside of the context of C familiarity. I've done some conversions of a few thousand line programs from C++ to D, and it's bad enough finding and fixing all the dependencies on 0 terminated strings. That turns out to be more work than I'd anticipated.
Feb 22 2002
"Walter" <walter digitalmars.com> wrote in message news:a56tc2$1534$1 digitaldaemon.com...And I want to emphasize this is an important point. D tries hard to avoid having incompatibilities with C that will subtly break things. 1 based arrays would do that. Another way to really mess people up would be to change the operator precedence. Since D is meant to appeal to C and C++ programmers, I feel it would be a mistake to change those things,
of how meritorious those changes are when viewed outside of the context of
familiarity. I've done some conversions of a few thousand line programs from C++ to D, and it's bad enough finding and fixing all the dependencies on 0
strings. That turns out to be more work than I'd anticipated.
Thanks for this explanation, Walter. I now have a better understanding of your goals for D. I wish you and your endevours well. I think you are off to a good start. Goodbye.
Feb 22 2002
And I want to emphasize this is an important point. D tries hard to avoid having incompatibilities with C that will subtly break things. 1 based arrays would do that. Another way to really mess people up would be to change the operator precedence. Since D is meant to appeal to C and C++ programmers, I feel it would be a mistake to change those things, regardless of how meritorious those changes are when viewed outside of the context of C familiarity.
So while ordinal arrays might be more technically correct, D would be less successful with them. However annoying that is, I'm sure you're right.I've done some conversions of a few thousand line programs from C++ to D, and it's bad enough finding and fixing all the dependencies on 0 terminated strings. That turns out to be more work than I'd anticipated.
0-terminated strings today; 0-terminated arrays tomorrow. (maybe in 'E'). Karl
Feb 22 2002
"Karl Bochert" <kbochert ix.netcom.com> wrote in message news:1104_1014441056 bose...So while ordinal arrays might be more technically correct, D would be less successful with them. However annoying that is, I'm sure you're right.
Language design is inevitably going to be an uneasy alliance of contradictory goals <g>. It's like designing a house - to make a bigger closet, you have to shrink the bedroom.
Feb 22 2002
I've been following this debate about slices and cases with some interest. As
a Python programmer, I'm rooting for end-exclusiveness in slices - it's the
way Python handles slices, and seems to work very well in practice. If D
sticks with that, then I think there'll be a lot of Python programmers who
will be able to make use of it very comfortably.
Another Python-ish thing that might be worth considering would be support for
negative array indexes. For example:
foo[-1] is the last element of array foo
foo[-2] is the second-to-last element of array foo
then with slices
foo[-2..] would be the last two elements of foo
foo[1..-1] would be a copy of foo, except for the first and last elements
implementation should be pretty simple, for any index < 0, such as: foo[-n],
the compiler would just treat it like: foo[foo.length - n]
CASES
----------
When it comes to case-ranges though, I agree that end-exclusiveness would be a
weird PITA. Perhaps the thing to do is decide that case-ranges and
array-slices will be different things, and go ahead and use different syntaxes.
To steal from Python again, perhaps use a colon for array slicing (foo[1:-1])
and keep '..' for case-ranges (which seems pretty natural).
This would also kind of make sense if you also supported Python-style negative
array indexes, since negative numbers in case-ranges would presumably have a
completely different meaning.
Barry
Feb 23 2002
"Barry Pederson" <barryp yahoo.com> wrote in message news:3C77CFDD.2050404 yahoo.com...I've been following this debate about slices and cases with some interest.
a Python programmer, I'm rooting for end-exclusiveness in slices - it's
way Python handles slices, and seems to work very well in practice. If D sticks with that, then I think there'll be a lot of Python programmers who will be able to make use of it very comfortably. Another Python-ish thing that might be worth considering would be support
negative array indexes. For example: foo[-1] is the last element of array foo foo[-2] is the second-to-last element of array foo then with slices foo[-2..] would be the last two elements of foo foo[1..-1] would be a copy of foo, except for the first and last
implementation should be pretty simple, for any index < 0, such as:
the compiler would just treat it like: foo[foo.length - n]
While this is a great idea, it suffers from a serious problem - the compiler will have to insert a runtime check for negative indices whenever the index expression is not a constant. This will be too much overhead.CASES ---------- When it comes to case-ranges though, I agree that end-exclusiveness would
weird PITA. Perhaps the thing to do is decide that case-ranges and array-slices will be different things, and go ahead and use different
To steal from Python again, perhaps use a colon for array slicing
and keep '..' for case-ranges (which seems pretty natural). This would also kind of make sense if you also supported Python-style
array indexes, since negative numbers in case-ranges would presumably have
completely different meaning.
Yes, using a different syntax may be the best solution.
Feb 23 2002
"Pavel Minayev" <evilone omen.ru> ha scritto nel messaggio news:a55g08$7e8$1 digitaldaemon.com...A1 = A1[0..pos] ~ A1[pos+1 .. A1.length] is what I should have coded. To the average person, knowing the 'pos'
removed, this code looks wrong as it seems to be including A1[pos]!
To the average C/C++/C#/Java programmer, it looks just as it should.
So I'm not an "average" C/C++/Java programmer, therefore I use them only since 1991/93/96. I know how slicing currently works in D, but I had to double check to understand. [...]But there are many programmers that know only C (or C++, or C#, or Java - the same language family) - and those people expect to find a common environment to start coding quick, without having to learn everything from scratch. Arrays are indexed from 0, every C programmer should remember that better than his own name - why disappoint them? 0-based indexing is a tradition too old to change it - it's better to live with it, especially since it's not hard to get used to it.
I always wondered why C and derivatives don't have a way to define a start index like Pascal does. To me it seems better to leave to the compiler the task to subtract the start index from the actual index. In C you write: int occurrencies['Z'-'A']; for (i = 0; i < size; ++i) { ++occurrencies[s[i]-'A']; } Here the task to subtract 'A' to every indexing is left to the programmer. Maybe the compiler could live with an optional initial index to subtract every time the array is accessed. Ciao
Feb 22 2002
"Roberto Mariottini" <rmariottini lycosmail.com> wrote in message news:a55s13$1c8a$1 digitaldaemon.com...So I'm not an "average" C/C++/Java programmer, therefore I use them only since 1991/93/96. I know how slicing currently works in D, but I had to double check to understand.
End-exclusive slicing _is_ an issue, definitely. But we were talking about 0-based indexing.I always wondered why C and derivatives don't have a way to define a start index like Pascal does. To me it seems better to leave to the compiler the task to subtract the start index from the actual index. In C you write: int occurrencies['Z'-'A']; for (i = 0; i < size; ++i) { ++occurrencies[s[i]-'A']; } Here the task to subtract 'A' to every indexing is left to the programmer. Maybe the compiler could live with an optional initial index to subtract every time the array is accessed.
This is a far better idea. What I like in Pascal is the ability to use 0-based, 1-based or whatever else based arrays depending on your task and your personal taste. Those who care of speed (me) would probably use 0-based (and I believe it should be the default, to work the same way as in C/C++). Otherwise, you can specify it yourself: int[5] foo; // consists of foo[0] to foo[4] int[1..5] bar; // consists of bar[1] to bar[5]
Feb 22 2002
"Pavel Minayev" <evilone omen.ru> wrote in message news:a562qa$5lv$1 digitaldaemon.com...This is a far better idea. What I like in Pascal is the ability to use 0-based, 1-based or whatever else based arrays depending on your task and your personal taste. Those who care of speed (me) would probably use 0-based (and I believe it should be the default, to work the same way as in C/C++). Otherwise, you can specify it yourself: int[5] foo; // consists of foo[0] to foo[4] int[1..5] bar; // consists of bar[1] to bar[5]
Having lower bounds specifiable will work with D (and even with C), but in my decades (!) of programming I've never found a use for it. I came to C from Basic, FORTRAN, and Pascal. I had some initial trouble getting used to 0 based rather than 1 based, but never looked back. 0 based looked more 'right' to me.
Feb 22 2002
"Walter" <walter digitalmars.com> wrote in message news:a56tc4$1534$2 digitaldaemon.com..."Pavel Minayev" <evilone omen.ru> wrote in message news:a562qa$5lv$1 digitaldaemon.com...This is a far better idea. What I like in Pascal is the ability to use 0-based, 1-based or whatever else based arrays depending on your task and your personal taste. Those who care of speed (me) would probably use 0-based (and I believe it should be the default, to work the same way as in C/C++). Otherwise, you can specify it yourself: int[5] foo; // consists of foo[0] to foo[4] int[1..5] bar; // consists of bar[1] to bar[5]
Having lower bounds specifiable will work with D (and even with C), but in my decades (!) of programming I've never found a use for it. I came to C from Basic, FORTRAN, and Pascal.
Instead I have some Pascal sources that will keep as they are. When I do programming I want to use all features the language provides. And my old Pascal sources are full of sets, subrange variables, nested procedures, arbitrary indexed arrays, and so on. Automatic code translators simply produce unmaintainable code, once I started to translate one of them in C++, but I stopped when I realized I'd have to work hard to produce a huge set of classes only to support base Pascal capabilities, with a big loss in readability.I had some initial trouble getting used to 0 based rather than 1 based, but never looked back. 0 based looked more 'right' to me.
Don't confuse normal initial trouble with language expressiveness lack. I know that, at the end, an array index must be translated to a 0-based integer. But simply doing the translation myself doesn't seem to me the right solution in most cases. So I agree with Pavel: array should be 0-based by default, letting the possibility to choose a different start index if needed, stating clearly that n-based indexes are less performant. Ciao
Feb 26 2002
"Walter" <walter digitalmars.com> wrote in message news:a56tc4$1534$2 digitaldaemon.com...Having lower bounds specifiable will work with D (and even with C), but in my decades (!) of programming I've never found a use for it. I came to C from Basic, FORTRAN, and Pascal. I had some initial trouble getting used
0 based rather than 1 based, but never looked back. 0 based looked more 'right' to me.
Well, there are actually cases where you'd prefer some base other than 0. As you've seen, many people here consider 1 to be more suitable, and I understand them... also there are some other cases, for example, suppose you have an array of year income for 1990-2000, in Pascal you'd probably declare it as "array[1990 .. 2000] of integer", and then index it like income[1995], letting the compiler do his job and insert all the necessary decrements; the result is clean code, easy to read and maintain. In C, you have to do it all yourself, and probably define some const base = 1990, and clutter all your code with things like income[1995 - base]. After all, it's as simple as subtracting the base from the index, a single SUB... ain't it worth the thing?
Feb 27 2002
"Pavel Minayev" <evilone omen.ru> escribió en el mensaje news:a5jben$16o8$1 digitaldaemon.com... | "Walter" <walter digitalmars.com> wrote in message | news:a56tc4$1534$2 digitaldaemon.com... | | > Having lower bounds specifiable will work with D (and even with C), but in | > my decades (!) of programming I've never found a use for it. I came to C | > from Basic, FORTRAN, and Pascal. I had some initial trouble getting used | to | > 0 based rather than 1 based, but never looked back. 0 based looked more | > 'right' to me. | | Well, there are actually cases where you'd prefer some base other than 0. | As you've seen, many people here consider 1 to be more suitable, and | I understand them... also there are some other cases, for example, suppose | you have an array of year income for 1990-2000, in Pascal you'd probably | declare it as "array[1990 .. 2000] of integer", and then index it like | income[1995], letting the compiler do his job and insert all the necessary | decrements; the result is clean code, easy to read and maintain. In C, | you have to do it all yourself, and probably define some const base = 1990, | and clutter all your code with things like income[1995 - base]. | | After all, it's as simple as subtracting the base from the index, | a single SUB... ain't it worth the thing? | | No one ever answered to this one. It seems very clever to me. Carlos Santander --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.463 / Virus Database: 262 - Release Date: 2003-03-17
Mar 22 2003
"Carlos Santander B." <carlos8294 msn.com> wrote in news:b5k815$11ia$2 digitaldaemon.com:"Pavel Minayev" <evilone omen.ru> escribió en el mensaje news:a5jben$16o8$1 digitaldaemon.com... | "Walter" <walter digitalmars.com> wrote in message | news:a56tc4$1534$2 digitaldaemon.com... | | > Having lower bounds specifiable will work with D (and even with C), | > but in | > my decades (!) of programming I've never found a use for it. I came | > to C from Basic, FORTRAN, and Pascal. I had some initial trouble | > getting used | to | > 0 based rather than 1 based, but never looked back. 0 based looked | > more 'right' to me. | | Well, there are actually cases where you'd prefer some base other | than 0. As you've seen, many people here consider 1 to be more | suitable, and I understand them... also there are some other cases, | for example, suppose you have an array of year income for 1990-2000, | in Pascal you'd probably declare it as "array[1990 .. 2000] of | integer", and then index it like income[1995], letting the compiler | do his job and insert all the necessary decrements; the result is | clean code, easy to read and maintain. In C, you have to do it all | yourself, and probably define some const base = 1990, | and clutter all your code with things like income[1995 - base].
Having arrays with base 1 and zero-based index is likely to become a maintainer's nightmare: Whenever you see an array, you must look at it's declaration (or wait for a tooltip from you IDE) and check whether is zero- based or not. Array operations become much more bug prone. Once your brain adjusted to zero-based indices, you can easily write bug-free code for arrays or verify that a given code is bug-free. But switching between different bases is very difficult. Zero-based indices are favoured over 1-based indices for implementations reasons. E.g. with a Byte you can address 256 array elements instead of only 255. In D, there are many ways to express concepts, e.g. functions, classes, D- structs, templates. I believe that non-zero based arrays are not really required to express concepts in away that suitable for the problems, programmers have to solve. Farmer.
Mar 26 2003
"Farmer" <itsFarmer. freenet.de> escribió en el mensaje news:Xns934B78D5F5FDitsFarmer 63.105.9.61... | "Carlos Santander B." <carlos8294 msn.com> wrote in | news:b5k815$11ia$2 digitaldaemon.com: | | > "Pavel Minayev" <evilone omen.ru> escribió en el mensaje | > news:a5jben$16o8$1 digitaldaemon.com... | >| "Walter" <walter digitalmars.com> wrote in message | >| news:a56tc4$1534$2 digitaldaemon.com... | >| | >| > Having lower bounds specifiable will work with D (and even with C), | >| > but | > in | >| > my decades (!) of programming I've never found a use for it. I came | >| > to C from Basic, FORTRAN, and Pascal. I had some initial trouble | >| > getting used | >| to | >| > 0 based rather than 1 based, but never looked back. 0 based looked | >| > more 'right' to me. | >| | >| Well, there are actually cases where you'd prefer some base other | >| than 0. As you've seen, many people here consider 1 to be more | >| suitable, and I understand them... also there are some other cases, | >| for example, suppose you have an array of year income for 1990-2000, | >| in Pascal you'd probably declare it as "array[1990 .. 2000] of | >| integer", and then index it like income[1995], letting the compiler | >| do his job and insert all the necessary decrements; the result is | >| clean code, easy to read and maintain. In C, you have to do it all | >| yourself, and probably define some const base = | > 1990, | >| and clutter all your code with things like income[1995 - base]. | | Having arrays with base 1 and zero-based index is likely to become a | maintainer's nightmare: Whenever you see an array, you must look at it's | declaration (or wait for a tooltip from you IDE) and check whether is zero- | based or not. Array operations become much more bug prone. Once your brain | adjusted to zero-based indices, you can easily write bug-free code for | arrays or verify that a given code is bug-free. But switching between | different bases is very difficult. | | Zero-based indices are favoured over 1-based indices for implementations | reasons. E.g. with a Byte you can address 256 array elements instead of | only 255. | | In D, there are many ways to express concepts, e.g. functions, classes, D- | structs, templates. I believe that non-zero based arrays are not really | required to express concepts in away that suitable for the problems, | programmers have to solve. | | | Farmer. I wasn't referring to 1-based arrays only, but any-base arrays. By default, arrays would be 0-based, but what if we could have: int [4..14] b; //starts in 4, ends in 13 int [-3..7] c; //starts in -3, ends in 6 int [10] a; //normal array, identical to int[0..10] a Delphi supports that, and I think it could be an interesting addition. Carlos Santander --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.465 / Virus Database: 263 - Release Date: 2003-03-25
Mar 26 2003
I disagree. Nonzero-based arrays buy you convenience. Now you don't have to remember to subtract the base, it'll do it automatically. Not everybody thinks like a pro programmer. That used to be one of the big selling points of BASIC that it had 1-based arrays. I always liked the feature in Pascal. If I want to index my array as 2,3,4 for some reason instead of 0,1,2, why should the compiler force me to use 0,1,2? I'll just go and use 2,3,4 anyway but now I have to write a stupid array wrapper class that does the subtraction for me, or remember to subtract 2 all the time. I can't think of a good example off the top of my head, but say your indices are an enum type instead of int. And let's just say that for instance your enum type only goes from 8 thru 12 because it also happens to be part of a hardware register matching some bits you don't have control over. And you want to map those hardware states to some other data. Well you have to remember to subtract 8 all the time or you'll waste memory or get an array bounds error. It's such a simple thing... D needs a range type. Then using them for declaring arrays becomes easy; if it wasn't there you'd wonder why not. Sean "Farmer" <itsFarmer. freenet.de> wrote in message news:Xns934B78D5F5FDitsFarmer 63.105.9.61..."Carlos Santander B." <carlos8294 msn.com> wrote in news:b5k815$11ia$2 digitaldaemon.com:"Pavel Minayev" <evilone omen.ru> escribió en el mensaje news:a5jben$16o8$1 digitaldaemon.com... | "Walter" <walter digitalmars.com> wrote in message | news:a56tc4$1534$2 digitaldaemon.com... | | > Having lower bounds specifiable will work with D (and even with C), | > but in | > my decades (!) of programming I've never found a use for it. I came | > to C from Basic, FORTRAN, and Pascal. I had some initial trouble | > getting used | to | > 0 based rather than 1 based, but never looked back. 0 based looked | > more 'right' to me. | | Well, there are actually cases where you'd prefer some base other | than 0. As you've seen, many people here consider 1 to be more | suitable, and I understand them... also there are some other cases, | for example, suppose you have an array of year income for 1990-2000, | in Pascal you'd probably declare it as "array[1990 .. 2000] of | integer", and then index it like income[1995], letting the compiler | do his job and insert all the necessary decrements; the result is | clean code, easy to read and maintain. In C, you have to do it all | yourself, and probably define some const base = 1990, | and clutter all your code with things like income[1995 - base].
Having arrays with base 1 and zero-based index is likely to become a maintainer's nightmare: Whenever you see an array, you must look at it's declaration (or wait for a tooltip from you IDE) and check whether is
based or not. Array operations become much more bug prone. Once your brain adjusted to zero-based indices, you can easily write bug-free code for arrays or verify that a given code is bug-free. But switching between different bases is very difficult. Zero-based indices are favoured over 1-based indices for implementations reasons. E.g. with a Byte you can address 256 array elements instead of only 255. In D, there are many ways to express concepts, e.g. functions, classes, D- structs, templates. I believe that non-zero based arrays are not really required to express concepts in away that suitable for the problems, programmers have to solve. Farmer.
Mar 27 2003
"Sean L. Palmer" wrote:I disagree. Nonzero-based arrays buy you convenience. Now you don't have to remember to subtract the base, it'll do it automatically.
Yes, but it will cost any user of an array a little bit of performance for the potential base correction. If D is seeking to be a successor to C and C++ in system and game programming, one has to be very careful with this. -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com
Mar 27 2003
As in C++, you don't pay for what you don't use. It doesn't necessarily cost any performance anyway. If your array is from 1..10, and starts at address 0x80000000, and each entry is 4 bytes long, the compiler just takes your index and does this to compute the address: (index*4)+0x7ffffffc If it were zero based, it'd do this: (index*4)+0x80000000 There are situations where it would cost you performance, but not many. Sean "Helmut Leitner" <helmut.leitner chello.at> wrote in message news:3E82FBF5.75FDADF4 chello.at..."Sean L. Palmer" wrote:I disagree. Nonzero-based arrays buy you convenience. Now you don't
to remember to subtract the base, it'll do it automatically.
Yes, but it will cost any user of an array a little bit of performance for the potential base correction. If D is seeking to be a successor to C and C++ in system and game programming, one has to be very careful with this.
Mar 27 2003
"Sean L. Palmer" wrote:There are situations where it would cost you performance, but not many. "Helmut Leitner" <helmut.leitner chello.at> wrote in message news:3E82FBF5.75FDADF4 chello.at..."Sean L. Palmer" wrote:I disagree. Nonzero-based arrays buy you convenience. Now you don't
to remember to subtract the base, it'll do it automatically.
Yes, but it will cost any user of an array a little bit of performance for the potential base correction. If D is seeking to be a successor to C and C++ in system and game programming, one has to be very careful with this.
As in C++, you don't pay for what you don't use. It doesn't necessarily cost any performance anyway. If your array is from 1..10, and starts at address 0x80000000, and each entry is 4 bytes long, the compiler just takes your index and does this to compute the address: (index*4)+0x7ffffffc If it were zero based, it'd do this: (index*4)+0x80000000
And when this address in not available at compile time (e. g. an element of an dynamically allocate part of an object) or passed through function call interfaces - how will you do it then without loss of performance? I think that's impossible. -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com
Mar 27 2003
"Helmut Leitner" <leitner hls.via.at> wrote in message news:3E833158.D568C839 hls.via.at..."Sean L. Palmer" wrote:There are situations where it would cost you performance, but not many. "Helmut Leitner" <helmut.leitner chello.at> wrote in message news:3E82FBF5.75FDADF4 chello.at..."Sean L. Palmer" wrote:I disagree. Nonzero-based arrays buy you convenience. Now you
haveto remember to subtract the base, it'll do it automatically.
Yes, but it will cost any user of an array a little bit of performance for the potential base correction. If D is seeking to be a successor
C and C++ in system and game programming, one has to be very careful with this.
As in C++, you don't pay for what you don't use. It doesn't necessarily cost any performance anyway. If your array is from 1..10, and starts at address 0x80000000, and each entry is 4 bytes long, the compiler just takes your index and does this
compute the address: (index*4)+0x7ffffffc If it were zero based, it'd do this: (index*4)+0x80000000
And when this address in not available at compile time (e. g. an element of an dynamically allocate part of an object) or passed through function call interfaces - how will you do it then without loss of performance? I think that's impossible.
in C++ any array can be rebased thus; inline template<T> T * rebase( T * ar, int base ) { return &(ar[-base]); } or C example int * myarray = malloc( sizeof(int) * 80 ); .... int * onebase = &myarray[-1]; // or myarray - 1; onebase[1] is now myarray[0] :) in a func call int func( int * ar ) { ar -= 1; // ar now 1 based. ..... }
Mar 27 2003
Helmut Leitner says...And when this address in not available at compile time ... - how will you do it then without loss of performance? I think that's impossible.
That tradeoff the programmer should be allowed to make. I suspect it's wrong anyway. The base value needs computation only once - dynamically or otherwise - and thereafter may be stored. Each (random) array access involves at minimum one addition to find the desired element's memory address. Using a different base adds not a whit of extra work. We had a similar debate about negative indices. Walter was against them for performance reasons that are easily addressed if not completely fictitious. Sean's motto 'pay for what you use' is apropos. Farmer says...In D, there are many ways to express concepts, e.g. functions, classes, D-structs, templates. I believe that non-zero based arrays are not really required to express concepts in away that suitable for the problems, programmers have to solve.
Functional languages owe much of their fabulous productivity to array (list) handling capabilities. An array can hold virtually anything, not just numbers. It can have more than one dimension to associate objects on different axes en masse. C++ folks unfamiliar with such paradigms know little of what they're missing, so I understand these counteroffers, but there is no substitute. The ability to pick apart, rearrange, index, map across, thread, and otherwise sling arrays around - and morph them into new forms - is a truly expressive and compact way to write tons of code with performance results comparable to C and even better, depending on your C programmer and his available time for optimizing nested inner loops and chasing down off-by-one errors. Mark
Mar 27 2003
Mark Evans <Mark_member pathlink.com> wrote in news:b60d0o$2sf7$1 digitaldaemon.com:Farmer says...In D, there are many ways to express concepts, e.g. functions, classes, D-structs, templates. I believe that non-zero based arrays are not really required to express concepts in away that suitable for the problems, programmers have to solve.
Functional languages owe much of their fabulous productivity to array (list) handling capabilities. An array can hold virtually anything, not just numbers. It can have more than one dimension to associate objects on different axes en masse. C++ folks unfamiliar with such paradigms know little of what they're missing, so I understand these counteroffers, but there is no substitute. The ability to pick apart, rearrange, index, map across, thread, and otherwise sling arrays around - and morph them into new forms - is a truly expressive and compact way to write tons of code with performance results comparable to C and even better, depending on your C programmer and his available time for optimizing nested inner loops and chasing down off-by-one errors. Mark
You are right. I don't know about functional programming, but I know that doing complex work with arrays is a pain in C++, C#, Java or D (not to mention C or Pascal). D arrays really shine when used for system level programming tasks, like getting memory from the GC, copying a memory block or working with a rather fixed set of objects/values. Non-zero based arrays for such tasks have few benefits, but pose the risk of harder to maintain code: Some/May people will use different array-bases, in the same language, for the same project, for similar concepts. I think, that D arrays would better stay a low-level, implementation determined feature. Putting more features to them, would further increase the confusion about them. But (separate) features to the D language and/or Phobos, that enables programmers to work with arrays(lists) in a productive, safe and reasonably fast manner could be a worthwhile addition to D. Farmer.
Mar 30 2003
Helmut Leitner wrote:"Sean L. Palmer" wrote:I disagree. Nonzero-based arrays buy you convenience. Now you don't have to remember to subtract the base, it'll do it automatically.
Yes, but it will cost any user of an array a little bit of performance for the potential base correction. If D is seeking to be a successor to C and C++ in system and game programming, one has to be very careful with this.
No, it won't. In Pascal, functions disliked taking arrays of unknown size. You could only make them take an array[SomeConstant..] of SomeType, then you could process this array as SomeConstant-based. This function would only have a run-time specification of an array length, but not of a base, and thus wouldn't probably be any slower than a 0-based array. I'm not sure whether it was allowed to assume that arrays had the same base, or type checking had caught such misuse. -i.
Mar 28 2003
Ilya Minkov wrote:Helmut Leitner wrote:"Sean L. Palmer" wrote:I disagree. Nonzero-based arrays buy you convenience. Now you don't have to remember to subtract the base, it'll do it automatically.
Yes, but it will cost any user of an array a little bit of performance for the potential base correction. If D is seeking to be a successor to C and C++ in system and game programming, one has to be very careful with this.
No, it won't. In Pascal, functions disliked taking arrays of unknown size. You could only make them take an array[SomeConstant..] of SomeType, then you could process this array as SomeConstant-based. This function would only have a run-time specification of an array length, but not of a base, and thus wouldn't probably be any slower than a 0-based array. I'm not sure whether it was allowed to assume that arrays had the same base, or type checking had caught such misuse.
You are right about Pascal. It was a major PITA that it was impossible even to write a generic function to e.g . calculate the average of an array of arbitrary size because the array size was part of the parameter type and therefore part of the function definition. But I don't think that you can compare this. If you want a base != 0, than you have to pay for it. Maybe not much, but you have to pay. If you don't offset the array pointer you will have to add the offset when accessing the array elements. If you offset the array pointer you need to reset it before you free it's dynamical memory. So you have to store either the offset or the pointer to the allocated memory. If you want to implement range checking, you will have to either use the offset, or store separate hi and lo bounds. Anyway things become more complicated and it won't make a single C, C++ or Java programmer feel better about D. -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com
Mar 28 2003
Helmut Leitner wrote:Ilya Minkov wrote:No, it won't. In Pascal, functions disliked taking arrays of unknown size. You could only make them take an array[SomeConstant..] of SomeType, then you could process this array as SomeConstant-based. This function would only have a run-time specification of an array length, but not of a base, and thus wouldn't probably be any slower than a 0-based array. I'm not sure whether it was allowed to assume that arrays had the same base, or type checking had caught such misuse.
You are right about Pascal. It was a major PITA that it was impossible even to write a generic function to e.g . calculate the average of an array of arbitrary size because the array size was part of the parameter type and therefore part of the function definition.
It was possible. Only the low bound was fixed, and the length was passed implicitly to the function.But I don't think that you can compare this. If you want a base != 0, than you have to pay for it. Maybe not much, but you have to pay.
Let the base be in the calling code, and let the function accept a dynamic array "as if" it was placed on the certain offset. Consider: this all offset thingy is good for program readability. And while length usually depends on the run-time condition, the base usually depends upon readability considerations for algorithms. For example, in a string you usually have to iterate from 0 to length-1. That's what all the C guys do, they even have a "for" which allows to perfectly hide this fact or the oppsite so that a bug is not too easy to see. But isn't it neater to reference the string as 1-based?If you don't offset the array pointer you will have to add the offset when accessing the array elements.
Pointer offset can only be done as a short-term optimisation. Actually, i don't even think it's requiered, since almost every memory acess gets additive constants in algorithms. Another example where it is useful, is a syntactic sugar for acessing arrays of constant size, like in Sean's example. You can also make the function programmer handle it: in the upper example, he specifies the lower bound, but does not specify the upper bound. It is his responsibility to retrieve the upper bound and take it into account. The same can be done with a lower bound. The bad thing is that it would probably mean expanding the dynamic array specification, or would requiere extended function annotation by the compiler. However, the problem may be of a more general nature, and might be better solved in a more generic way, as was pointed out by Mark. I'm not sure i exactly understand what specific features/ solution he means though. -i.
Mar 28 2003
The array record can include both a real base pointer and a pseudo-base pointer (incorporating the offset). The pseudo-base pointer is computed only once, whether dynamically or statically. Array access cost is identical with either pointer. Beyond that, the array record could include an 'end' pointer making negative indexing equally simple. It would be adjusted every time the length property is adjusted. Mark Helmut Leitner says...If you want a base != 0, than you have to pay for it. Maybe not much, but you have to pay. If you don't offset the array pointer you will have to add the offset when accessing the array elements. If you offset the array pointer you need to reset it before you free it's dynamical memory. So you have to store either the offset or the pointer to the allocated memory. If you want to implement range checking, you will have to either use the offset, or store separate hi and lo bounds. Anyway things become more complicated and it won't make a single C, C++ or Java programmer feel better about D. -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com
Mar 28 2003
Base index is nothing next to fundamental array manipulation/creation functions.
Experience stands behind that statement. Mathematica arrays are 1-based, C
arrays are 0-based, and I must often pass arrays between them. The 0 vs. 1
issue has never hurt me.
The big deal is that C/C++ offers no help in array manipulation. That is
primarily why I use Mathematica so regularly. Mathematica is a multiparadigm
language offering functional-style array manipulations. I would do these
manipulations in C++ if that were possible. The more of them in D, the better.
Just to give a flavor of what I mean - here is some Mathematica code pulled at
random. These lines showcase typical array manipulations and a bit of
functional style. They come from an autocorrelation spectral estimator that
performs as fast as the equivalent C, but says, in mere lines, what would be
pages of C.
'Table' is equivalent to Sean's 'range' (I think). 'Flatten' drops a
multi-dimensional array down to 1 dimension. This is the kind of stuff I wish
I could do in D. -Mark
X = InverseFourier[piece];
X2 = Map[(# Conjugate[#])&, X];
auto = Re[Chop[Fourier[X2]]] / Q;
s = Take[auto,M] Table[w[m],{m,1,M}];
s = Flatten[{s,Table[0,{N- (2 M - 1)}]}];
lastpart = Table[auto[[N-m+1 +1]] w[N-m+1],{m,N-M+1+1,N}];
s = Flatten[{s,lastpart}];
capS = Re[Chop[InverseFourier[s]]];
Sean L. Palmer says...
I disagree. Nonzero-based arrays buy you convenience. Now you don't have
to remember to subtract the base, it'll do it automatically.
Mar 27 2003
On Wed, 26 Mar 2003 23:45:19 +0000 (UTC), Farmer <itsFarmer. freenet.de> wrote:"Carlos Santander B." <carlos8294 msn.com> wrote in news:b5k815$11ia$2 digitaldaemon.com:"Pavel Minayev" <evilone omen.ru> escribió en el mensaje news:a5jben$16o8$1 digitaldaemon.com... | "Walter" <walter digitalmars.com> wrote in message | news:a56tc4$1534$2 digitaldaemon.com... | | > Having lower bounds specifiable will work with D (and even with C), | > but in | > my decades (!) of programming I've never found a use for it. I came | > to C from Basic, FORTRAN, and Pascal. I had some initial trouble | > getting used | to | > 0 based rather than 1 based, but never looked back. 0 based looked | > more 'right' to me. | | Well, there are actually cases where you'd prefer some base other | than 0. As you've seen, many people here consider 1 to be more | suitable, and I understand them... also there are some other cases, | for example, suppose you have an array of year income for 1990-2000, | in Pascal you'd probably declare it as "array[1990 .. 2000] of | integer", and then index it like income[1995], letting the compiler | do his job and insert all the necessary decrements; the result is | clean code, easy to read and maintain. In C, you have to do it all | yourself, and probably define some const base = 1990, | and clutter all your code with things like income[1995 - base].
Having arrays with base 1 and zero-based index is likely to become a maintainer's nightmare: Whenever you see an array, you must look at it's declaration (or wait for a tooltip from you IDE) and check whether is zero- based or not. Array operations become much more bug prone. Once your brain adjusted to zero-based indices, you can easily write bug-free code for arrays or verify that a given code is bug-free. But switching between different bases is very difficult. Zero-based indices are favoured over 1-based indices for implementations reasons. E.g. with a Byte you can address 256 array elements instead of only 255. In D, there are many ways to express concepts, e.g. functions, classes, D- structs, templates. I believe that non-zero based arrays are not really required to express concepts in away that suitable for the problems, programmers have to solve. Farmer.
I always get annoyed when people refer to '1-based' arrays. An array whose first element is arr[1] is a very special beast. Its elements are being labled with their position in the array. Access is by ordinal rather than cardinal. A quick test: Where is the character 's' in the word 'test'? If you answered "the character offset 2 from the start" then maybe '0-based' arrays make sense. If you said "the third character" you ought to consider having arrays accessed by ordinals. Karl It is highly misleading to think of '0-based' and '1-based' arrays. What they really are is arrays accessed by index, and arrays accessed by position. An array may have any arrangment of indices, but it always has a first position. I am against 0-based arrays. I am against 1-based arrays. I am for positional arrays. Last I looked, D had come up with an absolutely horrible approach to specifiying slices, brought on, I'm sure, by this 'based' concept.
Mar 29 2003
Karl Bochert complains correctly:If you answered "the character offset 2 from the start" then maybe '0-based' arrays make sense.
Karl - welcome to the universe of design flaws we call the C programming language. Confounding arrays with pointers with strings produces ... C. Pull one string, and the whole building collapses. Disentangling these concepts invites more than one answer to your question. The Icon language defines string positions between characters, while general arrays are "1-based." Icon is exceptionally adept at string processing. From the Icon Handbook, "Icon's strings and tables make text processing much more convenient than in languages that only provide characters and character arrays....Unlike most languages where strings are implemented as arrays of characters, Icon provides strings as a primitive data type. They can be of any length. There are extensive facilities for searching and editing strings." Unicode with its variable-byte-length encodings makes disentangling strings from arrays more urgent still. D should promote strings to primitive type status with dedicated constructs. Icon had it right a long time ago. D is struggling with Unicode because the confused C model is not amenable to Unicode. As a primitive type, Unicode string complexities would vanish under the hood. I'm not holding my breath, but if you ask me, that is how to do strings right. (Those in love with C strings could still declare arrays of char.) http://www.toolsofcomputing.com/IconHandbook/ http://unicon.sourceforge.net/index.html Mark
Mar 29 2003
Carlos Santander B. wrote:"Pavel Minayev" <evilone omen.ru> escribió en el mensaje news:a5jben$16o8$1 digitaldaemon.com... | "Walter" <walter digitalmars.com> wrote in message | news:a56tc4$1534$2 digitaldaemon.com... | | > Having lower bounds specifiable will work with D (and even with C), but in | > my decades (!) of programming I've never found a use for it. I came to C | > from Basic, FORTRAN, and Pascal. I had some initial trouble getting used | to | > 0 based rather than 1 based, but never looked back. 0 based looked more | > 'right' to me. | | Well, there are actually cases where you'd prefer some base other than 0. | As you've seen, many people here consider 1 to be more suitable, and | I understand them... also there are some other cases, for example, suppose | you have an array of year income for 1990-2000, in Pascal you'd probably | declare it as "array[1990 .. 2000] of integer", and then index it like | income[1995], letting the compiler do his job and insert all the necessary | decrements; the result is clean code, easy to read and maintain. In C, | you have to do it all yourself, and probably define some const base = 1990, | and clutter all your code with things like income[1995 - base]. | | After all, it's as simple as subtracting the base from the index, | a single SUB... ain't it worth the thing? | | No one ever answered to this one. It seems very clever to me.
Putting aside the issue of implementation details, nobody has produced any practical examples for it, and I've never seen any good use of it in Pascal, either; it's always used static assumptions about the environment code is to be used in, as with Pavel's pseudo-example above.
Mar 28 2003
"Pavel Minayev" <evilone omen.ru> wrote in message news:a562qa$5lv$1 digitaldaemon.com..."Roberto Mariottini" <rmariottini lycosmail.com> wrote in message news:a55s13$1c8a$1 digitaldaemon.com...
This is a far better idea. What I like in Pascal is the ability to use 0-based, 1-based or whatever else based arrays depending on your task and your personal taste. Those who care of speed (me) would probably use 0-based (and I believe it should be the default, to work the same way as in C/C++). Otherwise, you can specify it yourself: int[5] foo; // consists of foo[0] to foo[4] int[1..5] bar; // consists of bar[1] to bar[5]
Seconded! -- Stijn OddesE_XYZ hotmail.com http://OddesE.cjb.net __________________________________________ Remove _XYZ from my address when replying by mail
Feb 27 2002
"Pavel Minayev" <evilone omen.ru> wrote in message news:a562qa$5lv$1 digitaldaemon.com..."Roberto Mariottini" <rmariottini lycosmail.com> wrote in message news:a55s13$1c8a$1 digitaldaemon.com...
This is a far better idea. What I like in Pascal is the ability to use 0-based, 1-based or whatever else based arrays depending on your task and your personal taste. Those who care of speed (me) would probably use 0-based (and I believe it should be the default, to work the same way as in C/C++). Otherwise, you can specify it yourself: int[5] foo; // consists of foo[0] to foo[4] int[1..5] bar; // consists of bar[1] to bar[5]
And maybe make a property, array.StartIndex so you could always dynamically find out what the start index of the array is! -- Stijn OddesE_XYZ hotmail.com http://OddesE.cjb.net __________________________________________ Remove _XYZ from my address when replying by mail
Feb 27 2002
"OddesE" <OddesE_XYZ hotmail.com> wrote in message news:a5jajn$168e$1 digitaldaemon.com...And maybe make a property, array.StartIndex so you could always dynamically find out what the start index of the array is!
Then maybe: array.start // first index array.end // last index array.length // length (end - start + 1) And define this for all arrays, including 0-based.
Feb 27 2002
On a related note, Pascal had succ and pred for enums, but from what I remember didn't have first and last? All four would be quite handy to have for enums... unfortunately if you can define your own values for an enumerant, succ and pred become (at runtime) a function containing a switch statement or table lookup. Personally I think enums *should* be sequential, and a separate flags type could deal with bitflags. typedef'd ints can handle any other case. Sean "Pavel Minayev" <evilone omen.ru> wrote in message news:a5jb30$16ij$1 digitaldaemon.com..."OddesE" <OddesE_XYZ hotmail.com> wrote in message news:a5jajn$168e$1 digitaldaemon.com...And maybe make a property, array.StartIndex so you could always dynamically find out what the start index of the array is!
Then maybe: array.start // first index array.end // last index array.length // length (end - start + 1) And define this for all arrays, including 0-based.
Feb 28 2002
"Sean L. Palmer" <spalmer iname.com> wrote in message news:a5l3t1$1ua6$1 digitaldaemon.com...On a related note, Pascal had succ and pred for enums, but from what I remember didn't have first and last?
Yes, right. Succ and Pred, however, were defined for all ordinal types, not just enums.All four would be quite handy to have for enums... unfortunately if you
define your own values for an enumerant, succ and pred become (at runtime)
function containing a switch statement or table lookup.
There's the same problem in Pascal (it supports custom values for enum members since Delphi 6, AFAIK), and it's handled in a somewhat strange manner: Succ always means +1, and Pred is -1, regardless of what declaration says. So: type TEnum = (foo = 1000, bar = 2000, baz = 3000); The thing is, Pascal defines enum as "a subrange whose lowest and highest values correspond to the lowest and highest ordinalities of the constants in the declaration". So variable of type TEnum can take any value in range 1000 .. 3000, and thus, Succ and Pred just inrements/decrements by one...Personally I think enums *should* be sequential, and a separate flags type could deal with bitflags. typedef'd ints can handle any other case.
It is sometimes very convenient to define an enum with members equal to those of API: enum Key { LButton = 1, RButton = 2, Cancel = 3, MButton = 4, Back = 8, BackSpace = Back, Tab = 9, Clear = 12, Return = 13, Enter = Return, Shift = 16, Control = 17, ... } Now every Key, being casted to int, equals the appropriate VK_* constant - no need for switch() or alike.
Feb 28 2002
"Pavel Minayev" <evilone omen.ru> wrote in message news:a5labi$214t$1 digitaldaemon.com..."Sean L. Palmer" <spalmer iname.com> wrote in message news:a5l3t1$1ua6$1 digitaldaemon.com...On a related note, Pascal had succ and pred for enums, but from what I remember didn't have first and last?
Yes, right. Succ and Pred, however, were defined for all ordinal types, not just enums.
Right. Used it in place of ++ and -- alot.All four would be quite handy to have for enums... unfortunately if you
define your own values for an enumerant, succ and pred become (at
afunction containing a switch statement or table lookup.
There's the same problem in Pascal (it supports custom values for enum members since Delphi 6, AFAIK), and it's handled in a somewhat strange manner: Succ always means +1, and Pred is -1, regardless of what declaration says. So: type TEnum = (foo = 1000, bar = 2000, baz = 3000); The thing is, Pascal defines enum as "a subrange whose lowest and highest values correspond to the lowest and highest ordinalities of the constants in the declaration". So variable of type TEnum can take any value in range 1000 .. 3000, and thus, Succ and Pred just inrements/decrements by one...
That's why I think enums should be limited to sequential values.Personally I think enums *should* be sequential, and a separate flags
could deal with bitflags. typedef'd ints can handle any other case.
It is sometimes very convenient to define an enum with members equal to those of API: enum Key { LButton = 1, RButton = 2, Cancel = 3, MButton = 4, Back = 8, BackSpace = Back, Tab = 9, Clear = 12, Return = 13, Enter = Return, Shift = 16, Control = 17, ... } Now every Key, being casted to int, equals the appropriate VK_* constant - no need for switch() or alike.
So what's so inconvenient about this? typedef int VKCode; // in D I believe this makes a distinct type which behaves identically to int except for type conversion can't // be implicitly done from int to VKCode (though I think the opposite still happens implicitly) static const VKCode LButton = 1, RButton = 2, Cancel = 3, MButton = 4, Back = 8, BackSpace = Back, Tab = 9, Clear = 12, Return = 13, Enter = Return, Shift = 16, Control = 17; // this is assuming that implicit conversion from int to VKCode can still be done in the initializer. // if you think about it, that's really an explicit conversion anyway, don't you think? That's how I'd like it to be handled, anyway. Sean
Mar 01 2002
"Sean L. Palmer" <spalmer iname.com> wrote in message news:a5nm1p$r0$1 digitaldaemon.com...So what's so inconvenient about this? typedef int VKCode; // in D I believe this makes a distinct type which behaves identically to int except for type conversion can't // be implicitly done from int to
(though I think the opposite still happens implicitly) static const VKCode LButton = 1, RButton = 2, Cancel = 3, MButton = 4, Back = 8, BackSpace = Back, Tab = 9, Clear = 12, Return = 13, Enter = Return, Shift = 16, Control = 17; // this is assuming that implicit conversion from
to VKCode can still be done in the initializer. // if you think about it, that's really an explicit conversion anyway, don't you think?
The difference is that enum defines its own namespace. So it'd be Key.Enter, Key.Tab, Key.A etc... I don't see any other way to do it apart from declaring a separate class specially for that - probably not the best idea...
Mar 01 2002
"Pavel Minayev" <evilone omen.ru> wrote in message news:a5nugt$4qh$1 digitaldaemon.com..."Sean L. Palmer" <spalmer iname.com> wrote in message news:a5nm1p$r0$1 digitaldaemon.com...So what's so inconvenient about this? typedef int VKCode; // in D I believe this makes a distinct type which behaves identically to int except for type conversion can't // be implicitly done from int to
(though I think the opposite still happens implicitly) static const VKCode LButton = 1, RButton = 2, Cancel = 3, MButton = 4, Back = 8, BackSpace = Back, Tab = 9, Clear = 12, Return = 13, Enter = Return, Shift = 16, Control = 17; // this is assuming that implicit conversion from
to VKCode can still be done in the initializer. // if you think about it, that's really
explicit conversion anyway, don't you think?
The difference is that enum defines its own namespace. So it'd be Key.Enter, Key.Tab, Key.A etc... I don't see any other way to do it apart from declaring a separate class specially for that - probably not the best idea...
How about placing them into it's own module named Key.d? Not the most beatiful solution I agree though... -- Stijn OddesE_XYZ hotmail.com http://OddesE.cjb.net __________________________________________ Remove _XYZ from my address when replying by mail
Mar 01 2002
"OddesE" <OddesE_XYZ hotmail.com> wrote in message news:a5oasr$hqb$1 digitaldaemon.com...How about placing them into it's own module named Key.d? Not the most beatiful solution I agree though...
Two problems here. First is that there might be another module with such (frequently used) name. Second is that if another module contains a function Enter(), you'll have to resolve scope each time you use it. With enum, you'd always use Key.Enter for key, and Enter() to call function.
Mar 01 2002
"Pavel Minayev" <evilone omen.ru> wrote in message news:a5omng$lmv$1 digitaldaemon.com... <SNIP>Two problems here. First is that there might be another module with such (frequently used) name. Second is that if another module contains a function Enter(), you'll have to resolve scope each time you use it. With enum, you'd always use Key.Enter for key, and Enter() to call function.
Yes you are right, hadn't thought of that. -- Stijn OddesE_XYZ hotmail.com http://OddesE.cjb.net __________________________________________ Remove _XYZ from my address when replying by mail
Mar 02 2002
Mark Evans <Mark_member pathlink.com> writes:Unicode with its variable-byte-length encodings makes disentangling strings from arrays more urgent still. D should promote strings to primitive type status with dedicated constructs. Icon had it right a long time ago. D is struggling with Unicode because the confused C model is not amenable to Unicode. As a primitive type, Unicode string complexities would vanish under the hood. I'm not holding my breath, but if you ask me, that is how to do strings right. (Those in love with C strings could still declare arrays of char.)
One thing D needs for working Unicode strings is a decent foreach (or iterator) construct, which I suppose is "coming soon". Doing C-like for (int i = 0; i < string.length; ++i) process(string[i]); is out of the question if the internal representation is, for example, UTF-8. Implementing String as a class would not be totally impossible either, at least if assignment operator could be overloaded. Of course, the best would probably to have a string concept built into the language. As of now, the array type seems to have gathered lot of the functionality that would in normal circumstances be part of the string class (how often do you concatenate other arrays than strings, for example?)http://www.toolsofcomputing.com/IconHandbook/ http://unicon.sourceforge.net/index.html
While Icon is said to be adept at string processing, it's unfortunate that it doesn't support Unicode either: --- B3. Is there a Unicode version of Icon? No. Icon is defined in terms of 8-bit characters, and changing this presents several design challenges that would likely break existing programs. --- -Antti
Mar 29 2003
While Icon is said to be adept at string processing, it's unfortunate that it doesn't support Unicode either: -Antti
Icon is recognized worldwide as the king of string processing languages. Its development ceased before Unicode came into favor. Incidentally, if you know of any language that supports native Unicode strings I am all ears. One of the Unicon testimonials has it right: "Other languages have minimal data structures. Most of our programming is in C. Quite often, I need a list of objects. In C, it is (as you know) a royal pain to declare a structure with a pointer to itself, and malloc them, free them, and walk the chain. Why can't a language just have a 'list' datatype, and be done with it? Why can't a language provide the constructs we all need, instead of providing nearly-assembly- language constructs and letting us develop the rest ourselves?" Mark
Mar 29 2003
Depends what you mean by Unicode? Java and C# (and Verifiable Balderdash) all use Unicode UCS-16 as their native type. I'm not aware of any programming language - XML is not a programming language, all you soap suds! - that works with UTF-8 (or 7). Maybe that's what you meant? "Mark Evans" <Mark_member pathlink.com> wrote in message news:b65see$qsr$1 digitaldaemon.com...While Icon is said to be adept at string processing, it's unfortunate that it doesn't support Unicode either: -Antti
Icon is recognized worldwide as the king of string processing languages.
development ceased before Unicode came into favor. Incidentally, if you
any language that supports native Unicode strings I am all ears. One of the Unicon testimonials has it right: "Other languages have minimal
structures. Most of our programming is in C. Quite often, I need a list of objects. In C, it is (as you know) a royal pain to declare a structure
pointer to itself, and malloc them, free them, and walk the chain. Why
language just have a 'list' datatype, and be done with it? Why can't a
provide the constructs we all need, instead of providing nearly-assembly- language constructs and letting us develop the rest ourselves?" Mark
Mar 29 2003
Matthew Wilson says...Java and C# ... all use Unicode ... as their native type.
And some languages have seen Unicode retrofits. http://www.reportlab.com/i18n/python_unicode_tutorial.html http://rf.net/~james/perli18n.html#Q4 To clarify the remark, I was considering languages that offer Unicode strings as primitives (not merely characters) and are fast string processors (in the C speed range). Maybe C# fits the bill. Python and Java are not 'fast' and Java's String is not a primitive anyway. Other languages aside, the point is that D needs a Unicode string primitive. Mark cter
Mar 29 2003
"Mark Evans" <Mark_member pathlink.com> wrote in message news:b660it$tn4$1 digitaldaemon.com...Other languages aside, the point is that D needs a Unicode string
It does already. In D, a char[] is really a utf-8 array.
Mar 31 2003
Walter wrote:"Mark Evans" <Mark_member pathlink.com> wrote in message news:b660it$tn4$1 digitaldaemon.com...Other languages aside, the point is that D needs a Unicode string
primitive. It does already. In D, a char[] is really a utf-8 array.
Er, no... void main () { char[] foo; foo = "\uFF00"; } "cannot implicitly convert wchar[1] to char[]". Putting in an explicit cast results in a foo with length 1 and value 0.
Mar 31 2003
Ok, I'll fix it. -Walter "Burton Radons" <loth users.sourceforge.net> wrote in message news:b6a7vb$voq$1 digitaldaemon.com...Walter wrote:"Mark Evans" <Mark_member pathlink.com> wrote in message news:b660it$tn4$1 digitaldaemon.com...Other languages aside, the point is that D needs a Unicode string
primitive. It does already. In D, a char[] is really a utf-8 array.
Er, no... void main () { char[] foo; foo = "\uFF00"; } "cannot implicitly convert wchar[1] to char[]". Putting in an explicit cast results in a foo with length 1 and value 0.
Mar 31 2003









"Pavel Minayev" <evilone omen.ru> 