www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Text in D article

reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit


Here's a draft of an article which, hopefully, will explain some of the
details of how text in D works.  Any constructive criticism is welcomed,
along with edits or corrections.

Also, any suggestions on where to put this?  Ideally it could go on the
D website, but I think anywhere would be fine so long as we can point
people to it.

	-- Daniel

-- 
Unlike Knuth, I have neither proven or tried the above; it may not even
make sense.

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/
Nov 18 2006
next sibling parent reply =?UTF-8?B?QW5kZXJzIEYgQmrDtnJrbHVuZA==?= <afb algonet.se> writes:
Daniel Keep wrote:

 Also, any suggestions on where to put this?  Ideally it could go on the
 D website, but I think anywhere would be fine so long as we can point
 people to it.

If you change the license you can put it in the Wiki4D ? Like http://www.prowiki.org/wiki4d/wiki.cgi?CharsAndStrs --anders
Nov 18 2006
next sibling parent Alexander Panek <a.panek brainsware.org> writes:
Would perfectly fit into a wiki! Would be great to have such a text on 
wiki4d or dsource.org's tutorials.

Alex

Anders F Björklund wrote:
 Daniel Keep wrote:
 
 Also, any suggestions on where to put this?  Ideally it could go on the
 D website, but I think anywhere would be fine so long as we can point
 people to it.

If you change the license you can put it in the Wiki4D ? Like http://www.prowiki.org/wiki4d/wiki.cgi?CharsAndStrs --anders

Nov 18 2006
prev sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Anders F Björklund wrote:
 Daniel Keep wrote:
 
 Also, any suggestions on where to put this?  Ideally it could go on the
 D website, but I think anywhere would be fine so long as we can point
 people to it.

If you change the license you can put it in the Wiki4D ?

I'm happy to change the license so it can be used elsewhere... just trying to find a site that actually has the full FDL and isn't down :( I chose CC At-Sa since it should be pretty permissive; all you need to do is attribute the original author and make sure you don't change the license. I thought that's what the FDL did :P
 Like http://www.prowiki.org/wiki4d/wiki.cgi?CharsAndStrs

Some good info there; even a few things I didn't know! Might try to work some of it in.
 --anders

-- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 18 2006
prev sibling next sibling parent reply =?UTF-8?B?QW5kZXJzIEYgQmrDtnJrbHVuZA==?= <afb algonet.se> writes:
Daniel Keep wrote:

 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.

I would avoid the term "Unicode character" like the plague... If you must have something similar, then use "code point" ? It's OK to have it in the casual text, like "ASCII character, BMP character, Unicode character" but better not in the lists. It also has an example on why: printf("Hello, World!\n"); doesn't work. But it does, since string *literals* are all NUL-terminated. However, when you then try to extend that to a string variable, and that variable contains a slice... --anders
Nov 18 2006
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Anders F Björklund wrote:
 Daniel Keep wrote:
 
 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.

I would avoid the term "Unicode character" like the plague... If you must have something similar, then use "code point" ? It's OK to have it in the casual text, like "ASCII character, BMP character, Unicode character" but better not in the lists.

Mmm. I was trying to use the correct terms where appropriate, I just didn't want it to descend into unintelligible gibberish. This is sort of aimed at the person who has no idea what a 'code point' or 'code unit' even is.
 It also has an example on why: printf("Hello, World!\n");
 doesn't work. But it does, since string *literals* are all
 NUL-terminated. However, when you then try to extend that
 to a string variable, and that variable contains a slice...
 
 --anders

Very true. I suppose I *should* say that literals are NUL-terminated, but I want to make it perfectly clear that relying on this is a bad idea; is it accepted practice to simply treat all strings as if they were possibly non NUL-terminated? -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 18 2006
next sibling parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Daniel Keep" <daniel.keep.lists gmail.com> wrote in message 
news:ejn63u$1v79$1 digitaldaemon.com...

 Very true.  I suppose I *should* say that literals are NUL-terminated,
 but I want to make it perfectly clear that relying on this is a bad
 idea; is it accepted practice to simply treat all strings as if they
 were possibly non NUL-terminated?

Is null-termination of string literals even part of the D spec? Or is it entirely up to the implementation? If the latter, then I'd put something in there about it, saying that it can't even be relied on..
Nov 18 2006
prev sibling parent =?UTF-8?B?QW5kZXJzIEYgQmrDtnJrbHVuZA==?= <afb algonet.se> writes:
Daniel Keep wrote:

 Very true.  I suppose I *should* say that literals are NUL-terminated,
 but I want to make it perfectly clear that relying on this is a bad
 idea; is it accepted practice to simply treat all strings as if they
 were possibly non NUL-terminated?

I'm not sure if the text primarily wants to discuss Unicode encodings, or if it wants to discuss strings and text in D in general, but.... The main problem with printf is that you see a line like printf("foo") and think that all strings are allowed. If neither would work, then it wouldn't be as tempting to try it. But your conclusion/practice is OK, you shouldn't use printf with D strings without having a *good* reason (chances are that the C library will choke on the UTF-8 format anyway?) Even the good ole "%.*s" hack is not portable to all possible platforms. (it depends on how parameters are passed, think it breaks on Solaris...) toStringz is the safest, even if you probably need to couple it with a call to an encoding conversion if the local platform isn't using UTF-8 ? But then you are on your own, the D library doesn't do such conversions. Even simple D programs such as: import std.stdio; void main(char[][] args) { foreach(char[] arg; args) writefln("%s", arg); } Will break down if you run them on a platform without UTF-8 support, since you will get illegal strings in "args" (exceptions on writefln) As a workaround you can cast them over to ubyte[], translate to UTF-8 from the local encoding, and cast them back into (now legal) char[]... But I would hardly characterize that as a language "support" for the legacy platforms, it's better to say D *requires* Unicode support ? You might also want to touch briefly on the topics on COW and mutability and how you might get segfaults writing to string literals. Or not... :) --anders
Nov 18 2006
prev sibling next sibling parent reply Tydr Schnubbis <fake address.dude> writes:
Daniel Keep wrote:
 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.
 

Nov 18 2006
parent reply Alexander Panek <a.panek brainsware.org> writes:
PDF would be great, too.

Tydr Schnubbis wrote:
 Daniel Keep wrote:
 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.


Nov 18 2006
parent reply Max Samuha <maxter i.com.ua> writes:
On Sat, 18 Nov 2006 15:59:33 +0100, Alexander Panek
<a.panek brainsware.org> wrote:

PDF would be great, too.

Tydr Schnubbis wrote:
 Daniel Keep wrote:
 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.



viewer that supports the open office format http://www.officeviewers.com/
Nov 18 2006
next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Max Samuha wrote:
 On Sat, 18 Nov 2006 15:59:33 +0100, Alexander Panek
 <a.panek brainsware.org> wrote:
 
 PDF would be great, too.

 Tydr Schnubbis wrote:
 Daniel Keep wrote:
 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.



viewer that supports the open office format http://www.officeviewers.com/

Hey, *I'm* still on Windows :P -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 18 2006
next sibling parent reply Max Samuha <maxter i.com.ua> writes:
On Sun, 19 Nov 2006 02:43:10 +1100, Daniel Keep
<daniel.keep.lists gmail.com> wrote:

Max Samuha wrote:
 On Sat, 18 Nov 2006 15:59:33 +0100, Alexander Panek
 <a.panek brainsware.org> wrote:
 
 PDF would be great, too.

 Tydr Schnubbis wrote:
 Daniel Keep wrote:
 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.



viewer that supports the open office format http://www.officeviewers.com/

Hey, *I'm* still on Windows :P -- Daniel

Daniel, I didn't intend to offend you, really. Sorry, if I did. The article is great and useful. I would add a note for those coming from C# (and Java?) that D strings are mutable and doing the following is a bad idea: class BlackBox { private char[] _text; this() { _text = "object state"; } char[] text() { return _text; // should be 'return _text.dup' if you don't want the user of the object to change the internal _text; } } Or something like that.
Nov 18 2006
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Max Samuha wrote:
 On Sun, 19 Nov 2006 02:43:10 +1100, Daniel Keep
 <daniel.keep.lists gmail.com> wrote:
 
 Max Samuha wrote:
 On Sat, 18 Nov 2006 15:59:33 +0100, Alexander Panek
 <a.panek brainsware.org> wrote:

 PDF would be great, too.

 Tydr Schnubbis wrote:
 Daniel Keep wrote:
 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.



viewer that supports the open office format http://www.officeviewers.com/

-- Daniel

Daniel, I didn't intend to offend you, really. Sorry, if I did.

None taken at all. Hence the ":P" -- OpenOffice.org *does* work on Windows quite nicely :)
 The article is great and useful. I would add a note for those coming
 from C# (and Java?) that D strings are mutable and doing the following
 is a bad idea:
 
 class BlackBox
 {
 	private char[] _text;
 	
 	this()
 	{
 		_text = "object state";		
 	}
 
 	char[] text()
 	{
 		return _text; // should be 'return _text.dup' if you
 don't want the user of the object to change the internal _text;
 	}	
 }
 
 Or something like that.  

Perhaps. This was basically written to be a quick look at all the things people expect to work, but don't. To be honest, I've never had this problem since strings are arrays and arrays are passed by reference and thus can be mutated. But then, maybe not everyone catches that first time :P I'll definitely give it some thought. -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 18 2006
prev sibling parent reply Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
Daniel Keep wrote:
 
 Max Samuha wrote:
 
On Sat, 18 Nov 2006 15:59:33 +0100, Alexander Panek
<a.panek brainsware.org> wrote:


PDF would be great, too.

Tydr Schnubbis wrote:

Daniel Keep wrote:

Here's a draft of an article which, hopefully, will explain some of the
details of how text in D works.  Any constructive criticism is welcomed,
along with edits or corrections.

Any chance of an .rtf, .doc, or even .txt? :)


For those who is still on Windows :), thiere is a free and compact doc viewer that supports the open office format http://www.officeviewers.com/

Hey, *I'm* still on Windows :P -- Daniel

Same here -- for the most part. Luckily I'm an OOo fanboy. ;) As for making the PDF, I have also noticed the bloat of OOo's PDF output, but you might try CutePDF and see if it gives you better results. (Its a virtual printer that outputs to a PDF, so its usable with anything supporting printers.) -- Chris Nicholson-Sauls
Nov 18 2006
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Chris Nicholson-Sauls wrote:
 Daniel Keep wrote:
 Max Samuha wrote:

 On Sat, 18 Nov 2006 15:59:33 +0100, Alexander Panek
 <a.panek brainsware.org> wrote:


 PDF would be great, too.

 Tydr Schnubbis wrote:

 Daniel Keep wrote:

 Here's a draft of an article which, hopefully, will explain some
 of the
 details of how text in D works.  Any constructive criticism is
 welcomed,
 along with edits or corrections.

Any chance of an .rtf, .doc, or even .txt? :)


For those who is still on Windows :), thiere is a free and compact doc viewer that supports the open office format http://www.officeviewers.com/

Hey, *I'm* still on Windows :P -- Daniel

Same here -- for the most part. Luckily I'm an OOo fanboy. ;) As for making the PDF, I have also noticed the bloat of OOo's PDF output, but you might try CutePDF and see if it gives you better results. (Its a virtual printer that outputs to a PDF, so its usable with anything supporting printers.) -- Chris Nicholson-Sauls

I actually have... oh, what's it called? PDFCreator or somesuch. That doesn't usually do that much better than OOo. I actually had to zip the ODT and XHTML files since the newsgroup said they were too large together. I doubt I'd even be able to post the PDF at all :P -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 18 2006
prev sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Max Samuha wrote:
 On Sat, 18 Nov 2006 15:59:33 +0100, Alexander Panek
 <a.panek brainsware.org> wrote:
 
 PDF would be great, too.

 Tydr Schnubbis wrote:
 Daniel Keep wrote:
 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.



viewer that supports the open office format http://www.officeviewers.com/

Thanks for the link, Max. Daniel, I like it. Seems quite clear to me. One minor thing. In one section you recommend just using dchar[] everywhere as a solution for not slicing characters in the middle. But then in the next section you recommend using std.string as a comprehensive solution for manipulating strings. Unfortunately std.string really only deals with char[] strings. So you might want to point out explicitly the dilemma that poses to the developer: If you go with dchar[] and have to do a lot of string munging, you're likely to find lots of toUTF8's and toUCS32's popping up in your code. If you go with char[] you've got to remember that mystring[1..$] may not mean what you think it means. --bb
Nov 18 2006
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Bill Baxter wrote:
 Max Samuha wrote:
 On Sat, 18 Nov 2006 15:59:33 +0100, Alexander Panek
 <a.panek brainsware.org> wrote:

 PDF would be great, too.

 Tydr Schnubbis wrote:
 Daniel Keep wrote:
 Here's a draft of an article which, hopefully, will explain some of
 the
 details of how text in D works.  Any constructive criticism is
 welcomed,
 along with edits or corrections.



viewer that supports the open office format http://www.officeviewers.com/

Thanks for the link, Max. Daniel, I like it. Seems quite clear to me. One minor thing. In one section you recommend just using dchar[] everywhere as a solution for not slicing characters in the middle. But then in the next section you recommend using std.string as a comprehensive solution for manipulating strings. Unfortunately std.string really only deals with char[] strings. So you might want to point out explicitly the dilemma that poses to the developer: If you go with dchar[] and have to do a lot of string munging, you're likely to find lots of toUTF8's and toUCS32's popping up in your code. If you go with char[] you've got to remember that mystring[1..$] may not mean what you think it means. --bb

You are, of course, right. "OK; if you're doing array indexing or slicing, stick to dchar; if you're going to be using std.string, stick to char." Doesn't really sound good. It implies that either the standard library has a hole in it or that indexing and slicing on char[] and wchar[] *should* work as expected. I think I'll change the article so that it's correct, but here's a question for Walter: Is std.string going to support wchar[]s and dchar[]s? If not, why? Heh, they say the best way to learn something is to teach it. Guess I'm still learning :P -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 18 2006
prev sibling next sibling parent Lutger <lutger.blijdestijn gmail.com> writes:
Daniel Keep wrote:
 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.
 
 Also, any suggestions on where to put this?  Ideally it could go on the
 D website, but I think anywhere would be fine so long as we can point
 people to it.
 
 	-- Daniel
 

Cool information! I only recently became aware of how unicode works because of this newsgroup. The current solution in D looks fine to me, it's just that people are not aware of it and the documentation doesn't help much in increasing unicode awareness. I would vote for this information being incorporated right into the relevant sections of the official documentation. Probably the best advice I read here was that if you want your text to just work, you either use dchar or do all string handling with std.string. It's very simple, don't go messing with char[] without the help of phobos unless you know what you're doing. Perhaps you could put something like that in the beginning of your document. D does have something similar to a string class in the form of std.string imo, the only thing is that's it's procedural instead of object-based. I don't see a problem with that.
Nov 18 2006
prev sibling next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit


Alexander Panek wrote:
 PDF would be great, too.

 Tydr Schnubbis wrote:
 Daniel Keep wrote:
 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.



I used the .odt since I wanted people to be able to make modifications to it directly, if they wanted. I really don't like .rtf or .doc (long, painful history with those two), and .txt would probably destroy all formatting. I usually write stuff in reStructuredText, but just didn't on this occasion. Finally, the OOo-produced .pdf is kinda big (by an order of magnitude). So here is an .xhtml version, and I will continue to supply this with any updates. If someone needs it in something else, I'll do that as necessary. No point in continually converting it when I'm still updating it :P
 If you change the license you can put it in the Wiki4D ?

I've duel-licensed it under CC At-Sa and FDL but WOW the FDL is bad. Reading it is like trying to swim through tar. Also, I'm not entirely sure, but I think I may be violating the license by distributing it as ODT... I'm... not entirely sure. I've also got some moral objections to a few parts of the license, but I suppose it's not enough to prevent me using it. Problem is that GNU state specifically that the CC At-Sa license is not compatible with the FDL. Bloody hippies :3
 I would avoid the term "Unicode character" like the plague...
 If you must have something similar, then use "code point" ?
 It's OK to have it in the casual text, like "ASCII character,
 BMP character, Unicode character" but better not in the lists.

I've changed references to "characters" to "code points", but it now seems very cumbersome. I read the Wikipedia article, but I'm still not 100% sure where the distinction lies. So: what *precisely* is a "character", and when it is appropriate to use the word?
 It also has an example on why: printf("Hello, World!\n");
 doesn't work. But it does, since string *literals* are all
 NUL-terminated. However, when you then try to extend that
 to a string variable, and that variable contains a slice...

I've changed it to say that "statements like the above", and put in a note that yeah, ok, the example actually *does* work, but you really shouldn't count on that. Apart from the "character" -> "code point" changes, I've tried to mark all changes by hi lighting them yellow. -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 18 2006
next sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Daniel Keep wrote:
 I really don't like .rtf or .doc (long, painful history with those two),
 and .txt would probably destroy all formatting.  I usually write stuff
 in reStructuredText, but just didn't on this occasion.

I usually send articles around for review in .txt format, that way everyone can read them. After all the reviews are done, then I format it into html (using Ddoc) and put up the web page. The problems with sending around text files in non-text format attached to postings are: 1) the discussions always seem to focus on how to read the files, rather than their content 2) when the posting gets archived, the content of the non-text format becomes inaccessible (it isn't searched by google, either) That said, I think it's great you're working on a good article on strings in D. It'll be very helpful.
Nov 18 2006
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Walter Bright wrote:
 Daniel Keep wrote:
 I really don't like .rtf or .doc (long, painful history with those two),
 and .txt would probably destroy all formatting.  I usually write stuff
 in reStructuredText, but just didn't on this occasion.

I usually send articles around for review in .txt format, that way everyone can read them. After all the reviews are done, then I format it into html (using Ddoc) and put up the web page. The problems with sending around text files in non-text format attached to postings are: 1) the discussions always seem to focus on how to read the files, rather than their content 2) when the posting gets archived, the content of the non-text format becomes inaccessible (it isn't searched by google, either) That said, I think it's great you're working on a good article on strings in D. It'll be very helpful.

Usually I write up stuff in reStructuredText which is basically plain text with markup that can be read without running it through a formatter. In this case I didn't because... I'm not really sure why. I think it was just because OOo has a better spell-checker than Vim :P I might try dumping it out to a text file and see what happens... Also, thanks for the response. Let me know if you think there's anything I should include :) -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 18 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Daniel Keep wrote:
 Also, thanks for the response.  Let me know if you think there's
 anything I should include :)

To tell the truth, I haven't read it yet, because I am reluctant to download viewers and install them.
Nov 18 2006
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Walter Bright wrote:
 Daniel Keep wrote:
 Also, thanks for the response.  Let me know if you think there's
 anything I should include :)

To tell the truth, I haven't read it yet, because I am reluctant to download viewers and install them.

Ah, well, the latest zip contains an XHTML version which should open in just about any browser. Don't tell me you don't even browse your own website :3 -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 18 2006
prev sibling parent reply Serg Kovrov <kovrov no.spam> writes:
Hi Daniel,

You may want to give a try to Google Docs http://docs.google.com/
Seems your case is exactly what it for.


-- 
serg.
Nov 18 2006
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Serg Kovrov wrote:
 Hi Daniel,
 
 You may want to give a try to Google Docs http://docs.google.com/
 Seems your case is exactly what it for.

Blech. No offense, but I hate web apps. Dialup makes these things slow as molasses to use. I've made a website with Google Pages before, and it was not a fun experience. *click a button* *wait* ... ... ... ... *page loads* In an ideal world, I could edit in OOo or GVim and have the files mirrored over FTP or somesuch. I really aught to try that one of these days... -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 18 2006
parent reply Serg Kovrov <kovrov no.spam> writes:
Daniel Keep wrote:
 Blech.  No offense, but I hate web apps.  Dialup makes these things slow
 as molasses to use.  I've made a website with Google Pages before, and
 it was not a fun experience.
 
 *click a button*  *wait* ... ... ... ... *page loads*
 
 In an ideal world, I could edit in OOo or GVim and have the files
 mirrored over FTP or somesuch.  I really aught to try that one of these
 days...

I'm haven't used this google service before, but other people publish papers like yours this way. And if one do not have a wiki (I hate wiki's, btw) or other means to publish versioned documents - google docs seems best option. Out of curiosity, I have created new document and pasted contents from open office. It takes me a about 10 seconds (OO was opened already) to have it online - http://docs.google.com/View?docid=dtqh79k_1rbxfmb -- serg.
Nov 18 2006
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Serg Kovrov wrote:
 Daniel Keep wrote:
 Blech.  No offense, but I hate web apps.  Dialup makes these things slow
 as molasses to use.  I've made a website with Google Pages before, and
 it was not a fun experience.

 *click a button*  *wait* ... ... ... ... *page loads*

 In an ideal world, I could edit in OOo or GVim and have the files
 mirrored over FTP or somesuch.  I really aught to try that one of these
 days...

I'm haven't used this google service before, but other people publish papers like yours this way. And if one do not have a wiki (I hate wiki's, btw) or other means to publish versioned documents - google docs seems best option. Out of curiosity, I have created new document and pasted contents from open office. It takes me a about 10 seconds (OO was opened already) to have it online - http://docs.google.com/View?docid=dtqh79k_1rbxfmb

Not bad, except that there's no spacing between paragraphs. It also destroyed indenting on all the code examples :3 In any case, I dumped out the text to a plain text file, and re-marked it up in reStructuredText. Generates almost exactly the same HTML output, but now people can't complain they can't view it :P I'll post it up as soon as I've worked out if I'm going to include this "Q&A" section. -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 18 2006
prev sibling next sibling parent reply Pierre Rouleau <prouleau impathnetworks.com> writes:
Daniel Keep wrote:

 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.
 

As someone who has not been coding in D except for trying out some D every so often, I find: - the discussion of Unicode and its support of D clear and useful - the description of the use of printf and string confusing: You wrote:: Back before D had the std.stdio.writefln method, most examples used the old C function printf. This worked fine until you tried to output a string:: printf(“Hello, World!\n”); The above statement was very likely to print out garbage that left many people scratching their heads. The reason is that C uses NUL-terminated strings, whereas D uses true arrays. In other words: - Strings in C are a pointer to the first character. A string ends at the first NUL character. - Strings in D are a pointer to the first character, followed by a length. There is no terminating character. And that's the problem: printf is looking for a terminator that doesn't necessarily exist. That would lead me to believe that I could not use printf to print a string litteral. But then I just wrote and compiled the following D code:: int main() { printf("Hello!\n"); printf("Bye!\n"); return 1; } But it prints just fine. So, something must be missing in your explanation or my understanding. I'll have to read more about D to understand. Just my 2 cents, -- P.R.
Nov 18 2006
parent reply Pierre Rouleau <prouleau impathnetworks.com> writes:
Pierre Rouleau wrote:

 Daniel Keep wrote:
 
 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.

As someone who has not been coding in D except for trying out some D every so often, I find: - the discussion of Unicode and its support of D clear and useful - the description of the use of printf and string confusing: You wrote:: Back before D had the std.stdio.writefln method, most examples used the old C function printf. This worked fine until you tried to output a string:: printf(“Hello, World!\n”); The above statement was very likely to print out garbage that left many people scratching their heads. The reason is that C uses NUL-terminated strings, whereas D uses true arrays. In other words: - Strings in C are a pointer to the first character. A string ends at the first NUL character. - Strings in D are a pointer to the first character, followed by a length. There is no terminating character. And that's the problem: printf is looking for a terminator that doesn't necessarily exist. That would lead me to believe that I could not use printf to print a string litteral. But then I just wrote and compiled the following D code:: int main() { printf("Hello!\n"); printf("Bye!\n"); return 1; } But it prints just fine. So, something must be missing in your explanation or my understanding. I'll have to read more about D to understand. Just my 2 cents, -- P.R.

And BTW, the line:: printf(“Hello, World!\n”); does not compile because of the non ASCII characters used for quoting. So other questions comes to mind: - Can D source code contain Unicode characters freely? - If so, how is it done? - If not, how can we define a Unicode string literal? - Does D have a Unicode string type like, say Python, or is it better at specifying them? - How do we handle internationalization of presentation strings in D? - gettext support... - Do we have to use text codecs (as in Python for example)? This information would fit quite nicely in an article describing text in D.
Nov 18 2006
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Pierre Rouleau wrote:
 Pierre Rouleau wrote:
 
 Daniel Keep wrote:

 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.

As someone who has not been coding in D except for trying out some D every so often, I find: - the discussion of Unicode and its support of D clear and useful - the description of the use of printf and string confusing: You wrote:: Back before D had the std.stdio.writefln method, most examples used the old C function printf. This worked fine until you tried to output a string:: printf(“Hello, World!\n”); The above statement was very likely to print out garbage that left many people scratching their heads. The reason is that C uses NUL-terminated strings, whereas D uses true arrays. In other words: - Strings in C are a pointer to the first character. A string ends at the first NUL character. - Strings in D are a pointer to the first character, followed by a length. There is no terminating character. And that's the problem: printf is looking for a terminator that doesn't necessarily exist. That would lead me to believe that I could not use printf to print a string litteral. But then I just wrote and compiled the following D code:: int main() { printf("Hello!\n"); printf("Bye!\n"); return 1; } But it prints just fine. So, something must be missing in your explanation or my understanding. I'll have to read more about D to understand. Just my 2 cents, -- P.R.


Read down a little bit further: it points out that you want to use std.string.toStringz to ensure that the NUL terminator exists. It also admits that the example actually DOES work, simply because dmd sticks the NUL terminator on the end of all string literals. But as someone already pointed out, if what you're dealing with is NOT a string literal: a slice of another string, or something read from disk, then it won't be there and the code will choke. I should probably reorganise the section to be clearer on this. I used that (wrong) example because an example that actually fails would be somewhat longer, and probably make people think "Ok, so why can't I use slices to C functions? Are they not really strings?"
 
 And BTW, the line::
 
   printf(“Hello, World!\n”);
 
 does not compile because of the non ASCII characters used for quoting.

Damnit... every time I go to write prose that option's off, and every time I write code examples it's ON. I swear OOo is out to get me >_<
 So other questions comes to mind:

Off the top of my head:
 - Can D source code contain Unicode characters freely?

- Yup, you betcha!
 - If so, how is it done?

- Use a text editor that supports saving files in UTF-8. I'm not sure off the top of my head if UTF-16 and UTF-32 are supported directly...
 - If not, how can we define a Unicode string literal?

- If you don't have access to a Unicode-enabled editor, you can use escape sequences with \uXXXX (or \UXXXXXXXX for higher Unicode code points.)
 - Does D have a Unicode string type like, say Python, or is it better at
 specifying them?

- That's *all* D has. Remember, char, wchar and dchar correspond to UTF-8, UTF-16 and UTF-32 which are the three main ways of storing Unicode text. Internally, Python uses UTF-16.
 - How do we handle internationalization of presentation strings in D?
 - gettext support...

I don't know if gettext would work in D, simply because I've never seen it tried. D doesn't have any *direct* support for this, tho. (Then again, I'm yet to see *any* programming language that does.)
 - Do we have to use text codecs (as in Python for example)?

D has no built-in support for converting between code pages, as far as I know. You need to download and use a conversion library like iconv to convert between code pages.
 This information would fit quite nicely in an article describing text in D.

I may have to restructure it into two sections: a "What the... it's a borken!" section and a "Q&A" section. Thanks for the feedback. -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 18 2006
parent reply Pierre Rouleau <prouleau impathnetworks.com> writes:
Daniel Keep wrote:

 
 Pierre Rouleau wrote:
 
Pierre Rouleau wrote:


Daniel Keep wrote:


Here's a draft of an article which, hopefully, will explain some of the
details of how text in D works.  Any constructive criticism is welcomed,
along with edits or corrections.

As someone who has not been coding in D except for trying out some D every so often, I find: - the discussion of Unicode and its support of D clear and useful - the description of the use of printf and string confusing: You wrote:: Back before D had the std.stdio.writefln method, most examples used the old C function printf. This worked fine until you tried to output a string:: printf(“Hello, World!\n”); The above statement was very likely to print out garbage that left many people scratching their heads. The reason is that C uses NUL-terminated strings, whereas D uses true arrays. In other words: - Strings in C are a pointer to the first character. A string ends at the first NUL character. - Strings in D are a pointer to the first character, followed by a length. There is no terminating character. And that's the problem: printf is looking for a terminator that doesn't necessarily exist. That would lead me to believe that I could not use printf to print a string litteral. But then I just wrote and compiled the following D code:: int main() { printf("Hello!\n"); printf("Bye!\n"); return 1; } But it prints just fine. So, something must be missing in your explanation or my understanding. I'll have to read more about D to understand. Just my 2 cents, -- P.R.


Read down a little bit further: it points out that you want to use std.string.toStringz to ensure that the NUL terminator exists.

I saw that. My point was that the article should be a little clearer as to why you would want to use it. As an introduction of text processing in D, and a treatment of the different string format (NUL terminated or lenght-based) a newbie would need to know the implications of the code he writes, the effect of transformations (such as slices or whatever).
 It also admits that the example actually DOES work, simply because dmd
 sticks the NUL terminator on the end of all string literals.  But as
 someone already pointed out, if what you're dealing with is NOT a string
 literal: a slice of another string, or something read from disk, then it
 won't be there and the code will choke.
 
 I should probably reorganise the section to be clearer on this.  I used
 that (wrong) example because an example that actually fails would be
 somewhat longer, and probably make people think "Ok, so why can't I use
 slices to C functions?  Are they not really strings?"

 
 
And BTW, the line::

  printf(“Hello, World!\n”);

does not compile because of the non ASCII characters used for quoting.

Damnit... every time I go to write prose that option's off, and every time I write code examples it's ON. I swear OOo is out to get me >_<

I also like reStructuredText myself... but writing extra symbols is a little trickier...
 
So other questions comes to mind:

- Can D source code contain Unicode characters freely?

- If so, how is it done?

off the top of my head if UTF-16 and UTF-32 are supported directly...

Readers might be interested to know that they can use these in the source code file. As well, they wonder whether or not non ASCII characters are acceptables for things such as variable names.
- If not, how can we define a Unicode string literal?

escape sequences with \uXXXX (or \UXXXXXXXX for higher Unicode code points.)
- Does D have a Unicode string type like, say Python, or is it better at
specifying them?

UTF-8, UTF-16 and UTF-32 which are the three main ways of storing Unicode text. Internally, Python uses UTF-16.
- How do we handle internationalization of presentation strings in D?
- gettext support...

I don't know if gettext would work in D, simply because I've never seen it tried. D doesn't have any *direct* support for this, tho.

I can't see why it would not. Can we have a function named '_()' in D? Since gettext philosophy is to write all presentation strings in English, then the code can be written in ASCII-only files and since the strings are Unicode, the translated strings could contain any symbol at runtime. One aspect is the string formatting. Does D support string formatting similar to Python's dictionary-based formatting like: a_dict = {person_name : 'Daniel'} a_string = 'Hello %(person_name)s ! How are you?' % a_dict Python dictionaries are very useful for that purpose. Translating presentation strings works better when the entire string context is available to the person doing the natural language translation. As far as I am concerned, this is an important feature for programming language used to (client-side) write applications.
 
 (Then again, I'm yet to see *any* programming language that does.)
 

that the language does not preclude using gettext.
 
- Do we have to use text codecs (as in Python for example)?

D has no built-in support for converting between code pages, as far as I know. You need to download and use a conversion library like iconv to convert between code pages.

 
 Thanks for the feedback.
 

You're welcome. -- Pierre
Nov 18 2006
next sibling parent Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
Pierre Rouleau wrote:
 One aspect is the string formatting.  Does D support string formatting 
 similar to Python's dictionary-based formatting like:
 
 a_dict = {person_name : 'Daniel'}
 a_string = 'Hello %(person_name)s ! How are you?' % a_dict
 

No, but it ought to be easy enought to make. A quick hack at it: # import cashew .utils .array ; # # char[] dictsub (char[] src, char[][char[]] dict) { # char[] result ; # char[]* plug ; # size_t open , # close , # pos ; # # while (NOT_FOUND != (open = src.indexOf("%(", pos))) { # close = src.indexOf(")", open) ; # result ~= src[pos .. open] ; # pos = close + 1 ; # # if (null is (plug = src[open + 2 .. close] in dict)) { # throw new Exception("dictsub: invalid key " ~ src[open .. close + 1]); # } # result ~= *plug; # } # result ~= src[pos .. $]; # } Don't quote me on that working exactly right as is, since its just off the top of my head. But usage would be fairly straight forward, while not quite as pretty as Python since we don't yet have associative literals. # char[][char[]] a_dict; # a_dict["person_name"] = "Daniel"; # a_string = "Hello %(person_name)! How are you?".dictsub(a_dict); -- Chris Nicholson-Sauls
Nov 19 2006
prev sibling parent =?UTF-8?B?QW5kZXJzIEYgQmrDtnJrbHVuZA==?= <afb algonet.se> writes:
Pierre Rouleau wrote:

 I don't know if gettext would work in D, simply because I've never seen
 it tried.  D doesn't have any *direct* support for this, tho.

I can't see why it would not. Can we have a function named '_()' in D?

Yes, we are using this in wxD - it also works for GNU gettext with D. It's defined as an alias that leads to a function with a longer name: public static string wx.wxObject.GetTranslation(string str); extern(C) char * gettext (char * msgid); --anders
Nov 19 2006
prev sibling next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit


Ok, here's the third revision.  Includes some clearer examples, a Q&A
section, and is now written in plain text, and then dumped out to HTML.
 If anyone complains about what file format it's in now, they can get
stuffed :P  (And *yes*, the HTML is generated directly from the .txt file.)

Again, all feedback and suggestions is welcome.

	-- Daniel

-- 
Unlike Knuth, I have neither proven or tried the above; it may not even
make sense.

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/
Nov 18 2006
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit


Daniel Keep wrote:
 Ok, here's the third revision.  Includes some clearer examples, a Q&A
 section, and is now written in plain text, and then dumped out to HTML.
  If anyone complains about what file format it's in now, they can get
 stuffed :P  (And *yes*, the HTML is generated directly from the .txt file.)
 
 Again, all feedback and suggestions is welcome.
 
 	-- Daniel

I finally managed to find a copy of the C99 standard, and I've filled in what characters you can use... although it's still a bit tricky to understand. That said, I added an example which shows using function names written entirely in hiragana, so it obviously works :P Secondly, I've removed the references to std.utf.stride. After going over the docs again, and actually *testing* the code, it turns out I was dead wrong on what stride does: it returns the length of the code point sequence at the given location, not the number of code points from that location. Whoopsie. I've replaced the code showing how to use std.utf.stride with a small function that correctly computes the number of code points in a string. -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 18 2006
next sibling parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Daniel Keep wrote:
 Daniel Keep wrote:
 Ok, here's the third revision.  Includes some clearer examples, a Q&A
 section, and is now written in plain text, and then dumped out to HTML.
  If anyone complains about what file format it's in now, they can get
 stuffed :P  (And *yes*, the HTML is generated directly from the .txt file.)

 Again, all feedback and suggestions is welcome.

 	-- Daniel

I finally managed to find a copy of the C99 standard, and I've filled in what characters you can use... although it's still a bit tricky to understand. That said, I added an example which shows using function names written entirely in hiragana, so it obviously works :P

konnichiwa!!!!!!11one :D
 
 Secondly, I've removed the references to std.utf.stride.  After going
 over the docs again, and actually *testing* the code, it turns out I was
 dead wrong on what stride does: it returns the length of the code point
 sequence at the given location, not the number of code points from that
 location.  Whoopsie.
 
 I've replaced the code showing how to use std.utf.stride with a small
 function that correctly computes the number of code points in a string.
 
 	-- Daniel
 

Nice job on the article. Why don't you place it on the dsource tutorials section? It's a wiki system, so you can update it more easily.
Nov 19 2006
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Hasan Aljudy wrote:
 
 
 Daniel Keep wrote:
 Daniel Keep wrote:
 Ok, here's the third revision.  Includes some clearer examples, a Q&A
 section, and is now written in plain text, and then dumped out to HTML.
  If anyone complains about what file format it's in now, they can get
 stuffed :P  (And *yes*, the HTML is generated directly from the .txt
 file.)

 Again, all feedback and suggestions is welcome.

     -- Daniel

I finally managed to find a copy of the C99 standard, and I've filled in what characters you can use... although it's still a bit tricky to understand. That said, I added an example which shows using function names written entirely in hiragana, so it obviously works :P

konnichiwa!!!!!!11one :D

Actually, I'm pretty sure it's supposed to be konnichiha: people keep spelling and saying it "konnichiwa" because westerners misheard what the Japanese were saying :3 (Do correct me I'm wrong, btw...)
 Secondly, I've removed the references to std.utf.stride.  After going
 over the docs again, and actually *testing* the code, it turns out I was
 dead wrong on what stride does: it returns the length of the code point
 sequence at the given location, not the number of code points from that
 location.  Whoopsie.

 I've replaced the code showing how to use std.utf.stride with a small
 function that correctly computes the number of code points in a string.

     -- Daniel

Nice job on the article. Why don't you place it on the dsource tutorials section? It's a wiki system, so you can update it more easily.

Honestly, I'd love to see this on the official D website; from the number of people coming to the forums saying "why doesn't this work?" and "strings are teh borken!" it's obvious we need to have something that says "this is how things work and why they work the way they do." But if Walter doesn't want it, then I'm happy to stick it up on the Wiki... yet another format I'll have to change it over to :P -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Nov 19 2006
parent reply Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
Daniel Keep wrote:
 
 Hasan Aljudy wrote:
konnichiwa!!!!!!11one :D

Actually, I'm pretty sure it's supposed to be konnichiha: people keep spelling and saying it "konnichiwa" because westerners misheard what the Japanese were saying :3 (Do correct me I'm wrong, btw...)

Unless my Japanese mentor was playing a prank on me (which is /entirely/ possible) its actually a quirk thing. While it is written "kon'ityi-ha" it is indeed pronouned "kon'nityi-wa", as the 'ha' kana is written for the particle 'wa' for some long-forgotten reason. (Kind of like the archaic 'wo' kana is still used for the 'o' prefix, as in "(w)o-genki desu-ka".) -- Chris Nicholson-Sauls
Nov 19 2006
next sibling parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Chris Nicholson-Sauls wrote:
 Daniel Keep wrote:
 Hasan Aljudy wrote:
 konnichiwa!!!!!!11one :D

Actually, I'm pretty sure it's supposed to be konnichiha: people keep spelling and saying it "konnichiwa" because westerners misheard what the Japanese were saying :3 (Do correct me I'm wrong, btw...)


because the "ha" is actually a particle, and the "ha" particle is pronounced "wa" even though it's written as "ha". I think the phrase is basically an incomplete sentence understood to be "It's morning" or something like that ..
 
 Unless my Japanese mentor was playing a prank on me (which is /entirely/ 
 possible) its actually a quirk thing.  While it is written "kon'ityi-ha" 
 it is indeed pronouned "kon'nityi-wa", as the 'ha' kana is written for 
 the particle 'wa' for some long-forgotten reason.  (Kind of like the 
 archaic 'wo' kana is still used for the 'o' prefix, as in "(w)o-genki 
 desu-ka".)

kon'ity-ha? Wow, what kind of romanization system is that? Now /that/ is a prank .. I think what you said about the ha/wa is correct thu. From what I've gathered, the particle used to be pronounced "ha" but its pronunciation has changed over the centuries, while the spelling for it didn't.
 
 -- Chris Nicholson-Sauls

Nov 19 2006
parent Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
Hasan Aljudy wrote:
 Chris Nicholson-Sauls wrote:
 Unless my Japanese mentor was playing a prank on me (which is 
 /entirely/ possible) its actually a quirk thing.  While it is written 
 "kon'ityi-ha" it is indeed pronouned "kon'nityi-wa", as the 'ha' kana 
 is written for the particle 'wa' for some long-forgotten reason.  
 (Kind of like the archaic 'wo' kana is still used for the 'o' prefix, 
 as in "(w)o-genki desu-ka".)

kon'ity-ha? Wow, what kind of romanization system is that? Now /that/ is a prank ..

Its the Kunreisiki 「訓令式」. I prefer it, personally, because it stays a bit closer to the way it would be written in hiragana/katakana. (Like using "si" rather than "shi", because that's the only way it is pronounced, or using "tya" rather than "cha" because it would be written 「ちゃ」 in the kata.) Weblink: http://www.halcat.com/roomazi/doc/iso3602.html That said, though... I actually did make a mistake. *sigh* It should've just been "ti" rather than "tyi" at the end. That's what I get for responding on the way to bed, though. And I think you're right about it meaning basically "its morning" or "its a day", or some such. I never really asked, but looking at the kanji its written with, it seems to be a really awkward way of saying "good weather" or some such... ah hell. :)
 I think what you said about the ha/wa is correct thu. From what I've 
 gathered, the particle used to be pronounced "ha" but its pronunciation 
 has changed over the centuries, while the spelling for it didn't.

That could well be. Would make a little more sense than it just is, and that's that. -- Chris Nicholson-Sauls
Nov 19 2006
prev sibling next sibling parent reply Bill Baxter <wbaxter gmail.com> writes:
Chris Nicholson-Sauls wrote:
 Daniel Keep wrote:
 
 Hasan Aljudy wrote:

 konnichiwa!!!!!!11one :D

Actually, I'm pretty sure it's supposed to be konnichiha: people keep spelling and saying it "konnichiwa" because westerners misheard what the Japanese were saying :3 (Do correct me I'm wrong, btw...)

Unless my Japanese mentor was playing a prank on me (which is /entirely/ possible) its actually a quirk thing. While it is written "kon'ityi-ha" it is indeed pronouned "kon'nityi-wa", as the 'ha' kana is written for the particle 'wa' for some long-forgotten reason.

yep. (Kind of like the
 archaic 'wo' kana is still used for the 'o' prefix, as in "(w)o-genki 
 desu-ka".)

Now you're just making stuff up. :-) 'wo' is used as a particle indicating the object of a transitive verb. Like "hon wo yomu" (read a book) 本を読む Nothing to do with with the polite 'o' prefix in, o-genki desu ka: 御元気ですか (Though you're more likely to see it written with the hiragana 'o' instead: お元気ですか。) --bb
Nov 19 2006
parent Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
Bill Baxter wrote:
 Chris Nicholson-Sauls wrote:
 
 Daniel Keep wrote:

 Hasan Aljudy wrote:

 konnichiwa!!!!!!11one :D

Actually, I'm pretty sure it's supposed to be konnichiha: people keep spelling and saying it "konnichiwa" because westerners misheard what the Japanese were saying :3 (Do correct me I'm wrong, btw...)

Unless my Japanese mentor was playing a prank on me (which is /entirely/ possible) its actually a quirk thing. While it is written "kon'ityi-ha" it is indeed pronouned "kon'nityi-wa", as the 'ha' kana is written for the particle 'wa' for some long-forgotten reason.

yep. (Kind of like the
 archaic 'wo' kana is still used for the 'o' prefix, as in "(w)o-genki 
 desu-ka".)

Now you're just making stuff up. :-) 'wo' is used as a particle indicating the object of a transitive verb. Like "hon wo yomu" (read a book) 本を読む Nothing to do with with the polite 'o' prefix in, o-genki desu ka: 御元気ですか (Though you're more likely to see it written with the hiragana 'o' instead: お元気ですか。) --bb

Could've sworn 'wo' was used to write 'o-' though... ah well. Either that one /was/ a prank, or its just because I haven't touched hardly any Japanese in a couple years or so. The shame. :) Guess I could've played it safe and dug out one of my dictionaries to check. But where's the fun in that? -- Chris Nicholson-Sauls
Nov 19 2006
prev sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Chris Nicholson-Sauls wrote:
 Daniel Keep wrote:
 Hasan Aljudy wrote:
 konnichiwa!!!!!!11one :D

Actually, I'm pretty sure it's supposed to be konnichiha: people keep spelling and saying it "konnichiwa" because westerners misheard what the Japanese were saying :3 (Do correct me I'm wrong, btw...)

Unless my Japanese mentor was playing a prank on me (which is /entirely/ possible) its actually a quirk thing. While it is written "kon'ityi-ha" it is indeed pronouned "kon'nityi-wa", as the 'ha' kana is written for the particle 'wa' for some long-forgotten reason. (Kind of like the archaic 'wo' kana is still used for the 'o' prefix, as in "(w)o-genki desu-ka".) -- Chris Nicholson-Sauls

"D" wa sugoi desu ne... Whoa, do D community members have some bias towards japanese learning? I myself am a (slow, but active) learner of japanese (finished Pimsleur's Japanese Level 3 some time ago). -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Nov 20 2006
parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Bruno Medeiros wrote:
 Chris Nicholson-Sauls wrote:
 Daniel Keep wrote:
 Hasan Aljudy wrote:
 konnichiwa!!!!!!11one :D

Actually, I'm pretty sure it's supposed to be konnichiha: people keep spelling and saying it "konnichiwa" because westerners misheard what the Japanese were saying :3 (Do correct me I'm wrong, btw...)

Unless my Japanese mentor was playing a prank on me (which is /entirely/ possible) its actually a quirk thing. While it is written "kon'ityi-ha" it is indeed pronouned "kon'nityi-wa", as the 'ha' kana is written for the particle 'wa' for some long-forgotten reason. (Kind of like the archaic 'wo' kana is still used for the 'o' prefix, as in "(w)o-genki desu-ka".) -- Chris Nicholson-Sauls

"D" wa sugoi desu ne... Whoa, do D community members have some bias towards japanese learning? I myself am a (slow, but active) learner of japanese (finished Pimsleur's Japanese Level 3 some time ago).

Yeh, maybe we should have the D Conference here in Tokyo, after all. ;-) --bb
Nov 20 2006
prev sibling parent Don Clugston <dac nospam.com.au> writes:
Daniel Keep wrote:
 Again, all feedback and suggestions is welcome.

Fabulous. It's another *genuine* FAQ, and it'd be great to see this on the official website.
Nov 20 2006
prev sibling parent BCS <BCS pathilink.com> writes:
Daniel Keep wrote:
 Here's a draft of an article which, hopefully, will explain some of the
 details of how text in D works.  Any constructive criticism is welcomed,
 along with edits or corrections.
 
 Also, any suggestions on where to put this?  Ideally it could go on the
 D website, but I think anywhere would be fine so long as we can point
 people to it.
 
 	-- Daniel
 

Did this paper ever get hosted somewhere? I'm looking for a URL to cite it by.
Dec 22 2006