www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Automated page translation with Google

reply Walter Bright <newshound digitalmars.com> writes:
I've been looking into adding buttons to the D web pages to do automatic 
translation to different languages. The trouble is, the google 
translator also attempts to translate the code blocks, resulting in a mess.

Is there a css tag, hack, or trick to convince google translator to skip 
those sections?
Mar 22 2007
next sibling parent reply Pragma <ericanderton yahoo.removeme.com> writes:
Walter Bright wrote:
 I've been looking into adding buttons to the D web pages to do automatic 
 translation to different languages. The trouble is, the google 
 translator also attempts to translate the code blocks, resulting in a mess.
 
 Is there a css tag, hack, or trick to convince google translator to skip 
 those sections?

I googled around for a bit and even tried to see how google's own source snippets stand up to translation. Apparently, it's a known issue. The only thing I found was a reference to a <meta> tag attribute that disables translation for the whole page: http://www.google.com/help/faq_translation.html (bullet #13) <meta name="google" value="notranslate"> But there's nothing like <div google="notranslate">...</div> or somesuch available that I could find. FWIW, I found a rather nice javascript widget here for translation support that uses a drop-down instead of buttons. The page itself is a live example of it in action (in the menu on the right): http://forwarddevelopment.blogspot.com/2006/12/add-translation-tool-to-your-blog.html -- - EricAnderton at yahoo
Mar 22 2007
parent reply Walter Bright <newshound digitalmars.com> writes:
Pragma wrote:
 Walter Bright wrote:
 I've been looking into adding buttons to the D web pages to do 
 automatic translation to different languages. The trouble is, the 
 google translator also attempts to translate the code blocks, 
 resulting in a mess.

 Is there a css tag, hack, or trick to convince google translator to 
 skip those sections?

I googled around for a bit and even tried to see how google's own source snippets stand up to translation. Apparently, it's a known issue. The only thing I found was a reference to a <meta> tag attribute that disables translation for the whole page: http://www.google.com/help/faq_translation.html (bullet #13) <meta name="google" value="notranslate">

Oh well - of course it's quite useless to provide a translation button and then mark the whole page as notranslate!
 But there's nothing like <div google="notranslate">...</div> or somesuch 
 available that I could find.

Sigh. I couldn't find one, either, I was hoping I just overlooked the obvious.
 FWIW, I found a rather nice javascript widget here for translation 
 support that uses a drop-down instead of buttons. The page itself is a 
 live example of it in action (in the menu on the right):
 http://forwarddevelopment.blogspot.com/2006/12/add-translation-to
l-to-your-blog.html 

That does work reasonably well. Thanks!
Mar 22 2007
next sibling parent "Vladimir Panteleev" <thecybershadow gmail.com> writes:
On Fri, 23 Mar 2007 00:27:15 +0200, Walter Bright <newshound digitalmars=
.com> wrote:

 Pragma wrote:
 The only thing I found was a reference to a <meta> tag attribute that=


 disables translation for the whole page:

 http://www.google.com/help/faq_translation.html
 (bullet #13)
 <meta name=3D"google" value=3D"notranslate">

Oh well - of course it's quite useless to provide a translation button=

 and then mark the whole page as notranslate!

You could put each code snippet into its own iframe :) -- = Best regards, Vladimir mailto:thecybershadow gmail.com
Mar 22 2007
prev sibling parent reply "Unknown W. Brackets" <unknown simplemachines.org> writes:
An option, although a bit of work, would be to load the code snippets in 
a separate request using JavaScript.  Unfortunately, this would require 
JavaScript (although alternative links could be provided to the same 
content.)  It's definitely nicer than using iframes, but not good enough...

If you're using JavaScript code to make the translation happen, this 
might be reasonable.  All it would have to do is reload the original 
(untranslated) HTML and replace the sections of code with the originals. 
  That wouldn't be too much work, actually.

It's also possible to provide the translation through the server side, 
using a proxy and caching thus allowing you to control it to your 
desires... but this would probably be overdoing it.

That said, in my previous projects I've been very impressed with 
community-driven translation efforts.  I mean, we had something like 35 
volunteer translations of about 250k of text.  That's nothing compared 
to the probably 1500k of text to be translated for D...

Even so, automatic translation just cannot compare to the real thing. 
Not without smarter routines than we have now.

-[Unknown]


 Pragma wrote:
 Walter Bright wrote:
 I've been looking into adding buttons to the D web pages to do 
 automatic translation to different languages. The trouble is, the 
 google translator also attempts to translate the code blocks, 
 resulting in a mess.

 Is there a css tag, hack, or trick to convince google translator to 
 skip those sections?

I googled around for a bit and even tried to see how google's own source snippets stand up to translation. Apparently, it's a known issue. The only thing I found was a reference to a <meta> tag attribute that disables translation for the whole page: http://www.google.com/help/faq_translation.html (bullet #13) <meta name="google" value="notranslate">

Oh well - of course it's quite useless to provide a translation button and then mark the whole page as notranslate!
 But there's nothing like <div google="notranslate">...</div> or 
 somesuch available that I could find.

Sigh. I couldn't find one, either, I was hoping I just overlooked the obvious.
 FWIW, I found a rather nice javascript widget here for translation 
 support that uses a drop-down instead of buttons. The page itself is a 
 live example of it in action (in the menu on the right):
 http://forwarddevelopment.blogspot.com/2006/12/add-translation-to
l-to-your-blog.html 

That does work reasonably well. Thanks!

Mar 23 2007
parent reply Walter Bright <newshound digitalmars.com> writes:
Unknown W. Brackets wrote:
 An option, although a bit of work, would be to load the code snippets in 
 a separate request using JavaScript.  Unfortunately, this would require 
 JavaScript (although alternative links could be provided to the same 
 content.)  It's definitely nicer than using iframes, but not good enough...
 
 If you're using JavaScript code to make the translation happen, this 
 might be reasonable.  All it would have to do is reload the original 
 (untranslated) HTML and replace the sections of code with the originals. 
  That wouldn't be too much work, actually.
 
 It's also possible to provide the translation through the server side, 
 using a proxy and caching thus allowing you to control it to your 
 desires... but this would probably be overdoing it.

It sounds like too much work!
 That said, in my previous projects I've been very impressed with 
 community-driven translation efforts.  I mean, we had something like 35 
 volunteer translations of about 250k of text.  That's nothing compared 
 to the probably 1500k of text to be translated for D...

The real problem is that the documentation changes regularly, invalidating the translation work.
 Even so, automatic translation just cannot compare to the real thing. 
 Not without smarter routines than we have now.

I know. But it's: 1) effortless 2) always in sync with the constant changes to the documentation 3) reasonable to expect google will get better at it over time, and such improvements will be automatically incorporated
Mar 23 2007
next sibling parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Walter Bright wrote:
 Unknown W. Brackets wrote:
 Even so, automatic translation just cannot compare to the real thing. 
 Not without smarter routines than we have now.

I know. But it's: 1) effortless 2) always in sync with the constant changes to the documentation 3) reasonable to expect google will get better at it over time, and such improvements will be automatically incorporated

I don't think it's google that wrote the translation engines .. it's probably some other company's 30+ years of work!
Mar 23 2007
parent reply Walter Bright <newshound digitalmars.com> writes:
Hasan Aljudy wrote:
 Walter Bright wrote:
 Unknown W. Brackets wrote:
 Even so, automatic translation just cannot compare to the real thing. 
 Not without smarter routines than we have now.

I know. But it's: 1) effortless 2) always in sync with the constant changes to the documentation 3) reasonable to expect google will get better at it over time, and such improvements will be automatically incorporated

I don't think it's google that wrote the translation engines .. it's probably some other company's 30+ years of work!

You're right they bought it. But I think they'll continue to improve it, because doing it better can be worth enormous money.
Mar 23 2007
next sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Jan Claeys wrote:
 The Systran software they have licensed (not bought AFAIK) hasn't
 improved in any obvious way since the first time I used it something
 like 10 years ago...

That's disappointing.
 It's often usable if you want to get an impression of what a page talks
 about, but IMHO technical documentation requires accuracy.

I've used it to translate tech stuff from other languages into english, and if one is careful to use it as a guide rather than gospel, it is very useful. (I once spent time in Japan working on porting software to various Japanese computers. The only tech manuals available were written in Japanese. I don't know more than 10 words of Japanese, but I was amazed at how far I could get in understanding the manuals with just a hint here and there - so I tend to regard even a ludicrously lame google translation as a miracle <g>.)
Mar 25 2007
next sibling parent reply Roberto Mariottini <rmariottini mail.com> writes:
Walter Bright wrote:
 (I once spent time in Japan working on porting software to various 
 Japanese computers. The only tech manuals available were written in 
 Japanese. I don't know more than 10 words of Japanese, but I was amazed 
 at how far I could get in understanding the manuals with just a hint 
 here and there - so I tend to regard even a ludicrously lame google 
 translation as a miracle <g>.)

That's your side of the medal. Since most translation software is made by English-speaking people, translating from some other language to English works better than the reverse. In my experience the reverse doesn't work at all. It's not a miracle when "web" is translated as "photoreceptor", or any other funny word. The main D page translated in Italian makes no sense, so it helps no-one having it. Still, I've made some good laugh with it. Ciao
Mar 27 2007
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Roberto Mariottini wrote:
 That's your side of the medal. Since most translation software is made 
 by English-speaking people, translating from some other language to 
 English works better than the reverse.

I can believe that.
 In my experience the reverse doesn't work at all.
 
 It's not a miracle when "web" is translated as "photoreceptor", or any 
 other funny word. The main D page translated in Italian makes no sense, 
 so it helps no-one having it.

I don't know Italian, but I've worked with German electronics tech stuff auto-translated to English. You quickly figure out that "river" really means "electric current", and "tension" really means "voltage". If your interest is getting your work done, the translators really are an aid. It's surprising how little of a hint one really needs in order to get the information you need out of a chunk of foreign language text. When I worked with the Japanese tech manuals, not only was there no translation software, the stuff was not even in the roman alphabet, but I was able to crack it by looking at the diagrams and things that are universal, like hex numbers, "RS-232", etc.
Mar 27 2007
parent reply Roberto Mariottini <rmariottini mail.com> writes:
Walter Bright wrote:
 I don't know Italian, but I've worked with German electronics tech stuff 
 auto-translated to English. You quickly figure out that "river" really 
 means "electric current", and "tension" really means "voltage". If your 
 interest is getting your work done, the translators really are an aid.

Again, let me not agree. When you are an Italian programmer, you know what a "bug" is. And even if you are speaking in Italian you call it "bug". And also a "debugger" is called a "debugger". Having the translator change this key words to "insect" and "adjustment/tuning program" adds only garbage to the nonsense. And I can also add "template", "thread", "link", "linker" and so on. Having also the examples "translated" is another big problem.
 It's surprising how little of a hint one really needs in order to get 
 the information you need out of a chunk of foreign language text. When I 
 worked with the Japanese tech manuals, not only was there no translation 
 software, the stuff was not even in the roman alphabet, but I was able 
 to crack it by looking at the diagrams and things that are universal, 
 like hex numbers, "RS-232", etc.

Let me add that an average Italian programmer knows enough English to read programming manuals. Maybe you didn't notice, but none of the most successful IDE has been translated into Italian, and so no Italian documentation has been written for them. I suggest to revise your English documentation instead: make it simpler and you'll get more non-native speakers. Another hint: I use automatic translators to ensure they can get right my English. I copy and paste my English text to the translator and see if it can output an acceptable Italian. Often the problem can resolved simply: - adding a comma or changing the order of the words - using active form instead of passive - adding some clarifying "of" or "to" or "that" - using a synonym that the translator likes more For example changing the problematic paragraph: "D is statically typed, and compiles direct to native code. It's multiparadigm: supporting imperative, object oriented, and template metaprogramming styles. It's a member of the C syntax family, and its look and feel is very close to C++'s. For a quick feature comparison, see this comparison of D with C, C++, C# and Java." To the more easily translatable: "D is a statically typed programming language, and compiles directly to machine code. It's multiparadigm, supporting many programming styles: imperative, object oriented, and metaprogramming. It's a member of the C syntax family, and its appearance is very similar to that of C++. For a quick comparison of the features, see this comparison of D with C, C++, C# and Java." Leads to something that is more comprehensible in Italian and French (I'm not sure it's correct English, though). What I did: "Native code" was translated as "code [belonging to one by birth]", so I changed it to "machine code". The second sentence had to be reordered because it was problematic: "styles" was incorrectly associated to "metaprogramming" and "supporting" to "template". "Template" had to be removed: I found no way to get this word right. "Look and feel" had to be substituted with "appearance" in order to not get "sight and (tactile) sensation". "Very close to C++'s" had to be reworded as "very similar to that of C++" to not get "near [in space] to C++". Still, "statically typed" is translated as "statically [type]written". I have no clue on this. Ciao
Mar 30 2007
next sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Roberto Mariottini wrote:
 [snip]
 
 Ciao

It could be worse... <being-silly> We could always just write everything in RDF: that way machine translators wouldn't have a problem! And look at how *precise* everything is: "D is a statically typed programming language, and compiles directly to machine code." <rdf:Description rdf:about="http://www.digitalmars.com/d/" xmlns:pl="http://lambda-the-ultimate.org/ProgrammingLanguage#"> <pl:type-system>static</pl:type-system> <pl:compiles-to>native</pl:compiles-to> ... "It's multiparadigm, supporting many programming styles: imperative, object oriented, and metaprogramming." <pl:paradigms> <pl:ParadigmList> <pl:paradigm>imperative</pl:paradigm> <pl:paradigm>object-oriented</pl:paradigm> <pl:paradigm>metaprogramming</pl:paradigm> </pl:ParadigmList> </pl:paradigms> ... </being-silly> I'm curious as to whether this sort of ambiguity is a problem for other languages? Is it much easier to translate *to* English than *from* it? -- Daniel -- int getRandomNumber() { return 4; // chosen by fair dice roll. // guaranteed to be random. } http://xkcd.com/ v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Mar 30 2007
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Roberto Mariottini wrote:
 Walter Bright wrote:
 I don't know Italian, but I've worked with German electronics tech 
 stuff auto-translated to English. You quickly figure out that "river" 
 really means "electric current", and "tension" really means "voltage". 
 If your interest is getting your work done, the translators really are 
 an aid.

Again, let me not agree. When you are an Italian programmer, you know what a "bug" is. And even if you are speaking in Italian you call it "bug". And also a "debugger" is called a "debugger". Having the translator change this key words to "insect" and "adjustment/tuning program" adds only garbage to the nonsense. And I can also add "template", "thread", "link", "linker" and so on.

It certainly would be helpful if there was a way to tag some terms as "don't translate".
 Having also the examples "translated" is another big problem.

I agree. "translated" code samples are just garbage. That's why I asked earlier if there was a way to mark sections as "don't translate". Unfortunately, there doesn't seem to be a way.
 It's surprising how little of a hint one really needs in order to get 
 the information you need out of a chunk of foreign language text. When 
 I worked with the Japanese tech manuals, not only was there no 
 translation software, the stuff was not even in the roman alphabet, 
 but I was able to crack it by looking at the diagrams and things that 
 are universal, like hex numbers, "RS-232", etc.

Let me add that an average Italian programmer knows enough English to read programming manuals.

I'm sure that's true of most programmers. But still, there seems to be a demand for foreign language versions of the docs, as a couple people have made the effort to do them.
 Maybe you didn't notice, but none of the most 
 successful IDE has been translated into Italian, and so no Italian 
 documentation has been written for them.

Most of the interest in translations seems to come from spanish, portugese and japanese programmers. I have no idea if this is coincidence or not.
 I suggest to revise your English documentation instead: make it simpler 
 and you'll get more non-native speakers.
 
 Another hint: I use automatic translators to ensure they can get right 
 my English. I copy and paste my English text to the translator and see 
 if it can output an acceptable Italian. Often the problem can resolved 
 simply:
  - adding a comma or changing the order of the words
  - using active form instead of passive
  - adding some clarifying "of" or "to" or "that"
  - using a synonym that the translator likes more

That's a great suggestion, but I am nowhere near proficient enough in another language to make this work.
 For example changing the problematic paragraph:
 "D is statically typed, and compiles direct to native code. It's 
 multiparadigm: supporting imperative, object oriented, and template 
 metaprogramming styles. It's a member of the C syntax family, and its 
 look and feel is very close to C++'s. For a quick feature comparison, 
 see this comparison of D with C, C++, C# and Java."
 
 To the more easily translatable:
 "D is a statically typed programming language, and compiles directly to 
 machine code. It's multiparadigm, supporting many programming styles: 
 imperative, object oriented, and metaprogramming. It's a member of the C 
 syntax family, and its appearance is very similar to that of C++. For a 
 quick comparison of the features, see this comparison of D with C, C++, 
 C# and Java."
 
 Leads to something that is more comprehensible in Italian and French 
 (I'm not sure it's correct English, though).

I'll make the changes.
Mar 31 2007
next sibling parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Walter Bright wrote:
 Roberto Mariottini wrote:
 For example changing the problematic paragraph:
 "D is statically typed, and compiles direct to native code. It's 
 multiparadigm: supporting imperative, object oriented, and template 
 metaprogramming styles. It's a member of the C syntax family, and its 
 look and feel is very close to C++'s. For a quick feature comparison, 
 see this comparison of D with C, C++, C# and Java."

 To the more easily translatable:
 "D is a statically typed programming language, and compiles directly 
 to machine code. It's multiparadigm, supporting many programming 
 styles: imperative, object oriented, and metaprogramming. It's a 
 member of the C syntax family, and its appearance is very similar to 
 that of C++. For a quick comparison of the features, see this 
 comparison of D with C, C++, C# and Java."

 Leads to something that is more comprehensible in Italian and French 
 (I'm not sure it's correct English, though).

I'll make the changes.

It might also be somewhat helpful if you could change all or most references to "D" to "The D Language" or something, at least at the beginning of a paragraph.
Mar 31 2007
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Hasan Aljudy wrote:
 
 
 Walter Bright wrote:
 Roberto Mariottini wrote:
 For example changing the problematic paragraph:
 "D is statically typed, and compiles direct to native code. It's 
 multiparadigm: supporting imperative, object oriented, and template 
 metaprogramming styles. It's a member of the C syntax family, and its 
 look and feel is very close to C++'s. For a quick feature comparison, 
 see this comparison of D with C, C++, C# and Java."

 To the more easily translatable:
 "D is a statically typed programming language, and compiles directly 
 to machine code. It's multiparadigm, supporting many programming 
 styles: imperative, object oriented, and metaprogramming. It's a 
 member of the C syntax family, and its appearance is very similar to 
 that of C++. For a quick comparison of the features, see this 
 comparison of D with C, C++, C# and Java."

 Leads to something that is more comprehensible in Italian and French 
 (I'm not sure it's correct English, though).

I'll make the changes.

It might also be somewhat helpful if you could change all or most references to "D" to "The D Language" or something, at least at the beginning of a paragraph.

The problem is it becomes "The D language is a statically typed programming language...", kind of redundant.
Mar 31 2007
parent Walter Bright <newshound1 digitalmars.com> writes:
Brad Roberts wrote:
 Walter Bright wrote:
 The problem is it becomes "The D language is a statically typed 
 programming language...", kind of redundant.

The D language is statically typed and compiles directly to machine code.

Why didn't I think of that? <g>
Mar 31 2007
prev sibling next sibling parent Sean Kelly <sean f4.ca> writes:
Walter Bright wrote:
 
 I agree. "translated" code samples are just garbage. That's why I asked 
 earlier if there was a way to mark sections as "don't translate". 
 Unfortunately, there doesn't seem to be a way.

It would be enough simply not to translate text enclosed in <pre> tags. Sean
Mar 31 2007
prev sibling next sibling parent reply Georg Wrede <georg nospam.org> writes:
Walter Bright wrote:
 It certainly would be helpful if there was a way to tag some terms as 
 "don't translate".

Try putting an X in front of the undesidered translations: int house; person friend; vs. int Xhouse; person Xfriend; or something like this.
Mar 31 2007
parent Derek Parnell <derek psych.ward> writes:
On Sun, 01 Apr 2007 03:24:46 +0300, Georg Wrede wrote:

 Walter Bright wrote:
 It certainly would be helpful if there was a way to tag some terms as 
 "don't translate".

Try putting an X in front of the undesidered translations: int house; person friend; vs. int Xhouse; person Xfriend; or something like this.

I already do this using my in-house naming convention. Basically, the aim is to make sure that English words are never used as identifiers. int lHouse; int lFriend; The 'l' stands for /l/ocal scope. -- Derek Parnell Melbourne, Australia "Justice for David Hicks!" skype: derek.j.parnell
Apr 01 2007
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Walter Bright wrote:
 
 Maybe you didn't notice, but none of the most successful IDE has been 
 translated into Italian, and so no Italian documentation has been 
 written for them.

Most of the interest in translations seems to come from spanish, portugese and japanese programmers. I have no idea if this is coincidence or not.

Just a minor correction, portuguese is spoken (mainly) in Portugal and Brazil, and that translation on the D site was made by a brazilian programmer, not a portuguese one, cause it's written in brazilian portuguese. Regarding the interest in translation, it varies from country to country, as we all know, japanese people in geral don't speak english well or at all. In Spain, english speaking is not that common as well (altough not so much as in Japan). In Portugal people are more familiar with english than Spain. It's common in Portugal that technical people, or young people, are confortable understanding, or even speaking English, while in Spain it's uncommon. An interesting thing to note: in Spain foreign movies are dubbed, in Portugal they are not. I think that is a factor in how well a population is receptive to a foreign (english) language. In Brazil I'm not sure what it's like, although I think people are just as familiar with english as in Portugal. -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 01 2007
prev sibling parent Brad Roberts <braddr puremagic.com> writes:
Walter Bright wrote:
 Hasan Aljudy wrote:
 Walter Bright wrote:
 Roberto Mariottini wrote:
 For example changing the problematic paragraph:
 "D is statically typed, and compiles direct to native code. It's 
 multiparadigm: supporting imperative, object oriented, and template 
 metaprogramming styles. It's a member of the C syntax family, and 
 its look and feel is very close to C++'s. For a quick feature 
 comparison, see this comparison of D with C, C++, C# and Java."

 To the more easily translatable:
 "D is a statically typed programming language, and compiles directly 
 to machine code. It's multiparadigm, supporting many programming 
 styles: imperative, object oriented, and metaprogramming. It's a 
 member of the C syntax family, and its appearance is very similar to 
 that of C++. For a quick comparison of the features, see this 
 comparison of D with C, C++, C# and Java."

 Leads to something that is more comprehensible in Italian and French 
 (I'm not sure it's correct English, though).

I'll make the changes.

It might also be somewhat helpful if you could change all or most references to "D" to "The D Language" or something, at least at the beginning of a paragraph.

The problem is it becomes "The D language is a statically typed programming language...", kind of redundant.

The D language is statically typed and compiles directly to machine code.
Mar 31 2007
prev sibling parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Jan Claeys wrote:
 Op Fri, 23 Mar 2007 15:46:24 -0700
 schreef Walter Bright <newshound digitalmars.com>:
 
 Hasan Aljudy wrote:

 I don't think it's google that wrote the translation engines ..
 it's probably some other company's 30+ years of work!  

it, because doing it better can be worth enormous money.

The Systran software they have licensed (not bought AFAIK) hasn't improved in any obvious way since the first time I used it something like 10 years ago... It's often usable if you want to get an impression of what a page talks about, but IMHO technical documentation requires accuracy. E.g., something like "Objets de classe d'Instantiating ailleurs que le tas de CHROMATOGRAPHIE GAZEUSE" is complete nonsense if you are talking about D. ;-)

Actually I was looking up "free statistical translation" (or something like that) in Google, when I discovered a Google Blog entry stating that Google now uses a statistical model for translating Arabic and Chinese (I think all languages labeled BETA use that model now) http://googleresearch.blogspot.com/2006/04/statistical-machine-translation-live.html and, interestingly enough, you can now "suggest a better translation" for any piece of text that Google translates! I'm guessing it goes through some sort of filtering mechanism then gets passed to the statistical engine. http://googleblog.blogspot.com/2007/03/suggest-better-translation.html I've found that translating news articles from Arabic to English gives very good results .. However, translating technical articles from English to Arabic gives the crappiest results!! I guess it all depends on what they feed the statistical engine. Try it on aljazeera.net or something .. I think you'll be amazed; I was. I never thought there'd be any hope for "reasonable" machine translation involving Arabic, and I happily admit that I've been proved wrong!
Mar 26 2007
next sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Hasan Aljudy wrote:
 and, interestingly enough, you can now "suggest a better translation" 
 for any piece of text that Google translates! I'm guessing it goes 
 through some sort of filtering mechanism then gets passed to the 
 statistical engine.
 
 http://googleblog.blogspot.com/2007/03/suggest-better-translation.html

Once again, harnessing the power of crowds! I wish they'd do the new translator for the rest of the languages, too.
Mar 26 2007
parent Hasan Aljudy <hasan.aljudy gmail.com> writes:
Walter Bright wrote:
 Hasan Aljudy wrote:
 and, interestingly enough, you can now "suggest a better translation" 
 for any piece of text that Google translates! I'm guessing it goes 
 through some sort of filtering mechanism then gets passed to the 
 statistical engine.

 http://googleblog.blogspot.com/2007/03/suggest-better-translation.html

Once again, harnessing the power of crowds! I wish they'd do the new translator for the rest of the languages, too.

I'm going over some small things, suggesting translations .. Unfortunately the suggested translation doesn't show up instead of the messed up one; I'm not sure what they do with the translations that people suggest. I think it'd be awesome if there was a wiki-like translation system with such ajax powers .. man that would make collaborative translation very fun. heh .. some translations are so funny .. "Example: washroom" (yes, wc was translated to washroom!) The real problem with translating technical papers is that some terms simply have no agreed-upon translation. "Lazy Evaluation" for example .. heck .. I don't know how to translate that! If I was writing an article about it in Arabic, I'd simply leave it untranslated; that's what I've always done with terms that don't have a translation. Oh, gotta love this one .. "variadic templates"!! I don't even know what variadic means to begin with; how am I gonna translate it? lol! it seems that machine translator doesn't know either; it just transliterated it; it also transliterated "tuples" and "mixins". "D is statically typed, and compiles direct to native code." was translated as: "Dal static printed, and he collects direct for patriot law" Where "Dal" is the Arabic letter that makes the "D" sound.
Mar 26 2007
prev sibling parent janderson <askme me.com> writes:
Hasan Aljudy wrote:
 
 
 Jan Claeys wrote:
 Op Fri, 23 Mar 2007 15:46:24 -0700
 schreef Walter Bright <newshound digitalmars.com>:

 Hasan Aljudy wrote:

 I don't think it's google that wrote the translation engines ..
 it's probably some other company's 30+ years of work!  

it, because doing it better can be worth enormous money.

The Systran software they have licensed (not bought AFAIK) hasn't improved in any obvious way since the first time I used it something like 10 years ago... It's often usable if you want to get an impression of what a page talks about, but IMHO technical documentation requires accuracy. E.g., something like "Objets de classe d'Instantiating ailleurs que le tas de CHROMATOGRAPHIE GAZEUSE" is complete nonsense if you are talking about D. ;-)

Actually I was looking up "free statistical translation" (or something like that) in Google, when I discovered a Google Blog entry stating that Google now uses a statistical model for translating Arabic and Chinese (I think all languages labeled BETA use that model now) http://googleresearch.blogspot.com/2006/04/statistical-machine-t anslation-live.html and, interestingly enough, you can now "suggest a better translation" for any piece of text that Google translates! I'm guessing it goes through some sort of filtering mechanism then gets passed to the statistical engine. http://googleblog.blogspot.com/2007/03/suggest-better-translation.html

This is cool. Perhaps someone will summit better translated D pages. Then with small changes hopefully the translation would stay reasonably decent.
 
 I've found that translating news articles from Arabic to English gives 
 very good results ..
 However, translating technical articles from English to Arabic gives the 
 crappiest results!! I guess it all depends on what they feed the 
 statistical engine.
 
 Try it on aljazeera.net or something .. I think you'll be amazed; I was. 
 I never thought there'd be any hope for "reasonable" machine translation 
 involving Arabic, and I happily admit that I've been proved wrong!

Mar 26 2007
prev sibling parent "Unknown W. Brackets" <unknown simplemachines.org> writes:
Actually, this was true of my project as well.  Not all of it, of 
course, and we used versioning to see what had changed... but large 
segments could change between releases and things always had to be 
retranslated.

Of course, we tried to minimize this... but the volunteers were always 
willing to keep at it!

Effortless is as effortless does, in the end.

-[Unknown]


 The real problem is that the documentation changes regularly, 
 invalidating the translation work.
 
 Even so, automatic translation just cannot compare to the real thing. 
 Not without smarter routines than we have now.

I know. But it's: 1) effortless 2) always in sync with the constant changes to the documentation 3) reasonable to expect google will get better at it over time, and such improvements will be automatically incorporated

Mar 23 2007
prev sibling next sibling parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Walter Bright wrote:
 I've been looking into adding buttons to the D web pages to do automatic 
 translation to different languages. The trouble is, the google 
 translator also attempts to translate the code blocks, resulting in a mess.
 
 Is there a css tag, hack, or trick to convince google translator to skip 
 those sections?

Seeing as how automated translation is horrible at best, might as well just drop the whole idea ...
Mar 23 2007
next sibling parent reply Max Samukha <samukha voliacable.com> writes:
On Fri, 23 Mar 2007 02:08:13 -0600, Hasan Aljudy
<hasan.aljudy gmail.com> wrote:

Walter Bright wrote:
 I've been looking into adding buttons to the D web pages to do automatic 
 translation to different languages. The trouble is, the google 
 translator also attempts to translate the code blocks, resulting in a mess.
 
 Is there a css tag, hack, or trick to convince google translator to skip 
 those sections?

Seeing as how automated translation is horrible at best, might as well just drop the whole idea ...

Agree. Most automated translations are funny and unreadable for native speakers (much funnier than my English writing). IMO, people won't read them at all or have fun and go away. It's definitely preferable to use translations done by native speakers. BTW, google translator unacceptably distorts the meaning of original Russian text (the translator is in beta, but I don't think it wil get improved much).
Mar 23 2007
parent Hasan Aljudy <hasan.aljudy gmail.com> writes:
Max Samukha wrote:
 On Fri, 23 Mar 2007 02:08:13 -0600, Hasan Aljudy
 <hasan.aljudy gmail.com> wrote:
 
 Walter Bright wrote:
 I've been looking into adding buttons to the D web pages to do automatic 
 translation to different languages. The trouble is, the google 
 translator also attempts to translate the code blocks, resulting in a mess.

 Is there a css tag, hack, or trick to convince google translator to skip 
 those sections?

just drop the whole idea ...

Agree. Most automated translations are funny and unreadable for native speakers (much funnier than my English writing). IMO, people won't read them at all or have fun and go away. It's definitely preferable to use translations done by native speakers. BTW, google translator unacceptably distorts the meaning of original Russian text (the translator is in beta, but I don't think it wil get improved much).

European languages might have a chance (since they supposedly are close to English .. some how), I don't know about Russian, and I'm not sure about Japanese either since its grammar is totally unrelated to English .. but it's especially with Arabic that I know it's impossible to get a good auto-translator anytime in the near future.
Mar 23 2007
prev sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Hasan Aljudy wrote:
 Seeing as how automated translation is horrible at best, might as well 
 just drop the whole idea ...

I've experimented with the German=>English translations (see http://www.generalatomic.com/teil1/index.html) and, while horrible, is decipherable. It's better (much better) than nothing. It's also kinda fun :-)
Mar 23 2007
parent reply janderson <askme me.com> writes:
Walter Bright wrote:
 Hasan Aljudy wrote:
 Seeing as how automated translation is horrible at best, might as well 
 just drop the whole idea ...

I've experimented with the German=>English translations (see http://www.generalatomic.com/teil1/index.html) and, while horrible, is decipherable. It's better (much better) than nothing. It's also kinda fun :-)

Some of the englisg is very hard to understand, while other parts are surprisingly well translated. For instance this segment I thought was rather good. We take the glow small lamp from the box and connect the connections with the connections + and -4.5 V (fig. 2). Then we switch the switch on S6. The small lamp will brightly burn. The thin wire contained in the small lamp now forms a connection between the power source connections, and it comes off a strong electron flow, which is so strong that the electrons rub against the atoms of the thread very strongly 1). Friction however produces as well known warmth; it becomes so strong in our case that the thread begins glowing and light produced. The river flows thereby in such a way, like it into fig. 3 registered is i.e., from the negative pole thick with electrons over the small lamp to the positive terminal. -Joel
Mar 23 2007
parent Walter Bright <newshound digitalmars.com> writes:
janderson wrote:
 Walter Bright wrote:
 Hasan Aljudy wrote:
 Seeing as how automated translation is horrible at best, might as 
 well just drop the whole idea ...

I've experimented with the German=>English translations (see http://www.generalatomic.com/teil1/index.html) and, while horrible, is decipherable. It's better (much better) than nothing. It's also kinda fun :-)

Some of the englisg is very hard to understand, while other parts are surprisingly well translated. For instance this segment I thought was rather good. We take the glow small lamp from the box and connect the connections with the connections + and -4.5 V (fig. 2). Then we switch the switch on S6. The small lamp will brightly burn. The thin wire contained in the small lamp now forms a connection between the power source connections, and it comes off a strong electron flow, which is so strong that the electrons rub against the atoms of the thread very strongly 1). Friction however produces as well known warmth; it becomes so strong in our case that the thread begins glowing and light produced. The river flows thereby in such a way, like it into fig. 3 registered is i.e., from the negative pole thick with electrons over the small lamp to the positive terminal.

Yes, I sure wish I had these translations as a kid (when I got the Kosmos set).
Mar 23 2007
prev sibling next sibling parent reply Roberto Mariottini <rmariottini mail.com> writes:
Walter Bright wrote:
 I've been looking into adding buttons to the D web pages to do automatic 
 translation to different languages. The trouble is, the google 
 translator also attempts to translate the code blocks, resulting in a mess.

The real trouble is that these translators are not good enough for 'production'. Here in Italy we laugh at sites automatically translated, we even signal them to friends by e-mail. One of the funniest was the now non-working it.mp3u.com, where you could find some pearls like: "100% risk free" => "100% rischia liberamente" that means "you risk 100% freely" (ROTFL), and should be "Libero da rischi al 100%". Ciao P.S.: Google Translate brings "rischio di 100% liberamente" that means "risk of 100% freely". LOL
Mar 23 2007
next sibling parent reply Max Samukha <samukha voliacable.com> writes:
On Fri, 23 Mar 2007 10:21:55 +0100, Roberto Mariottini
<rmariottini mail.com> wrote:

Walter Bright wrote:
 I've been looking into adding buttons to the D web pages to do automatic 
 translation to different languages. The trouble is, the google 
 translator also attempts to translate the code blocks, resulting in a mess.

The real trouble is that these translators are not good enough for 'production'. Here in Italy we laugh at sites automatically translated, we even signal them to friends by e-mail. One of the funniest was the now non-working it.mp3u.com, where you could find some pearls like: "100% risk free" => "100% rischia liberamente" that means "you risk 100% freely" (ROTFL), and should be "Libero da rischi al 100%". Ciao P.S.: Google Translate brings "rischio di 100% liberamente" that means "risk of 100% freely". LOL

I suggest people in the NG have the D site's front page autotranslated into their language. I just say no to Russian translation. Some extracts translated back into English: "Michael the Great is the only thing I need." "Segfualt D is the language of systems." (I like this one:)) Portablility turns into mobility, of course... "metaprogramming styles" - "styles of the zodiac"... How about other languages?
Mar 23 2007
next sibling parent Roberto Mariottini <rmariottini mail.com> writes:
Max Samukha wrote:
[...[
 
 I suggest people in the NG have the D site's front page autotranslated
 into their language. I just say no to Russian translation. Some
 extracts translated back into English:
 
 "Michael the Great is the only thing I need."
 "Segfualt D is the language of systems."  (I like this one:))
 Portablility turns into mobility, of course...
 "metaprogramming styles" - "styles of the zodiac"...
 
 How about other languages?

http://www.google.com/translate?u=http%3A%2F%2Fwww.digitalmars.com%2Fd%2F&langpair=en%7Cit&hl=en&ie=UTF8 LOL: "web" -> "photoreceptor" (???) "just what I need" -> "Only a moment ago what I need" "supporting imperative, object oriented, and template metaprogramming styles" -> "styles imperative metaprogramming, objectively oriented and of the supporting (thin metal plate with a cut pattern)" "D change log" -> "D changes the log (of wood)" "tech tips" -> "ends of technology" "Issues and bugs" -> "Editions and insects" "Code coverage" -> "Coding the filling" "Last update Sun Feb 4 12:10:26 2007" -> "Last sun (the star) Feb 4 12 of the update: 10: 26 2007" "Home" -> "Domestic" Some error is only a little misleading: "download" -> "transfer from the central system to the satellites" "Exception safety" -> "Exceptional safety" "(programming) language" -> "(natural) language" Some sentences are completely without a meaning. Note also that the whole example is on one single line. As a side note, I can say the translator keeps getting better, decade by decade. It got right: "Programming language" and "Style guide", while "Debugger" is translated as "adjustment/tuning program" (the right translation is "debugger"). Maybe in 30-60 years it can produce acceptable output. Ciao
Mar 23 2007
prev sibling parent reply =?ISO-8859-1?Q?Jari-Matti_M=E4kel=E4?= <jmjmak utu.fi.invalid> writes:
Max Samukha wrote:
 On Fri, 23 Mar 2007 10:21:55 +0100, Roberto Mariottini
 <rmariottini mail.com> wrote:
 Walter Bright wrote:
 I've been looking into adding buttons to the D web pages to do automatic 
 translation to different languages. The trouble is, the google 
 translator also attempts to translate the code blocks, resulting in a mess.

'production'.


 Here in Italy we laugh at sites automatically translated, we even signal 
 them to friends by e-mail.


 I suggest people in the NG have the D site's front page autotranslated
 into their language. 

 How about other languages?

Google Translate does not support finnish yet, but I found www.tranexp.com. Well, it does look a bit hilarious :D Many of these online translators don't care about context at all. They simply use a stupid mapping of original words to some bad translations. Ok, here's part of the machine translated frontpage translated back to english by me: --- [Logo] It takes to synchronize Sun (<- the one that shines in the sky) ... D-LETTER Programming Use of language <snip> "excellent, reasonable what I-LETTER misses.. another D-LETTER home programming" - Segfault <snip> You note: that's right, D-LETTER user accepts that staying with downloading and treating D-letter, that is book knowledge D-LETTER eyeglasses, he or she post statute explicit recognizes something redeem so that intellectual real estate exactly with the help of publishing right aka patent evaluation at home something keep synchronized, that is emailed feedback fly so that Digital Mars. --- end of bad translation Probably the best part is the code example. I could not translate it w/o LMAO.
Mar 23 2007
parent reply Deewiant <deewiant.doesnotlike.spam gmail.com> writes:
Jari-Matti Mäkelä wrote:
 Google Translate does not support finnish yet, but I found
 www.tranexp.com. Well, it does look a bit hilarious :D
 

InterTran's Finnish translation is a joke. I don't know what it's logic is in translating the English "I" to "I-KIRJAIN" (literally "THE LETTER I") instead of "minä". That's one of the most basic words in either language, and can't be that hard to get right. Run its Finnish->English on the Kalevala for a laugh: http://www.sacred-texts.com/neu/kvfin/ Or the English->Finnish, if you're feeling lucky: http://www.sacred-texts.com/neu/kveng/ I love the way "tuo sotka, sorea lintu" (approximately "that duck, graceful bird"; the English translation has "a beauteous duck") becomes "yonder bitch, pretty bird".
Mar 23 2007
parent Jari-Matti =?ISO-8859-1?Q?M=E4kel=E4?= <jmjmak utu.fi.invalid> writes:
Deewiant wrote:

 Jari-Matti Mäkelä wrote:
 Google Translate does not support finnish yet, but I found
 www.tranexp.com. Well, it does look a bit hilarious :D
 

InterTran's Finnish translation is a joke. I don't know what it's logic is in translating the English "I" to "I-KIRJAIN" (literally "THE LETTER I") instead of "minä". That's one of the most basic words in either language, and can't be that hard to get right.

I guess they must have had some joker as a summer trainee there. :) It's a bit unfortunate that some sites even use that kind of service to serve localized pages. -- The spirit is willing but the flesh is weak. The vodka is strong but the meat is rotten.
Mar 24 2007
prev sibling parent reply Jascha Wetzel <"[firstname]" mainia.de> writes:
here is an article about a system that's supposed to work - it considers
context. google's translator obviously doesn't.

http://www.wired.com/wired/archive/14.12/translate.html

Roberto Mariottini wrote:
 Walter Bright wrote:
 I've been looking into adding buttons to the D web pages to do
 automatic translation to different languages. The trouble is, the
 google translator also attempts to translate the code blocks,
 resulting in a mess.

The real trouble is that these translators are not good enough for 'production'. Here in Italy we laugh at sites automatically translated, we even signal them to friends by e-mail. One of the funniest was the now non-working it.mp3u.com, where you could find some pearls like: "100% risk free" => "100% rischia liberamente" that means "you risk 100% freely" (ROTFL), and should be "Libero da rischi al 100%". Ciao P.S.: Google Translate brings "rischio di 100% liberamente" that means "risk of 100% freely". LOL

Mar 23 2007
next sibling parent Walter Bright <newshound digitalmars.com> writes:
Jascha Wetzel wrote:
 here is an article about a system that's supposed to work - it considers
 context. google's translator obviously doesn't.
 
 http://www.wired.com/wired/archive/14.12/translate.html

It's a very interesting technology. It could be a coup for Google to buy it and incorporate it.
Mar 23 2007
prev sibling parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Jascha Wetzel wrote:
 here is an article about a system that's supposed to work - it considers
 context. google's translator obviously doesn't.
 
 http://www.wired.com/wired/archive/14.12/translate.html

Google actually uses this system for a couple of languages now .. (see my other post today .. (damn, how do you get the post links from Thunderbird?))
 
 Roberto Mariottini wrote:
 Walter Bright wrote:
 I've been looking into adding buttons to the D web pages to do
 automatic translation to different languages. The trouble is, the
 google translator also attempts to translate the code blocks,
 resulting in a mess.

'production'. Here in Italy we laugh at sites automatically translated, we even signal them to friends by e-mail. One of the funniest was the now non-working it.mp3u.com, where you could find some pearls like: "100% risk free" => "100% rischia liberamente" that means "you risk 100% freely" (ROTFL), and should be "Libero da rischi al 100%". Ciao P.S.: Google Translate brings "rischio di 100% liberamente" that means "risk of 100% freely". LOL


Mar 26 2007
parent Hasan Aljudy <hasan.aljudy gmail.com> writes:
Hasan Aljudy wrote:
 Jascha Wetzel wrote:
 here is an article about a system that's supposed to work - it considers
 context. google's translator obviously doesn't.

 http://www.wired.com/wired/archive/14.12/translate.html

Google actually uses this system for a couple of languages now .. (see my other post today .. (damn, how do you get the post links from Thunderbird?))

I guess: news://news.digitalmars.com:119/eu9lfk$1ehf$1 digitalmars.com
Mar 26 2007
prev sibling next sibling parent Jan Claeys <usenet janc.be> writes:
Op Fri, 23 Mar 2007 15:46:24 -0700
schreef Walter Bright <newshound digitalmars.com>:

 Hasan Aljudy wrote:

 I don't think it's google that wrote the translation engines ..
 it's probably some other company's 30+ years of work!  

You're right they bought it. But I think they'll continue to improve it, because doing it better can be worth enormous money.

The Systran software they have licensed (not bought AFAIK) hasn't improved in any obvious way since the first time I used it something like 10 years ago... It's often usable if you want to get an impression of what a page talks about, but IMHO technical documentation requires accuracy. E.g., something like "Objets de classe d'Instantiating ailleurs que le tas de CHROMATOGRAPHIE GAZEUSE" is complete nonsense if you are talking about D. ;-) -- JanC
Mar 25 2007
prev sibling next sibling parent Jan Claeys <usenet janc.be> writes:
Op Tue, 27 Mar 2007 16:23:58 +0200
schreef Roberto Mariottini <rmariottini mail.com>:

 That's your side of the medal. Since most translation software is
 made by English-speaking people, translating from some other language
 to English works better than the reverse.

Actually, SYSTRAN is a French company and their software is used for machine-assisted translation in de EU (imagine, here in Europe we have a government level that uses more than 20 official languages that everything has to be translated to and from...). The problem is, those translations are made by professional translators who use it to assist them where possible, but fix all (or at least most of) the errors made. Maybe they have a more expensive & better version too; I don't think SYSTRAN wants the EU to start using Google or Babelfish instead of the expensive licensed software that provides SYSTRAN with a consistent income now. ;-) -- JanC
Mar 27 2007
prev sibling parent Jan Claeys <usenet janc.be> writes:
Op Mon, 26 Mar 2007 17:37:26 -0600
schreef Hasan Aljudy <hasan.aljudy gmail.com>:

 Actually I was looking up "free statistical translation" (or
 something like that) in Google, when I discovered a Google Blog entry
 stating that Google now uses a statistical model for translating
 Arabic and Chinese 

Hm, there are open source libraries & tools for doing "statistical translation", e.g.: <http://www.statmt.org/moses/> -- JanC
Mar 27 2007