www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Source level Java to D converter

reply stonecobra <scott stonecobra.com> writes:
Started a new thread, in case there is even more interest ;)

Brad Anderson wrote:
 Ant wrote:

 On Sun, 25 Jul 2004 18:29:41 -0700, stonecobra wrote:
 mechanical java -> D translation process,

can you share that!? I started that translation but got lost on workarounds for dmd (mine) problems. I believe the DWT project should use a mechanical java -> D.

am looking forward to putting SWT code through it. Scott, let me know when it's ready, or if you want someone to kick the tires. And if Walter fixes the FRef bug with your example, we may get somewhere with DWT...

The rumors are true. I currently can translate most class/interface stuff, and I am currently focusing on the translations between Java's String and D's char[]. The fact that one is an object makes it a wee bit difficult :) There will still be some hand fixup after the fact I'm afraid, because of things like not being able to tell the types that are put into a HashMap, for example. It does use an XML intermediate language, but I believe it is still lacking certain things. To get the intermediate language, I use jikes patched with JavaML (http://www.cs.washington.edu/homes/gjb/JavaML/) which tells jikes to spit out its debug info in XML. I then have an XSLT stylesheet that runs over this xml output and starts writing D files. I am trying to port SAX from java to D right now, and I am getting pretty close (string issues and such). I will release it, I just worry about releasing something that doesn't even work. Are any of you willing to help me make it better? The big help I would like is various changes to jikes to do more with the information it has (like fully package named types in output), etc. I think I've got a lot of the stylesheet stuff under control, but if there were more/better output to work on, that would be SWEEET! Any more thoughts? Scott
Jul 27 2004
next sibling parent reply parabolis <parabolis softhome.net> writes:
stonecobra wrote:

 
 The rumors are true.  I currently can translate most class/interface 
 stuff, and I am currently focusing on the translations between Java's 
 String and D's char[].  The fact that one is an object makes it a wee 
 bit difficult :)

I have a feeling that D will eventually have a String class of its own as it currently has no String functionality at all. Since the char and wchar types are UTF a function finding the Nth character in the array would have to decode the array for each call. A String class implementation will allow certain assumptions that from eliminate the requirement of parsing UTF.
 
 There will still be some hand fixup after the fact I'm afraid, because 
 of things like not being able to tell the types that are put into a 
 HashMap, for example.

As it happens I have just been working with a HashTable... I assume you are writing this in Java? Hashtable has the same values() function as HashMap so... I am also assuming you are not familiar with java.lang.Object.getClass()? ================================================================ import java.nio.charset.*; import java.util.*; public class Test { public static void main( String[] args ) throws Exception { Hashtable table = new Hashtable(Charset.availableCharsets()); Iterator i = table.values().iterator(); Object tableValueObject; Class tableValueClass; while( i.hasNext() ) { tableValueObject = i.next(); tableValueClass = tableValueObject.getClass(); System.out.print( tableValueClass ); System.out.print( " extends " ); System.out.print( tableValueClass.getSuperclass() ); System.out.println(); } } } ================================================================ class sun.nio.cs.ISO_8859_9 extends class java.nio.charset.Charset class sun.nio.cs.ext.ISO_8859_8 extends class java.nio.charset.Charset ... (40+ lines removed for your reading pleasure) class sun.nio.cs.UTF_16BE extends class java.nio.charset.Charset ================================================================
 
 I will release it, I just worry about releasing something that doesn't 
 even work.  Are any of you willing to help me make it better?
 
 The big help I would like is various changes to jikes to do more with 
 the information it has (like fully package named types in output), etc.
 
 I think I've got a lot of the stylesheet stuff under control, but if 
 there were more/better output to work on, that would be SWEEET!

I would love to help but I am currently trying to find my way around the D compiler. Good luck.
Jul 27 2004
next sibling parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <ce6elp$2jmu$1 digitaldaemon.com>, parabolis says...

I have a feeling that D will eventually have a String class of its
own as it currently has no String functionality at all.

I believe Hauke is working on that string class. "no String functionality at all" is a bit of an exageration though, since std.string has certainly got a non-zero amount of funtionality. What kind of functionality are you looking for?
Since the
char and wchar types are UTF a function finding the Nth character
in the array would have to decode the array for each call.

which is easy, even with char[] and wchar[]. But, better still, a dchar[] array is strictly one character per element. Java chars are D wchars, of course, since Java has no 32-bit-wide dchar, and so Java strings are /also/ UTF-16. The way I read it, java.lang.String.charAt() just counts UTF-16 fragments - same for length(), indexOf(), etc. Me again
Jul 27 2004
parent reply parabolis <parabolis softhome.net> writes:
Arcane Jill wrote:

 In article <ce6elp$2jmu$1 digitaldaemon.com>, parabolis says...
 
 
I have a feeling that D will eventually have a String class of its
own as it currently has no String functionality at all.

I believe Hauke is working on that string class.

This could be good to hear.
 
 "no String functionality at all" is a bit of an exageration though, since
 std.string has certainly got a non-zero amount of funtionality. What kind of
 functionality are you looking for?

Some fundamentals like length and substring would be a good start. See my post in Re: OT - scanf in Java
 
 
 
Since the
char and wchar types are UTF a function finding the Nth character
in the array would have to decode the array for each call.

which is easy, even with char[] and wchar[]. But, better still, a dchar[] array is strictly one character per element.

Perhaps easy but a class or interface solution can perform length-querry and substring operations in constant time. Parsing UTF and counting the characters would take N time in all cases. The substring
 
 Java chars are D wchars, of course, since Java has no 32-bit-wide dchar, and so
 Java strings are /also/ UTF-16. The way I read it, java.lang.String.charAt()
 just counts UTF-16 fragments - same for length(), indexOf(), etc.
 

True but java.lang.Character and java.lang.String are classes and will be much more easily extended than a primitive type. (Just ask sun for unsigned primitives ;))
Jul 27 2004
parent reply Ben Hinkle <bhinkle4 juno.com> writes:
parabolis wrote:

 Arcane Jill wrote:
 
 In article <ce6elp$2jmu$1 digitaldaemon.com>, parabolis says...
 
 
I have a feeling that D will eventually have a String class of its
own as it currently has no String functionality at all.

I believe Hauke is working on that string class.

This could be good to hear.

I don't know if archives of the old newsgroup are around but I recall a long thread about string classes back then (at least I think it was on the old newsgroup). It might be worth looking at those posts. Personally I think char[]/std.string is better than a class and is certainly better to choose one or the other but not both.
 
 "no String functionality at all" is a bit of an exageration though, since
 std.string has certainly got a non-zero amount of funtionality. What kind
 of functionality are you looking for?

Some fundamentals like length and substring would be a good start. See my post in Re: OT - scanf in Java
 
 
 
Since the
char and wchar types are UTF a function finding the Nth character
in the array would have to decode the array for each call.

which is easy, even with char[] and wchar[]. But, better still, a dchar[] array is strictly one character per element.

Perhaps easy but a class or interface solution can perform length-querry and substring operations in constant time. Parsing UTF and counting the characters would take N time in all cases. The substring
 
 Java chars are D wchars, of course, since Java has no 32-bit-wide dchar,
 and so Java strings are /also/ UTF-16. The way I read it,
 java.lang.String.charAt() just counts UTF-16 fragments - same for
 length(), indexOf(), etc.
 

True but java.lang.Character and java.lang.String are classes and will be much more easily extended than a primitive type. (Just ask sun for unsigned primitives ;))

Jul 27 2004
next sibling parent teqDruid <me teqdruid.com> writes:
On Tue, 27 Jul 2004 23:09:37 -0400, Ben Hinkle wrote:
 Personally I think
 char[]/std.string is better than a class

Why? The only member variable of a class need be is a char[] (or perhaps dchar[] or wchar[] if needed) so it can't be a memory size or efficiency issue. A String is like the perfect, textbook object, so what's the procedural method got over it? I know it's a hack, but I wonder if making the compiler able to implicitly convert between a String class and char[]'s would be a desirable feature.? In Java, the String class is a special class. Just thinking out loud. John
Jul 27 2004
prev sibling next sibling parent parabolis <parabolis softhome.net> writes:
Ben Hinkle wrote:
 parabolis wrote:
 
 
Arcane Jill wrote:


In article <ce6elp$2jmu$1 digitaldaemon.com>, parabolis says...



I have a feeling that D will eventually have a String class of its
own as it currently has no String functionality at all.

I believe Hauke is working on that string class.

This could be good to hear.

I don't know if archives of the old newsgroup are around but I recall a long thread about string classes back then (at least I think it was on the old newsgroup). It might be worth looking at those posts. Personally I think char[]/std.string is better than a class and is certainly better to choose one or the other but not both.

If you are interested please read my post towards the end of the previous thread OT - scanf in Java. Especially the end of the post where I discuss the benefits of a class and/or class+interface approach.
Jul 28 2004
prev sibling parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <ce75di$2roa$1 digitaldaemon.com>, Ben Hinkle says...

Personally I think
char[]/std.string is better than a class and is certainly better to choose
one or the other but not both.

The string class concept came up more recently in a discussion of internationalization issues, transcoding issues, etc., and would, I think, be geared toward that end. But even without that, there are good reasons for a string class. Consider: # char[] s = new char[2]; # char[0] = 0x80; # char[0] = 0x81; Whoa! - that's invalid UTF-8 - but when is it going to be spotted? When will that manifest as a bug? With a class, you can have Design by Contract - a class invariant would have spotted that straight away. Just one thought among many. Personally I don't care how it's done, but I am pleased to hear that transcoding etc. is being worked upon, and if that results in a string class, so be it. Arcane Jill
Jul 28 2004
parent reply Ilya Minkov <minkov cs.tum.edu> writes:
Yes, a string class would be cool. I think there is a major difference 
between char[] and String (class) which would not make them interfere.

I wonder whether there's much sense in creating an analogous class to 
StringBuffer - which would really interfere with char[] in use.

String as a class could, as opposed to char[]:
- be immutable, thus somewhat safer
- try to check for legality
- can be abstract with descendants being in UTF8/16/32/ some other 
representations
- allow some intuitive processing
- take care of some conversions
- overload comparison operators
...

Naturally, one wouldn't want to write algorithms to work with an 
immutable class, but they can work with basic data types and nonthless 
input and output a String without a performance penalty.

Arcane Jill schrieb:

 In article <ce75di$2roa$1 digitaldaemon.com>, Ben Hinkle says...
 
 
Personally I think
char[]/std.string is better than a class and is certainly better to choose
one or the other but not both.

The string class concept came up more recently in a discussion of internationalization issues, transcoding issues, etc., and would, I think, be geared toward that end. But even without that, there are good reasons for a string class. Consider: # char[] s = new char[2]; # char[0] = 0x80; # char[0] = 0x81; Whoa! - that's invalid UTF-8 - but when is it going to be spotted? When will that manifest as a bug? With a class, you can have Design by Contract - a class invariant would have spotted that straight away. Just one thought among many. Personally I don't care how it's done, but I am pleased to hear that transcoding etc. is being worked upon, and if that results in a string class, so be it. Arcane Jill

Jul 29 2004
parent parabolis <parabolis softhome.net> writes:
Ilya Minkov wrote:
 
 String as a class could, as opposed to char[]:
 - be immutable, thus somewhat safer
 - try to check for legality
 - can be abstract with descendants being in UTF8/16/32/ some other 
 representations
 - allow some intuitive processing
 - take care of some conversions
 - overload comparison operators
 ...
 
 Naturally, one wouldn't want to write algorithms to work with an 
 immutable class, but they can work with basic data types and nonthless 
 input and output a String without a performance penalty.

Actually I agree with almost all the benifits you list. However I do not clearly understand all the issues behind why Java choose to make String immutable and so I am not really very clear on all the benefits of an immutable String class. I know it makes substring() much simpler because it is safe to return a String pointing into another String. However it sounds like they have a StringTable data structure that may offer other benefits as well. Do you know much about how they implemented StringTable?
Jul 29 2004
prev sibling next sibling parent teqDruid <me teqdruid.com> writes:
On Tue, 27 Jul 2004 16:41:14 -0700, parabolis wrote:

 stonecobra wrote:
 
 
 The rumors are true.  I currently can translate most class/interface 
 stuff, and I am currently focusing on the translations between Java's 
 String and D's char[].  The fact that one is an object makes it a wee 
 bit difficult :)

I have a feeling that D will eventually have a String class of its own as it currently has no String functionality at all.

And by that you mean a String class in phobos, yes? There is a String class (I believe Walter links to it somewhere off the D website) already, but since it's not in phobos, I find it to be virtually useless. But that's probably because I'm writing libraries right now. If we're going to use a String class, it's important for it to be a part of phobos (and thus standard), or at least to implement a standard interface in phobos. A lack of a standard String class is somethat that's actually kind of irked be about D (really phobos I guess), and I think it should be a high enough priority that it should be pre-1.0 thing. Anyone else agree, or am I the only one not happy with char[] and free floating methods? John
Jul 27 2004
prev sibling parent stonecobra <scott stonecobra.com> writes:
parabolis wrote:

 stonecobra wrote:
 
 The rumors are true.  I currently can translate most class/interface 
 stuff, and I am currently focusing on the translations between Java's 
 String and D's char[].  The fact that one is an object makes it a wee 
 bit difficult :)

I have a feeling that D will eventually have a String class of its own as it currently has no String functionality at all. Since the char and wchar types are UTF a function finding the Nth character in the array would have to decode the array for each call. A String class implementation will allow certain assumptions that from eliminate the requirement of parsing UTF.

I have been avoiding writing my own String call for this purpose. For example the java code: foo.indexOf("bar") translates into: std.string.find(foo, "bar") It's just a helper method instead of a class method of an object. Simple, but tedious for translation ;)
 
 There will still be some hand fixup after the fact I'm afraid, because 
 of things like not being able to tell the types that are put into a 
 HashMap, for example.

As it happens I have just been working with a HashTable... I assume you are writing this in Java? Hashtable has the same values() function as HashMap so... I am also assuming you are not familiar with java.lang.Object.getClass()? ================================================================ import java.nio.charset.*; import java.util.*; public class Test { public static void main( String[] args ) throws Exception { Hashtable table = new Hashtable(Charset.availableCharsets()); Iterator i = table.values().iterator(); Object tableValueObject; Class tableValueClass; while( i.hasNext() ) { tableValueObject = i.next(); tableValueClass = tableValueObject.getClass(); System.out.print( tableValueClass ); System.out.print( " extends " ); System.out.print( tableValueClass.getSuperclass() ); System.out.println(); } } } ================================================================ class sun.nio.cs.ISO_8859_9 extends class java.nio.charset.Charset class sun.nio.cs.ext.ISO_8859_8 extends class java.nio.charset.Charset ... (40+ lines removed for your reading pleasure) class sun.nio.cs.UTF_16BE extends class java.nio.charset.Charset ================================================================

I am not writing this in Java, actually. It is an XSLT stylesheet transforming XML data that represents java source code. I am familiar with java.lang.Object.getClass(), but then my translator would be a runtime introspection translator, instead of the source level translator that I described it as. It is a flaw in my scheme, I understand that, but I am still going forward with it.
 I will release it, I just worry about releasing something that doesn't 
 even work.  Are any of you willing to help me make it better?

 The big help I would like is various changes to jikes to do more with 
 the information it has (like fully package named types in output), etc.

 I think I've got a lot of the stylesheet stuff under control, but if 
 there were more/better output to work on, that would be SWEEET!

I would love to help but I am currently trying to find my way around the D compiler. Good luck.

Funny that, I am too :) Scott
Jul 27 2004
prev sibling next sibling parent reply Ben Hinkle <bhinkle4 juno.com> writes:
 The rumors are true.  I currently can translate most class/interface
 stuff, and I am currently focusing on the translations between Java's
 String and D's char[].  The fact that one is an object makes it a wee
 bit difficult :)

what are the difficulties you are referring to?
Jul 27 2004
parent reply stonecobra <scott stonecobra.com> writes:
Ben Hinkle wrote:

The rumors are true.  I currently can translate most class/interface
stuff, and I am currently focusing on the translations between Java's
String and D's char[].  The fact that one is an object makes it a wee
bit difficult :)

what are the difficulties you are referring to?

Maybe difficulties is too strong a word, merely translating an object.method(param) call to helpermethod.method(object, param) call. Tedious hand coding of mappings between string methods and std.string methods. See also my response to parabolis. Scott
Jul 27 2004
parent Andy Friesen <andy ikagames.com> writes:
stonecobra wrote:

 Ben Hinkle wrote:
 
 The rumors are true.  I currently can translate most class/interface
 stuff, and I am currently focusing on the translations between Java's
 String and D's char[].  The fact that one is an object makes it a wee
 bit difficult :)

what are the difficulties you are referring to?

Maybe difficulties is too strong a word, merely translating an object.method(param) call to helpermethod.method(object, param) call. Tedious hand coding of mappings between string methods and std.string methods. See also my response to parabolis.

There is a scary, undocumented wrinkle in the language where, if you have a function of the form T foo(Array[] an_array, ... arguments) { ... } The expression my_array.foo(...); is equivalent to: foo(my_array, ...); This only works if the first argument is an array type. (any type of array will work, though) I don't know if it's supposed to be there, why it's there to begin with, or if it's going to stick around, though. :) -- andy
Jul 27 2004
prev sibling parent stonecobra <scott stonecobra.com> writes:
stonecobra wrote:

 I will release it, I just worry about releasing something that doesn't 
 even work.  Are any of you willing to help me make it better?

So, no one is interested in it? I won't release it then. Thanks Scott Sanders
Jul 29 2004