|
Archives
D Programming
DD.gnu digitalmars.D digitalmars.D.bugs digitalmars.D.dtl digitalmars.D.dwt digitalmars.D.announce digitalmars.D.learn digitalmars.D.debugger C/C++ Programming
c++c++.announce c++.atl c++.beta c++.chat c++.command-line c++.dos c++.dos.16-bits c++.dos.32-bits c++.idde c++.mfc c++.rtl c++.stl c++.stl.hp c++.stl.port c++.stl.sgi c++.stlsoft c++.windows c++.windows.16-bits c++.windows.32-bits c++.wxwindows digitalmars.empire digitalmars.DMDScript electronics |
digitalmars.D.learn - read files ... continued
Woops ... sent before done writing ... sorry Hi I'm writing an application that reads all kind of text files. I'm not really familiar with the filetypes. For the moment I read them with a BufferedFile. I read the lines with readLine() Stream br = new BufferedFile(fileName); char[] line = br.readLine(); But that causes a lot of trouble. I managed to figure out how to read a file his BOM and so It'll also be possible I presume to convert them to a type (UTF8 for example) that I always use. (I'm checking that later) But for a lot of files when I check the BOM I get result -1 (meaning the type is not known). http://www.digitalmars.com/d/phobos/std_stream.html The only known BOM types are listed there (UTF8,UTF16,UTF32 LE or BE) For a lot of text files on my system (windows) the type is ANSI, and there's no problem reading them with BufferedFile if there are no special signs in it. But if there's an accent or something (for example 'é'), than it's an invalid UTF sequence. I cannot convert the text because the BOM for this files is also unknown. Anyone has an idea of how to catch this sort of files (and convert them?) Or is there a stream that takes into account the filetype by itself? Would be very handy ... It's an application I wrote in Java I'm now trying in D. In Java I used a BufferedReader on A FileReader and there all goes well. Sometimes files are not read well, but no faults like this invalid UTF-sequence in D. If someone unterstands my problem out of all this confusing talk (that's because I'm rather confused myself) ... I'd be glad :p Thanks! May 13 2007
Jan Hanselaer wrote:Woops ... sent before done writing ... sorry Hi I'm writing an application that reads all kind of text files. I'm not really familiar with the filetypes. For the moment I read them with a BufferedFile. I read the lines with readLine() Stream br = new BufferedFile(fileName); char[] line = br.readLine(); But that causes a lot of trouble. I managed to figure out how to read a file his BOM and so It'll also be possible I presume to convert them to a type (UTF8 for example) that I always use. (I'm checking that later) But for a lot of files when I check the BOM I get result -1 (meaning the type is not known). http://www.digitalmars.com/d/phobos/std_stream.html The only known BOM types are listed there (UTF8,UTF16,UTF32 LE or BE) For a lot of text files on my system (windows) the type is ANSI, and there's no problem reading them with BufferedFile if there are no special signs in it. But if there's an accent or something (for example '�'), than it's an invalid UTF sequence. I cannot convert the text because the BOM for this files is also unknown. Anyone has an idea of how to catch this sort of files (and convert them?) Or is there a stream that takes into account the filetype by itself? Would be very handy ... It's an application I wrote in Java I'm now trying in D. In Java I used a BufferedReader on A FileReader and there all goes well. Sometimes files are not read well, but no faults like this invalid UTF-sequence in D. If someone unterstands my problem out of all this confusing talk (that's because I'm rather confused myself) ... I'd be glad :p Thanks! May 13 2007
"Daniel Keep" <daniel.keep.lists gmail.com> schreef in bericht news:f26qq0$1d16$1 digitalmars.com...Jan Hanselaer wrote:Woops ... sent before done writing ... sorry Hi I'm writing an application that reads all kind of text files. I'm not really familiar with the filetypes. For the moment I read them with a BufferedFile. I read the lines with readLine() Stream br = new BufferedFile(fileName); char[] line = br.readLine(); But that causes a lot of trouble. I managed to figure out how to read a file his BOM and so It'll also be possible I presume to convert them to a type (UTF8 for example) that I always use. (I'm checking that later) But for a lot of files when I check the BOM I get result -1 (meaning the type is not known). http://www.digitalmars.com/d/phobos/std_stream.html The only known BOM types are listed there (UTF8,UTF16,UTF32 LE or BE) For a lot of text files on my system (windows) the type is ANSI, and there's no problem reading them with BufferedFile if there are no special signs in it. But if there's an accent or something (for example '?'), than it's an invalid UTF sequence. I cannot convert the text because the BOM for this files is also unknown. Anyone has an idea of how to catch this sort of files (and convert them?) Or is there a stream that takes into account the filetype by itself? Would be very handy ... It's an application I wrote in Java I'm now trying in D. In Java I used a BufferedReader on A FileReader and there all goes well. Sometimes files are not read well, but no faults like this invalid UTF-sequence in D. If someone unterstands my problem out of all this confusing talk (that's because I'm rather confused myself) ... I'd be glad :p Thanks! May 13 2007
Jan Hanselaer wrote:"Daniel Keep" <daniel.keep.lists gmail.com> schreef in bericht news:f26qq0$1d16$1 digitalmars.com...Jan Hanselaer wrote:Woops ... sent before done writing ... sorry Hi I'm writing an application that reads all kind of text files. I'm not really familiar with the filetypes. For the moment I read them with a BufferedFile. I read the lines with readLine() Stream br = new BufferedFile(fileName); char[] line = br.readLine(); But that causes a lot of trouble. I managed to figure out how to read a file his BOM and so It'll also be possible I presume to convert them to a type (UTF8 for example) that I always use. (I'm checking that later) But for a lot of files when I check the BOM I get result -1 (meaning the type is not known). http://www.digitalmars.com/d/phobos/std_stream.html The only known BOM types are listed there (UTF8,UTF16,UTF32 LE or BE) For a lot of text files on my system (windows) the type is ANSI, and there's no problem reading them with BufferedFile if there are no special signs in it. But if there's an accent or something (for example '?'), than it's an invalid UTF sequence. I cannot convert the text because the BOM for this files is also unknown. Anyone has an idea of how to catch this sort of files (and convert them?) Or is there a stream that takes into account the filetype by itself? Would be very handy ... It's an application I wrote in Java I'm now trying in D. In Java I used a BufferedReader on A FileReader and there all goes well. Sometimes files are not read well, but no faults like this invalid UTF-sequence in D. If someone unterstands my problem out of all this confusing talk (that's because I'm rather confused myself) ... I'd be glad :p Thanks! May 13 2007
Jan Hanselaer escribió:Yes, thanks a lot. At least now I understand it more. But it's a pity there isn't a stream doing all the work. It's going to be very difficult to read the different files in a proper way. May 13 2007
|