digitalmars.D.learn - latin-1 encoding

Simen Haugen (2/2) Jan 11 2007 I'm just starting to look at D, but I can't seem to find any encodings f...

Johan Granberg (3/5) Jan 11 2007 What are you trying to do? It would be helpfull to know if you want to r...

Simen Haugen (2/5) Jan 11 2007 Reading and writing files.

Johan Granberg (6/12) Jan 12 2007 there is no string manipulation functions i the standard library that wi...
Frits van Bommel (34/40) Jan 12 2007 Now I'm no expert in character encodings, but isn't Latin-1 just the

Frank Benoit (keinfarbton) (3/7) Jan 12 2007 you can try the mango project. It has a package called ICU, that does

"Simen Haugen" <simen norstat.no> writes:

I'm just starting to look at D, but I can't seem to find any encodings for 
latin-1 in the standard library...

Jan 11 2007

Johan Granberg <lijat.meREM OVE.gmail.com> writes:

Simen Haugen wrote:

 I'm just starting to look at D, but I can't seem to find any encodings for
 latin-1 in the standard library...

What are you trying to do? It would be helpfull to know if you want to read
files in latin-1 or if you want your whole program to use it internally.

Jan 11 2007

"Simen Haugen" <simen norstat.no> writes:

"Johan Granberg" wrote:
 What are you trying to do? It would be helpfull to know if you want to 
 read
 files in latin-1 or if you want your whole program to use it internally.

Reading and writing files.

Jan 11 2007

Johan Granberg <lijat.meREM OVE.gmail.com> writes:

Simen Haugen wrote:

 "Johan Granberg" wrote:
 What are you trying to do? It would be helpfull to know if you want to
 read
 files in latin-1 or if you want your whole program to use it internally.

 
 Reading and writing files.

there is no string manipulation functions i the standard library that will
help you there but you could read them as usual but instead of using char[]
use ubyte[] to store them. If you want to use string manipulation functions
the easiest would be to convert to utf8, there was some discussion of how
to do that a couple of weeks ago.

Jan 12 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Simen Haugen wrote:
 "Johan Granberg" wrote:
 What are you trying to do? It would be helpfull to know if you want to 
 read
 files in latin-1 or if you want your whole program to use it internally.

 
 Reading and writing files. 

Now I'm no expert in character encodings, but isn't Latin-1 just the 
first 256 codepoints (or whatever they're called) of Unicode, packed 
into a single byte per character?

If so, it should be pretty trivial to convert latin-1 characters to 
Unicode, either to wchar[]/dchar[] by direct one-to-one assignment (no 
multibyte sequences possible) or to char[] by using std.utf.encode, like 
this:

-----
// warning: incomplete, untested code

ubyte[] data_lat1;

// ... fill data_lat1 array

char[] data_utf8;    // perhaps preallocate this to a reasonable length

foreach(c; data_lat1) {
     std.utf.encode(data_utf8, c);
}
-----


And UTF to Latin-1 should be pretty easy too:
-----
// again: incomplete, untested code

char[] data_utf;    // wchar[] and dchar[] should work as well

ubyte[] data_lat1;  // again, preallocate a reasonable array if you want

size_t i = 0;
while(i < data_utf.length) {
     dchar c = std.utf.decode(data_utf, i);    // advances i
     assert(c < 0x100);      // make sure it fits
     data_lat1 ~= c;
}
-----

I should note that by 'preallocate' I mean '"new" an array and set the 
length to 0'.
Setting the length to 0 is important since otherwise your output will 
get appended to the end of a default-initialized array, which isn't what 
you want ;)

Jan 12 2007

"Frank Benoit (keinfarbton)" <benoit tionex.removethispart.de> writes:

Simen Haugen schrieb:
 I'm just starting to look at D, but I can't seem to find any encodings for 
 latin-1 in the standard library... 
 
 

you can try the mango project. It has a package called ICU, that does
convertions between various encodings and unicode.

Jan 12 2007

D Programming

C/C++ Programming

Other

digitalmars.D.learn - latin-1 encoding