www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - byte to char safe?

reply Harry <harry potter.com> writes:
Again hello, 

char[6] t = r"again" ~ cast(char)7 ~ r"hello";

use only own write functions
is ok?

thank you!
Jul 30 2009
next sibling parent reply BCS <ao pathlink.com> writes:
Reply to Harry,

 Again hello,
 
 char[6] t = r"again" ~ cast(char)7 ~ r"hello";
 
 use only own write functions
 is ok?
 thank you!
 
I think this will also work and you can be shure it's safe. r"again" ~ '\x07' ~ r"hello"
Jul 30 2009
parent reply Harry <harry potter.com> writes:
BCS Wrote:

 Reply to Harry,
 
 Again hello,
 
 char[6] t = r"again" ~ cast(char)7 ~ r"hello";
 
 use only own write functions
 is ok?
 thank you!
 
I think this will also work and you can be shure it's safe. r"again" ~ '\x07' ~ r"hello"
again thank you ! D writef not print utf8 control? \x00 .. \x1f and \x7f .. \x9f safe for data? where \n \t? sorry so many questions
Jul 30 2009
next sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Harry wrote:
 D writef not print utf8 control?
I have no idea what you're asking. To take a blind stab: writef expects any char[] you give it to be in utf8 format. Actually, EVERYTHING assumes that char[] is in utf8.
 \x00 .. \x1f and \x7f .. \x9f safe for data?
Again, I'm not sure what you're asking. char[] is utf8, which means that ANY code unit in the range \x80..\xff is invalid by itself. If you're storing binary data, use ubyte[].
 where \n \t?
const NEWLINE = "\n"; What's the issue?
Jul 30 2009
prev sibling parent Sergey Gromov <snake.scaly gmail.com> writes:
Thu, 30 Jul 2009 18:29:09 -0400, Harry wrote:

 BCS Wrote:
 
 Reply to Harry,
 
 Again hello,
 
 char[6] t = r"again" ~ cast(char)7 ~ r"hello";
 
 use only own write functions
 is ok?
 thank you!
 
I think this will also work and you can be shure it's safe. r"again" ~ '\x07' ~ r"hello"
again thank you ! D writef not print utf8 control? \x00 .. \x1f and \x7f .. \x9f safe for data? where \n \t? sorry so many questions
No, writef does not escape non-printing characters, if that's what you mean. It prints them as is. You can escape them using std.uri.encode for instance: writefln(std.uri.encode(str));
Jul 31 2009
prev sibling next sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
Harry escribió:
 Again hello, 
 
 char[6] t = r"again" ~ cast(char)7 ~ r"hello";
If you want the result to be "again7hello", then no. You must do: char[6] t = r"again" ~ '7' ~ r"hello"; or: char[6] t = r"again" ~ (cast(char)('0' + 7)) ~ r"hello";
Jul 30 2009
parent reply Harry <harry potter.com> writes:
Ary Borenszweig Wrote:

 Harry escribió:
 Again hello, 
 
 char[6] t = r"again" ~ cast(char)7 ~ r"hello";
If you want the result to be "again7hello", then no. You must do: char[6] t = r"again" ~ '7' ~ r"hello"; or: char[6] t = r"again" ~ (cast(char)('0' + 7)) ~ r"hello";
Hello Ary, 7 is data not string. It makes own write function need style data in char[] Not sure if safe ? thank you
Jul 30 2009
parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Thu, 30 Jul 2009 19:14:56 -0400, Harry wrote:

 Ary Borenszweig Wrote:
 
 Harry escribi񸋊> > Again hello, 
 
 char[6] t = r"again" ~ cast(char)7 ~ r"hello";
If you want the result to be "again7hello", then no. You must do: char[6] t = r"again" ~ '7' ~ r"hello"; or: char[6] t = r"again" ~ (cast(char)('0' + 7)) ~ r"hello";
Hello Ary, 7 is data not string. It makes own write function need style data in char[] Not sure if safe ?
If you use only your own write function then you can put just anything into char[]. But if you pass that char[] to any standard function, or even foreach, and there are non-UTF-8 sequences in there, the standard function will fail. Also note that values from 0 to 0x7F are valid UTF-8 codes and can be safely inserted into char[]. If you want to safely put a larger constant into char[] you can use unicode escape sequences: '\uXXXX' or '\UXXXXXXXX', where XXXX and XXXXXXXX are 4 or 8 hexadecimal digits respectively: char[] foo = "hello " ~ "\u017e" ~ "\U00105614"; foreach (dchar ch; foo) writefln("%x", cast(uint) ch); Finally, if you want to encode a variable into char[], you can use std.utf.encode function: char[] foo; uint value = 0x00100534; std.utf.encode(foo, value); Unfortunately all std.utf functions accept only valid UTF characters. Currently they're everything from 0 to 0xD7FF and from 0xE000 to 0x10FFFF. Any other character values will throw a run-time exception if passed to standard functions.
Jul 31 2009
parent reply Harry <harry potter.com> writes:
Sergey Gromov Wrote:

 Thu, 30 Jul 2009 19:14:56 -0400, Harry wrote:
 
 Ary Borenszweig Wrote:
 
 Harry escribi񸋊> > Again hello, 
 
 char[6] t = r"again" ~ cast(char)7 ~ r"hello";
If you want the result to be "again7hello", then no. You must do: char[6] t = r"again" ~ '7' ~ r"hello"; or: char[6] t = r"again" ~ (cast(char)('0' + 7)) ~ r"hello";
Hello Ary, 7 is data not string. It makes own write function need style data in char[] Not sure if safe ?
If you use only your own write function then you can put just anything into char[]. But if you pass that char[] to any standard function, or even foreach, and there are non-UTF-8 sequences in there, the standard function will fail. Also note that values from 0 to 0x7F are valid UTF-8 codes and can be safely inserted into char[]. If you want to safely put a larger constant into char[] you can use unicode escape sequences: '\uXXXX' or '\UXXXXXXXX', where XXXX and XXXXXXXX are 4 or 8 hexadecimal digits respectively: char[] foo = "hello " ~ "\u017e" ~ "\U00105614"; foreach (dchar ch; foo) writefln("%x", cast(uint) ch); Finally, if you want to encode a variable into char[], you can use std.utf.encode function: char[] foo; uint value = 0x00100534; std.utf.encode(foo, value); Unfortunately all std.utf functions accept only valid UTF characters. Currently they're everything from 0 to 0xD7FF and from 0xE000 to 0x10FFFF. Any other character values will throw a run-time exception if passed to standard functions.
thank you! non-print utf8 is print with writef start of text \x02 is smile end of text \x03 is heart newline \x0a is newline! is difference? utf.encode(foo,value) foo~"\U00100534"
Aug 01 2009
parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Sat, 01 Aug 2009 19:58:20 -0400, Harry wrote:

 Sergey Gromov Wrote:
 
 Thu, 30 Jul 2009 19:14:56 -0400, Harry wrote:
 
 Ary Borenszweig Wrote:
 
 Harry escribi񸋊> > Again hello, 
 
 char[6] t = r"again" ~ cast(char)7 ~ r"hello";
If you want the result to be "again7hello", then no. You must do: char[6] t = r"again" ~ '7' ~ r"hello"; or: char[6] t = r"again" ~ (cast(char)('0' + 7)) ~ r"hello";
Hello Ary, 7 is data not string. It makes own write function need style data in char[] Not sure if safe ?
If you use only your own write function then you can put just anything into char[]. But if you pass that char[] to any standard function, or even foreach, and there are non-UTF-8 sequences in there, the standard function will fail. Also note that values from 0 to 0x7F are valid UTF-8 codes and can be safely inserted into char[]. If you want to safely put a larger constant into char[] you can use unicode escape sequences: '\uXXXX' or '\UXXXXXXXX', where XXXX and XXXXXXXX are 4 or 8 hexadecimal digits respectively: char[] foo = "hello " ~ "\u017e" ~ "\U00105614"; foreach (dchar ch; foo) writefln("%x", cast(uint) ch); Finally, if you want to encode a variable into char[], you can use std.utf.encode function: char[] foo; uint value = 0x00100534; std.utf.encode(foo, value); Unfortunately all std.utf functions accept only valid UTF characters. Currently they're everything from 0 to 0xD7FF and from 0xE000 to 0x10FFFF. Any other character values will throw a run-time exception if passed to standard functions.
thank you! non-print utf8 is print with writef start of text \x02 is smile end of text \x03 is heart newline \x0a is newline!
Well, sure, standard writef simply outputs those characters to the console. Then console prints them according to its own rules. Therefore special characters will have different representation on different consoles. If you want consistent output you should those special characters to some printable form.
 is difference? utf.encode(foo,value)  foo~"\U00100534"
A little. The code: uint value = 0x00100534; std.utf.encode(foo, value); is the same as: foo ~= "\U00100534";
Aug 02 2009
parent Sergey Gromov <snake.scaly gmail.com> writes:
Mon, 3 Aug 2009 04:11:31 +0400, Sergey Gromov wrote:

 If you want consistent output you should those
 special characters to some printable form.
Sorry, this sentence has a typo: If you want consistent output you should *convert* those special characters to some printable form.
Aug 02 2009
prev sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Harry wrote:
 Again hello, 
 
 char[6] t = r"again" ~ cast(char)7 ~ r"hello";
 
 use only own write functions
 is ok?
 
 thank you!
I think a more significant problem is that "again\x07hello" can't possibly fit in six characters, unless you're using some crazy numbering system I'm not familiar with.
Jul 30 2009