www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - more OO way to do hex string to bytes conversion

reply Ralph Doncaster <nerdralph github.com> writes:
I've been reading std.conv and std.range, trying to figure out a 
high-level way of converting a hex string to bytes.  The only way 
I've been able to do it is through pointer access:

import std.stdio;
import std.string;
import std.conv;

void main()
{
     immutable char* hex = "deadbeef".toStringz;
     for (auto i=0; hex[i]; i += 2)
         writeln(to!byte(hex[i]));
}


While it works, I'm wondering if there is a more object-oriented 
way of doing it in D.
Feb 06
next sibling parent ag0aep6g <anonymous example.com> writes:
On 02/06/2018 07:33 PM, Ralph Doncaster wrote:
 I've been reading std.conv and std.range, trying to figure out a 
 high-level way of converting a hex string to bytes.  The only way I've 
 been able to do it is through pointer access:
 
 import std.stdio;
 import std.string;
 import std.conv;
 
 void main()
 {
      immutable char* hex = "deadbeef".toStringz;
      for (auto i=0; hex[i]; i += 2)
          writeln(to!byte(hex[i]));
 }
 
 
 While it works, I'm wondering if there is a more object-oriented way of 
 doing it in D.
I don't think that works as you intend. Your code is taking the numeric value of every other character. But you want 0xDE, 0xAD, 0xBE, 0xEF, no? Here's one way to do that, but I don't think it qualifies as object oriented (but I'm also not sure how an object oriented solution is supposed to look): ---- void main() { import std.algorithm: map; import std.conv: to; import std.range: chunks; import std.stdio; foreach (b; "deadbeef".chunks(2).map!(chars => chars.to!ubyte(16))) { writefln("%2X", b); } } ----
Feb 06
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Feb 06, 2018 at 06:33:02PM +0000, Ralph Doncaster via
Digitalmars-d-learn wrote:
 I've been reading std.conv and std.range, trying to figure out a
 high-level way of converting a hex string to bytes.  The only way I've
 been able to do it is through pointer access:
 
 import std.stdio;
 import std.string;
 import std.conv;
 
 void main()
 {
     immutable char* hex = "deadbeef".toStringz;
     for (auto i=0; hex[i]; i += 2)
         writeln(to!byte(hex[i]));
 }
 
 
 While it works, I'm wondering if there is a more object-oriented way
 of doing it in D.
OO is outdated. D uses the range-based idiom with UFCS for chaining operations in a way that doesn't require you to write loops yourself. For example: import std.array; import std.algorithm; import std.conv; import std.range; // No need to use .toStringz unless you're interfacing with C auto hex = "deadbeef"; // let compiler infer the type for you auto bytes = hex.chunks(2) // lazily iterate over `hex` by digit pairs .map!(s => s.to!ubyte(16)) // convert each pair to a ubyte .array; // make an array out of it // Do whatever you wish with the ubyte[] array. writefln("%(%02X %)", bytes); If you want a reusable way to convert a hex string to bytes, you could do something like this: import std.array; import std.algorithm; import std.conv; import std.range; ubyte[] hexToBytes(string hex) { return hex.chunks(2) .map!(s => s.to!ubyte(16)) .array; } Of course, this eagerly constructs an array to store the result, which allocates, and also requires the hex string to be fully constructed first. You can make this code lazy by turning it into a range algorithm, then you can actually generate the hex digits lazily from somewhere else, and process the output bytes as they are generated, no allocation necessary: /* Run this example by putting this in a file called 'test.d' * and invoking `dmd -unittest -main -run test.d` */ import std.array; import std.algorithm; import std.conv; import std.format; import std.range; import std.stdio; auto hexToBytes(R)(R hex) if (isInputRange!R && is(ElementType!R : dchar)) { return hex.chunks(2) .map!(s => s.to!ubyte(16)); } unittest { // Infinite stream of hex digits auto digits = "0123456789abcdef".cycle; digits.take(100) // take the first 100 digits .hexToBytes // turn them into bytes .map!(b => format("%02X", b)) // print in uppercase .joiner(" ") // nicely delimit bytes with spaces .chain("\n") // end with a nice newline .copy(stdout.lockingTextWriter); // write output directly to stdout } T -- Designer clothes: how to cover less by paying more.
Feb 06
next sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/6/18 1:46 PM, H. S. Teoh wrote:
 Of course, this eagerly constructs an array to store the result, which
 allocates, and also requires the hex string to be fully constructed
 first.  You can make this code lazy by turning it into a range
 algorithm, then you can actually generate the hex digits lazily from
 somewhere else, and process the output bytes as they are generated, no
 allocation necessary:
 
 	/* Run this example by putting this in a file called 'test.d'
 	 * and invoking `dmd -unittest -main -run test.d`
 	 */
 	import std.array;
 	import std.algorithm;
 	import std.conv;
 	import std.format;
 	import std.range;
 	import std.stdio;
 
 	auto hexToBytes(R)(R hex)
 		if (isInputRange!R && is(ElementType!R : dchar))
 	{
 		return hex.chunks(2)
 		          .map!(s => s.to!ubyte(16));
 	}
 
 	unittest
 	{
 		// Infinite stream of hex digits
 		auto digits = "0123456789abcdef".cycle;
 
 		digits.take(100)	// take the first 100 digits
 		      .hexToBytes	// turn them into bytes
 		      .map!(b => format("%02X", b)) // print in uppercase
 		      .joiner(" ")	// nicely delimit bytes with spaces
 		      .chain("\n")	// end with a nice newline
 		      .copy(stdout.lockingTextWriter);
 		      			// write output directly to stdout
Hm... format in a loop? That returns strings, and allocates. Yuck! ;) writefln("%(%02X %)", digits.take(100).hexToBytes); -Steve
Feb 06
prev sibling parent reply Craig Dillabaugh <craig.dillabaugh gmail.com> writes:
On Tuesday, 6 February 2018 at 18:46:54 UTC, H. S. Teoh wrote:
 On Tue, Feb 06, 2018 at 06:33:02PM +0000, Ralph Doncaster via 
 Digitalmars-d-learn wrote:
clip
 OO is outdated.  D uses the range-based idiom with UFCS for 
 chaining operations in a way that doesn't require you to write 
 loops yourself. For example:

 	import std.array;
 	import std.algorithm;
 	import std.conv;
 	import std.range;

 	// No need to use .toStringz unless you're interfacing with C
 	auto hex = "deadbeef";	// let compiler infer the type for you

 	auto bytes = hex.chunks(2)	// lazily iterate over `hex` by 
 digit pairs
 	   .map!(s => s.to!ubyte(16))	// convert each pair to a ubyte
 	   .array;			// make an array out of it

 	// Do whatever you wish with the ubyte[] array.
 	writefln("%(%02X %)", bytes);
clip
 T
Wouldn't it be more accurate to say OO is not the correct tool for every job rather than it is "outdated". How would one write a GUI library with chains and CTFE? Second, while 'auto' is nice, for learning examples I think putting the type there is actually more helpful to someone trying to understand what is happening. If you know the type why not just write it ... its not like using auto saves you any work in most cases. I understand that its nice in templates and for ranges and the like, but for basic types I don't see any advantage to using it.
Feb 06
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 06/02/2018 8:46 PM, Craig Dillabaugh wrote:
 On Tuesday, 6 February 2018 at 18:46:54 UTC, H. S. Teoh wrote:
 On Tue, Feb 06, 2018 at 06:33:02PM +0000, Ralph Doncaster via 
 Digitalmars-d-learn wrote:
clip
 OO is outdated.  D uses the range-based idiom with UFCS for chaining 
 operations in a way that doesn't require you to write loops yourself. 
 For example:

     import std.array;
     import std.algorithm;
     import std.conv;
     import std.range;

     // No need to use .toStringz unless you're interfacing with C
     auto hex = "deadbeef";    // let compiler infer the type for you

     auto bytes = hex.chunks(2)    // lazily iterate over `hex` by 
 digit pairs
        .map!(s => s.to!ubyte(16))    // convert each pair to a ubyte
        .array;            // make an array out of it

     // Do whatever you wish with the ubyte[] array.
     writefln("%(%02X %)", bytes);
clip
 T
Wouldn't it be more accurate to say OO is not the correct tool for every job rather than it is "outdated".  How would one write a GUI library with chains and CTFE?
But you could with signatures and structs instead ;)
Feb 06
parent reply Craig Dillabaugh <craig.dillabaugh gmail.com> writes:
On Wednesday, 7 February 2018 at 03:25:05 UTC, rikki cattermole 
wrote:
 On 06/02/2018 8:46 PM, Craig Dillabaugh wrote:
 On Tuesday, 6 February 2018 at 18:46:54 UTC, H. S. Teoh wrote:
 [...]
clip
[...]
clip
 [...]
Wouldn't it be more accurate to say OO is not the correct tool for every job rather than it is "outdated".  How would one write a GUI library with chains and CTFE?
But you could with signatures and structs instead ;)
I am not sure how this would work ... would this actually be a good idea, or are you just saying that technically it would be possible?
Feb 06
parent rikki cattermole <rikki cattermole.co.nz> writes:
On 07/02/2018 4:06 AM, Craig Dillabaugh wrote:
 On Wednesday, 7 February 2018 at 03:25:05 UTC, rikki cattermole wrote:
 On 06/02/2018 8:46 PM, Craig Dillabaugh wrote:
 On Tuesday, 6 February 2018 at 18:46:54 UTC, H. S. Teoh wrote:
 [...]
clip
 [...]
clip
 [...]
Wouldn't it be more accurate to say OO is not the correct tool for every job rather than it is "outdated".  How would one write a GUI library with chains and CTFE?
But you could with signatures and structs instead ;)
I am not sure how this would work ... would this actually be a good idea, or are you just saying that technically it would be possible?
A very good idea :) WIP: https://github.com/rikkimax/DIPs/blob/master/DIPs/DIP1xxx-RC.md https://github.com/rikkimax/stdc-signatures/tree/master/stdc
Feb 06
prev sibling next sibling parent Machin <machgyl zbor.ue> writes:
On Tuesday, 6 February 2018 at 18:33:02 UTC, Ralph Doncaster 
wrote:
 I've been reading std.conv and std.range, trying to figure out 
 a high-level way of converting a hex string to bytes.  The only 
 way I've been able to do it is through pointer access:

 import std.stdio;
 import std.string;
 import std.conv;

 void main()
 {
     immutable char* hex = "deadbeef".toStringz;
     for (auto i=0; hex[i]; i += 2)
         writeln(to!byte(hex[i]));
 }


 While it works, I'm wondering if there is a more 
 object-oriented way of doing it in D.
converting data has nothing to do with OOP. In D we write like that: ``` import std.range : chunks; // consumes lazily two by two import std.algorithm.iteration : map; // apply a func to the chuncks import std.conv : to; // the func: convert with a custom base import std.array : array; // render the whole stuff ubyte[] a = "deadbeef".chunks(2).map!(a => a.to!ubyte(16)).array; ```
Feb 06
prev sibling next sibling parent reply Ralph Doncaster <nerdralph github.com> writes:
On Tuesday, 6 February 2018 at 18:33:02 UTC, Ralph Doncaster 
wrote:
 I've been reading std.conv and std.range, trying to figure out 
 a high-level way of converting a hex string to bytes.  The only 
 way I've been able to do it is through pointer access:

 import std.stdio;
 import std.string;
 import std.conv;

 void main()
 {
     immutable char* hex = "deadbeef".toStringz;
     for (auto i=0; hex[i]; i += 2)
         writeln(to!byte(hex[i]));
 }
Thanks for all the feedback. I'll have to do some more reading about maps. My initial though is they don't seem as readable as loops. The chunks() is useful, so for now what I'm going with is: ubyte[] arr; foreach (b; "deadbeef".chunks(2)) { arr ~= b.to!ubyte(16); }
Feb 06
parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 02/06/2018 11:55 AM, Ralph Doncaster wrote:

 I'll have to do some more reading about
 maps.  My initial though is they don't seem as readable as loops.
Surprisingly, they may be very easy to read in some situations.
 The chunks() is useful, so for now what I'm going with is:
      ubyte[] arr;
      foreach (b; "deadbeef".chunks(2))
      {
          arr ~= b.to!ubyte(16);
      }
That is great but it has two issues that may be important in some programs: 1) It makes multiple allocations as the array grows 2) You need another loop to to do the actual work later on If you needed to go through the elements just once, the memory allocations would be wasted. Instead, you can solve both issues with a chained expression like the ones that has been shown by others. The cool thing is, you can always hide the ugly bits in a function. Starting with ag0aep6g's code, I went overboard to pick a better destination type (other than ubyte) to support different chunk sizes: void main() { import std.stdio; foreach (b; "deadbeef".hexValues) { writefln("%2X", b); } // Works for different sized chunks as well: writeln("12345678".hexValues!4); } auto hexValues(size_t digits = 2, ToType = DefaultTypeForSize!digits)(string s) { import std.algorithm: map; import std.conv: to; import std.range: chunks; return s.chunks(digits).map!(chars => chars.to!ToType(16)); } template DefaultTypeForSize(size_t s) { static if (s == 1) { alias DefaultTypeForSize = ubyte; } else static if (s == 2) { alias DefaultTypeForSize = ushort; } else static if (s == 4) { alias DefaultTypeForSize = uint; } else static if (s == 8) { alias DefaultTypeForSize = ulong; } else { import std.string : format; static assert(false, format("There is no default %s-byte type", s)); } } Ali
Feb 06
prev sibling parent reply Ralph Doncaster <nerdralph github.com> writes:
On Tuesday, 6 February 2018 at 18:33:02 UTC, Ralph Doncaster 
wrote:
 I've been reading std.conv and std.range, trying to figure out 
 a high-level way of converting a hex string to bytes.  The only 
 way I've been able to do it is through pointer access:

 import std.stdio;
 import std.string;
 import std.conv;

 void main()
 {
     immutable char* hex = "deadbeef".toStringz;
     for (auto i=0; hex[i]; i += 2)
         writeln(to!byte(hex[i]));
 }


 While it works, I'm wondering if there is a more 
 object-oriented way of doing it in D.
After a bunch of searching, I came across hex string literals. They are mentioned but not documented as a literal. https://dlang.org/spec/lex.html#string_literals Combined with the toHexString function in std.digest, it is easy to convert between hex strings and byte arrays. import std.stdio; import std.digest; void main() { auto data = cast(ubyte[]) x"deadbeef"; writeln("data: 0x", toHexString(data)); } p.s. the cast should probably be to immutable ubyte[]. I'm guessing without it, there is an automatic copy of the data being made.
Feb 07
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 7 February 2018 at 14:47:04 UTC, Ralph Doncaster 
wrote:
 p.s. the cast should probably be to immutable ubyte[].  I'm 
 guessing without it, there is an automatic copy of the data 
 being made.
No copy - you just get undefined behavior if you actually try to modify it!
Feb 07