www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.digest can't CTFE?

reply Manu <turkeyman gmail.com> writes:
"CTFE
Digests do not work in CTFE"


That's an unfortunate limitation... why is, those things? :(
May 31 2018
next sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Thu, 31 May 2018 18:12:35 -0700 schrieb Manu:

 Hashing's not low-level. It would be great if these did CTFE; generating
 compile-time hashes is a thing that would be really useful!
 Right here, I have a string class that carries a hash around with it for
 comparison reasons. Such string literals would prefer to have CT hashes.
 
As I was the one who wrote that doc comment: For basically all hash implementations you'll be casting from an integer type to the raw bytes representation somewhere. As the binary presentation needs to be portable, you need to be aware of the endianess of the system you're running your code on. AFAIR CTFE does (did?) not provide any way to do endianess-dependent conversions at all and there's also no way to know the CTFE endianess, so this is a fundamental limitation. (E.g. if you have a cross-compiler targeting a system with a different endianess, version(BigEndian) will give you the target endianess. But what will actually be used in CTFE?). I don't know if anything changed in this regard since std.digest was written some time ago. But if you get the std.bitmanip nativeTo*Endian and *EndianToNative functions to work in CTFE, std.digest should work as well. There may be some workaround, as IIRC druntimes core.internal.hash works in CTFE? It's either this, or it's buggy in that cross-compilation scenario ;-) -- Johannes
Jun 01 2018
parent reply Kagamin <spam here.lot> writes:
On Friday, 1 June 2018 at 08:37:33 UTC, Johannes Pfau wrote:
 I don't know if anything changed in this regard since 
 std.digest was written some time ago. But if you get the 
 std.bitmanip  nativeTo*Endian and *EndianToNative functions to 
 work in CTFE, std.digest should work as well.
Standard cryptographic algorithms are by design not dependent on endianness, rather they set on a specific endianness.
Jun 01 2018
parent reply Johannes Pfau <nospam example.com> writes:
Am Fri, 01 Jun 2018 08:50:19 +0000 schrieb Kagamin:

 On Friday, 1 June 2018 at 08:37:33 UTC, Johannes Pfau wrote:
 I don't know if anything changed in this regard since std.digest was
 written some time ago. But if you get the std.bitmanip  nativeTo*Endian
 and *EndianToNative functions to work in CTFE, std.digest should work
 as well.
Standard cryptographic algorithms are by design not dependent on endianness, rather they set on a specific endianness.
However you want to call it, the algorithms interpret data as numbers which means that the binary representation differs based on endianess. If you want portable results, you can't ignore that fact in the implementation. So even though the algorithms are not dependent on the endianess, the representation of the result is. Therefore standards do usually propose an internal byte order. -- Johannes
Jun 01 2018
parent reply Kagamin <spam here.lot> writes:
On Friday, 1 June 2018 at 10:04:52 UTC, Johannes Pfau wrote:
 However you want to call it, the algorithms interpret data as 
 numbers which means that the binary representation differs 
 based on endianess. If you want portable results, you can't 
 ignore that fact in the implementation. So even though the 
 algorithms are not dependent on the endianess, the 
 representation of the result is. Therefore standards do usually 
 propose an internal byte order.
Huh? The algorithm packs bytes into integers and does it independently of platform. Once integers are formed, the arithmetic operations are independent of endianness. It works this way even in pure javascript, which is not sensitive to endianness.
Jun 01 2018
parent reply Atila Neves <atila.neves gmail.com> writes:
On Friday, 1 June 2018 at 20:12:23 UTC, Kagamin wrote:
 On Friday, 1 June 2018 at 10:04:52 UTC, Johannes Pfau wrote:
 However you want to call it, the algorithms interpret data as 
 numbers which means that the binary representation differs 
 based on endianess. If you want portable results, you can't 
 ignore that fact in the implementation. So even though the 
 algorithms are not dependent on the endianess, the 
 representation of the result is. Therefore standards do 
 usually propose an internal byte order.
Huh? The algorithm packs bytes into integers and does it independently of platform. Once integers are formed, the arithmetic operations are independent of endianness. It works this way even in pure javascript, which is not sensitive to endianness.
It's a common programming misconception that endianness matters much. It's one of those that just won't go away, like "GC languages are slow" or "C is magically fast". I recommend reading this: https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html In short, unless you're a compiler writer or implementing a binary protocol endianness only matters if you cast between pointers and integers. So... Don't. Atila
Jun 01 2018
parent reply Johannes Pfau <nospam example.com> writes:
Am Sat, 02 Jun 2018 06:31:37 +0000 schrieb Atila Neves:

 On Friday, 1 June 2018 at 20:12:23 UTC, Kagamin wrote:
 On Friday, 1 June 2018 at 10:04:52 UTC, Johannes Pfau wrote:
 However you want to call it, the algorithms interpret data as numbers
 which means that the binary representation differs based on endianess.
 If you want portable results, you can't ignore that fact in the
 implementation. So even though the algorithms are not dependent on the
 endianess, the representation of the result is. Therefore standards do
 usually propose an internal byte order.
Huh? The algorithm packs bytes into integers and does it independently of platform. Once integers are formed, the arithmetic operations are independent of endianness. It works this way even in pure javascript, which is not sensitive to endianness.
It's a common programming misconception that endianness matters much. It's one of those that just won't go away, like "GC languages are slow" or "C is magically fast". I recommend reading this: https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html In short, unless you're a compiler writer or implementing a binary protocol endianness only matters if you cast between pointers and integers. So... Don't. Atila
That's an interesting point. When I said the algorithm depends on the system endianess I was indeed always thinking in terms of machine code (i.e. if system endianess=data endianess you hopefully do nothing at all, otherwise you need some conversion). But it is indeed true that describing conversion as mathematical shift operations + indexing will leave handling these differences to the compilers. So you can probably say the algorithm doesn't depend on system endianess, although a low level representation of implementations will. I guess this is what Kagamin wanted to explain, please excuse me for not getting the point. So in our case, we can obviously use that higher-abstraction-level interpretation and the idiom used in the article indeed works fine in CTFE. So somebody ( Manu?) just has to fix std.bitmanip *EndianToNative nativeTo*Endian functions to use this (probably benchmarking performance impacts). Then std.digest should simply start working or should at least be easy to fix for CTFE support. -- Johannes
Jun 08 2018
parent reply Manu <turkeyman gmail.com> writes:
On Fri, 8 Jun 2018 at 11:35, Johannes Pfau via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 Am Sat, 02 Jun 2018 06:31:37 +0000 schrieb Atila Neves:

 On Friday, 1 June 2018 at 20:12:23 UTC, Kagamin wrote:
 On Friday, 1 June 2018 at 10:04:52 UTC, Johannes Pfau wrote:
 However you want to call it, the algorithms interpret data as numbers
 which means that the binary representation differs based on endianess.
 If you want portable results, you can't ignore that fact in the
 implementation. So even though the algorithms are not dependent on the
 endianess, the representation of the result is. Therefore standards do
 usually propose an internal byte order.
Huh? The algorithm packs bytes into integers and does it independently of platform. Once integers are formed, the arithmetic operations are independent of endianness. It works this way even in pure javascript, which is not sensitive to endianness.
It's a common programming misconception that endianness matters much. It's one of those that just won't go away, like "GC languages are slow" or "C is magically fast". I recommend reading this: https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html In short, unless you're a compiler writer or implementing a binary protocol endianness only matters if you cast between pointers and integers. So... Don't. Atila
That's an interesting point. When I said the algorithm depends on the system endianess I was indeed always thinking in terms of machine code (i.e. if system endianess=data endianess you hopefully do nothing at all, otherwise you need some conversion). But it is indeed true that describing conversion as mathematical shift operations + indexing will leave handling these differences to the compilers. So you can probably say the algorithm doesn't depend on system endianess, although a low level representation of implementations will. I guess this is what Kagamin wanted to explain, please excuse me for not getting the point. So in our case, we can obviously use that higher-abstraction-level interpretation and the idiom used in the article indeed works fine in CTFE. So somebody ( Manu?) just has to fix std.bitmanip *EndianToNative nativeTo*Endian functions to use this (probably benchmarking performance impacts). Then std.digest should simply start working or should at least be easy to fix for CTFE support.
I'm already burning about 3x my reasonably allocate-able free time to DMD PR's... I'd really love if someone else would look at that :) I'm not quite sure what you mean though; endian conversion functions are still endian conversion functions, and they shouldn't be affected here. The problem is in the std.digest code where it *calls* endian functions (or makes endian assumptions). There need be no reference to endian in std.digest... if code is pulling bytes from an int (ie, cast(byte*)) or something, just use ubyte[4] and index it instead if uint, etc. I'm surprised that digest code would use anything other than byte buffers. It may be that there are some optimised version()-ed fast-paths might be endian conscious, but the default path has no reason to not work.
Jun 08 2018
parent Johannes Pfau <nospam example.com> writes:
Am Fri, 08 Jun 2018 11:46:41 -0700 schrieb Manu:
 
 I'm already burning about 3x my reasonably allocate-able free time to
 DMD PR's...
 I'd really love if someone else would look at that :)
I'll see if I can allocate some time for that. Should be a mostly trivial change.
 I'm not quite sure what you mean though; endian conversion functions are
 still endian conversion functions, and they shouldn't be affected here.
Yes, but the point made in that article is that you can implement *Endian<=>native conversions without knowing the native endianness. This would immediately make these functions CTFE-able.
 The problem is in the std.digest code where it *calls* endian functions
 (or makes endian assumptions). There need be no reference to endian in
 std.digest... if code is pulling bytes from an int (ie, cast(byte*)) or
 something, just use ubyte[4] and index it instead if uint, etc. I'm
 surprised that digest code would use anything other than byte buffers.
 It may be that there are some optimised version()-ed fast-paths might be
 endian conscious, but the default path has no reason to not work.
That's not how hash algorithms are usually specified. These algorithms perform bit rotate operations, additions, multiplications on these values*. You could probably implement these on byte[4] values instead, but you'll waste time porting the algorithm, benchmarking possible performance impacts and it will be more difficult to compare the implementation to the reference implementation (think of audits). So it's not realistic to change this. * An interesting question here is if you could actually always ignore system endianess and do simple casts when cleverly adjusting all constants in the algorithm to fit? -- Johannes
Jun 10 2018
prev sibling next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 31 May 2018 at 21:29:13 UTC, Manu wrote:
 "CTFE
 Digests do not work in CTFE"


 That's an unfortunate limitation... why is, those things? :(
Because CTFE cannot do things which are technically ABI dependent. You can work around it with code like this: T fromBytes(T, Endianess endianess = Endianess.LittleEndian) (const ubyte[] _data) pure { static assert(is(T : long)); // poor man's isIntegral T result; static if (endianess == Endianess.LittleEndian) { static if (T.sizeof == 4) { result = ( _data[0] | (_data[1] << 8) | (_data[2] << 16) | (_data[3] << 24) ); } else static if (T.sizeof == 8) { result = ( _data[0] | (_data[1] << 8) | (_data[2] << 16) | (_data[3] << 24) | (cast(ulong)_data[4] << 32UL) | (cast(ulong)_data[5] << 40UL) | (cast(ulong)_data[6] << 48UL) | (cast(ulong)_data[7] << 56UL) ); } else { static assert(0, "only int and long are supported"); } } else { static assert(0, "Big Endian currently not supported"); } return result; }
Jun 01 2018
parent reply Sisor <smietzner yahoo.de> writes:
On Friday, 1 June 2018 at 14:56:32 UTC, Stefan Koch wrote:
 On Thursday, 31 May 2018 at 21:29:13 UTC, Manu wrote:
 "CTFE
 Digests do not work in CTFE"


 That's an unfortunate limitation... why is, those things? :(
Because CTFE cannot do things which are technically ABI dependent.
Is there a technical reason for this? Can CTFE determine the endianness of the (target-) system?
Jun 01 2018
parent Stefan Koch <uplink.coder googlemail.com> writes:
On Friday, 1 June 2018 at 18:30:34 UTC, Sisor wrote:
 On Friday, 1 June 2018 at 14:56:32 UTC, Stefan Koch wrote:
 On Thursday, 31 May 2018 at 21:29:13 UTC, Manu wrote:
 "CTFE
 Digests do not work in CTFE"


 That's an unfortunate limitation... why is, those things? :(
Because CTFE cannot do things which are technically ABI dependent.
Is there a technical reason for this? Can CTFE determine the endianness of the (target-) system?
it's more then just endianness it's also alignment. I'd rather not make guarantees about that. newCTFE for example is build in reasonably a platform agnostic way, therefore it cannot know anything target or even host specific without the programmer being explicit about it.
Jun 01 2018
prev sibling parent reply Kagamin <spam here.lot> writes:
On Thursday, 31 May 2018 at 21:29:13 UTC, Manu wrote:
 "CTFE
 Digests do not work in CTFE"


 That's an unfortunate limitation... why is, those things? :(
just for fun: https://run.dlang.io/gist/861f14f0d776f75e0195c8910990e14c - chacha20 running at compile time :)
Jun 01 2018
parent Kagamin <spam here.lot> writes:
https://run.dlang.io/gist/946773fd185902c7f65ba32cb3368719 - and 
this is poly1305, the combined source is too big for 
run.dlang.org.
Jun 01 2018