www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.md5 and my.md5

reply Regan Heath <regan netwin.co.nz> writes:
------------5dT1VNmihRsA5JqKm7L2mi
Content-Type: text/plain; format=flowed; charset=iso-8859-15
Content-Transfer-Encoding: 8bit

Hi,

After writing my own version of the md5 algo, then discovering the std.md5 
implementation, I thought I would ask peoples thoughts on which version 
they prefer and why. Mainly to discover what I have done right/wrong when 
I wrote it.

So if anyone is interested I am attaching my source.

Comments of all shapes and sizes are desired and most welcome, 
particularly...

- The fact that I have used a class and std.md5 uses a struct.
- The naming of the functions.

std.md5 does have a 'sum' function which I do not, one could be added 
trivially.


I think I made some mistakes, for example:

- My functions take char[] whereas std.md5 takes void[]
- my digest is a char[] std.md5 uses a ubyte[]

..etc..

Anything you say, can and will be used.. to enable me to write better D 
code!

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
------------5dT1VNmihRsA5JqKm7L2mi
Content-Disposition: attachment; filename=md5.d
Content-Type: text/plain; name=md5.d
Content-Transfer-Encoding: 8bit

/* md5.d - RSA Data Security, Inc., MD5 message-digest algorithm
 * Derived from the RSA Data Security, Inc. MD5 Message-Digest Algorithm.
 */


/* Copyright (C) 1991-2, RSA Data Security, Inc. Created 1991. All
rights reserved.

License to copy and use this software is granted provided that it
is identified as the "RSA Data Security, Inc. MD5 Message-Digest
Algorithm" in all material mentioning or referencing this software
or this function.

License is also granted to make and use derivative works provided
that such works are identified as "derived from the RSA Data
Security, Inc. MD5 Message-Digest Algorithm" in all material
mentioning or referencing the derived work.

RSA Data Security, Inc. makes no representations concerning either
the merchantability of this software or the suitability of this
software for any particular purpose. It is provided "as is"
without express or implied warranty of any kind.
These notices must be retained in any copies of any part of this
documentation and/or software.
 */

module regans.md5;

static uint state[4] = [ 0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476 ];
static char[64] padding = [ 0x80 ];

static uint S11 = 7;
static uint S12 = 12;
static uint S13 = 17;
static uint S14 = 22;
static uint S21 = 5;
static uint S22 = 9;
static uint S23 = 14;
static uint S24 = 20;
static uint S31 = 4;
static uint S32 = 11;
static uint S33 = 16;
static uint S34 = 23;
static uint S41 = 6;
static uint S42 = 10;
static uint S43 = 15;
static uint S44 = 21;

class MD5 {

	unittest {
		auto MD5 m = new MD5();

		m.reset();
		m.process("");
		assert(m.toString() == "d41d8cd98f00b204e9800998ecf8427e");
		
		m.reset();
		m.process("a");
		assert(m.toString() == "0cc175b9c0f1b6a831c399e269772661");

		m.reset();
		m.process("abc");
		assert(m.toString() == "900150983cd24fb0d6963f7d28e17f72");

		m.reset();
		m.process("message digest");
		assert(m.toString() == "f96b697d7cb7938d525a2f31aaf161d0");

		m.reset();
		m.process("abcdefghijklmnopqrstuvwxyz");
		assert(m.toString() == "c3fcd3d76192e4007dfb496cca67e13b");

		m.reset();
		m.process("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789");
		assert(m.toString() == "d174ab98d277d9f5a5611c2c9f419d9f");

		m.reset();
		m.process("12345678901234567890123456789012345678901234567890123456789012345678901234567890");
		assert(m.toString() == "57edf4a22be3c955ac49da2e2107b67a");
	}
		
	this() {
		reset();
	}
	
	void reset() {
		state[] = .state;
		count[] = 0;
		md5.length = 0;		
		buffer.length = 0;		
	}
	
	void process(char[] input) {
		uint i;

		addBits(input.length);
		buffer ~= input[];
		
		for(i = 0; i+63 < buffer.length; i += 64) {			
			transform(buffer[i..i+64]);
		}
			
		buffer = buffer[i..buffer.length];
	}
	
	char[] get() {
		if (md5.length == 0) {
			char[] bits;
			uint i,b;
		
			bits = encode(count);
		
			b = getBytes();
			i = (b < 56) ? (56 - b) : (120 - b);
			
			process(padding[0..i]);
			process(bits);
			
			md5 = encode(state);
		}
		return md5;
	}

	char[] toString() {
		char[] result;
		int i;
		
		result.length = 32;
		get();
		
		for(i = 0; i < 16; i ++) {
			result[i*2] = "0123456789abcdef"[md5[i]/16];
			result[i*2+1] = "0123456789abcdef"[md5[i]%16];
		}
			
		return result;			
	}
	
private:
	char[] buffer;
	char[] md5;
	uint state[4];
	uint count[2];	

	void addBits(uint bytes) {
		count[0] += bytes << 3;
		if (count[0] < bytes << 3)
			count[1]++;
		count[1] += bytes >> 29;
	}
	
	uint getBytes() {
		return (count[0] >> 3) & 0x3f;
	}
	
	void transform(char[] block) {
		uint a,b,c,d;
		uint[] x;
		
		x = decode(block);

		a = state[0];
		b = state[1];
		c = state[2];
		d = state[3];
		
		/* Round 1 */
		ff(a, b, c, d, x[ 0], S11, 0xd76aa478); /* 1 */
		ff(d, a, b, c, x[ 1], S12, 0xe8c7b756); /* 2 */
		ff(c, d, a, b, x[ 2], S13, 0x242070db); /* 3 */
		ff(b, c, d, a, x[ 3], S14, 0xc1bdceee); /* 4 */
		ff(a, b, c, d, x[ 4], S11, 0xf57c0faf); /* 5 */
		ff(d, a, b, c, x[ 5], S12, 0x4787c62a); /* 6 */
		ff(c, d, a, b, x[ 6], S13, 0xa8304613); /* 7 */
		ff(b, c, d, a, x[ 7], S14, 0xfd469501); /* 8 */
		ff(a, b, c, d, x[ 8], S11, 0x698098d8); /* 9 */
		ff(d, a, b, c, x[ 9], S12, 0x8b44f7af); /* 10 */
		ff(c, d, a, b, x[10], S13, 0xffff5bb1); /* 11 */
		ff(b, c, d, a, x[11], S14, 0x895cd7be); /* 12 */
		ff(a, b, c, d, x[12], S11, 0x6b901122); /* 13 */
		ff(d, a, b, c, x[13], S12, 0xfd987193); /* 14 */
		ff(c, d, a, b, x[14], S13, 0xa679438e); /* 15 */
		ff(b, c, d, a, x[15], S14, 0x49b40821); /* 16 */
		
		/* Round 2 */
		gg(a, b, c, d, x[ 1], S21, 0xf61e2562); /* 17 */
		gg(d, a, b, c, x[ 6], S22, 0xc040b340); /* 18 */
		gg(c, d, a, b, x[11], S23, 0x265e5a51); /* 19 */
		gg(b, c, d, a, x[ 0], S24, 0xe9b6c7aa); /* 20 */
		gg(a, b, c, d, x[ 5], S21, 0xd62f105d); /* 21 */
		gg(d, a, b, c, x[10], S22,  0x2441453); /* 22 */
		gg(c, d, a, b, x[15], S23, 0xd8a1e681); /* 23 */
		gg(b, c, d, a, x[ 4], S24, 0xe7d3fbc8); /* 24 */
		gg(a, b, c, d, x[ 9], S21, 0x21e1cde6); /* 25 */
		gg(d, a, b, c, x[14], S22, 0xc33707d6); /* 26 */
		gg (c, d, a, b, x[ 3], S23, 0xf4d50d87); /* 27 */
		
		gg(b, c, d, a, x[ 8], S24, 0x455a14ed); /* 28 */
		gg(a, b, c, d, x[13], S21, 0xa9e3e905); /* 29 */
		gg(d, a, b, c, x[ 2], S22, 0xfcefa3f8); /* 30 */
		gg(c, d, a, b, x[ 7], S23, 0x676f02d9); /* 31 */
		gg(b, c, d, a, x[12], S24, 0x8d2a4c8a); /* 32 */
		
		/* Round 3 */
		hh(a, b, c, d, x[ 5], S31, 0xfffa3942); /* 33 */
		hh(d, a, b, c, x[ 8], S32, 0x8771f681); /* 34 */
		hh(c, d, a, b, x[11], S33, 0x6d9d6122); /* 35 */
		hh(b, c, d, a, x[14], S34, 0xfde5380c); /* 36 */
		hh(a, b, c, d, x[ 1], S31, 0xa4beea44); /* 37 */
		hh(d, a, b, c, x[ 4], S32, 0x4bdecfa9); /* 38 */
		hh(c, d, a, b, x[ 7], S33, 0xf6bb4b60); /* 39 */
		hh(b, c, d, a, x[10], S34, 0xbebfbc70); /* 40 */
		hh(a, b, c, d, x[13], S31, 0x289b7ec6); /* 41 */
		hh(d, a, b, c, x[ 0], S32, 0xeaa127fa); /* 42 */
		hh(c, d, a, b, x[ 3], S33, 0xd4ef3085); /* 43 */
		hh(b, c, d, a, x[ 6], S34,  0x4881d05); /* 44 */
		hh(a, b, c, d, x[ 9], S31, 0xd9d4d039); /* 45 */
		hh(d, a, b, c, x[12], S32, 0xe6db99e5); /* 46 */
		hh(c, d, a, b, x[15], S33, 0x1fa27cf8); /* 47 */
		hh(b, c, d, a, x[ 2], S34, 0xc4ac5665); /* 48 */
		
		/* Round 4 */
		ii(a, b, c, d, x[ 0], S41, 0xf4292244); /* 49 */
		ii(d, a, b, c, x[ 7], S42, 0x432aff97); /* 50 */
		ii(c, d, a, b, x[14], S43, 0xab9423a7); /* 51 */
		ii(b, c, d, a, x[ 5], S44, 0xfc93a039); /* 52 */
		ii(a, b, c, d, x[12], S41, 0x655b59c3); /* 53 */
		ii(d, a, b, c, x[ 3], S42, 0x8f0ccc92); /* 54 */
		ii(c, d, a, b, x[10], S43, 0xffeff47d); /* 55 */
		ii(b, c, d, a, x[ 1], S44, 0x85845dd1); /* 56 */
		ii(a, b, c, d, x[ 8], S41, 0x6fa87e4f); /* 57 */
		ii(d, a, b, c, x[15], S42, 0xfe2ce6e0); /* 58 */
		ii(c, d, a, b, x[ 6], S43, 0xa3014314); /* 59 */
		ii(b, c, d, a, x[13], S44, 0x4e0811a1); /* 60 */
		ii(a, b, c, d, x[ 4], S41, 0xf7537e82); /* 61 */
		ii(d, a, b, c, x[11], S42, 0xbd3af235); /* 62 */
		ii(c, d, a, b, x[ 2], S43, 0x2ad7d2bb); /* 63 */
		ii(b, c, d, a, x[ 9], S44, 0xeb86d391); /* 64 */
		
		state[0] += a;
		state[1] += b;
		state[2] += c;
		state[3] += d;
		
		x[] = 0;
	}
}

char[] encode(uint[] value)
in {
}
out (result) {
	//assert(decode(result) == value);
}
body{
	char[] result;
	uint i;
	
	result.length = value.length*4;
	for(i = 0; i < result.length; i += 4) {
		result[i] = value[i/4] & 0xff;
		result[i+1] = (value[i/4] >> 8) & 0xff;
		result[i+2] = (value[i/4] >> 16) & 0xff;
		result[i+3] = (value[i/4] >> 24) & 0xff;
	}

	return result;
}
uint[] decode(char[] block)
in {
}
out (result) {
	//assert(encode(result) == block);
}		
body {
	uint[] result;
	uint i;
	
	result.length = block.length/4;
	for (i = 0; i < block.length; i += 4) {
		result[i/4] = block[i] | (block[i+1] << 8) | (block[i+2] << 16) | (block[i+3]
<< 24);
	}
		
	return result;
}

uint f(uint x, uint y, uint z) {
	return (x & y) | (~x & z);
}
uint g(uint x, uint y, uint z) {
	return (x & z) | (y & ~z);
}
uint h(uint x, uint y, uint z) {
	return x ^ y ^ z;
}
uint i(uint x, uint y, uint z) {
	return y ^ (x | ~z);
}

uint rotateLeft(uint x, uint n) {
	return (x << n) | (x >> (32-n));
}

void ff(inout uint a, uint b, uint c, uint d, uint x, uint s, uint ac) {
	a += f(b, c, d) + x + ac;
	a = rotateLeft(a, s);
    a += b;
}
void gg(inout uint a, uint b, uint c, uint d, uint x, uint s, uint ac) {
	a += g(b, c, d) + x + ac;
	a = rotateLeft(a, s); 
	a += b;
}
void hh(inout uint a, uint b, uint c, uint d, uint x, uint s, uint ac) {
	a += h(b, c, d) + x + ac;
	a = rotateLeft(a, s);
	a += b;
}
void ii(inout uint a, uint b, uint c, uint d, uint x, uint s, uint ac) {
	a += i(b, c, d) + x + ac;
	a = rotateLeft(a, s);
	a += b;
}

------------5dT1VNmihRsA5JqKm7L2mi--
Jun 07 2004
parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <opr872elh35a2sq9 digitalmars.com>, Regan Heath says...
- My functions take char[] whereas std.md5 takes void[]
- my digest is a char[] std.md5 uses a ubyte[]

char is most definitely the wrong type. A char stores a fragment of a UTF-8 encoded character stream. You need a ubyte. But it's nice to find someone else interested in crypto things. If you craft any other hash functions (the rest of the MD- family, the SHA- family, the new Tiger algorithm, and so on) there will definitely be a place for them in the forthcoming etc.crypto heirarchy. (We might have to haggle over the API a little). :-) Arcane Jill
Jun 07 2004
next sibling parent reply "Walter" <newshound digitalmars.com> writes:
"Arcane Jill" <Arcane_member pathlink.com> wrote in message
news:ca1mnp$2p58$1 digitaldaemon.com...
 But it's nice to find someone else interested in crypto things. If you

 other hash functions (the rest of the MD- family, the SHA- family, the new

 algorithm, and so on) there will definitely be a place for them in the
 forthcoming etc.crypto heirarchy. (We might have to haggle over the API a
 little). :-)

I haven't added the SHA (Secure Hash Algorithm) because the description of the algorithm says implementations need an export license. I don't know if this is obsolete or not, as there are SHA implementations all over the web, but I prefer that phobos not be encumbered with such problems.
Jun 07 2004
parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <ca2brg$oaj$2 digitaldaemon.com>, Walter says...
I haven't added the SHA (Secure Hash Algorithm) because the description of
the algorithm says implementations need an export license.

Sensible, but as it happens your information is out of date. US export restrictions on cryptography were relaxed in September 1998, and dropped altogether in January 2000. Put it like this. You know when you connect to an https:// web site, and you get all that malarky with security certificates, and maybe a little padlock icon in the corner of your browser if you're lucky? Well that's SSL, and the SSL protocol includes an implementation of SHA-1. If SHA is illegal in your country, then there must be an AWFUL lot of lawbreakers around, including Microsoft, Netscape, .... Anyway, I don't live in the US, so those dumb rules never did apply to me. It always made me laugh that the US was not allowed to export to me that which I already had. (I could export it to them!)
I don't know if
this is obsolete or not, as there are SHA implementations all over the web,
but I prefer that phobos not be encumbered with such problems.

Sounds like a job for Deimos then. The crypto community at large WANT easy-to-use implementations of TLS (that's the successor to SSL) out there, because apparently OpenSSL is just too damn hard to use. This is my goal, and I shall achieve it in D, and I will have to write an AWFUL lot of stuff to get there (big integers were just the start), but I will succeed. Of more concern is the fact that Hans Dobbertin has demonstrated a weakness in MD5 which now makes it unsuitable for serious crypto. There is every possibility that it might be broken in the next few years. SHA-256 is the hashing algorithm of choice these days. To be honest, it hadn't occured to me that you might have put SHA in Phobos but didn't. I largely figured I would be doing all the work myself anyway. But then, I guess I'm still hoping that (the forthcoming) etc.crypto may eventually become std.crypto... :-) Jill
Jun 07 2004
parent reply "Walter" <newshound digitalmars.com> writes:
"Arcane Jill" <Arcane_member pathlink.com> wrote in message
news:ca2e0r$rmg$1 digitaldaemon.com...
 In article <ca2brg$oaj$2 digitaldaemon.com>, Walter says...
I haven't added the SHA (Secure Hash Algorithm) because the description


the algorithm says implementations need an export license.

Sensible, but as it happens your information is out of date. US export restrictions on cryptography were relaxed in September 1998, and dropped altogether in January 2000.

That's good news.
 Put it like this. You know when you connect to an https:// web site, and

 all that malarky with security certificates, and maybe a little padlock

 the corner of your browser if you're lucky? Well that's SSL, and the SSL
 protocol includes an implementation of SHA-1. If SHA is illegal in your

 then there must be an AWFUL lot of lawbreakers around, including

 Netscape, ....

It was legal to use it, just not export it. That's why many software boxes got marked 'not for export'. Of course, it was absurd to think that this actually prevented anyone outside the country from getting it.
 Anyway, I don't live in the US, so those dumb rules never did apply to me.

 always made me laugh that the US was not allowed to export to me that

 already had. (I could export it to them!)

It had the effect of forcing US software companies to have a separate export product that was cryptographically crippled, putting them at a serious disadvantage to foreign competitors who of course had strong encryption.
I don't know if
this is obsolete or not, as there are SHA implementations all over the


but I prefer that phobos not be encumbered with such problems.

Sounds like a job for Deimos then. The crypto community at large WANT easy-to-use implementations of TLS (that's the successor to SSL) out

 because apparently OpenSSL is just too damn hard to use. This is my goal,

 shall achieve it in D, and I will have to write an AWFUL lot of stuff to

 there (big integers were just the start), but I will succeed.

 Of more concern is the fact that Hans Dobbertin has demonstrated a

 MD5 which now makes it unsuitable for serious crypto. There is every

 that it might be broken in the next few years. SHA-256 is the hashing

 of choice these days.

 To be honest, it hadn't occured to me that you might have put SHA in

 didn't. I largely figured I would be doing all the work myself anyway. But

 I guess I'm still hoping that (the forthcoming) etc.crypto may eventually

 std.crypto...

I hope so, too. Crypto is another of my interests, but generally not explored because I spend all my time with compilers <g>.
Jun 07 2004
parent reply Stephan Wienczny <wienczny web.de> writes:
Maybe we will see a new break-through in cryptography if Walter has some 
  time for it ;-)
Something like the "Walter Hash"...

Walter wrote:
 
 
 I hope so, too. Crypto is another of my interests, but generally not
 explored because I spend all my time with compilers <g>.
 
 

Jun 07 2004
parent "Walter" <newshound digitalmars.com> writes:
Sadly, I don't have a good enough math background to pretend I can advance
that field.

"Stephan Wienczny" <wienczny web.de> wrote in message
news:ca2i7u$126p$1 digitaldaemon.com...
 Maybe we will see a new break-through in cryptography if Walter has some
   time for it ;-)
 Something like the "Walter Hash"...

 Walter wrote:
 I hope so, too. Crypto is another of my interests, but generally not
 explored because I spend all my time with compilers <g>.


Jun 07 2004
prev sibling parent Regan Heath <regan netwin.co.nz> writes:
On Mon, 7 Jun 2004 12:23:21 +0000 (UTC), Arcane Jill 
<Arcane_member pathlink.com> wrote:
 In article <opr872elh35a2sq9 digitalmars.com>, Regan Heath says...
 - My functions take char[] whereas std.md5 takes void[]
 - my digest is a char[] std.md5 uses a ubyte[]

char is most definitely the wrong type. A char stores a fragment of a UTF-8 encoded character stream. You need a ubyte.

Yep... I realise that now :)
 But it's nice to find someone else interested in crypto things. If you 
 craft any
 other hash functions (the rest of the MD- family, the SHA- family, the 
 new Tiger
 algorithm, and so on) there will definitely be a place for them in the
 forthcoming etc.crypto heirarchy. (We might have to haggle over the API a
 little). :-)

I am such a newbie at crypto that I doubt I'll be much use, that won't stop me from having a go of course! (when I find the time) I'll let you know what I'm gonna try next, when I decide what that is. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 07 2004