www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - range result in Tuple! and how to convert into assocArray by sort?

reply MichaelBi <shunjie.bi gmail.com> writes:
s is the string, and print result as following:

s.array.sort!("a<b").group.assocArray.byPair.array.sort!("a[0]<b[0]").each!writeln;

Tuple!(dchar, "key", uint, "value")('A', 231)
Tuple!(dchar, "key", uint, "value")('C', 247)
Tuple!(dchar, "key", uint, "value")('G', 240)
Tuple!(dchar, "key", uint, "value")('T', 209)

then how to transfer into 
[['A',231],['C',247],['G',240],['T',209]]? tried map!, but can 
only sortout key or value... tried array(), but result is not 
sorted then...thanks in advance.
May 09 2022
next sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
If I am understanding the problem correctly, this is a super expensive 
method for doing something pretty simple. Even if it is a bit more code, 
this won't require memory allocation which in this case wouldn't be 
cheap (given how big DNA tends to be).

string s = "ACGTACGT";

uint[4] counts;

foreach(char c; s) {
	switch(c) {
		case 'A':
		case 'a':
			counts[0]++;
			break;
		case 'C':
		case 'c':
			counts[1]++;
			break;
		case 'G':
		case 'g':
			counts[2]++;
			break;
		case 'T':
		case 't':
			counts[3]++;
			break;
		default:
			assert(0, "Unknown compound");
	}
}

writeln(counts);
May 09 2022
next sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 5/9/22 20:38, rikki cattermole wrote:

 this is a super expensive
 method for doing something pretty simple.
Yes! :) Assuming the data is indeed validated in some way, the following should be even faster. It validates the data after the fact: import std.stdio; import std.range; import std.exception; import std.algorithm; import std.format; const ulong[] alphabet = [ 'A', 'C', 'G', 'T' ]; void main() { string s = "ACGTACGT"; auto counts = new ulong[char.max]; foreach(char c; s) { counts[c]++; } validateCounts(counts); writeln(counts.indexed(alphabet)); } void validateCounts(ulong[] counts) { // The other elements should all be zero. enforce(counts .enumerate .filter!(t => !alphabet.canFind(t.index)) .map!(t => t.value) .sum == 0, format!"There were illegal letters in the data: %s"(counts)); } Ali
May 09 2022
parent reply MichaelBi <shunjie.bi gmail.com> writes:
On Tuesday, 10 May 2022 at 04:21:04 UTC, Ali Çehreli wrote:
 On 5/9/22 20:38, rikki cattermole wrote:

 [...]
Yes! :) Assuming the data is indeed validated in some way, the following should be even faster. It validates the data after the fact: [...]
this is cool! thanks for your time and i really like your book Programming in D :)
May 09 2022
parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 5/9/22 22:12, MichaelBi wrote:
 On Tuesday, 10 May 2022 at 04:21:04 UTC, Ali Çehreli wrote:
 On 5/9/22 20:38, rikki cattermole wrote:

 [...]
Yes! :) Assuming the data is indeed validated in some way, the following should be even faster. It validates the data after the fact: [...]
this is cool!
I've been meaning to write about a bug in my code, which would likely cause zero issues, and which you've probably already fixed. ;) BAD: auto counts = new ulong[char.max]; GOOD: auto counts = new ulong[char.max - char.min + 1]; FINE: auto counts = new ulong[256];
 thanks for your time and i really like your book
 Programming in D :)
Yay! :) Ali
May 11 2022
prev sibling parent MichaelBi <shunjie.bi gmail.com> writes:
On Tuesday, 10 May 2022 at 03:38:08 UTC, rikki cattermole wrote:
 If I am understanding the problem correctly, this is a super 
 expensive method for doing something pretty simple. Even if it 
 is a bit more code, this won't require memory allocation which 
 in this case wouldn't be cheap (given how big DNA tends to be).

 string s = "ACGTACGT";

 uint[4] counts;

 foreach(char c; s) {
 	switch(c) {
 		case 'A':
 		case 'a':
 			counts[0]++;
 			break;
 		case 'C':
 		case 'c':
 			counts[1]++;
 			break;
 		case 'G':
 		case 'g':
 			counts[2]++;
 			break;
 		case 'T':
 		case 't':
 			counts[3]++;
 			break;
 		default:
 			assert(0, "Unknown compound");
 	}
 }

 writeln(counts);
yes, thanks. understood this. the problem for me now is after learning D, always thinking about using range and function composition...and forgot the basic algorithm :)
May 09 2022
prev sibling parent forkit <forkit gmail.com> writes:
On Tuesday, 10 May 2022 at 03:22:04 UTC, MichaelBi wrote:
 s is the string, and print result as following:

 s.array.sort!("a<b").group.assocArray.byPair.array.sort!("a[0]<b[0]").each!writeln;

 Tuple!(dchar, "key", uint, "value")('A', 231)
 Tuple!(dchar, "key", uint, "value")('C', 247)
 Tuple!(dchar, "key", uint, "value")('G', 240)
 Tuple!(dchar, "key", uint, "value")('T', 209)

 then how to transfer into 
 [['A',231],['C',247],['G',240],['T',209]]? tried map!, but can 
 only sortout key or value... tried array(), but result is not 
 sorted then...thanks in advance.
Adding tuples to an AA is easy. Sorting the output of an AA is the tricky part. // ----- module test; safe: import std; void main() { uint[dchar] myAA; Tuple!(dchar, uint) myTuple; myTuple[0] = 'C'; myTuple[1] = 247; myAA[ myTuple[0] ] = myTuple[1]; myTuple[0] = 'G'; myTuple[1] = 240; myAA[ myTuple[0] ] = myTuple[1]; myTuple[0] = 'A'; myTuple[1] = 231; myAA[ myTuple[0] ] = myTuple[1]; myTuple[0] = 'T'; myTuple[1] = 209; myAA[ myTuple[0] ] = myTuple[1]; // NOTE: associative arrays do not preserve the order of the keys inserted into the array. // See: https://dlang.org/spec/hash-map.html // if we want the output of an AA to be sorted (by key).. string[] orderedKeyPairSet; foreach(ref key, ref value; myAA.byPair) orderedKeyPairSet ~= key.to!string ~ ":" ~ value.to!string; orderedKeyPairSet.sort; foreach(ref str; orderedKeyPairSet) writeln(str); /+ A:231 C:247 G:240 T:209 +/ } // --------
May 11 2022