digitalmars.D.learn - uniq and array of enum members (That are all strings)
- bauss (18/18) Jan 16 2019 Is there a way to achieve the following:
- H. S. Teoh (15/33) Jan 16 2019 .uniq only works on adjacent identical elements. You should sort your
- bauss (6/39) Jan 16 2019 Sorting will not work in my case though because it's an enum of
- H. S. Teoh (17/37) Jan 16 2019 [...]
- bauss (6/35) Jan 16 2019 That's not necessarily true.
- H. S. Teoh (18/28) Jan 16 2019 Sorting is the simplest approach because then the problem is reduced to
- H. S. Teoh (19/19) Jan 16 2019 De-duplicating a range that's not necessarily sorted seems to be a
- bauss (7/8) Jan 16 2019 I'm aware of how to do it manually as I already stated I went
- H. S. Teoh (25/32) Jan 16 2019 Feel free to submit a PR to Phobos for this. The templatized function I
- Alex (22/66) Jan 16 2019 yeah... searching by hand is somewhat inefficient.
Is there a way to achieve the following: import std.stdio; import std.algorithm : uniq; import std.array : array; enum Foo : string { a = "aa", b = "bb", c = "cc" } void main() { auto a = [Foo.a, Foo.b, Foo.a, Foo.b, Foo.c]; auto b = a.uniq; writeln(b); // Expected output: [a, b, c] // Outputs: [a, b, a, b, c] }
Jan 16 2019
On Wed, Jan 16, 2019 at 03:57:49PM +0000, bauss via Digitalmars-d-learn wrote:Is there a way to achieve the following:[...]enum Foo : string { a = "aa", b = "bb", c = "cc" } void main() { auto a = [Foo.a, Foo.b, Foo.a, Foo.b, Foo.c]; auto b = a.uniq; writeln(b); // Expected output: [a, b, c] // Outputs: [a, b, a, b, c] }.uniq only works on adjacent identical elements. You should sort your array first. If you need to preserve the original order but eliminate duplicates, then you could use an AA to keep track of what has been seen. E.g.: bool[string] seen; auto b = a.filter!((e) { if (e in seen) return false; seen[e] = true; return true; }); T -- People walk. Computers run.
Jan 16 2019
On Wednesday, 16 January 2019 at 16:12:28 UTC, H. S. Teoh wrote:On Wed, Jan 16, 2019 at 03:57:49PM +0000, bauss via Digitalmars-d-learn wrote:Sorting will not work in my case though because it's an enum of strings that are not sorted alphabetically. Right now I'm doing it manually by a foreach in similar way you're using filter. I just feel like that's an overkill for something so trivial.Is there a way to achieve the following:[...]enum Foo : string { a = "aa", b = "bb", c = "cc" } void main() { auto a = [Foo.a, Foo.b, Foo.a, Foo.b, Foo.c]; auto b = a.uniq; writeln(b); // Expected output: [a, b, c] // Outputs: [a, b, a, b, c] }.uniq only works on adjacent identical elements. You should sort your array first. If you need to preserve the original order but eliminate duplicates, then you could use an AA to keep track of what has been seen. E.g.: bool[string] seen; auto b = a.filter!((e) { if (e in seen) return false; seen[e] = true; return true; }); T
Jan 16 2019
On Wed, Jan 16, 2019 at 04:21:12PM +0000, bauss via Digitalmars-d-learn wrote:On Wednesday, 16 January 2019 at 16:12:28 UTC, H. S. Teoh wrote:[...][...].uniq only works on adjacent identical elements. You should sort your array first. If you need to preserve the original order but eliminate duplicates, then you could use an AA to keep track of what has been seen. E.g.: bool[string] seen; auto b = a.filter!((e) { if (e in seen) return false; seen[e] = true; return true; });Sorting will not work in my case though because it's an enum of strings that are not sorted alphabetically. Right now I'm doing it manually by a foreach in similar way you're using filter. I just feel like that's an overkill for something so trivial.It's not trivial. In order for the computer to know whether or not the i'th element should be excluded, it needs to know what has come before it. In the case of .uniq, this is easy because the assumption is that identical elements are adjacent, so we only need to know what the previous element was. But in your case, we need to know elements that were seen an arbitrary distance in the past. So there must be some way of answering the question "has the i'th element been seen x iterations ago?". So either you search backward until you find the element (very inefficient: it will turn your algorithm into O(n^2)), or you need some kind of lookup structure like an AA to remember what has been seen before. T -- It is the quality rather than the quantity that matters. -- Lucius Annaeus Seneca
Jan 16 2019
On Wednesday, 16 January 2019 at 16:35:04 UTC, H. S. Teoh wrote:On Wed, Jan 16, 2019 at 04:21:12PM +0000, bauss via Digitalmars-d-learn wrote:That's not necessarily true. It just has to know whether the element matches an earlier element. Your filter example demonstrates exactly why sorting isn't necessary.On Wednesday, 16 January 2019 at 16:12:28 UTC, H. S. Teoh wrote:[...][...].uniq only works on adjacent identical elements. You should sort your array first. If you need to preserve the original order but eliminate duplicates, then you could use an AA to keep track of what has been seen. E.g.: bool[string] seen; auto b = a.filter!((e) { if (e in seen) return false; seen[e] = true; return true; });Sorting will not work in my case though because it's an enum of strings that are not sorted alphabetically. Right now I'm doing it manually by a foreach in similar way you're using filter. I just feel like that's an overkill for something so trivial.It's not trivial. In order for the computer to know whether or not the i'th element should be excluded, it needs to know what has come before it.
Jan 16 2019
On Wed, Jan 16, 2019 at 04:37:21PM +0000, bauss via Digitalmars-d-learn wrote:On Wednesday, 16 January 2019 at 16:35:04 UTC, H. S. Teoh wrote:[...]Sorting is the simplest approach because then the problem is reduced to using .uniq. It's not necessarily the most *efficient* approach. But regardless, the point is that in order to know whether some given element e matches an earlier element, the computer must somehow keep track of previously-seen elements. Either you do this implicitly by sorting it so that the relevant previously-seen element occurs immediately before the current one, or you have to store it explicitly somewhere, like in an AA, or you have to compute it on the fly. There is no way to get around this; you cannot filter an element based on information you don't have. So either you recompute that information (search backward until you find a match), or you store it in a structure like an AA where you can do the lookup easily. The information has to come from *somewhere*. T -- Nearly all men can stand adversity, but if you want to test a man's character, give him power. -- Abraham LincolnIt's not trivial. In order for the computer to know whether or not the i'th element should be excluded, it needs to know what has come before it.That's not necessarily true. It just has to know whether the element matches an earlier element. Your filter example demonstrates exactly why sorting isn't necessary.
Jan 16 2019
De-duplicating a range that's not necessarily sorted seems to be a pretty common task, so here's a generic function for whoever else might want to do this: import std.range.primitives; auto deduplicate(R)(R range) if (isInputRange!R) { import std.algorithm : filter; alias E = ElementType!R; bool[E] seen; return range.filter!((e) { if (e in seen) return false; seen[e] = true; return true; }); } T -- Why have vacation when you can work?? -- EC
Jan 16 2019
On Wednesday, 16 January 2019 at 18:20:57 UTC, H. S. Teoh wrote:TI'm aware of how to do it manually as I already stated I went with a similar approach. There should just be something standard for it and uniq should have an overload or something that allows for another behavior that doesn't rely on sorting. Even if it's "slower".
Jan 16 2019
On Wed, Jan 16, 2019 at 06:25:48PM +0000, bauss via Digitalmars-d-learn wrote: [...]I'm aware of how to do it manually as I already stated I went with a similar approach. There should just be something standard for it and uniq should have an overload or something that allows for another behavior that doesn't rely on sorting.Feel free to submit a PR to Phobos for this. The templatized function I posted is probably already pretty close to Phobos standards, you just need to add a few unittests and a nice ddoc header. I wouldn't call it `uniq`, though. It doesn't quite jive with the "examine nearby elements only" philosophy of .uniq (which inherits from the Unix `uniq` that does the same thing), and it's not nogc unlike .uniq, etc., and IME, trying to shoehorn something like this into an existing symbol with already preconceived semantics will make the PR review needlessly controversial. I'd just submit it under a different name.Even if it's "slower".Actually, filtering with an AA is not slower, since the AA amortizes the overall cost to O(n), which is even better than the O(n log n) of the pre-sorting approach. Of course, this comes at the cost of not being nogc, which might be a turnoff for some folks. It's probably possible to write a nogc version of this function, but it will be more complicated (much more complicated if you're working with a range of unknown length). This is one of the places where IMO the hassle of manually managing the memory of a lookup table far outweighs the "cost" (perceived or real) of just embracing the GC. T -- Elegant or ugly code as well as fine or rude sentences have something in common: they don't depend on the language. -- Luca De Vitis
Jan 16 2019
On Wednesday, 16 January 2019 at 16:21:12 UTC, bauss wrote:On Wednesday, 16 January 2019 at 16:12:28 UTC, H. S. Teoh wrote:yeah... searching by hand is somewhat inefficient. but this would work also with an enum, wouldn't it? ´´´ import std.stdio; import std.algorithm : uniq; import std.array : array; import std.algorithm.sorting : sort; enum Foo : string { a = "aa", b = "bb", c = "cc" } void main() { enum a = [Foo.a, Foo.b, Foo.a, Foo.b, Foo.c]; auto b = a.sort.uniq; writeln(b); } ´´´ And if you have something like immutable, dup would help, maybe?On Wed, Jan 16, 2019 at 03:57:49PM +0000, bauss via Digitalmars-d-learn wrote:Sorting will not work in my case though because it's an enum of strings that are not sorted alphabetically. Right now I'm doing it manually by a foreach in similar way you're using filter. I just feel like that's an overkill for something so trivial.Is there a way to achieve the following:[...]enum Foo : string { a = "aa", b = "bb", c = "cc" } void main() { auto a = [Foo.a, Foo.b, Foo.a, Foo.b, Foo.c]; auto b = a.uniq; writeln(b); // Expected output: [a, b, c] // Outputs: [a, b, a, b, c] }.uniq only works on adjacent identical elements. You should sort your array first. If you need to preserve the original order but eliminate duplicates, then you could use an AA to keep track of what has been seen. E.g.: bool[string] seen; auto b = a.filter!((e) { if (e in seen) return false; seen[e] = true; return true; }); T
Jan 16 2019
On Wednesday, 16 January 2019 at 16:40:34 UTC, Alex wrote:On Wednesday, 16 January 2019 at 16:21:12 UTC, bauss wrote:The problem with sorting is that the following: [3,5,6,6,2,1,2,5,3] will then become [1,2,3,5,6] or [6,5,3,2,1] and not: [3,5,6,2,1] which would be what you'd wanna use in some situations. The important thing to know here is that the numbers are not a sequence, they're a set of numbers. It's important the set doesn't have a change of order. The filter example works and that's what I already did. But something that is a bit better would be appreciated. Sorting really makes no sense to make something unique. It makes sense for the most "trivial" implementation, but not the most trivial usages.On Wednesday, 16 January 2019 at 16:12:28 UTC, H. S. Teoh wrote:yeah... searching by hand is somewhat inefficient. but this would work also with an enum, wouldn't it? ´´´ import std.stdio; import std.algorithm : uniq; import std.array : array; import std.algorithm.sorting : sort; enum Foo : string { a = "aa", b = "bb", c = "cc" } void main() { enum a = [Foo.a, Foo.b, Foo.a, Foo.b, Foo.c]; auto b = a.sort.uniq; writeln(b); } ´´´ And if you have something like immutable, dup would help, maybe?On Wed, Jan 16, 2019 at 03:57:49PM +0000, bauss via Digitalmars-d-learn wrote:Sorting will not work in my case though because it's an enum of strings that are not sorted alphabetically. Right now I'm doing it manually by a foreach in similar way you're using filter. I just feel like that's an overkill for something so trivial.Is there a way to achieve the following:[...]enum Foo : string { a = "aa", b = "bb", c = "cc" } void main() { auto a = [Foo.a, Foo.b, Foo.a, Foo.b, Foo.c]; auto b = a.uniq; writeln(b); // Expected output: [a, b, c] // Outputs: [a, b, a, b, c] }.uniq only works on adjacent identical elements. You should sort your array first. If you need to preserve the original order but eliminate duplicates, then you could use an AA to keep track of what has been seen. E.g.: bool[string] seen; auto b = a.filter!((e) { if (e in seen) return false; seen[e] = true; return true; }); T
Jan 16 2019
On Wednesday, 16 January 2019 at 16:52:50 UTC, bauss wrote:The problem with sorting is that the following: [3,5,6,6,2,1,2,5,3] will then become [1,2,3,5,6] or [6,5,3,2,1] and not: [3,5,6,2,1] which would be what you'd wanna use in some situations. The important thing to know here is that the numbers are not a sequence, they're a set of numbers. It's important the set doesn't have a change of order. The filter example works and that's what I already did. But something that is a bit better would be appreciated. Sorting really makes no sense to make something unique. It makes sense for the most "trivial" implementation, but not the most trivial usages.Ah... I see. You want something like a special fold by filtering... hmm ;)
Jan 16 2019
On Wednesday, 16 January 2019 at 17:28:14 UTC, Alex wrote:On Wednesday, 16 January 2019 at 16:52:50 UTC, bauss wrote:Yep. I just want to make sure there are no duplicates in the array while maintaining the order of the items.The problem with sorting is that the following: [3,5,6,6,2,1,2,5,3] will then become [1,2,3,5,6] or [6,5,3,2,1] and not: [3,5,6,2,1] which would be what you'd wanna use in some situations. The important thing to know here is that the numbers are not a sequence, they're a set of numbers. It's important the set doesn't have a change of order. The filter example works and that's what I already did. But something that is a bit better would be appreciated. Sorting really makes no sense to make something unique. It makes sense for the most "trivial" implementation, but not the most trivial usages.Ah... I see. You want something like a special fold by filtering... hmm ;)
Jan 16 2019