www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Good examples of value types

reply =?UTF-8?B?Ikx1w61z?= Marques" <luis luismarques.eu> writes:
Hi,

For a comparison with the Java language, I'm trying to come up 
with some good examples of custom types that should be value 
types (but that must be ref types in Java). I think the most 
obvious ones are numeric types. So BigNum, MyNum, etc. are good 
examples because programmers are used to numeric types being 
value types, and having them suddenly become a ref type just 
because it's MyNum instead of long is really annoying. Still, 
could you come up with some type that would really benefit from 
being a value type but that isn't numeric (or otherwise similar)?

Thanks for your help!

Luís
May 05 2015
next sibling parent reply Justin Whear <justin economicmodeling.com> writes:
On Tue, 05 May 2015 20:40:58 +0000, Luís Marques wrote:

 could you come up with some type that would really benefit from being a
 value type but that isn't numeric (or otherwise similar)?
Dates, times, durations, regular expressions, tokens, lazy generators, digests, vectors, really any type that provides a interpretation/behavior wrapper around one or more trivially copyable types. Frankly, the real divide in my mind is polymorphic/non-polymorphic, not by-reference vs by-value, though I will occasionally use `final class` if I want a convenience reference type around a resource.
May 05 2015
parent reply =?UTF-8?B?Ikx1w61z?= Marques" <luis luismarques.eu> writes:
On Tuesday, 5 May 2015 at 21:04:22 UTC, Justin Whear wrote:
 Dates, times, durations, regular expressions, tokens, lazy 
 generators,
 digests, vectors, really any type that provides a 
 interpretation/behavior
 wrapper around one or more trivially copyable types.
That's a very nice list, thank you, it *was* helpful. Still, I have some issues with it, maybe we can improve it. * Dates, times, durations, vectors -> still somewhat numeric. I can add two weeks to a date, 3 seconds to a duration, etc. (for types which are not at all numeric it would probably make no difference what value are assigned which bit patterns, while for more numeric types the bit patterns must have some regularity to be able to efficiently implement operations like addition, etc.) * Regular expressions -> I have no idea what you have in mind for this one; even after looking at std.regex... * Tokens -> On the one hand, I think this could be an excellent example, since it's a case where the bit pattern is arbitrary (because it generally has no numeric properties). On the other hand, I could see people arguing that just using an int is perfectly fine, so it doesn't benefit from a custom type. Do notice that, even in D, an enum converts without a cast to an int, so the fact that an enum might be used to list the possible abstract values (the tokens) doesn't quite make it a completely independent type, IMHO. * lazy generators -> explain, please? * digests -> do you mean the digest output, or the digest function state?
 Frankly, the real divide in my mind is 
 polymorphic/non-polymorphic, not
 by-reference vs by-value, though I will occasionally use `final 
 class` if
 I want a convenience reference type around a resource.
Aren't you arguing against yourself? In those cases you wanted your type to have reference semantics, even though it wasn't a polymorphic type. Doesn't that counterexample prove that polymorphic/non-polymorphic is just an heuristic, while the value/ref distinction is more fundamental?
May 06 2015
parent Justin Whear <justin economicmodeling.com> writes:
On Wed, 06 May 2015 16:54:46 +0000, Luís Marques wrote:

 * Regular expressions -> I have no idea what you have in mind for this
 one; even after looking at std.regex...
I meant both patterns and matches/captures.
 * Tokens -> On the one hand, I think this could be an excellent example,
 since it's a case where the bit pattern is arbitrary (because it
 generally has no numeric properties). On the other hand, I could see
 people arguing that just using an int is perfectly fine, so it doesn't
 benefit from a custom type. Do notice that, even in D, an enum converts
 without a cast to an int, so the fact that an enum might be used to list
 the possible abstract values (the tokens) doesn't quite make it a
 completely independent type, IMHO.
By tokens I mean the output of a lexer, usually looking like: struct Token { Type type; uint line, uint col; string captured; }
 * lazy generators -> explain, please?
Phobos is full of these. A simple example is the result of std.range.iota. The motivation is that a generator is actually an algorithm which happens to be wrapped up as a value.
 * digests -> do you mean the digest output, or the digest function
 state?
Certainly the output, haven't given much thought to the intermediate state.
 Frankly, the real divide in my mind is polymorphic/non-polymorphic, not
 by-reference vs by-value, though I will occasionally use `final class`
 if I want a convenience reference type around a resource.
Aren't you arguing against yourself? In those cases you wanted your type to have reference semantics, even though it wasn't a polymorphic type. Doesn't that counterexample prove that polymorphic/non-polymorphic is just an heuristic, while the value/ref distinction is more fundamental?
Ideally they're orthogonal, though you need reference types to do true dynamic polymorphism. Using `final class` is a pragmatic move given that D conflates reference types with polymorphic types, i.e. if there were a `ref struct` I'd use it.
May 06 2015
prev sibling next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 5 May 2015 at 20:40:59 UTC, Luís Marques wrote:
 Hi,

 For a comparison with the Java language, I'm trying to come up 
 with some good examples of custom types that should be value 
 types (but that must be ref types in Java). I think the most 
 obvious ones are numeric types. So BigNum, MyNum, etc. are good 
 examples because programmers are used to numeric types being 
 value types, and having them suddenly become a ref type just 
 because it's MyNum instead of long is really annoying. Still, 
 could you come up with some type that would really benefit from 
 being a value type but that isn't numeric (or otherwise 
 similar)?

 Thanks for your help!

 Luís
Let me tell you an actual war story of mine. We have this program that is computationally intensive written in java. Somewhere in the core of the program, we have a LRU cache, with some entries sticking in there, and most entry getting evicted soon enough (typical pareto kind of thing). Problem is, all these entries needs to be value types (we are in java) and, by the time things gets evicted from the LRU cache, they have moved to the old generation. The whole damn thing generate a ton of garbage. The obvious solution is to use value types in the cache, but that not possible. I won't go in the details, but that was a really hard problem to solve, that kept us busy for for longer then it should have because of language limitations. Long story short: value types are useful.
May 05 2015
next sibling parent reply "Dominikus Dittes Scherkl" writes:
On Wednesday, 6 May 2015 at 02:07:40 UTC, deadalnix wrote:
 Problem is, all these entries needs to be value types (we are 
 in java)
Maybe you meant reference types here?
May 06 2015
parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 6 May 2015 at 08:50:19 UTC, Dominikus Dittes 
Scherkl wrote:
 On Wednesday, 6 May 2015 at 02:07:40 UTC, deadalnix wrote:
 Problem is, all these entries needs to be value types (we are 
 in java)
Maybe you meant reference types here?
Yes, sorry for the confusion.
May 06 2015
prev sibling next sibling parent reply Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 2015-05-06 at 02:07 +0000, deadalnix via Digitalmars-d wrote:
=20
[=E2=80=A6]
 Let me tell you an actual war story of mine.
I think you need to date the story, and say with GC was being used.
 We have this program that is computationally intensive written in=20
 java. Somewhere in the core of the program, we have a LRU cache,=20
 with some entries sticking in there, and most entry getting=20
 evicted soon enough (typical pareto kind of thing).
=20
 Problem is, all these entries needs to be value types (we are in=20
 java) and, by the time things gets evicted from the LRU cache,=20
 they have moved to the old generation.
=20
 The whole damn thing generate a ton of garbage.
I suspect this was not using G1 as tons of garbage isn't as much of a problem compared to CMS, etc.
 The obvious solution is to use value types in the cache, but that=20
 not possible. I won't go in the details, but that was a really=20
 hard problem to solve, that kept us busy for for longer then it=20
 should have because of language limitations.
=20
 Long story short: value types are useful.
This is true anyway no matter which language or GC is being used. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
May 06 2015
parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 6 May 2015 at 09:20:50 UTC, Russel Winder wrote:
 On Wed, 2015-05-06 at 02:07 +0000, deadalnix via Digitalmars-d 
 wrote:
 
[…]
 Let me tell you an actual war story of mine.
I think you need to date the story, and say with GC was being used.
It is relatively recent (a little more than 2 years ago). Any generational GC would have trashed the same way.
 We have this program that is computationally intensive written 
 in java. Somewhere in the core of the program, we have a LRU 
 cache, with some entries sticking in there, and most entry 
 getting evicted soon enough (typical pareto kind of thing).
 
 Problem is, all these entries needs to be value types (we are 
 in java) and, by the time things gets evicted from the LRU 
 cache, they have moved to the old generation.
 
 The whole damn thing generate a ton of garbage.
I suspect this was not using G1 as tons of garbage isn't as much of a problem compared to CMS, etc.
All of them would have trashed the same way. Generational GC bet on the fact that most object dies young. In that case, it is true, but object are kept live long enough because of the LRU that the GC promote them, and they only get collected when the full collection kicks in.
May 06 2015
prev sibling next sibling parent reply =?UTF-8?B?Ikx1w61z?= Marques" <luis luismarques.eu> writes:
On Wednesday, 6 May 2015 at 02:07:40 UTC, deadalnix wrote:
 Let me tell you an actual war story of mine.

 We have this program that is computationally intensive written 
 in java. Somewhere in the core of the program, we have a LRU 
 cache, with some entries sticking in there, and most entry 
 getting evicted soon enough (typical pareto kind of thing).

 Problem is, all these entries needs to be value types (we are 
 in java) and, by the time things gets evicted from the LRU 
 cache, they have moved to the old generation.

 The whole damn thing generate a ton of garbage.

 The obvious solution is to use value types in the cache, but 
 that not possible. I won't go in the details, but that was a 
 really hard problem to solve, that kept us busy for for longer 
 then it should have because of language limitations.

 Long story short: value types are useful.
That's sounds like a great story! What kind of "values" were you caching?
May 06 2015
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 6 May 2015 at 17:01:05 UTC, Luís Marques wrote:
 That's sounds like a great story! What kind of "values" were 
 you caching?
Without going into the details, we had entries in a DB, counting in the millions in the working set. Some of them where hammered heavily, most of them were only needed an handful of time. Problem is, there was no way upfront to know which ones, so we add there entries in a local cache, so we can find the heavily used ones in the cache, and fallback on the DB for the less commonly used ones.
May 06 2015
parent reply =?UTF-8?B?Ikx1w61z?= Marques" <luis luismarques.eu> writes:
On Wednesday, 6 May 2015 at 18:05:57 UTC, deadalnix wrote:
 Without going into the details, we had entries in a DB, 
 counting in the millions in the working set. Some of them where 
 hammered heavily, most of them were only needed an handful of 
 time. Problem is, there was no way upfront to know which ones, 
 so we add there entries in a local cache, so we can find the 
 heavily used ones in the cache, and fallback on the DB for the 
 less commonly used ones.
OK, so here the advantages of value vs ref types had to do "only" with allocation issues, and not (also) any advantages in how the values/objects were used per se. So, one could argue that this problem could also be solved through richer memory allocation primitives, without having to have value types? BTW, given the GC behaviour you described, weak references wouldn't help, would they?
May 06 2015
parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 6 May 2015 at 18:28:28 UTC, Luís Marques wrote:
 On Wednesday, 6 May 2015 at 18:05:57 UTC, deadalnix wrote:
 Without going into the details, we had entries in a DB, 
 counting in the millions in the working set. Some of them 
 where hammered heavily, most of them were only needed an 
 handful of time. Problem is, there was no way upfront to know 
 which ones, so we add there entries in a local cache, so we 
 can find the heavily used ones in the cache, and fallback on 
 the DB for the less commonly used ones.
OK, so here the advantages of value vs ref types had to do "only" with allocation issues, and not (also) any advantages in how the values/objects were used per se. So, one could argue that this problem could also be solved through richer memory allocation primitives, without having to have value types? BTW, given the GC behaviour you described, weak references wouldn't help, would they?
It is not only with allocation issue, but that is certainly a benefit :) Weak reference would have caused most of the cache to be scrapped at each GC cycle, so that would defeat the purpose of the cache to begin with. At the end, we ended up using a quite complicated solution involving recycling objects in the cache.
May 06 2015
prev sibling parent "amiga" <vadim.goryunov gmail.com> writes:
On Wednesday, 6 May 2015 at 02:07:40 UTC, deadalnix wrote:
 On Tuesday, 5 May 2015 at 20:40:59 UTC, Luís Marques wrote:
 Hi,

 For a comparison with the Java language, I'm trying to come up 
 with some good examples of custom types that should be value 
 types (but that must be ref types in Java). I think the most 
 obvious ones are numeric types. So BigNum, MyNum, etc. are 
 good examples because programmers are used to numeric types 
 being value types, and having them suddenly become a ref type 
 just because it's MyNum instead of long is really annoying. 
 Still, could you come up with some type that would really 
 benefit from being a value type but that isn't numeric (or 
 otherwise similar)?

 Thanks for your help!

 Luís
Let me tell you an actual war story of mine. We have this program that is computationally intensive written in java. Somewhere in the core of the program, we have a LRU cache, with some entries sticking in there, and most entry getting evicted soon enough (typical pareto kind of thing). Problem is, all these entries needs to be value types (we are in java) and, by the time things gets evicted from the LRU cache, they have moved to the old generation. The whole damn thing generate a ton of garbage. The obvious solution is to use value types in the cache, but that not possible. I won't go in the details, but that was a really hard problem to solve, that kept us busy for for longer then it should have because of language limitations. Long story short: value types are useful.
People tend to use array of primitives in such cases. So you store an object in arrays like long longs[MAX*NUM_LONGS_PER_RECORD]; int ints[MAX*NUM_INTS_PER_RECORD]; byte bytes[MAX*NUM_BYTES_PER_RECORD]; and then for each record you have an offset. There are even open source collections that are completely GC free that use this principle. This looks like C style programming in Java. But in the end you still use JVM and you use that for critical part of the project only. So value types are useful. The good thing about D is that you can allocate such a crazy amount of objects using malloc and avoid GC scan for them (if my understanding of D is correct).
May 06 2015
prev sibling next sibling parent reply "Freddy" <Hexagonalstar64 gmail.com> writes:
On Tuesday, 5 May 2015 at 20:40:59 UTC, Luís Marques wrote:
 Hi,

 For a comparison with the Java language, I'm trying to come up 
 with some good examples of custom types that should be value 
 types (but that must be ref types in Java). I think the most 
 obvious ones are numeric types. So BigNum, MyNum, etc. are good 
 examples because programmers are used to numeric types being 
 value types, and having them suddenly become a ref type just 
 because it's MyNum instead of long is really annoying. Still, 
 could you come up with some type that would really benefit from 
 being a value type but that isn't numeric (or otherwise 
 similar)?

 Thanks for your help!

 Luís
Immutable types? It doesn't matter whether they are by value or by ref but by value usually has performance increases(especially considering you can by a value type by ref in D if you need to).
May 06 2015
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 7 May 2015 at 02:09:34 UTC, Freddy wrote:
 It doesn't matter whether they are by value or by ref but by 
 value usually has performance increases(especially considering 
 you can by a value type by ref in D if you need to).
Yep, this is very much true, but on x86 the register size is increasing so a good compiler could hold 512 bits in a register. If only compiler and backends weren't so far behind hardware developments...
May 07 2015
prev sibling next sibling parent reply "Dejan Lekic" <dejan.lekic gmail.com> writes:
On Tuesday, 5 May 2015 at 20:40:59 UTC, Luís Marques wrote:
 Hi,

 For a comparison with the Java language, I'm trying to come up 
 with some good examples of custom types that should be value 
 types (but that must be ref types in Java). I think the most 
 obvious ones are numeric types. So BigNum, MyNum, etc. are good 
 examples because programmers are used to numeric types being 
 value types, and having them suddenly become a ref type just 
 because it's MyNum instead of long is really annoying. Still, 
 could you come up with some type that would really benefit from 
 being a value type but that isn't numeric (or otherwise 
 similar)?

 Thanks for your help!

 Luís
To add to what others have said - whenever you think you will benefit from stack-allocation. Read this article: http://www.ibm.com/developerworks/library/j-jtp09275/ Java is good at escape analysis. But I find it really useful to be able to specify a type that will always be allocated on the stack (unless you really want it on the heap).
May 07 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/7/15 4:15 AM, Dejan Lekic wrote:
 On Tuesday, 5 May 2015 at 20:40:59 UTC, Luís Marques wrote:
 Hi,

 For a comparison with the Java language, I'm trying to come up with
 some good examples of custom types that should be value types (but
 that must be ref types in Java). I think the most obvious ones are
 numeric types. So BigNum, MyNum, etc. are good examples because
 programmers are used to numeric types being value types, and having
 them suddenly become a ref type just because it's MyNum instead of
 long is really annoying. Still, could you come up with some type that
 would really benefit from being a value type but that isn't numeric
 (or otherwise similar)?

 Thanks for your help!

 Luís
To add to what others have said - whenever you think you will benefit from stack-allocation. Read this article: http://www.ibm.com/developerworks/library/j-jtp09275/ Java is good at escape analysis. But I find it really useful to be able to specify a type that will always be allocated on the stack (unless you really want it on the heap).
(Not sure I'm not being confused about the topic.) A coworker mentioned they have big problems in Java whenever they need to return more than one thing from one function. Reference types don't compose nicely that way. In a language with valye types, the juxtaposition of two items (value or reference) is a va;ue. In Java, it's a reference (so you need to allocate a new object etc.) Andrei
May 07 2015
parent reply Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 2015-05-07 at 07:36 -0700, Andrei Alexandrescu via Digitalmars-d
wrote:
[=E2=80=A6]
=20
 Reference types don't compose nicely that way. In a language with valye=
=20
 types, the juxtaposition of two items (value or reference) is a va;ue.=
=20
 In Java, it's a reference (so you need to allocate a new object etc.)
But that object is likely very short lived, and (especially with the G1 GC), handled very efficiently. But as ever, without benchmarks and actual data there are no facts, only opinions. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
May 07 2015
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/7/15 12:29 PM, Russel Winder via Digitalmars-d wrote:
 On Thu, 2015-05-07 at 07:36 -0700, Andrei Alexandrescu via Digitalmars-d
 wrote:
 […]
 Reference types don't compose nicely that way. In a language with valye
 types, the juxtaposition of two items (value or reference) is a va;ue.
 In Java, it's a reference (so you need to allocate a new object etc.)
But that object is likely very short lived, and (especially with the G1 GC), handled very efficiently. But as ever, without benchmarks and actual data there are no facts, only opinions.
Yah, my coworker had data (this is Facebook after all) and it was pretty damning. But that was on Dalvik which is AFAIK far behind state-of-the-art desktop VMs. -- Andrei
May 07 2015
prev sibling parent "Nikolay" <sibnick gmail.com> writes:
java.lang.String

It is big problem in java. You have pointer to String object with 
fields: hashCode and chars array (UTF-16). But every array is 
object itself. So it is pointer to object again. There is pointer 
compression option in x64 JVM (it uses 32 bits per pointer), but 
in any way it is too many indirection and additional overhead for 
object headers.

FYI
It is possible that Java 9 will have value types (google: IBM 
Java PackedObjects)
May 07 2015