digitalmars.D - [theory] What is a type?

• Justin Johansson (48/48) Oct 10 2010 Specifically I have a problem in trying to implement
• Manfred_Nowak (3/4) Oct 10 2010 These are three dimensions already.
• Jonathan Marler (53/101) Jan 15 2018 I'm getting "nostalgia" from all the math terminology :) Here
• Mark (31/79) Jan 15 2018 Well, in D-speak you could imagine taking (perhaps arbitrarily)
• Ali (5/13) Jan 15 2018 If you want to learns the ins and outs of types, this books comes
Justin Johansson <no spam.com> writes:
```Specifically I have a problem in trying to implement
a functional language translator in D.  My target language
has a rather non-conventional type system, in which,
superficially at least, types can be described as being
Cartesian in nature. That is,
types in this system have two orthogonal dimensions:
(1) classical data-type (e.g. boolean, number, string,
object) and
(2) Kleene cardinality (occurrences) with respect to (1).

The axial origin of this Cartesian type-system correlates well
with the concept of the "top" type (AKA "Unit" in Scala)
and as the rather adhoc "void" type in many Algol-derived
languages such as C/C++/Java/D.

So along axis 1 we might broadly describe the classical
data types as item, boolean, number, string, object where
item is effectively either a superclass or variant of the
other mentioned types.

Along axis 2 we describe Kleene occurrences of 1 as may be
passed contractually to a receiving function.  These
occurrences may be enumerated six-fold as:
zeroOrMore
zeroOrOne
oneOrMore
exactlyZero
exactlyOne
none

Readers may see that, for example, zeroOrOne is a special
case (perhaps a subclass?) of zeroOrMore.  exactlyOne is
a special case of both zeroOrOne and oneOrMore (sounds like
multiple inheritance?).  OTOH, exactlyZero is a special
case of zeroOrOne, which, in turn, is a special case of
zeroOrmore.

none is a special case of all of the above and reflects
the cardinality facet of the return type of a function
that never returns (say as by throwing an exception).

To make this type system even more enigmatic lets add a
third dimension, taking the 2-D Cartesian type system
model to a 3-D Spatial model, by imagining a further
degree of freedom with respect to laziness of evaluation
(AKA closure of arguments).

Now having hopefully described that a type is something
that might well have multiple orthogonal aspects to its
identity, how would one go about implementing a dynamic
language with such a complex type system in D?

I realize that this is a complex topic and that it might
require better articulation than so far I have given.

Nevertheless, thanks for all replies,
Justin Johansson
```
Oct 10 2010
"Manfred_Nowak" <svv1999 hotmail.com> writes:
```Justin Johansson wrote:
These occurrences may be enumerated six-fold as:

-manfred
```
Oct 10 2010
Justin Johansson <no spam.com> writes:
```On 11/10/2010 4:10 AM, Manfred_Nowak wrote:
Justin Johansson wrote:
These occurrences may be enumerated six-fold as:

-manfred

Sorry, perhaps a miscommunication, the six-fold occurrence types
are enumerated along one axis (the cardinality axis).

Why do you mention there are three dimensions already wherein
my model there are just three dimensions anyway?

- JJ
```
Oct 12 2010
"Manfred_Nowak" <svv1999 hotmail.com> writes:
```Justin Johansson wrote:

the six-fold occurrence types

6= 8-2= 2^3-2

You modeled three axis into one enumeration.
The three boolean axis are `zero', `one', `more'.
Because `more' cannot exist without `one' two entries in your enumeration

-manfred
```
Oct 12 2010
Jonathan Marler <johnnymarler gmail.com> writes:
```On Sunday, 10 October 2010 at 12:28:32 UTC, Justin Johansson
wrote:
Specifically I have a problem in trying to implement
a functional language translator in D.  My target language
has a rather non-conventional type system, in which,
superficially at least, types can be described as being
Cartesian in nature. That is,
types in this system have two orthogonal dimensions:
(1) classical data-type (e.g. boolean, number, string,
object) and
(2) Kleene cardinality (occurrences) with respect to (1).

The axial origin of this Cartesian type-system correlates well
with the concept of the "top" type (AKA "Unit" in Scala)
and as the rather adhoc "void" type in many Algol-derived
languages such as C/C++/Java/D.

So along axis 1 we might broadly describe the classical
data types as item, boolean, number, string, object where
item is effectively either a superclass or variant of the
other mentioned types.

Along axis 2 we describe Kleene occurrences of 1 as may be
passed contractually to a receiving function.  These
occurrences may be enumerated six-fold as:
zeroOrMore
zeroOrOne
oneOrMore
exactlyZero
exactlyOne
none

Readers may see that, for example, zeroOrOne is a special
case (perhaps a subclass?) of zeroOrMore.  exactlyOne is
a special case of both zeroOrOne and oneOrMore (sounds like
multiple inheritance?).  OTOH, exactlyZero is a special
case of zeroOrOne, which, in turn, is a special case of
zeroOrmore.

none is a special case of all of the above and reflects
the cardinality facet of the return type of a function
that never returns (say as by throwing an exception).

To make this type system even more enigmatic lets add a
third dimension, taking the 2-D Cartesian type system
model to a 3-D Spatial model, by imagining a further
degree of freedom with respect to laziness of evaluation
(AKA closure of arguments).

Now having hopefully described that a type is something
that might well have multiple orthogonal aspects to its
identity, how would one go about implementing a dynamic
language with such a complex type system in D?

I realize that this is a complex topic and that it might
require better articulation than so far I have given.

Nevertheless, thanks for all replies,
Justin Johansson

I'm getting "nostalgia" from all the math terminology :)  Here
are some of my thoughts:

You've taken one property, cardinality, and put that on one axis
and then declared all other properties in your system should be
orthogonal to cardinality.  But then you said that "string" was
on the other axis, which itself is of type zeroOrMore(char),

You also need to take into account that cardinality is
multi-dimensional. i.e. you could have zeroOrMore(char) or
zeroOrMore(zeroOrMore(char)).

That being said, I don't think this model really get much closer
to answering the original question, "what is a type?".

A while back I watched a cppcon talk from Dan Saks who defined
what a data type was in C++:

Note that the whole video also contains alot of good information.
It explores the social aspects being a programmer and is
definitely worth watching.

In his video he declares a data type as a "bundle of compile-time
properties for an object", namely

1. size/alignment
2. set of values
3. set of permitted operations

This model is a very good breakdown of what a type is in C++.
D's a bit more complex.  When you say "type" in D, there's alot
more things you could be talking about so you either need to
expand your definition of what a type is, or restrict the
entities you consider to be types.  But these 3 properties are a
good start :)

A bit off topic.  I find it frustrating that types always need to
have these 3 properties.  Most of the time a programmer doesn't
care about size/alignment, they just want the set of values and
operations.  D can allow a programmer to do this using templates,
but that "throws the baby out with the bath water".  You almost
always require subset of properties.  Because template don't
define any of the properties, you end up with cryptic template
error messages.  D's tried to make the problem better by using
conditional templates and C++ is trying to solve this with
"Concepts". I've explored these ideas for a long time and am
still currently developing my own methods for solving these
problems. My latest idea is that you can create different "type
interfaces".  At a minimum, I've defined a type to be a "pure
function" that takes any value and maps it to a boolean
indicating whether or not that value is a member of that type.
That's the "minimum interface", but you could also have a type
that declares other properties, such as, a set of "values", or
size/alignment.  However, it's important to allow a developer to
work with types that don't necessarily define all these
properties for every time.  Anyway, I'm mostly just sharing some
of my thoughts.  I have alot to say on this subject but I don't
want to go too far off into the weeds.
```
Jan 15 2018
=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
```On 01/15/2018 10:58 AM, Jonathan Marler wrote:
On Sunday, 10 October 2010 at 12:28:32 UTC, Justin Johansson wrote:

I'm getting "nostalgia" from all the math terminology :)

Just the date of the post you're responding to is sufficient for
"nostalgia". ;)

Ali
```
Jan 15 2018
Jonathan Marler <johnnymarler gmail.com> writes:
```On Monday, 15 January 2018 at 19:34:23 UTC, Ali Ã‡ehreli wrote:
On 01/15/2018 10:58 AM, Jonathan Marler wrote:
On Sunday, 10 October 2010 at 12:28:32 UTC, Justin Johansson

wrote:

I'm getting "nostalgia" from all the math terminology :)

Just the date of the post you're responding to is sufficient
for "nostalgia". ;)

Ali

Oh wow!  I don't know why the form post was on the first page of
my form thread. You think he still checks his thread every day to
see if someone has responded?  :)
```
Jan 15 2018
"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
```On Mon, Jan 15, 2018 at 06:58:37PM +0000, Jonathan Marler via Digitalmars-d
wrote:
[...]
That being said, I don't think this model really get much closer to
answering the original question, "what is a type?".

[...]

At the most abstract level, a type is just a set. The members of the set
represent what values that type can have. Operations on types are
therefore simply operations on sets, which lets us bring in the entire
mechanism of mathematical sets to elaborate upon the theory of types.

A boolean, for example, is simply a 2-membered set B = {true, false}. A
struct of two booleans is just the Cartesian product B x B. An array of
booleans is simply the Kleene closure of B, i.e., the union of all
Cartesian powers of B, usually denoted as B*.

This abstract view disregards how a type is actually implemented, and
merely considers its set of possible values.  Under this model, you
begin by describing the sets of possible values your program will work
with. Then when you're ready to implement it, you decide on some kind of
mapping to map the abstract sets to the physical types provided by the
hardware.

On a more practical level, the way I see a type is an *interpretation*
imposed upon a series of zero or more bytes (or, if you want to drill
down even deeper, bits).  The same binary values can represent different
types, depending on how you interpret them.  Under this model, you
proceed in the opposite direction: starting with an arbitrary sequence
of bytes, you recursively build up interpretations of it until you
arrive at the data structures that your program will operate on.

The abstract view is useful when you're working within your problem
domain: it allows you to phrase your algorithms and transformations
directly in terms of the objects and operations you wish to work on. But
it's not always so easy to map this abstract model onto the quirks of
the hardware you have to work with. It's easy to think of trivial
mappings, of course, but generally they will have pretty poor
performance because they don't take advantage of the properties of the
underlying hardware.  Finding an *efficient* mapping that doesn't also
break correctness can be quite challenging.

The practical view is useful when you're building things from ground up,
e.g., when you're building a compiler. It lets you build up types that
can take advantage of the underlying hardware's properties, and also
avoid or work around hardware quirks or weaknesses.  It also allows you
to reinterpret types creatively, that may allow you to achieve
interesting effects (e.g., the data bytes output by the compiler can be
reinterpreted as executable code by the CPU). However, types built up
this way may not map nicely into your problem domain, and that forces
your code to become convoluted in order to compensate for the impedance
mismatch between the two. Correctness may also be compromised if the
objects in your problem domain don't have a 1-to-1 fit with the hardware
types (e.g., your calculator program might operate on integers in the
mathematical sense of numbers that can be arbitrarily large, but the int
type provided by the hardware is modulo some power of 2; so when values
get too large, what your program thinks of as an integer no longer
matches what the hardware implements, resulting in overflow/underflow
bugs).

Finding the sweet spot that balances the two is non-trivial, and is the
subject of entire fields of research. :-D

T

--
The trouble with TCP jokes is that it's like hearing the same joke over and
over.
```
Jan 15 2018
Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
```On Monday, 15 January 2018 at 19:25:16 UTC, H. S. Teoh wrote:
At the most abstract level, a type is just a set. The members
of the set represent what values that type can have.

Hm, yes, like representing an ordered pair (a,b) as {{a},{a,b}}.

But I think typing is more commonly viewed as a filter. So if you
represent all values as sets, then the type would be a filter
preventing certain combinations.

It is a matter of perspective, constructive or not constructive.
Kinda like synthesis, additive (combine sines) or subtractive
(filter noise).
```
Jan 15 2018
Mark <smarksc gmail.com> writes:
```On Sunday, 10 October 2010 at 12:28:32 UTC, Justin Johansson
wrote:
Specifically I have a problem in trying to implement
a functional language translator in D.  My target language
has a rather non-conventional type system, in which,
superficially at least, types can be described as being
Cartesian in nature. That is,
types in this system have two orthogonal dimensions:
(1) classical data-type (e.g. boolean, number, string,
object) and
(2) Kleene cardinality (occurrences) with respect to (1).

The axial origin of this Cartesian type-system correlates well
with the concept of the "top" type (AKA "Unit" in Scala)
and as the rather adhoc "void" type in many Algol-derived
languages such as C/C++/Java/D.

So along axis 1 we might broadly describe the classical
data types as item, boolean, number, string, object where
item is effectively either a superclass or variant of the
other mentioned types.

Along axis 2 we describe Kleene occurrences of 1 as may be
passed contractually to a receiving function.  These
occurrences may be enumerated six-fold as:
zeroOrMore
zeroOrOne
oneOrMore
exactlyZero
exactlyOne
none

Readers may see that, for example, zeroOrOne is a special
case (perhaps a subclass?) of zeroOrMore.  exactlyOne is
a special case of both zeroOrOne and oneOrMore (sounds like
multiple inheritance?).  OTOH, exactlyZero is a special
case of zeroOrOne, which, in turn, is a special case of
zeroOrmore.

none is a special case of all of the above and reflects
the cardinality facet of the return type of a function
that never returns (say as by throwing an exception).

To make this type system even more enigmatic lets add a
third dimension, taking the 2-D Cartesian type system
model to a 3-D Spatial model, by imagining a further
degree of freedom with respect to laziness of evaluation
(AKA closure of arguments).

Now having hopefully described that a type is something
that might well have multiple orthogonal aspects to its
identity, how would one go about implementing a dynamic
language with such a complex type system in D?

I realize that this is a complex topic and that it might
require better articulation than so far I have given.

Nevertheless, thanks for all replies,
Justin Johansson

Well, in D-speak you could imagine taking (perhaps arbitrarily)
the first axis (basic data type) as the "skeleton" and attach
various properties to it. For example:

Int!(exactlyOne, eager) x = 42;
Int!(zeroOrMore, lazy) y;
auto foo = x+y; // What does this mean? What type is foo?

You seem to have in mind a syntax for your type system but I'm
not sure how you intend for the semantics to work. D's templates
are really nice in that they allow you to express many properties
of a term that are not "hard coded" into its type. So you can do
stuff like:

enum MinimizationTarget { programThroughput, GCThroughput,
warmupTime }
auto GarbageCollector(MinimizationTarget[] priorities = [])()
{
static if (priorities == []) return
StopTheWorldMarkAndSweepGC();
else static if (priorities[0] == heapFragmentation) return
copyingGC();
else static if (/* some boolean algebra condition on
\$priorities */) return GenerationalMarkAndSweepGC();
else /* etc. */
}
programThroughput];
auto myGraph = Graph!(myGC, directed);

which is pretty cool. :)
```
Jan 15 2018
Ali <fakeemail example.com> writes:
```On Sunday, 10 October 2010 at 12:28:32 UTC, Justin Johansson
wrote:
Now having hopefully described that a type is something
that might well have multiple orthogonal aspects to its
identity, how would one go about implementing a dynamic
language with such a complex type system in D?

I realize that this is a complex topic and that it might
require better articulation than so far I have given.

Nevertheless, thanks for all replies,
Justin Johansson

If you want to learns the ins and outs of types, this books comes
highly recommended

https://www.cis.upenn.edu/~bcpierce/tapl/
```
Jan 15 2018
thedeemon <dlang thedeemon.com> writes:
```On Monday, 15 January 2018 at 20:46:19 UTC, Ali wrote:

If you want to learns the ins and outs of types, this books
comes highly recommended

https://www.cis.upenn.edu/~bcpierce/tapl/

+1, TAPL is a must read for anyone in CS, I believe.
Also recommended: "Type Theory & Functional Programming" by Simon
Thompson,
"Practical Foundations for Programming Languages" by Robert Harper
and his and his colleagues lectures here:
https://www.cs.uoregon.edu/research/summerschool/archives.html

Many programmers remain unaware that there is a discipline called
Type Theory, as part of math in general, it predates all
electronic computers and it's still very relevant today. Anyone
dabbling into compilers and programming language theory should
learn the basics of type theory, proof theory and some category
theory, these three are very much connected and talk about
basically the same constructions from different angles (see
Curry-Howard correspondence and "computational trinitarianism").
It's ridiculous how many programmers only learn about types from
books on C++ or MSDN, get very vague ideas about them and never
learn any actual PLT. Of course type is not a set of values, or
any other set, not at all.
```
Jan 15 2018
Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
```On Tuesday, 16 January 2018 at 06:26:34 UTC, thedeemon wrote:
"Practical Foundations for Programming Languages" by Robert
Harper

Well, I have that one, and the title is rather misleading. Not at
all practical.

electronic computers and it's still very relevant today. Anyone
dabbling into compilers and programming language theory should
learn the basics of type theory, proof theory and some category
theory, these three are very much connected and talk about
basically the same constructions from different angles (see
Curry-Howard correspondence and "computational
trinitarianism"). It's ridiculous how many programmers only
learn about types from books on C++ or MSDN, get very vague
ideas about them and never learn any actual PLT. Of course type
is not a set of values, or any other set, not at all.

I don't really agree with this, empirically speaking. As many
people have designed good languages without such a focus, and
many have designed not very good languages with this focus...

Anyway, let's not make this too complicated, no need for a pile
of tomes:

https://en.wikipedia.org/wiki/Type_theory
```
Jan 16 2018