www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - More on Rust language

reply bearophile <bearophileHUGS lycos.com> writes:
Through Reddit I've found two introductions to the system language Rust being
developed by Mozilla. This is one of them:

http://marijnhaverbeke.nl/rust_tutorial/

This is an alpha-state tutorial, so some parts are unfinished and some parts
will probably change, in the language too.

Unfortunately this first tutorial doesn't discuss typestates and syntax macros
(yet), two of the most significant features of Rust. The second tutorial
discussed a bit typestates too.

Currently the Rust compiler is written in Rust and it's based on the LLVM
back-end. This allows it to eat its own dog food (there are few descriptions of
typestate usage in the compiler itself) and the backend is efficient enough.
Compared to DMD the Rust compiler is in a earlier stage of development, it
works and it's able to compile itself but I think it's not usable yet for
practical purposes.

On the GitHub page the Rust project has 547 "Watch" and 52 "Fork", while DMD
has 159 and 49 of them, despite Rust is a quite younger compiler/software
compared to D/DMD. So it seems enough people are interested in Rust.

Most of the text below is quotations from the tutorials.

---------------------------

http://marijnhaverbeke.nl/rust_tutorial/control.html

Pattern matching

Rust's alt construct is a generalized, cleaned-up version of C's switch
construct. You provide it with a value and a number of arms, each labelled with
a pattern, and it will execute the arm that matches the value.

alt my_number {
  0       { std::io::println("zero"); }
  1 | 2   { std::io::println("one or two"); }
  3 to 10 { std::io::println("three to ten"); }
  _       { std::io::println("something else"); }
}

There is no 'falling through' between arms, as in C—only one arm is executed,
and it doesn't have to explicitly break out of the construct when it is
finished.

The part to the left of each arm is called the pattern. Literals are valid
patterns, and will match only their own value. The pipe operator (|) can be
used to assign multiple patterns to a single arm. Ranges of numeric literal
patterns can be expressed with to. The underscore (_) is a wildcard pattern
that matches everything.

If the arm with the wildcard pattern was left off in the above example, running
it on a number greater than ten (or negative) would cause a run-time failure.
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
I have found a slides pack, Rust All Hands Winter 2011, with some notes on
typestates too:
http://www.slideshare.net/pcwalton/rust-all-hands-winter-2011

And here there are some tests about macros too, search the word "macro":
https://github.com/graydon/rust/tree/master/src/test/run-pass

Bye,
bearophile
Nov 03 2011
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/3/2011 8:14 PM, bearophile wrote:
 Mark-compact (aka moving) collectors, where live objects are moved together
 to make allocated memory more compact. Note that doing this involves
 updating pointers’ values on the fly. This category includes semispace
 collectors as well as the more efficient modern ones like the .NET CLR’s
 that don’t use up half your memory or address space. C++ cannot support
 this without at least a new pointer type, because C/C++ pointer values are
 required to be stable (not change their values), so that you can cast them
 to an int and back, or write them to a file and back; this is why we

 compacting GC heaps. See section 3.3 of my paper
 (http://www.gotw.ca/publications/C++CLIRationale.pdf ) A Design Rationale
 for C++/CLI for more rationale about ^ and gcnew.<
Tell me if I am wrong still.
You're wrong still :-)
 How do you implement a moving GC in D if D has
 raw pointers?
It can be done if the D compiler emits full runtime type info. It's a solved problem with GCs.
 D semantics doesn't allow the GC to automatically modify those
 pointers when the GC moves the data.
Yes, it does. I've implemented a moving collector before designing D, and I carefully defined the semantics so that it could be done for D. Besides, having two pointer types in D would be disastrously complex. C++/CLI does, and C++/CLI is a failure in the marketplace. (I've dealt with multiple pointer types from the DOS daze, and believe me it is a BAD BAD BAD idea.)
Nov 03 2011
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Walter Bright:

 You're wrong still :-)
In this newsgroup I am used to being wrong several times every day :-)
 It can be done if the D compiler emits full runtime type info. It's a solved 
 problem with GCs.
I see, I will have to read more on this solution.
 Besides, having two pointer types in D would be disastrously complex.
Rust has three pointer types! :-) In Ada too I think there are three types of pointers.
 (I've dealt with multiple 
 pointer types from the DOS daze, and believe me it is a BAD BAD BAD idea.)
I am not sure, but I think the situation is very different here. Here it's only the type system that tells those pointers them apart, and restricts the kinds of operations you are allowed to do with them or changes the things they do. In Rust it's not the kind of memory they point to that tells what they are (as I presume was in DOS), here you are allowed to use one of the three kinds of pointers, as you like, for each kind of data you want. The difference is all in their semantics. I think this is very different from the DOS pointers situation. From the examples of Rust code I've read, I have not seen any disaster regarding the design of its pointers. They have implemented a not small compiler with the language, so I think the pointer situation is not awful. Regarding pointer types, in D there are function pointers and function delegates, they are kind of two different kinds of pointers already. They increase language complexity, its usage, and require some conversion code, but they are not a disaster to use. Thank you for your answers, bye, bearophile
Nov 03 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/3/2011 9:14 PM, bearophile wrote:
 Regarding pointer types, in D there are function pointers and function
 delegates, they are kind of two different kinds of pointers already.
And their only saving grace is they are not used that often, so the complexity is tolerable. This is not so for pointers.
Nov 03 2011
prev sibling parent reply Caligo via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, Nov 3, 2011 at 10:43 PM, Walter Bright
<newshound2 digitalmars.com>wrote:

  How do you implement a moving GC in D if D has
 raw pointers?
It can be done if the D compiler emits full runtime type info. It's a solved problem with GCs. D semantics doesn't allow the GC to automatically modify those
 pointers when the GC moves the data.
Yes, it does. I've implemented a moving collector before designing D, and I carefully defined the semantics so that it could be done for D. Besides, having two pointer types in D would be disastrously complex. C++/CLI does, and C++/CLI is a failure in the marketplace. (I've dealt with multiple pointer types from the DOS daze, and believe me it is a BAD BAD BAD idea.)
Given the recent discussion on radical changes to GC and dtors, could someone please explain why having multiple pointer types is a bad idea?
May 08 2014
parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Friday, 9 May 2014 at 04:55:28 UTC, Caligo via Digitalmars-d 
wrote:
 On Thu, Nov 3, 2011 at 10:43 PM, Walter Bright
 <newshound2 digitalmars.com>wrote:

  How do you implement a moving GC in D if D has
 raw pointers?
It can be done if the D compiler emits full runtime type info. It's a solved problem with GCs. D semantics doesn't allow the GC to automatically modify those
 pointers when the GC moves the data.
Yes, it does. I've implemented a moving collector before designing D, and I carefully defined the semantics so that it could be done for D. Besides, having two pointer types in D would be disastrously complex. C++/CLI does, and C++/CLI is a failure in the marketplace. (I've dealt with multiple pointer types from the DOS daze, and believe me it is a BAD BAD BAD idea.)
Given the recent discussion on radical changes to GC and dtors, could someone please explain why having multiple pointer types is a bad idea?
It increases the complexity to reason about code. If the compiler does not give an helping hand, bugs are too easy to create. -- Paulo
May 08 2014
parent reply "Araq" <rumpf_a web.de> writes:
 It increases the complexity to reason about code.
No, that's wrong.
 If the compiler does not give an helping hand, bugs are too 
 easy to create.
Usually a type system is used to increase safety...
May 09 2014
parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 09.05.2014 21:53, schrieb Araq:
 It increases the complexity to reason about code.
No, that's wrong.
Why it is wrong? Even you ever seen a programmer reason about unique pointers, shared pointers, weak pointers, naked pointers, references and cyclic data structures without mistakes? In any language that provide them?
 If the compiler does not give an helping hand, bugs are too easy to
 create.
Usually a type system is used to increase safety...
That is why Rust provides a type system that knows about pointer types, lifetimes and usage dataflow. Because in languages that don't go that far, the desired outcome is not always the best. -- Paulo
May 09 2014
parent "Araq" <rumpf_a web.de> writes:
 It increases the complexity to reason about code.
No, that's wrong.
Why it is wrong?
Because it is much harder to reason about the same things without type system support.
May 10 2014
prev sibling next sibling parent "marwy" <mariusz.wyrozumski gmail.com> writes:
On Friday, 4 November 2011 at 03:14:29 UTC, bearophile wrote:
 Through Reddit I've found two introductions to the system 
 language Rust being developed by Mozilla. This is one of them:

 http://marijnhaverbeke.nl/rust_tutorial/

 This is an alpha-state tutorial, so some parts are unfinished 
 and some parts will probably change, in the language too.

 Unfortunately this first tutorial doesn't discuss typestates 
 and syntax macros (yet), two of the most significant features 
 of Rust. The second tutorial discussed a bit typestates too.

 Currently the Rust compiler is written in Rust and it's based 
 on the LLVM back-end. This allows it to eat its own dog food 
 (there are few descriptions of typestate usage in the compiler 
 itself) and the backend is efficient enough. Compared to DMD 
 the Rust compiler is in a earlier stage of development, it 
 works and it's able to compile itself but I think it's not 
 usable yet for practical purposes.

 On the GitHub page the Rust project has 547 "Watch" and 52 
 "Fork", while DMD has 159 and 49 of them, despite Rust is a 
 quite younger compiler/software compared to D/DMD. So it seems 
 enough people are interested in Rust.

 Most of the text below is quotations from the tutorials.

 ---------------------------

 http://marijnhaverbeke.nl/rust_tutorial/control.html

 Pattern matching

 Rust's alt construct is a generalized, cleaned-up version of 
 C's switch construct. You provide it with a value and a number 
 of arms, each labelled with a pattern, and it will execute the 
 arm that matches the value.

 alt my_number {
   0       { std::io::println("zero"); }
   1 | 2   { std::io::println("one or two"); }
   3 to 10 { std::io::println("three to ten"); }
   _       { std::io::println("something else"); }
 }

 There is no 'falling through' between arms, as in C—only one 
 arm is executed, and it doesn't have to explicitly break out of 
 the construct when it is finished.

 The part to the left of each arm is called the pattern. 
 Literals are valid patterns, and will match only their own 
 value. The pipe operator (|) can be used to assign multiple 
 patterns to a single arm. Ranges of numeric literal patterns 
 can be expressed with to. The underscore (_) is a wildcard 
 pattern that matches everything.

 If the arm with the wildcard pattern was left off in the above 
 example, running it on a number greater than ten (or negative) 
 would cause a run-time failure. When no arm matches, alt 
 constructs do not silently fall through—they blow up instead.

 A powerful application of pattern matching is destructuring, 
 where you use the matching to get at the contents of data 
 types. Remember that (float, float) is a tuple of two floats:

 fn angle(vec: (float, float)) -> float {
     alt vec {
       (0f, y) when y < 0f { 1.5 * std::math::pi }
       (0f, y) { 0.5 * std::math::pi }
       (x, y) { std::math::atan(y / x) }
     }
 }

 A variable name in a pattern matches everything, and binds that 
 name to the value of the matched thing inside of the arm block. 
 Thus, (0f, y) matches any tuple whose first element is zero, 
 and binds y to the second element. (x, y) matches any tuple, 
 and binds both elements to a variable.

 Any alt arm can have a guard clause (written when EXPR), which 
 is an expression of type bool that determines, after the 
 pattern is found to match, whether the arm is taken or not. The 
 variables bound by the pattern are available in this guard 
 expression.


 Record patterns

 Records can be destructured on in alt patterns. The basic 
 syntax is {fieldname: pattern, ...}, but the pattern for a 
 field can be omitted as a shorthand for simply binding the 
 variable with the same name as the field.

 alt mypoint {
     {x: 0f, y: y_name} { /* Provide sub-patterns for fields */ }
     {x, y}             { /* Simply bind the fields */ }
 }

 The field names of a record do not have to appear in a pattern 
 in the same order they appear in the type. When you are not 
 interested in all the fields of a record, a record pattern may 
 end with , _ (as in {field1, _}) to indicate that you're 
 ignoring all other fields.


 Tags

 Tags [FIXME terminology] are datatypes that have several 
 different representations. For example, the type shown earlier:

 tag shape {
     circle(point, float);
     rectangle(point, point);
 }

 A value of this type is either a circle¸ in which case it 
 contains a point record and a float, or a rectangle, in which 
 case it contains two point records. The run-time representation 
 of such a value includes an identifier of the actual form that 
 it holds, much like the 'tagged union' pattern in C, but with 
 better ergonomics.


 Tag patterns

 For tag types with multiple variants, destructuring is the only 
 way to get at their contents. All variant constructors can be 
 used as patterns, as in this definition of area:

 fn area(sh: shape) -> float {
     alt sh {
         circle(_, size) { std::math::pi * size * size }
         rectangle({x, y}, {x: x2, y: y2}) { (x2 - x) * (y2 - y) 
 }
     }
 }

 ------------------------------

 // The type of this vector will be inferred based on its use.
 let x = [];

 // Explicitly say this is a vector of integers.
 let y: [int] = [];

 ---------------------------

 Tuples

 Tuples in Rust behave exactly like records, except that their 
 fields do not have names (and can thus not be accessed with dot 
 notation). Tuples can have any arity except for 0 or 1 (though 
 you may see nil, (), as the empty tuple if you like).

 let mytup: (int, int, float) = (10, 20, 30.0);
 alt mytup {
   (a, b, c) { log a + b + (c as int); }
 }

 ---------------------------

 Pointers

 Rust supports several types of pointers. The simplest is the 
 unsafe pointer, written *TYPE, which is a completely unchecked 
 pointer type only used in unsafe code (and thus, in typical 
 Rust code, very rarely). The safe pointer types are  TYPE for 
 shared, reference-counted boxes, and ~TYPE, for uniquely-owned 
 pointers.

 All pointer types can be dereferenced with the * unary operator.

 ---------------------------

 When inserting an implicit copy for something big, the compiler 
 will warn, so that you know that the code is not as efficient 
 as it looks.

 ---------------------------

 Argument passing styles

 ...

 Another style is by-move, which will cause the argument to 
 become de-initialized on the caller side, and give ownership of 
 it to the called function. This is written -.

 Finally, the default passing styles (by-value for 
 non-structural types, by-reference for structural ones) are 
 written + for by-value and && for by(-immutable)-reference. It 
 is sometimes necessary to override the defaults. We'll talk 
 more about this when discussing generics.

 ==============================================

 The second introduction I have found:
 https://github.com/graydon/rust/wiki/

 ---------------------------

 https://github.com/graydon/rust/wiki/Unit-testing

 Rust has built in support for simple unit testing. Functions 
 can be marked as unit tests using the 'test' attribute.


 fn return_none_if_empty() {
    ... test code ...
 }

 A test function's signature must have no arguments and no 
 return value. To run the tests in a crate, it must be compiled 
 with the '--test' flag: rustc myprogram.rs --test -o 
 myprogram-tests. Running the resulting executable will run all 
 the tests in the crate. A test is considered successful if its 
 function returns; if the task running the test fails, through a 
 call to fail, a failed check or assert, or some other means, 
 then the test fails.

 When compiling a crate with the '--test' flag '--cfg test' is 
 also implied, so that tests can be conditionally compiled.


 mod tests {

   fn return_none_if_empty() {
     ... test code ...
   }
 }

 Note that attaching the 'test' attribute to a function does not 
 imply the 'cfg(test)' attribute. Test items must still be 
 explicitly marked for conditional compilation (though this 
 could change in the future).

 Tests that should not be run can be annotated with the 'ignore' 
 attribute. The existence of these tests will be noted in the 
 test runner output, but the test will not be run.

 A test runner built with the '--test' flag supports a limited 
 set of arguments to control which tests are run: the first free 
 argument passed to a test runner specifies a filter used to 
 narrow down the set of tests being run; the '--ignored' flag 
 tells the test runner to run only tests with the 'ignore' 
 attribute.
 Parallelism


 Parallelism

 By default, tests are run in parallel, which can make 
 interpreting failure output difficult. In these cases you can 
 set the RUST_THREADS environment variable to 1 to make the 
 tests run sequentially.

 Examples
 Typical test run

 mytests
running 30 tests running driver::tests::mytest1 ... ok running driver::tests::mytest2 ... ignored ... snip ... running driver::tests::mytest30 ... ok result: ok. 28 passed; 0 failed; 2 ignored Test run with failures
 mytests
running 30 tests running driver::tests::mytest1 ... ok running driver::tests::mytest2 ... ignored ... snip ... running driver::tests::mytest30 ... FAILED result: FAILED. 27 passed; 1 failed; 2 ignored Running ignored tests
 mytests --ignored
running 2 tests running driver::tests::mytest2 ... failed running driver::tests::mytest10 ... ok result: FAILED. 1 passed; 1 failed; 0 ignored Running a subset of tests
 mytests mytest1
running 11 tests running driver::tests::mytest1 ... ok running driver::tests::mytest10 ... ignored ... snip ... running driver::tests::mytest19 ... ok result: ok. 11 passed; 0 failed; 1 ignored --------------------------- https://github.com/graydon/rust/wiki/Error-reporting Incorrect use of numeric literals. auto i = 0u; i += 3; // suggest "3u" Use of for where for each was meant. for (v in foo.iter()) // suggest "for each" This is something I'd like in D too: http://d.puremagic.com/issues/show_bug.cgi?id=6638 --------------------------- https://github.com/graydon/rust/wiki/Attribute-notes Crate Linkage Attributes A crate's version is determined by the link attribute, which is a list meta item containing metadata about the crate. This metadata can, in turn, be used in providing partial matching parameters to syntax extension loading and crate importing directives, denoted by the syntax and use keywords respectively. All meta items within a link attribute contribute to the versioning of a crate, and two meta items, name and vers, have special meaning and must be present in all crates compiled as shared libraries. An example of a typical crate link attribute: vers = "0.1", uuid = "122bed0b-c19b-4b82-b0b7-7ae8aead7297", url = "http://rust-lang.org/src/std")]; ============================================== Regarding different kinds of pointers in D, I have recently found this: http://herbsutter.com/2011/10/25/garbage-collection-synopsis-and-c/ From what I understand in this comment by Herb Sutter, I was right when about three years ago I was asking for a second pointer type in D:
Mark-compact (aka moving) collectors, where live objects are 
moved together to make allocated memory more compact. Note that 
doing this involves updating pointers’ values on the fly. This 
category includes semispace collectors as well as the more 
efficient modern ones like the .NET CLR’s that don’t use up 
half your memory or address space. C++ cannot support this 
without at least a new pointer type, because C/C++ pointer 
values are required to be stable (not change their values), so 
that you can cast them to an int and back, or write them to a 
file and back; this is why we created the ^ pointer type for 

heaps. See section 3.3 of my paper 
(http://www.gotw.ca/publications/C++CLIRationale.pdf ) A Design 
Rationale for C++/CLI for more rationale about ^ and gcnew.<
Tell me if I am wrong still. How do you implement a moving GC in D if D has raw pointers? D semantics doesn't allow the GC to automatically modify those pointers when the GC moves the data. -------------------------- As you see this post of mine doesn't discuss typestates nor syntax macros. I have not found enough info about them in the Rust docs. Even if Rust will not become widespread, it will introduce typestates in the cauldron of features known by future language designers (and maybe future programmers too), or it will show why typestates are not a good idea. In all three cases Rust will be useful. Some comments regarding D: - I'd like the better error messages I have discussed in bug 6638. - Tuple de-structuring syntax will be good to have in D too. There is a patch on this. If the ideas of the patch are not developed enough, then I suggest to present the design problems and to discuss and solve them. - I'd like a bit more flexible switch in D, discussion: http://d.puremagic.com/issues/show_bug.cgi?id=596 This is just an additive change, I think it causes no breaking changes. - Tag patterns used inside the switch-like "alt": syntax-wise this looks less easy to implement in D. - I think unit testing in D needs more improvements. Rust is in a less developed state compared to D, yet its unit testing features seems better designed already. I think this is not complex stuff to design and implement. Bye, bearophile
For those of you wondering what's the current state of typestate in Rust: it's dead. More information here: https://pcwalton.github.io/blog/2012/12/26/typestate-is-dead/
May 09 2014
prev sibling parent reply "Douglas Peterson" <Doug nowhere.com> writes:
Rust is quite seductive (own point of view of course) in its 
"traits" system. They've found the right median line between OOP 
and TMP. I mean it's a realy nice concept.
May 10 2014
parent reply Xavier Bigand <flamaros.xavier gmail.com> writes:
Le 10/05/2014 11:35, Douglas Peterson a Ă©crit :
 Rust is quite seductive (own point of view of course) in its "traits"
 system. They've found the right median line between OOP and TMP. I mean
 it's a realy nice concept.
Have you a direct link about traits? Cause I am almost unable to see their tutorials (infinite loading time).
May 10 2014
parent "Dicebot" <public dicebot.lv> writes:
On Saturday, 10 May 2014 at 11:43:29 UTC, Xavier Bigand wrote:
 Le 10/05/2014 11:35, Douglas Peterson a Ă©crit :
 Rust is quite seductive (own point of view of course) in its 
 "traits"
 system. They've found the right median line between OOP and 
 TMP. I mean
 it's a realy nice concept.
Have you a direct link about traits? Cause I am almost unable to see their tutorials (infinite loading time).
http://static.rust-lang.org/doc/master/rust.html#traits
May 10 2014