digitalmars.D - Component programming

Chris (24/24) Jul 31 2013 This is only losely related to D, but I don't fully understand

H. S. Teoh (45/55) Jul 31 2013 [...]
Justin Whear (16/43) Jul 31 2013 in-d/240008321)

bearophile (41/44) Jul 31 2013 What D calls "component programming" is very nice and good, but

Walter Bright (14/16) Jul 31 2013 Ironically, the component program from the article I wrote:

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (3/4) Jul 31 2013 What do you mean exactly? :p
bearophile (8/11) Jul 31 2013 Benchmarking code written in two different languages is tricky,

Walter Bright (11/19) Jul 31 2013 You are right to be skeptical of cross language benchmarks.

bearophile (18/25) Jul 31 2013 I agree, I program often in Python, and it can be very useful,

Joseph Rushton Wakeling (7/10) Aug 01 2013 Yea, this was a frustration. :-( It was really nice to be able to write...
Andrei Alexandrescu (3/6) Aug 01 2013 ERROR 404 - PAGE NOT FOUND

anonymous observer (5/12) Aug 01 2013 This one, perhaps?

bearophile (6/11) Aug 01 2013 Yes, it's the same, thank you.

Andrei Alexandrescu (17/26) Jul 31 2013 I measured the timings. It was in a discussion between Walter, myself,

Justin Whear (9/15) Jul 31 2013 I disagree with your "toy" assessment. I've been using this chaining,

H. S. Teoh (64/77) Jul 31 2013 [...]

Walter Bright (3/4) Jul 31 2013 Thank you for an excellent and concise summary of what component program...
Chris (20/35) Aug 01 2013 I agree, and to be honest, loops have given me more than one
John Colvin (2/132) Aug 01 2013 Add in some code examples and that could make a nice article.

Walter Bright (2/4) Aug 01 2013 Yes, please!

H. S. Teoh (78/83) Aug 01 2013 Alright, so I decided to prove my point about component programming by

Walter Bright (3/10) Aug 01 2013 I think this is awesome, and this + your previous post are sufficient to...

H. S. Teoh (9/20) Aug 02 2013 OK, here's a draft of the article:

Walter Bright (2/7) Aug 02 2013 Get 'em up on bugzilla! (At least any that fail with HEAD.)
Timon Gehr (16/34) Aug 02 2013 Also, you may want to replace some of the manually implemented ranges

Andrei Alexandrescu (4/27) Aug 02 2013 Would be nice to have a couple of these both explicit and also

H. S. Teoh (13/37) Aug 03 2013 Thanks! I replaced the code with the first version above. I decided that

Jonathan M Davis (10/20) Aug 03 2013 You could also use std.datetime.Interval and do something like

bearophile (122/127) Aug 03 2013 Most of the code below is not tested. So my suggestions may

Andre Artus (2/10) Aug 03 2013 It should probably be picked up from the OS, to support

Jonathan M Davis (6/21) Aug 03 2013 std.datetime has something like that internally for some of what it does...

Jacob Carlborg (5/9) Aug 12 2013 The solution for that is to make it possible to plug in support for new

H. S. Teoh (56/212) Aug 03 2013 Yeah I'll look into that sometime. It'll definitely be a useful thing to

bearophile (6/10) Aug 06 2013 If not already present, I suggest you to put a reduced version of

H. S. Teoh (13/22) Aug 06 2013 [...]

H. S. Teoh (10/17) Aug 03 2013 [...]

Justin Whear (7/18) Aug 02 2013 I recently wrote a range component for my current project that is simila...

H. S. Teoh (6/26) Aug 02 2013 It would be nice to collect these custom ranges and see if there's some

bearophile (4/6) Aug 02 2013 chunkBy seems OK for Phobos.

Timon Gehr (9/14) Aug 02 2013 Which version of the compiler are you using?

H. S. Teoh (9/26) Aug 02 2013 Can you send me the error messages? I'll see if I can reorder the code
H. S. Teoh (16/31) Aug 02 2013 I just checked DMD 2.063, and it appears that the error is caused by a

Timon Gehr (2/5) Aug 02 2013 I think it pulled in the wrong version of druntime.

H. S. Teoh (7/14) Aug 03 2013 OK, I've written a simple replacement for 2.063 std.range.chunks inside

Dejan Lekic (2/145) Aug 05 2013 Good work! I've read the article yesterday. Very educational!

H. S. Teoh (9/10) Aug 05 2013 Thanks!

Dejan Lekic (3/133) Aug 01 2013 This post deserves to become an article somewhere. D Wiki, some

Dejan Lekic (16/61) Aug 01 2013 I was honestly thinking whether I should reply to this rant or
Brad Anderson (7/13) Aug 01 2013 Resident compiler guys,

Walter Bright (5/9) Aug 01 2013 I don't know.

bearophile (9/13) Aug 01 2013 I agree.

Andre Artus (24/35) Aug 03 2013 Who's giving the course and where will it be held?

David Nadlinger (4/17) Aug 03 2013 In this example, no, as all involved ranges are evaluated lazily.

Andre Artus (8/26) Aug 03 2013 I probably could have worded it better: I did not intend to imply
Walter Bright (4/6) Aug 03 2013 The rules for ranges do not specify if they are done eagerly, lazily, or...

H. S. Teoh (46/64) Jul 31 2013 Keep in mind, though, that you pay for the avoidance of OO indirections

Meta (5/5) Aug 01 2013 The one thing that confused me at first when I read Walter's

Brad Anderson (7/12) Aug 01 2013 "Component programing" is kind of a crowded term in programming

H. S. Teoh (6/21) Aug 01 2013 What about "Ultra Range Processing"? :)

John Colvin (6/26) Aug 01 2013 Range-Flow Processing.

John Colvin (2/33) Aug 01 2013 Alternatively, substitute Procesing with Programming.

qznc (14/40) Aug 02 2013 A few days ago, there was a discussion about APL on HN [0]. What
Jason den Dulk (34/47) Aug 12 2013 What the wikipedia entry is saying, in a roundabout way is:

Chris (5/53) Aug 19 2013 Thanks for the explanation, Jason.

H. S. Teoh (8/14) Aug 21 2013 What version of ldmd2 are you using? Looks like my code is incompatible

"Chris" <wendlec tcd.ie> writes:

This is only losely related to D, but I don't fully understand 
the separation of component programming and OOP (cf. 
https://en.wikipedia.org/wiki/Component-based_software_engineering#Differences_from_object-or
ented_programming). 
In an OO framwork, the objects are basically components. See also

"Brad Cox of Stepstone largely defined the modern concept of a 
software component.[4] He called them Software ICs and set out to 
create an infrastructure and market for these components by 
inventing the Objective-C programming language." (see link above)

Walter's example 
(http://www.drdobbs.com/architecture-and-design/component-programming-in-d/240008321)

void main() {
         stdin.byLine(KeepTerminator.yes)    // 1
         map!(a => a.idup).                  // 2
         array.                              // 3
         sort.                               // 4
         copy(                               // 5
             stdout.lockingTextWriter());    // 6
     }

This is more or less how mature OO programs look like. Ideally 
each class (component) does one thing (however small the class 
might be) and can be used or called to perform this task. All 
other classes or components can live independently. From my 
experience this is exactly what Objective-C does. Rather than 
subclassing, it uses other classes to get a job done.

Jul 31 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jul 31, 2013 at 12:20:56PM +0200, Chris wrote:
 This is only losely related to D, but I don't fully understand the
 separation of component programming and OOP (cf.
https://en.wikipedia.org/wiki/Component-based_software_engineering#Differences_from_object-oriented_programming).
 In an OO framwork, the objects are basically components. See also
 
 "Brad Cox of Stepstone largely defined the modern concept of a
 software component.[4] He called them Software ICs and set out to
 create an infrastructure and market for these components by
 inventing the Objective-C programming language." (see link above)
 
 Walter's example (http://www.drdobbs.com/architecture-and-design/component-programming-in-d/240008321)

[...]

Thanks for the link to Walter's article, it was a very insightful read.

I can't say I'm that clear about the difference between component
programming and OO, so I'll decline comment.

One question that the article did raise, though, was what to do when
your algorithms require non-linear interconnectivity between components.
For example, say I have an expression evaluator that takes an object
that represents a math expression, and another object representing a set
of identifier-to-value mappings, and returns the value of the expression
given those mappings:

	Value evalExpr(Expr,Ident,Value)(Expr e, Value[Ident] mappings)
	{
		...
	}

In the spirit of component programming, one would conceivably have an
expression parsing component that takes, say, an input stream of
characters and returns an expression object:

	Expr parseExpr(InputRange)(InputRange input)
		if (is(ElementType!InputRange : dchar))
	{
		...
	}

And conceivably, one would also have a variable assignment parser that
parses an input stream of characters containing user-typed value
assignments, and returns a Value[Ident] hash (obviously, this last bit
can be generalized to any AA-like interface, but let's keep it simple
for now):

	Value[Ident] parseBindings(InputRange)(InputRange input) { ... }

So now, my main code would look like this:

	void main(string[] args) {
		assert(args.length == 3);
		parseExpr(args[1])
			.evalExpr(parseBindings(args[2]))
			.copy(stdout);
	}

Which is not as pretty, because of the non-linear dependence on args[1]
and arg[2]. I've a hard time turning this into its own reusable
component, because it requires multiple disparate inputs. It's also easy
to come up with examples with multiple outputs, and, in the general
case, components with n-to-m input/output connectivity. How do we still
maximize reusability in those cases?


T

-- 
There are two ways to write error-free programs; only the third one works.

Jul 31 2013

Justin Whear <justin economicmodeling.com> writes:

On Wed, 31 Jul 2013 12:20:56 +0200, Chris wrote:

 This is only losely related to D, but I don't fully understand the
 separation of component programming and OOP (cf.
 https://en.wikipedia.org/wiki/Component-

based_software_engineering#Differences_from_object-oriented_programming).
 In an OO framwork, the objects are basically components. See also
 
 "Brad Cox of Stepstone largely defined the modern concept of a software
 component.[4] He called them Software ICs and set out to create an
 infrastructure and market for these components by inventing the
 Objective-C programming language." (see link above)
 
 Walter's example
 (http://www.drdobbs.com/architecture-and-design/component-programming-

in-d/240008321)
 
 void main() {
          stdin.byLine(KeepTerminator.yes)    // 1 map!(a => a.idup).    
                       // 2 array.                              // 3
          sort.                               // 4 copy(                 
                       // 5
              stdout.lockingTextWriter());    // 6
      }
 
 This is more or less how mature OO programs look like. Ideally each
 class (component) does one thing (however small the class might be) and
 can be used or called to perform this task. All other classes or
 components can live independently. From my experience this is exactly
 what Objective-C does. Rather than subclassing, it uses other classes to
 get a job done.

A few things:
1) The functions used in Walter's example are not methods, they are 
generic free functions.  The "interfaces" they require are not actual OOP 
interfaces, but rather a description of what features the supplied type 
must supply.
2) The avoidance of actual objects, interfaces, and methods means that 
the costly indirections of OOP are also avoided.  The compiler is free to 
inline as much of the pipeline as it wishes.
3) Component programming simplifies usage requirements, OOP frameworks 
complicate usage requirements (e.g. you must inherit from this class).

If anything, component programming is just functional programming + 
templates and some nice syntactic sugar.  And a healthy dose of pure 
awesome.

Jul 31 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Justin Whear:

 If anything, component programming is just functional 
 programming + templates and some nice syntactic sugar.
 And a healthy dose of pure awesome.

What D calls "component programming" is very nice and good, but 
in D it's almost a joke.

Currently this code inlines nothing (the allocations, the 
difference and the product):


import std.numeric: dotProduct;
int main() {
     enum N = 50;
     auto a = new int[N];
     auto b = new int[N];
     auto c = new int[N];
     c[] = a[] - b[];
     int result = dotProduct(c, c);
     return result;
}


If you write it in component-style (using doubles here):


import std.math;
import std.algorithm, std.range;

int main() {
     enum N = 50;
     alias T = double;
     auto a = new T[N];
     auto b = new T[N];

     return cast(int)zip(a, b)
            .map!(p => (p[0] - p[1]) ^^ 2)
            .reduce!q{a + b};
}


The situation gets much worse, you see many functions in the 
binary, that even LDC2 often not able to inline. The GHC Haskell 
compiler turns similar "components" code in efficient SIMD asm 
(that uses packed doubles, like double2), it inlines everything, 
merges the loops, produces a small amount of asm output, and 
there is no "c" intermediate array. In GHC "component 
programming" is mature (and Intel is developing an Haskell 
compiler that is even more optimizing), while in D/dmd/Phobos 
this stuff is just started. GHC has twenty+ years of head start 
on this and it shows.

The situation should be improved for D/dmd/Phobos, otherwise such 
D component programming remains partially a dream, or a toy.

Bye,
bearophile

Jul 31 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 7/31/2013 3:23 PM, bearophile wrote:
 The situation should be improved for D/dmd/Phobos, otherwise such D component
 programming remains partially a dream, or a toy.

Ironically, the component program from the article I wrote:

     void main() {
         stdin.byLine(KeepTerminator.yes)    // 1
         map!(a => a.idup).                  // 2
         array.                              // 3
         sort.                               // 4
         copy(                               // 5
             stdout.lockingTextWriter());    // 6
     }

is 2x faster than the Haskell version:

     import Data.List
     import qualified Data.ByteString.Lazy.Char8 as L main = L.interact $
     L.unlines . sort . L.lines

Jul 31 2013

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 07/31/2013 03:46 PM, Walter Bright wrote:

 is 2x faster

What do you mean exactly? :p

Ali

Jul 31 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 Ironically, the component program from the article I wrote:
 ...
 is 2x faster than the Haskell version:

Benchmarking code written in two different languages is tricky, 
there are so many sources of mistakes, even if you know well both 
languages. But I accept your timing. And I say that it's good :-) 
We should aim to be better than the Intel Labs Haskell Research 
Compiler (HRC) :-)

Bye,
bearophile

Jul 31 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 7/31/2013 4:17 PM, bearophile wrote:
 Walter Bright:

 Ironically, the component program from the article I wrote:
 ...
 is 2x faster than the Haskell version:

 Benchmarking code written in two different languages is tricky, there are so
 many sources of mistakes, even if you know well both languages. But I accept
 your timing. And I say that it's good :-) We should aim to be better than the
 Intel Labs Haskell Research Compiler (HRC) :-)

You are right to be skeptical of cross language benchmarks.

Some data points you might find interesting:

1. I made no attempt to optimize the D version (other than throwing the 
appropriate compiler switches). It's meant to be the straightforward "naive" 
implementation.

2. I did not write the Haskell version - Bartosz Milewski did. He admits to not 
being an expert on Haskell, and there may be faster ways to do it.


I'll also agree with you that the component programming style is new in D, and 
probably could benefit a great deal from 20 years of concerted effort :-)

I disagree with you that it is a toy, however. Speed is only one measure of
utility.

Jul 31 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 Speed is only one measure of utility.

I agree, I program often in Python, and it can be very useful, 
despite being sometimes not fast at all.

But as Haskell folks sometimes say, a modern language should try 
to allow a high level style of coding while still keeping a "good 
enough" efficiency.

------------------------

Justin Whear:

 I hadn't realized how bug-prone non-trivial loops tend to be 
 until I started writing this way and avoided them entirely.

I agree.


 Thus far, I don't think I've rewritten anything out of the 
 component programming style, so while probably not optimal, 
 it's been more than good enough.

Take a look at this thread in D.learn:

http://forum.dlang.org/thread/mailman.304.1375190212.22075.digitalmars-d-learn puremagic.com

------------------------

Andrei Alexandrescu:

 Is that a lot better than ghc?

According to this article it seems better, but I have no direct 
experience of it:

http://www.leafpetersen.com/leaf/publications/hs2013/haskell-gap.pdf

Bye,
bearophile

Jul 31 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 08/01/2013 03:40 AM, bearophile wrote:
 Take a look at this thread in D.learn:
 
 http://forum.dlang.org/thread/mailman.304.1375190212.22075.digitalmars-d-learn puremagic.com

Yea, this was a frustration. :-(  It was really nice to be able to write simple,
clean, elegant code using D -- it was sad to discover that while this was great
for a prototype, the performance gap was far too large to make it a viable
long-term solution.

Most of the issues seem to centre around GC, so there might be some low-hanging
fruit there for performance improvements.

Aug 01 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 7/31/13 6:40 PM, bearophile wrote:
 According to this article it seems better, but I have no direct
 experience of it:

 http://www.leafpetersen.com/leaf/publications/hs2013/haskell-gap.pdf

ERROR 404 - PAGE NOT FOUND

Andrei

Aug 01 2013

"anonymous observer" <ao dlang.org> writes:

On Thursday, 1 August 2013 at 16:13:55 UTC, Andrei Alexandrescu
wrote:
 On 7/31/13 6:40 PM, bearophile wrote:
 According to this article it seems better, but I have no direct
 experience of it:

 http://www.leafpetersen.com/leaf/publications/hs2013/haskell-gap.pdf

 ERROR 404 - PAGE NOT FOUND

 Andrei

This one, perhaps?

http://www.leafpetersen.com/leaf/publications/ifl2013/haskell-gap.pdf

Bye

Aug 01 2013

"bearophile" <bearophileHUGS lycos.com> writes:

anonymous observer:

 ERROR 404 - PAGE NOT FOUND

 Andrei

 This one, perhaps?

 http://www.leafpetersen.com/leaf/publications/ifl2013/haskell-gap.pdf

Yes, it's the same, thank you.

Another comparison (I have not yet read this):
http://www.leafpetersen.com/leaf/publications/hs2013/hrc-paper.pdf

Bye,
bearophile

Aug 01 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 7/31/13 4:17 PM, bearophile wrote:
 Walter Bright:

 Ironically, the component program from the article I wrote:
 ...
 is 2x faster than the Haskell version:

 Benchmarking code written in two different languages is tricky, there
 are so many sources of mistakes, even if you know well both languages.

I measured the timings. It was in a discussion between Walter, myself, 
and a common friend. That friend said Haskell will do a lot better on 
the same task. I measured by piping 1,000,000 lines of real log data 
through the tested programs.

His first version was:

import Data.List main = interact $ unlines . sort . lines

This took 51 seconds. The friend got a bit miffed complaining I only 
measured to ridicule his code (but this is hardly the first time hard 
numbers offended someone) and went to ask on Haskell fora on how to make 
the code faster. His second version was:

import Data.List import qualified Data.ByteString.Lazy.Char8 as L
main = L.interact $ L.unlines . sort . L.lines

This version took 7 seconds. A debug version of the D code took 3 seconds.

 But I accept your timing.

This is most gracious considering the crass statement with which you opened.

 And I say that it's good :-) We should aim to
 be better than the Intel Labs Haskell Research Compiler (HRC) :-)

Is that a lot better than ghc?


Andrei

Jul 31 2013

Justin Whear <justin economicmodeling.com> writes:

On Thu, 01 Aug 2013 00:23:52 +0200, bearophile wrote:
 
 The situation should be improved for D/dmd/Phobos, otherwise such D
 component programming remains partially a dream, or a toy.
 
 Bye,
 bearophile

I disagree with your "toy" assessment.  I've been using this chaining, 
component style for a while now and have really enjoyed the clarity it's 
brought to my code.  I hadn't realized how bug-prone non-trivial loops 
tend to be until I started writing this way and avoided them entirely.  
My policy is to aim for clarity and legibility first and to rewrite for 
performance only if necessary.  Thus far, I don't think I've rewritten 
anything out of the component programming style, so while probably not 
optimal, it's been more than good enough.

Jul 31 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jul 31, 2013 at 11:52:35PM +0000, Justin Whear wrote:
 On Thu, 01 Aug 2013 00:23:52 +0200, bearophile wrote:
 
 The situation should be improved for D/dmd/Phobos, otherwise such D
 component programming remains partially a dream, or a toy.
 
 Bye,
 bearophile

 
 I disagree with your "toy" assessment.  I've been using this chaining,
 component style for a while now and have really enjoyed the clarity
 it's brought to my code.  I hadn't realized how bug-prone non-trivial
 loops tend to be until I started writing this way and avoided them
 entirely.

[...]

One of the more influential courses I took in college was on Jackson
Structured Programming. It identified two sources of programming
complexity (i.e., where bugs are most likely to occur): (1) mismatches
between the structure of the program and the structure of the data
(e.g., you're reading an input file that has a preamble, body, and
epilogue, but your code has a single loop over lines in the file); (2)
writing loop invariants (or equivalently, loop conditions).

Most non-trivial loops in imperative code have both, which makes them
doubly prone to bugs. In the example I gave above, the mismatch between
the code structure (a single loop) and the file structure (three
sequential sections) often prompts people to add boolean flags, state
variables, and the like, in order to resolve the conflict between the
two structures. Such ad hoc structure resolutions are a breeding ground
for bugs, and often lead to complicated loop conditions, which invite
even more bugs.

In contrast, if you structure your code according to the structure of
the input (i.e., one loop for processing the preamble, one loop for
processing the body, one loop for processing the epilogue), it becomes
considerably less complex, easier to read (and write!), and far less bug
prone. Your loop conditions become simpler, and thus easier to reason
about and leave less room for bugs to hide.

But to be able to process the input in this way requires that you
encapsulate your input so that it can be processed by 3 different loops.
Once you go down that road, you start to arrive at the concept of input
ranges... then you abstract away the three loops into three components,
and behold, component style programming!

In fact, with component style programming, you can also address another
aspect of (1): when you need to simultaneously process two data
structures whose structures don't match. For example, if you want to lay
out a yearly calendar using writeln, the month/day cells must be output
in a radically different order than the logical foreach(m;1..12) {
foreach(day;1..31) } structure). Writing this code in the traditional
imperative style produces a mass of spaghettii code: either you have
bizarre loops with convoluted loop conditions for generating the dates
in the order you want to print them, or you have to fill out some kind
of grid structure in a complicated order so that you can generate the
dates in order.

Using ranges, though, this becomes considerably more tractable: you can
have an input range of dates in chronological order, two output ranges
corresponding to chunking by week / month, which feed into a third
output range that buffers the generated cells and prints them once
enough has been generated to fill a row of output. By separating out
these non-corresponding structures into separate components, you greatly
simplify the code within each component and thus reduce the number of
bugs (e.g. it's far easier to ensure you never put more than 7 days in a
week, since the weekly output range is all in one place, as opposed to
sprinkled everywhere across multiple nested loops in the imperative
style calendar code). The code that glues these components together is
also separated out and becomes easier to understand and debug: you
simply read from the input range of dates, write to the two output
ranges, and check if they are full (this isn't part of the range API but
added so for this particular example); if the weekly range is full,
start a new week; if the monthly range is full, start a new month. Then
the final output range takes care of when to actually produce output --
you just write stuff to it and don't worry about it in the glue code.

OK, this isn't really a good example of the linear pipeline style code
we're talking about, but it does show how using ranges as components can
untangle very complicated code into simple, tractable parts that are
readable and easy to debug.


T

-- 
If you compete with slaves, you become a slave. -- Norbert Wiener

Jul 31 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 7/31/2013 5:46 PM, H. S. Teoh wrote:
 [...]

Thank you for an excellent and concise summary of what component programming is 
all about!

Jul 31 2013

"Chris" <wendlec tcd.ie> writes:

On Thursday, 1 August 2013 at 00:47:43 UTC, H. S. Teoh wrote:

 Most non-trivial loops in imperative code have both, which 
 makes them
 doubly prone to bugs. In the example I gave above, the mismatch 
 between
 the code structure (a single loop) and the file structure (three
 sequential sections) often prompts people to add boolean flags, 
 state
 variables, and the like, in order to resolve the conflict 
 between the
 two structures. Such ad hoc structure resolutions are a 
 breeding ground
 for bugs, and often lead to complicated loop conditions, which 
 invite
 even more bugs.


 T

I agree, and to be honest, loops have given me more than one 
headache. It's so easy to lose track of what is going on where 
and why. And if you have ever had the pleasure of adding to or 
debugging code that handles three or more different issues in one 
loop, then you will know how mind boggling loops can be.

Your example is very good (you should write an article about it) 
and similar examples occur in web development all the time 
(creating tables, lists etc). I once wrote an event calendar for 
a homepage and _partly_ disentagled the loop for simplicity's 
sake. I say "partly" because it is still a bit "loopy". And I 
guess this is what component programming is all about, 
disentangling code.

The only difficulty I have is the opposition to OOP. I don't 
really see how the two concepts are mutually exclusive. OOP can 
benefit from component programming and vice versa.

Component programming is a good choice for web programming where 
loops abound. I'm tired of the infinite loops (pardon the pun 
again) in JavaScript and the like. Sure there's gotta be a better 
way.

Aug 01 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Thursday, 1 August 2013 at 00:47:43 UTC, H. S. Teoh wrote:
 On Wed, Jul 31, 2013 at 11:52:35PM +0000, Justin Whear wrote:
 On Thu, 01 Aug 2013 00:23:52 +0200, bearophile wrote:
 
 The situation should be improved for D/dmd/Phobos, otherwise 
 such D
 component programming remains partially a dream, or a toy.
 
 Bye,
 bearophile

 
 I disagree with your "toy" assessment.  I've been using this 
 chaining,
 component style for a while now and have really enjoyed the 
 clarity
 it's brought to my code.  I hadn't realized how bug-prone 
 non-trivial
 loops tend to be until I started writing this way and avoided 
 them
 entirely.

 [...]

 One of the more influential courses I took in college was on 
 Jackson
 Structured Programming. It identified two sources of programming
 complexity (i.e., where bugs are most likely to occur): (1) 
 mismatches
 between the structure of the program and the structure of the 
 data
 (e.g., you're reading an input file that has a preamble, body, 
 and
 epilogue, but your code has a single loop over lines in the 
 file); (2)
 writing loop invariants (or equivalently, loop conditions).

 Most non-trivial loops in imperative code have both, which 
 makes them
 doubly prone to bugs. In the example I gave above, the mismatch 
 between
 the code structure (a single loop) and the file structure (three
 sequential sections) often prompts people to add boolean flags, 
 state
 variables, and the like, in order to resolve the conflict 
 between the
 two structures. Such ad hoc structure resolutions are a 
 breeding ground
 for bugs, and often lead to complicated loop conditions, which 
 invite
 even more bugs.

 In contrast, if you structure your code according to the 
 structure of
 the input (i.e., one loop for processing the preamble, one loop 
 for
 processing the body, one loop for processing the epilogue), it 
 becomes
 considerably less complex, easier to read (and write!), and far 
 less bug
 prone. Your loop conditions become simpler, and thus easier to 
 reason
 about and leave less room for bugs to hide.

 But to be able to process the input in this way requires that 
 you
 encapsulate your input so that it can be processed by 3 
 different loops.
 Once you go down that road, you start to arrive at the concept 
 of input
 ranges... then you abstract away the three loops into three 
 components,
 and behold, component style programming!

 In fact, with component style programming, you can also address 
 another
 aspect of (1): when you need to simultaneously process two data
 structures whose structures don't match. For example, if you 
 want to lay
 out a yearly calendar using writeln, the month/day cells must 
 be output
 in a radically different order than the logical 
 foreach(m;1..12) {
 foreach(day;1..31) } structure). Writing this code in the 
 traditional
 imperative style produces a mass of spaghettii code: either you 
 have
 bizarre loops with convoluted loop conditions for generating 
 the dates
 in the order you want to print them, or you have to fill out 
 some kind
 of grid structure in a complicated order so that you can 
 generate the
 dates in order.

 Using ranges, though, this becomes considerably more tractable: 
 you can
 have an input range of dates in chronological order, two output 
 ranges
 corresponding to chunking by week / month, which feed into a 
 third
 output range that buffers the generated cells and prints them 
 once
 enough has been generated to fill a row of output. By 
 separating out
 these non-corresponding structures into separate components, 
 you greatly
 simplify the code within each component and thus reduce the 
 number of
 bugs (e.g. it's far easier to ensure you never put more than 7 
 days in a
 week, since the weekly output range is all in one place, as 
 opposed to
 sprinkled everywhere across multiple nested loops in the 
 imperative
 style calendar code). The code that glues these components 
 together is
 also separated out and becomes easier to understand and debug: 
 you
 simply read from the input range of dates, write to the two 
 output
 ranges, and check if they are full (this isn't part of the 
 range API but
 added so for this particular example); if the weekly range is 
 full,
 start a new week; if the monthly range is full, start a new 
 month. Then
 the final output range takes care of when to actually produce 
 output --
 you just write stuff to it and don't worry about it in the glue 
 code.

 OK, this isn't really a good example of the linear pipeline 
 style code
 we're talking about, but it does show how using ranges as 
 components can
 untangle very complicated code into simple, tractable parts 
 that are
 readable and easy to debug.


 T

Add in some code examples and that could make a nice article.

Aug 01 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/1/2013 2:23 AM, John Colvin wrote:
 On Thursday, 1 August 2013 at 00:47:43 UTC, H. S. Teoh wrote:
 Add in some code examples and that could make a nice article.

Yes, please!

Aug 01 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Aug 01, 2013 at 10:34:24AM -0700, Walter Bright wrote:
 On 8/1/2013 2:23 AM, John Colvin wrote:
On Thursday, 1 August 2013 at 00:47:43 UTC, H. S. Teoh wrote:
Add in some code examples and that could make a nice article.

 
 Yes, please!

Alright, so I decided to prove my point about component programming by
actually writing a fully-functional version of the calendar layout
program, so that I have a solid piece of evidence that component
programming lives up to its promise. :) In addition, I decided that for
maximum reusability, I want the output lines available in an input
range, with absolutely no binding to writeln whatsoever (except in
main() where the range is handed to writeln for output). In retrospect,
that was perhaps a bit too ambitious... I ran into a few roadblocks to
actually get the code working, so it took me a lot longer than I
anticipated to finish the code.

However, I *will* say that I'm very proud of the code: already there are
a few pieces that, if properly cleaned up and refined, probably deserve
inclusion in Phobos. Reusability FTW!! Now, just tell me if you've ever
seen a calendar layout program made of straightforward, reusable pieces.
I for sure haven't. I tried looking at the C code for the Unix cal
program once... It looked frighteningly similar to an IOCCC entry. :-/

My D version, however, built using ranges through and through, has many
pieces that are easily reusable. For example, if you wanted to output
only a single month instead, you could just call join("\n") on the range
of formatted month lines that the full year layout algorithm uses to
splice lines from multiple months together -- it's *that* reusable.

Anyway. Enough hand-waving in the air. Let the actual code speak for
itself:

	https://github.com/quickfur/dcal/blob/master/dcal.d

Now, w.r.t. the roadblocks I alluded to.

When I first started working on the code, my goal was to maximize usage
of existing Phobos facilities in order to show how many batteries D
already comes with. As it turned out, I could only use basic Phobos
components; some of the more complex pieces like frontTransversal, which
would've been perfect for the bit that splices formatted month lines
together, couldn't be used because it wasn't flexible enough to handle
the insertion of fillers when some subranges are empty. In the end, I
had to code that range by hand, and I can't say I'm that happy with it
yet. But at least, it's nothing compared to the hairy complexity of the
C version of cal.

Another place where I wanted to use existing Phobos components was
chunkBy. There's probably a way to do it if you think hard enough about
it, but in the end I felt it was simpler to just write the code myself.
Might be a failure on my part to recognize how to put existing Phobos
ranges in a clever enough way to achieve what I wanted. I did try to do
something similar to byWeek(), but somehow it didn't do what I wanted
and I decided to just code it by hand instead of investigating further.

By far the biggest roadblock I ran into was that after I wrote
everything up to (and including) pasteBlocks, my unittests refused to
work. Somehow, pasteBlocks kept repeating the first line of the output
(the month names, if you look at the unittest) and refused to advance
farther.  Eventually I traced the problem to Lines.popFront(), which
pops each subrange off the range of ranges. The problem is that this
only works on some ranges, but not others; if you pass the output of
formatMonths() straight to pasteBlocks(), it will NOT work. Why? Because
pasteBlocks return a std.algorithm.Map object, which recreates the
subrange each time, so Lines.popFront() is only popping a temporary copy
of the subrange, not the real thing. I was about to give up and try
another approach, when out of the blue I decided to try and see if I
could stuff the range returned by formatMonths() into an array, and then
pass *that* to pasteBlocks() -- and behold, it worked!!

This was a totally unexpected fix, that a newbie probably would never
have thought of, so this is a potential trap for newcomers to D who
expect components to just be pluggable. In retrospect, it makes sense --
you need to somehow buffer the ranges of formatted month lines
*somewhere* in order to be able to splice them together out of their
natural depth-first outer/inner range order. But this is not obvious at
all from first glance; perhaps it's a sign of a leaky abstraction
somewhere. We should probably look into why this is happening and how to
fix it. And there should be a way to test for this in pasteBlocks'
signature constraint so that future code won't fall into the same trap,
but I can't think of one right now.

Once this last bit worked, though, everything fell into place quickly.
After all unittests were passing, no more bugs were found!! The program
can print beautifully laid out calendars with no problems whatsoever.
I'm so in love with D right now... If I'd done this exercise in C or
C++, I'd be spending the next 2 days debugging before I could present
the code for the world to see. D ranges and unittest blocks are t3h
k00l.


T

-- 
It always amuses me that Windows has a Safe Mode during bootup. Does that mean
that Windows is normally unsafe?

Aug 01 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/1/2013 10:24 PM, H. S. Teoh wrote:
 Once this last bit worked, though, everything fell into place quickly.
 After all unittests were passing, no more bugs were found!! The program
 can print beautifully laid out calendars with no problems whatsoever.
 I'm so in love with D right now... If I'd done this exercise in C or
 C++, I'd be spending the next 2 days debugging before I could present
 the code for the world to see. D ranges and unittest blocks are t3h
 k00l.

I think this is awesome, and this + your previous post are sufficient to create 
a great article!

Aug 01 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Aug 01, 2013 at 10:49:00PM -0700, Walter Bright wrote:
 On 8/1/2013 10:24 PM, H. S. Teoh wrote:
Once this last bit worked, though, everything fell into place quickly.
After all unittests were passing, no more bugs were found!! The program
can print beautifully laid out calendars with no problems whatsoever.
I'm so in love with D right now... If I'd done this exercise in C or
C++, I'd be spending the next 2 days debugging before I could present
the code for the world to see. D ranges and unittest blocks are t3h
k00l.

 
 I think this is awesome, and this + your previous post are
 sufficient to create a great article!

OK, here's a draft of the article:

	http://wiki.dlang.org/User:Quickfur/Component_programming_with_ranges

It looks like I may have to sort out some issues with compiler bugs
before officially posting this article, though, since the code
apparently fails to compile with many versions of DMD. :-(


T

-- 
Живёшь только однажды.

Aug 02 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/2/2013 3:02 PM, H. S. Teoh wrote:
 OK, here's a draft of the article:

 	http://wiki.dlang.org/User:Quickfur/Component_programming_with_ranges

 It looks like I may have to sort out some issues with compiler bugs
 before officially posting this article, though, since the code
 apparently fails to compile with many versions of DMD. :-(

Get 'em up on bugzilla! (At least any that fail with HEAD.)

Aug 02 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 08/03/2013 12:02 AM, H. S. Teoh wrote:
 On Thu, Aug 01, 2013 at 10:49:00PM -0700, Walter Bright wrote:
 On 8/1/2013 10:24 PM, H. S. Teoh wrote:
 Once this last bit worked, though, everything fell into place quickly.
 After all unittests were passing, no more bugs were found!! The program
 can print beautifully laid out calendars with no problems whatsoever.
 I'm so in love with D right now... If I'd done this exercise in C or
 C++, I'd be spending the next 2 days debugging before I could present
 the code for the world to see. D ranges and unittest blocks are t3h
 k00l.

 I think this is awesome, and this + your previous post are
 sufficient to create a great article!

 OK, here's a draft of the article:

 	http://wiki.dlang.org/User:Quickfur/Component_programming_with_ranges

 It looks like I may have to sort out some issues with compiler bugs
 before officially posting this article, though, since the code
 apparently fails to compile with many versions of DMD. :-(


 T

Also, you may want to replace some of the manually implemented ranges 
where this makes sense.

Eg, datesInYear can be expressed more to the point as:


auto datesInYear(int year){
     return Date(year,1,1).recurrence!((a,n)=>a[n-1]+1.dur!"days")
         .until!(a=>a.year>year);
}



(This closes over year though. The following version uses only closed 
lambdas by embedding year in the returned range object:


auto datesInYear(int year){
     return Date(year,1,1)
         .recurrence!((a,n)=>a[n-1]+1.dur!"days")
         .zip(year.repeat)
         .until!(a=>a[0].year>a[1]).map!(a=>a[0]);
})

Aug 02 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2013-08-02 23:27:20 +0000, Timon Gehr said:
 Also, you may want to replace some of the manually implemented ranges 
 where this makes sense.
 
 Eg, datesInYear can be expressed more to the point as:
 
 
 auto datesInYear(int year){
      return Date(year,1,1).recurrence!((a,n)=>a[n-1]+1.dur!"days")
          .until!(a=>a.year>year);
 }
 
 
 
 (This closes over year though. The following version uses only closed 
 lambdas by embedding year in the returned range object:
 
 
 auto datesInYear(int year){
      return Date(year,1,1)
          .recurrence!((a,n)=>a[n-1]+1.dur!"days")
          .zip(year.repeat)
          .until!(a=>a[0].year>a[1]).map!(a=>a[0]);
 })

Would be nice to have a couple of these both explicit and also 
implemented with the stdlib.

Andrei

Aug 02 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Aug 02, 2013 at 06:07:02PM -0700, Andrei Alexandrescu wrote:
 On 2013-08-02 23:27:20 +0000, Timon Gehr said:
Also, you may want to replace some of the manually implemented
ranges where this makes sense.

Eg, datesInYear can be expressed more to the point as:


auto datesInYear(int year){
     return Date(year,1,1).recurrence!((a,n)=>a[n-1]+1.dur!"days")
         .until!(a=>a.year>year);
}

(This closes over year though. The following version uses only
closed lambdas by embedding year in the returned range object:


auto datesInYear(int year){
     return Date(year,1,1)
         .recurrence!((a,n)=>a[n-1]+1.dur!"days")
         .zip(year.repeat)
         .until!(a=>a[0].year>a[1]).map!(a=>a[0]);
})


Thanks! I replaced the code with the first version above. I decided that
it's OK to close over year; it's a good example of the convenience of D
closures. And I also don't feel like explaining the functional
gymnastics of using zip and map just to avoid a closure. :)


 Would be nice to have a couple of these both explicit and also
 implemented with the stdlib.

[...]

I felt the article was approaching the long side, so I decided to just
use Timon's simplified code instead of the original explicit
implementation.

Or do you think it's better to have both, for comparison?


T

-- 
Ignorance is bliss... but only until you suffer the consequences!

Aug 03 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, August 03, 2013 01:27:20 Timon Gehr wrote:
 Also, you may want to replace some of the manually implemented ranges
 where this makes sense.
 
 Eg, datesInYear can be expressed more to the point as:
 
 
 auto datesInYear(int year){
      return Date(year,1,1).recurrence!((a,n)=>a[n-1]+1.dur!"days")
          .until!(a=>a.year>year);
 }

You could also use std.datetime.Interval and do something like

auto datesInYear(int year)
{
    auto interval = Interval!Date(Date(year, 1, 1), Date(year + 1, 1, 1));
    return interval.fwdRange((a){return a + dur!"days"(1);});
}

I do think that I need to revisit how ranges work with intervals in 
std.datetime, as they're a bit clunky.

- Jonathan M Davis

Aug 03 2013

"bearophile" <bearophileHUGS lycos.com> writes:

H. S. Teoh:

 OK, here's a draft of the article:

 	http://wiki.dlang.org/User:Quickfur/Component_programming_with_ranges


Most of the code below is not tested. So my suggestions may 
contain bugs or mistakes.


A bit improved chunkBy could go in Phobos.

---------------

For our purposes, though, we can't just do this in a loop, 
because it has to interface with the other components, which do 
not have a matching structure to a loop over dates.<


"yield" for coroutines is a very nice kind of glue.

---------------

auto datesInYear(int year) {
     return Date(year, 1, 1)
         .recurrence!((a,n) => a[n-1] + dur!"days"(1))
         .until!(a => a.year > year);
}


===>


auto datesInYear(in uint year) pure /*nothrow*/
in {
     assert(year > 1900);
} body {
     return Date(year, 1, 1)
            .recurrence!((a, n) => a[n - 1] + dur!"days"(1))
            .until!(d => d.year > year);
}


I suggest to align the dots vertically like that. And generally 
_all_ variables/arguments that don't need to mutate should be 
const or immutable (or enum), unless holes in Phobos or in the 
type system or other factors prevent you to do it.

---------------

return chunkBy!"a.month()"(dates);

===>

return dates.chunkBy!q{ a.month };

---------------

byWeek() is not so simple. Most of its code is boilerplate code. 
Perhaps using a "yield" it becomes simpler.

---------------

string spaces(size_t n) {
     return repeat(' ').take(n).array.to!string;
}

===>

string spaces(in size_t n) pure nothrow {
     return std.array.replicate(" ", n);
}


Currently in programs that import both std.range and std.array 
you have to qualify the module for replicate.

In Python this is just:

' ' * n

---------------

auto buf = appender!string();

Perhaps this suffices:

appender!string buf;

---------------

string[] days = map!((Date d) => " %2d".format(d.day))(r.front)
                 .array;

(not tested) ==>

const days = r.front.map!(d => " %2d".format(d.day)).array;

Or:

const string[] days = r
                       .front
                       .map!(d => " %2d".format(d.day))
                       .array;

---------------

If you put the days inside buf, do you really need to turn days 
into an array with array()?

string[] days = map!((Date d) => " %2d".format(d.day))(r.front)
                 .array;
assert(days.length <= 7 - startDay);
days.copy(buf);


Isn't this enough?

auto days = r.front.map!(d => " %2d".format(d.day));

---------------

If not already present this array should go in std.datetime or 
core.time:

     static immutable string[] monthNames = [
         "January", "February", "March", "April", "May", "June",
         "July", "August", "September", "October", "November", 
"December"
     ];

---------------

return to!string(spaces(before) ~ name ~ spaces(after));

==> (untested)

return text(before.spaces, name, after.spaces);

Or maybe even just (untested):

return before.spaces ~ name ~ after.spaces;

---------------

auto formatMonth(Range)(Range monthDays)
     if (isInputRange!Range && is(ElementType!Range == Date))
{
     assert(!monthDays.empty);
     assert(monthDays.front.day == 1);

     return chain(
         [ monthTitle(monthDays.front.month) ],
         monthDays.byWeek().formatWeek());
}

===> (untested)

auto formatMonth(R)(R monthDays)
     if (isInputRange!R && is(ElementType!R == Date))
in {
     assert(!monthDays.empty);
     assert(monthDays.front.day == 1);
} body {
     return [monthDays.front.month.monthTitle]
            .chain(monthDays.byWeek.formatWeek);
}


Generally I suggest to use pre- and post conditions.

---------------

return months.map!((month) => month.formatMonth());

===> (untested)

return months.map!formatMonth;

---------------

.map!((r) =>

===>

.map!(r =>

---------------

int year = to!int(args[1]);

===>

int year = args[1].to!int;

---------------

On Rosettacode there is a shorter calendar:
http://rosettacode.org/wiki/Calendar#D

If you want we can put, as second D entry, your calendar code 
(without unittests) in that page too.

Bye,
bearophile

Aug 03 2013

"Andre Artus" <andre.artus gmail.com> writes:

 Bearophile:
 If not already present this array should go in std.datetime or 
 core.time:

     static immutable string[] monthNames = [
         "January", "February", "March", "April", "May", "June",
         "July", "August", "September", "October", "November", 
 "December"
     ];

It should probably be picked up from the OS, to support 
localization.

Aug 03 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, August 04, 2013 06:20:57 Andre Artus wrote:
 Bearophile:
 If not already present this array should go in std.datetime or
 
 core.time:
     static immutable string[] monthNames = [
     
         "January", "February", "March", "April", "May", "June",
         "July", "August", "September", "October", "November",
 
 "December"
 
     ];

 
 It should probably be picked up from the OS, to support
 localization.

std.datetime has something like that internally for some of what it does (in 
particular, toSimpleString, which I wouldn't even put in there now if I could 
go back), but we explicitly didn't make anything like that public, because 
it's English-specific.

- Jonathan M Davis

Aug 03 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-04 06:53, Jonathan M Davis wrote:

 std.datetime has something like that internally for some of what it does (in
 particular, toSimpleString, which I wouldn't even put in there now if I could
 go back), but we explicitly didn't make anything like that public, because
 it's English-specific.

The solution for that is to make it possible to plug in support for new 
languages and make the default English.

-- 
/Jacob Carlborg

Aug 12 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sun, Aug 04, 2013 at 05:02:05AM +0200, bearophile wrote:
[...]
 A bit improved chunkBy could go in Phobos.

Yeah I'll look into that sometime. It'll definitely be a useful thing to
have, I think.


 ---------------
 
For our purposes, though, we can't just do this in a loop, because
it has to interface with the other components, which do not have a
matching structure to a loop over dates.<

 

 "yield" for coroutines is a very nice kind of glue.

Don't fibres already have yield()?


 ---------------
 
 auto datesInYear(int year) {
     return Date(year, 1, 1)
         .recurrence!((a,n) => a[n-1] + dur!"days"(1))
         .until!(a => a.year > year);
 }
 
 
 ===>
 
 
 auto datesInYear(in uint year) pure /*nothrow*/

I think "in uint" is redundant. Uints don't need to be const.

The Date ctor isn't nothrow, unfortunately.

Also, std.datetime docs explicitly state that years can be negative (in
which case they represent B.C..


 in {
     assert(year > 1900);

Good idea, I should be using contracts. :)

But in this case, there's no good reason to restrict it to >1900,
because std.datetime can handle years past 0 AD.


 } body {
     return Date(year, 1, 1)
            .recurrence!((a, n) => a[n - 1] + dur!"days"(1))
            .until!(d => d.year > year);
 }
 
 
 I suggest to align the dots vertically like that. And generally
 _all_ variables/arguments that don't need to mutate should be const
 or immutable (or enum), unless holes in Phobos or in the type system
 or other factors prevent you to do it.

Hmm. OK, but I still think "in uint" is redundant.


 ---------------
 
 return chunkBy!"a.month()"(dates);
 
 ===>
 
 return dates.chunkBy!q{ a.month };

I don't really like using q{} here. But the UFCS thing probably looks
better, considering the rest of the code.


 ---------------
 
 byWeek() is not so simple. Most of its code is boilerplate code.
 Perhaps using a "yield" it becomes simpler.

I think Phobos needs better primitives for constructing chunked ranges.
:)  I was thinking if there's a common core that can be factored out of
chunkBy and byWeek, to make a generic range chunking function that can
handle both.


 ---------------
 
 string spaces(size_t n) {
     return repeat(' ').take(n).array.to!string;
 }
 
 ===>
 
 string spaces(in size_t n) pure nothrow {
     return std.array.replicate(" ", n);
 }

Thanks! I didn't know about std.array.replicate. :)


 Currently in programs that import both std.range and std.array you
 have to qualify the module for replicate.
 
 In Python this is just:
 
 ' ' * n

The closest I ever got to that in D was to define:

	string x(string s, size_t n) { ... }

so you could write " ".x(10) as a visual approximation to " "*10.  But
this is not a good idea for other reasons ("x" is too confusing a name
and likely to clash with other symbols).


 ---------------
 
 auto buf = appender!string();
 
 Perhaps this suffices:
 
 appender!string buf;

Actually, that doesn't compile.


 ---------------
 
 string[] days = map!((Date d) => " %2d".format(d.day))(r.front)
                 .array;
 
 (not tested) ==>
 
 const days = r.front.map!(d => " %2d".format(d.day)).array;

I wanted to make the return type explicit for the reader's benefit.


[...]
 If you put the days inside buf, do you really need to turn days into
 an array with array()?

The reason is actually so that I can assert on its length. Otherwise
you're right, I could just put it directly into buf.


[...]
 ---------------
 
 If not already present this array should go in std.datetime or
 core.time:
 
     static immutable string[] monthNames = [
         "January", "February", "March", "April", "May", "June",
         "July", "August", "September", "October", "November",
 "December"
     ];

Well, once we have a proper i18n module...


 ---------------
 
 return to!string(spaces(before) ~ name ~ spaces(after));
 
 ==> (untested)
 
 return text(before.spaces, name, after.spaces);
 
 Or maybe even just (untested):
 
 return before.spaces ~ name ~ after.spaces;

You're right, there's no need to call to!string. I think that was left
over from older code when some of the pieces were char[].


 ---------------
 
 auto formatMonth(Range)(Range monthDays)
     if (isInputRange!Range && is(ElementType!Range == Date))
 {
     assert(!monthDays.empty);
     assert(monthDays.front.day == 1);
 
     return chain(
         [ monthTitle(monthDays.front.month) ],
         monthDays.byWeek().formatWeek());
 }
 
 ===> (untested)
 
 auto formatMonth(R)(R monthDays)
     if (isInputRange!R && is(ElementType!R == Date))
 in {
     assert(!monthDays.empty);
     assert(monthDays.front.day == 1);
 } body {
     return [monthDays.front.month.monthTitle]
            .chain(monthDays.byWeek.formatWeek);

I don't like overusing UFCS when it makes the code harder to read.


 }
 
 
 Generally I suggest to use pre- and post conditions.

Good idea. We should use contracts more in D... in part so that there's
enough code out there to pressure people into fixing DbC-related issues.
;-)


 ---------------
 
 return months.map!((month) => month.formatMonth());
 
 ===> (untested)
 
 return months.map!formatMonth;

Good idea!


 ---------------
 
 .map!((r) =>
 
 ===>
 
 .map!(r =>

Heh, I didn't know you could omit the parens there!


 ---------------
 
 int year = to!int(args[1]);
 
 ===>
 
 int year = args[1].to!int;

I'm on the fence about this one. I still think to!int(args[1]) reads
better. But your version is not bad, either.


 ---------------
 
 On Rosettacode there is a shorter calendar:
 http://rosettacode.org/wiki/Calendar#D
 
 If you want we can put, as second D entry, your calendar code
 (without unittests) in that page too.

[...]

Well, I didn't write my version to be short. :)  The goal was to use it
as an example of the kind of component-style programming Walter was
talking about.

As for putting it up on Rosettacode, sure, go ahead. :) Just make sure
you test the result first, though. Some of your suggestions actually
don't compile.

I'm going to incorporate some of your suggestions in the code and update
the article.  Thanks for the feedback!


T

-- 
It is widely believed that reinventing the wheel is a waste of time; but
I disagree: without wheel reinventers, we would be still be stuck with
wooden horse-cart wheels.

Aug 03 2013

"bearophile" <bearophileHUGS lycos.com> writes:

H. S. Teoh:

 It looks like I may have to sort out some issues with compiler 
 bugs
 before officially posting this article, though, since the code
 apparently fails to compile with many versions of DMD. :-(

If not already present, I suggest you to put a reduced version of 
the problems you have found (with map or something else) in 
Bugzilla. (Otherwise I'll try to do it later).

Bye,
bearophile

Aug 06 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, Aug 06, 2013 at 06:20:23PM +0200, bearophile wrote:
 H. S. Teoh:
 
It looks like I may have to sort out some issues with compiler bugs
before officially posting this article, though, since the code
apparently fails to compile with many versions of DMD. :-(

 
 If not already present, I suggest you to put a reduced version of
 the problems you have found (with map or something else) in
 Bugzilla. (Otherwise I'll try to do it later).

[...]

Actually, the problem has been fixed in git HEAD, it's just that 2.063
doesn't have the fix. The basic cause is that std.range.chunks in 2.063
and earlier requires slicing, but I need to use it with a forward range
that doesn't have slicing. This limitation has been removed in git HEAD
(upcoming 2.064).

I've already solved the problem in the code by using "static if
(__VERSION__ < 2064L)" and providing a bare-bones replacement of
std.range.chunks that does support forward ranges.


T

-- 
Customer support: the art of getting your clients to pay for your own
incompetence.

Aug 06 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Aug 02, 2013 at 03:02:24PM -0700, H. S. Teoh wrote:
 On Thu, Aug 01, 2013 at 10:49:00PM -0700, Walter Bright wrote:

[...]
 I think this is awesome, and this + your previous post are
 sufficient to create a great article!

 
 OK, here's a draft of the article:
 
 	http://wiki.dlang.org/User:Quickfur/Component_programming_with_ranges

[...]

Alright, the article is done:

	http://wiki.dlang.org/Component_programming_with_ranges

I haven't linked it to the articles page yet, though. It'd be better if
somebody else vetted it before it is included there, I think. :)


T

-- 
"Holy war is an oxymoron." -- Lazarus Long

Aug 03 2013

Justin Whear <justin economicmodeling.com> writes:

On Thu, 01 Aug 2013 22:24:32 -0700, H. S. Teoh wrote:
 Now, w.r.t. the roadblocks I alluded to.
 
 When I first started working on the code, my goal was to maximize usage
 of existing Phobos facilities in order to show how many batteries D
 already comes with. As it turned out, I could only use basic Phobos
 components; some of the more complex pieces like frontTransversal, which
 would've been perfect for the bit that splices formatted month lines
 together, couldn't be used because it wasn't flexible enough to handle
 the insertion of fillers when some subranges are empty. In the end, I
 had to code that range by hand, and I can't say I'm that happy with it
 yet.

I recently wrote a range component for my current project that is similar 
but with a twist.  It takes a bunch of ranges, each of which is assumed 
to be sorted with some predicate, then it walks through them, returning a 
range of the fronts of each range. The twist is that it has to call a 
user-supplied `produce` function whenever it encounters a mismatch (e.g. 
a range's front is greater than the others or a range is empty).

Aug 02 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Aug 02, 2013 at 04:06:46PM +0000, Justin Whear wrote:
 On Thu, 01 Aug 2013 22:24:32 -0700, H. S. Teoh wrote:
 Now, w.r.t. the roadblocks I alluded to.
 
 When I first started working on the code, my goal was to maximize
 usage of existing Phobos facilities in order to show how many
 batteries D already comes with. As it turned out, I could only use
 basic Phobos components; some of the more complex pieces like
 frontTransversal, which would've been perfect for the bit that
 splices formatted month lines together, couldn't be used because it
 wasn't flexible enough to handle the insertion of fillers when some
 subranges are empty. In the end, I had to code that range by hand,
 and I can't say I'm that happy with it yet.

 
 I recently wrote a range component for my current project that is
 similar but with a twist.  It takes a bunch of ranges, each of which
 is assumed to be sorted with some predicate, then it walks through
 them, returning a range of the fronts of each range. The twist is that
 it has to call a user-supplied `produce` function whenever it
 encounters a mismatch (e.g.  a range's front is greater than the
 others or a range is empty).

It would be nice to collect these custom ranges and see if there's some
common functionality that can be added to Phobos.


T

-- 
The trouble with TCP jokes is that it's like hearing the same joke over and
over.

Aug 02 2013

"bearophile" <bearophileHUGS lycos.com> writes:

H. S. Teoh:

 It would be nice to collect these custom ranges and see if 
 there's some common functionality that can be added to Phobos.

chunkBy seems OK for Phobos.

Bye,
bearophile

Aug 02 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 08/02/2013 07:24 AM, H. S. Teoh wrote:
 ...
 Anyway. Enough hand-waving in the air. Let the actual code speak for
 itself:

 	https://github.com/quickfur/dcal/blob/master/dcal.d
 ...

Which version of the compiler are you using?

I get the dreaded forward reference errors with at least DMD 2.060, DMD 
2.063 and DMD 2.063.2 and the 2.x build on dpaste.

Git head gives me:

Error: undefined identifier '_xopCmp'
dmd: clone.c:690: FuncDeclaration* 
StructDeclaration::buildXopCmp(Scope*): Assertion `s' failed.
Aborted (core dumped)

Aug 02 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Aug 02, 2013 at 08:49:30PM +0200, Timon Gehr wrote:
 On 08/02/2013 07:24 AM, H. S. Teoh wrote:
...
Anyway. Enough hand-waving in the air. Let the actual code speak for
itself:

	https://github.com/quickfur/dcal/blob/master/dcal.d
...

 
 Which version of the compiler are you using?

I'm using git HEAD.


 I get the dreaded forward reference errors with at least DMD 2.060,
 DMD 2.063 and DMD 2.063.2 and the 2.x build on dpaste.

Can you send me the error messages? I'll see if I can reorder the code
to fix them.


 Git head gives me:
 
 Error: undefined identifier '_xopCmp'
 dmd: clone.c:690: FuncDeclaration*
 StructDeclaration::buildXopCmp(Scope*): Assertion `s' failed.
 Aborted (core dumped)

That's new. It was working as of yesterday; must've been a new
regression in the commits since then?


T

-- 
The richest man is not he who has the most, but he who needs the least.

Aug 02 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Aug 02, 2013 at 03:00:01PM -0700, H. S. Teoh wrote:
 On Fri, Aug 02, 2013 at 08:49:30PM +0200, Timon Gehr wrote:

[...]
 I get the dreaded forward reference errors with at least DMD 2.060,
 DMD 2.063 and DMD 2.063.2 and the 2.x build on dpaste.

 
 Can you send me the error messages? I'll see if I can reorder the code
 to fix them.

I just checked DMD 2.063, and it appears that the error is caused by a
limitation in std.range.chunks in 2.063, where it requires slicing and
length, but formatYear can only give it an input range. This is kinda
sad, since that means I'll have to implement chunks myself on 2.063. :-/

I've no idea why it seems to be somehow conflated with an error from
invoking std.conv.to!() to convert from string to int; apparently some
kind of compiler bug that obscures the real problem with
std.range.chunks.


 Git head gives me:
 
 Error: undefined identifier '_xopCmp'
 dmd: clone.c:690: FuncDeclaration*
 StructDeclaration::buildXopCmp(Scope*): Assertion `s' failed.
 Aborted (core dumped)

 
 That's new. It was working as of yesterday; must've been a new
 regression in the commits since then?

[...]

Actually, I just pulled git HEAD again, and it's still working fine.
Maybe you just need to update your repo?


T

-- 
"Real programmers can write assembly code in any language. :-)" -- Larry Wall

Aug 02 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 08/03/2013 01:05 AM, H. S. Teoh wrote:
 Actually, I just pulled git HEAD again, and it's still working fine.
 Maybe you just need to update your repo?
 ...

I think it pulled in the wrong version of druntime.

Aug 02 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Aug 03, 2013 at 04:12:58AM +0200, Timon Gehr wrote:
 On 08/03/2013 01:05 AM, H. S. Teoh wrote:
Actually, I just pulled git HEAD again, and it's still working fine.
Maybe you just need to update your repo?
...

 
 I think it pulled in the wrong version of druntime.

OK, I've written a simple replacement for 2.063 std.range.chunks inside
a static if (__VERSION__ <= 2063) block, so you should be able to
compile it now. The code has been pushed to github.


T

-- 
It is of the new things that men tire --- of fashions and proposals and
improvements and change. It is the old things that startle and intoxicate. It
is the old things that are young. -- G.K. Chesterton

Aug 03 2013

"Dejan Lekic" <dejan.lekic gmail.com> writes:

On Friday, 2 August 2013 at 05:26:05 UTC, H. S. Teoh wrote:
 On Thu, Aug 01, 2013 at 10:34:24AM -0700, Walter Bright wrote:
 On 8/1/2013 2:23 AM, John Colvin wrote:
On Thursday, 1 August 2013 at 00:47:43 UTC, H. S. Teoh wrote:
Add in some code examples and that could make a nice article.

 
 Yes, please!

 Alright, so I decided to prove my point about component 
 programming by
 actually writing a fully-functional version of the calendar 
 layout
 program, so that I have a solid piece of evidence that component
 programming lives up to its promise. :) In addition, I decided 
 that for
 maximum reusability, I want the output lines available in an 
 input
 range, with absolutely no binding to writeln whatsoever (except 
 in
 main() where the range is handed to writeln for output). In 
 retrospect,
 that was perhaps a bit too ambitious... I ran into a few 
 roadblocks to
 actually get the code working, so it took me a lot longer than I
 anticipated to finish the code.

 However, I *will* say that I'm very proud of the code: already 
 there are
 a few pieces that, if properly cleaned up and refined, probably 
 deserve
 inclusion in Phobos. Reusability FTW!! Now, just tell me if 
 you've ever
 seen a calendar layout program made of straightforward, 
 reusable pieces.
 I for sure haven't. I tried looking at the C code for the Unix 
 cal
 program once... It looked frighteningly similar to an IOCCC 
 entry. :-/

 My D version, however, built using ranges through and through, 
 has many
 pieces that are easily reusable. For example, if you wanted to 
 output
 only a single month instead, you could just call join("\n") on 
 the range
 of formatted month lines that the full year layout algorithm 
 uses to
 splice lines from multiple months together -- it's *that* 
 reusable.

 Anyway. Enough hand-waving in the air. Let the actual code 
 speak for
 itself:

 	https://github.com/quickfur/dcal/blob/master/dcal.d

 Now, w.r.t. the roadblocks I alluded to.

 When I first started working on the code, my goal was to 
 maximize usage
 of existing Phobos facilities in order to show how many 
 batteries D
 already comes with. As it turned out, I could only use basic 
 Phobos
 components; some of the more complex pieces like 
 frontTransversal, which
 would've been perfect for the bit that splices formatted month 
 lines
 together, couldn't be used because it wasn't flexible enough to 
 handle
 the insertion of fillers when some subranges are empty. In the 
 end, I
 had to code that range by hand, and I can't say I'm that happy 
 with it
 yet. But at least, it's nothing compared to the hairy 
 complexity of the
 C version of cal.

 Another place where I wanted to use existing Phobos components 
 was
 chunkBy. There's probably a way to do it if you think hard 
 enough about
 it, but in the end I felt it was simpler to just write the code 
 myself.
 Might be a failure on my part to recognize how to put existing 
 Phobos
 ranges in a clever enough way to achieve what I wanted. I did 
 try to do
 something similar to byWeek(), but somehow it didn't do what I 
 wanted
 and I decided to just code it by hand instead of investigating 
 further.

 By far the biggest roadblock I ran into was that after I wrote
 everything up to (and including) pasteBlocks, my unittests 
 refused to
 work. Somehow, pasteBlocks kept repeating the first line of the 
 output
 (the month names, if you look at the unittest) and refused to 
 advance
 farther.  Eventually I traced the problem to Lines.popFront(), 
 which
 pops each subrange off the range of ranges. The problem is that 
 this
 only works on some ranges, but not others; if you pass the 
 output of
 formatMonths() straight to pasteBlocks(), it will NOT work. 
 Why? Because
 pasteBlocks return a std.algorithm.Map object, which recreates 
 the
 subrange each time, so Lines.popFront() is only popping a 
 temporary copy
 of the subrange, not the real thing. I was about to give up and 
 try
 another approach, when out of the blue I decided to try and see 
 if I
 could stuff the range returned by formatMonths() into an array, 
 and then
 pass *that* to pasteBlocks() -- and behold, it worked!!

 This was a totally unexpected fix, that a newbie probably would 
 never
 have thought of, so this is a potential trap for newcomers to D 
 who
 expect components to just be pluggable. In retrospect, it makes 
 sense --
 you need to somehow buffer the ranges of formatted month lines
 *somewhere* in order to be able to splice them together out of 
 their
 natural depth-first outer/inner range order. But this is not 
 obvious at
 all from first glance; perhaps it's a sign of a leaky 
 abstraction
 somewhere. We should probably look into why this is happening 
 and how to
 fix it. And there should be a way to test for this in 
 pasteBlocks'
 signature constraint so that future code won't fall into the 
 same trap,
 but I can't think of one right now.

 Once this last bit worked, though, everything fell into place 
 quickly.
 After all unittests were passing, no more bugs were found!! The 
 program
 can print beautifully laid out calendars with no problems 
 whatsoever.
 I'm so in love with D right now... If I'd done this exercise in 
 C or
 C++, I'd be spending the next 2 days debugging before I could 
 present
 the code for the world to see. D ranges and unittest blocks are 
 t3h
 k00l.


 T

Good work! I've read the article yesterday. Very educational!

Aug 05 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, Aug 05, 2013 at 02:11:32PM +0200, Dejan Lekic wrote:
[...]
 Good work! I've read the article yesterday. Very educational!

Thanks!

I did actually make a major revision to the article last night based on
some feedback I got; I rewrote much of the first part of it, so if
you're interested you might want to re-read it.


T

-- 
Caffeine underflow. Brain dumped.

Aug 05 2013

"Dejan Lekic" <dejan.lekic gmail.com> writes:

On Thursday, 1 August 2013 at 00:47:43 UTC, H. S. Teoh wrote:
 On Wed, Jul 31, 2013 at 11:52:35PM +0000, Justin Whear wrote:
 On Thu, 01 Aug 2013 00:23:52 +0200, bearophile wrote:
 
 The situation should be improved for D/dmd/Phobos, otherwise 
 such D
 component programming remains partially a dream, or a toy.
 
 Bye,
 bearophile

 
 I disagree with your "toy" assessment.  I've been using this 
 chaining,
 component style for a while now and have really enjoyed the 
 clarity
 it's brought to my code.  I hadn't realized how bug-prone 
 non-trivial
 loops tend to be until I started writing this way and avoided 
 them
 entirely.

 [...]

 One of the more influential courses I took in college was on 
 Jackson
 Structured Programming. It identified two sources of programming
 complexity (i.e., where bugs are most likely to occur): (1) 
 mismatches
 between the structure of the program and the structure of the 
 data
 (e.g., you're reading an input file that has a preamble, body, 
 and
 epilogue, but your code has a single loop over lines in the 
 file); (2)
 writing loop invariants (or equivalently, loop conditions).

 Most non-trivial loops in imperative code have both, which 
 makes them
 doubly prone to bugs. In the example I gave above, the mismatch 
 between
 the code structure (a single loop) and the file structure (three
 sequential sections) often prompts people to add boolean flags, 
 state
 variables, and the like, in order to resolve the conflict 
 between the
 two structures. Such ad hoc structure resolutions are a 
 breeding ground
 for bugs, and often lead to complicated loop conditions, which 
 invite
 even more bugs.

 In contrast, if you structure your code according to the 
 structure of
 the input (i.e., one loop for processing the preamble, one loop 
 for
 processing the body, one loop for processing the epilogue), it 
 becomes
 considerably less complex, easier to read (and write!), and far 
 less bug
 prone. Your loop conditions become simpler, and thus easier to 
 reason
 about and leave less room for bugs to hide.

 But to be able to process the input in this way requires that 
 you
 encapsulate your input so that it can be processed by 3 
 different loops.
 Once you go down that road, you start to arrive at the concept 
 of input
 ranges... then you abstract away the three loops into three 
 components,
 and behold, component style programming!

 In fact, with component style programming, you can also address 
 another
 aspect of (1): when you need to simultaneously process two data
 structures whose structures don't match. For example, if you 
 want to lay
 out a yearly calendar using writeln, the month/day cells must 
 be output
 in a radically different order than the logical 
 foreach(m;1..12) {
 foreach(day;1..31) } structure). Writing this code in the 
 traditional
 imperative style produces a mass of spaghettii code: either you 
 have
 bizarre loops with convoluted loop conditions for generating 
 the dates
 in the order you want to print them, or you have to fill out 
 some kind
 of grid structure in a complicated order so that you can 
 generate the
 dates in order.

 Using ranges, though, this becomes considerably more tractable: 
 you can
 have an input range of dates in chronological order, two output 
 ranges
 corresponding to chunking by week / month, which feed into a 
 third
 output range that buffers the generated cells and prints them 
 once
 enough has been generated to fill a row of output. By 
 separating out
 these non-corresponding structures into separate components, 
 you greatly
 simplify the code within each component and thus reduce the 
 number of
 bugs (e.g. it's far easier to ensure you never put more than 7 
 days in a
 week, since the weekly output range is all in one place, as 
 opposed to
 sprinkled everywhere across multiple nested loops in the 
 imperative
 style calendar code). The code that glues these components 
 together is
 also separated out and becomes easier to understand and debug: 
 you
 simply read from the input range of dates, write to the two 
 output
 ranges, and check if they are full (this isn't part of the 
 range API but
 added so for this particular example); if the weekly range is 
 full,
 start a new week; if the monthly range is full, start a new 
 month. Then
 the final output range takes care of when to actually produce 
 output --
 you just write stuff to it and don't worry about it in the glue 
 code.

 OK, this isn't really a good example of the linear pipeline 
 style code
 we're talking about, but it does show how using ranges as 
 components can
 untangle very complicated code into simple, tractable parts 
 that are
 readable and easy to debug.


 T

This post deserves to become an article somewhere. D Wiki, some 
blog, whatever. All to the point. Respect.

Aug 01 2013

"Dejan Lekic" <dejan.lekic gmail.com> writes:

On Wednesday, 31 July 2013 at 22:23:54 UTC, bearophile wrote:
 Justin Whear:

 If anything, component programming is just functional 
 programming + templates and some nice syntactic sugar.
 And a healthy dose of pure awesome.

 What D calls "component programming" is very nice and good, but 
 in D it's almost a joke.

 Currently this code inlines nothing (the allocations, the 
 difference and the product):


 import std.numeric: dotProduct;
 int main() {
     enum N = 50;
     auto a = new int[N];
     auto b = new int[N];
     auto c = new int[N];
     c[] = a[] - b[];
     int result = dotProduct(c, c);
     return result;
 }


 If you write it in component-style (using doubles here):


 import std.math;
 import std.algorithm, std.range;

 int main() {
     enum N = 50;
     alias T = double;
     auto a = new T[N];
     auto b = new T[N];

     return cast(int)zip(a, b)
            .map!(p => (p[0] - p[1]) ^^ 2)
            .reduce!q{a + b};
 }


 The situation gets much worse, you see many functions in the 
 binary, that even LDC2 often not able to inline. The GHC 
 Haskell compiler turns similar "components" code in efficient 
 SIMD asm (that uses packed doubles, like double2), it inlines 
 everything, merges the loops, produces a small amount of asm 
 output, and there is no "c" intermediate array. In GHC 
 "component programming" is mature (and Intel is developing an 
 Haskell compiler that is even more optimizing), while in 
 D/dmd/Phobos this stuff is just started. GHC has twenty+ years 
 of head start on this and it shows.

 The situation should be improved for D/dmd/Phobos, otherwise 
 such D component programming remains partially a dream, or a 
 toy.

 Bye,
 bearophile

I was honestly thinking whether I should reply to this rant or 
not... Obviously I picked the first. - Component programming, as 
you probably know yourself already, is not about making 
super-fast, super-optimized applications, but about making it 
easy both to write the code and to understand the code, as well 
as making it easy to combine components (algorithms mostly) and 
get the result quickly, where by "quickly" I think about time I 
need to write the code.

If you really want a super-optimized solution you will in most 
cases write the piece in question in C. Well, that is at least 
what my experience tells me. Luckily, I do business applications 
most of the time, so performance is rarely an issue. CONVENIENCE 
is! In other words, I shamelessly admit, I only care about the 
time I have to spend coding in order to implement something that 
is of value to the business.

Aug 01 2013

"Brad Anderson" <eco gnuk.net> writes:

On Wednesday, 31 July 2013 at 22:23:54 UTC, bearophile wrote:
 <snip>
 Currently this code inlines nothing (the allocations, the 
 difference and the product):

 <snip>

 If you write it in component-style (using doubles here):

 <snip>

Resident compiler guys,

How difficult would it be to make sure stuff like this gets 
inlined and optimized more thoroughly?  I'm very ignorant of 
compiler internals but it's kind of disheartening that LDC can't 
inline them well despite being a fairly good optimizing compiler. 
  Is this a frontend issue or a backend issue?

Aug 01 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/1/2013 2:35 PM, Brad Anderson wrote:
 How difficult would it be to make sure stuff like this gets inlined and
 optimized more thoroughly?  I'm very ignorant of compiler internals but it's
 kind of disheartening that LDC can't inline them well despite being a fairly
 good optimizing compiler.  Is this a frontend issue or a backend issue?

I don't know.

But consider that optimizers are built to optimize typical code patterns. 
Component programming is fairly non-existent in C and C++, and is new in D. 
Hence, optimizers are not set up to deal with those patterns (yet).

Aug 01 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 But consider that optimizers are built to optimize typical code 
 patterns. Component programming is fairly non-existent in C and 
 C++, and is new in D. Hence, optimizers are not set up to deal 
 with those patterns (yet).

I agree.

GHC also works with a LLVM back-end, so those optimizations are 
done in some kind of middle-end.

Probably a silly idea: perhaps we can collect some money, like 
1000-2000 dollars, to pay for a 3 day long course for Walter 
(total about 15 hours) about such matters.

Bye,
bearophile

Aug 01 2013

"Andre Artus" <andre.artus gmail.com> writes:

On Thursday, 1 August 2013 at 22:45:10 UTC, bearophile wrote:
 Walter Bright:

 But consider that optimizers are built to optimize typical 
 code patterns. Component programming is fairly non-existent in 
 C and C++, and is new in D. Hence, optimizers are not set up 
 to deal with those patterns (yet).

 I agree.

 GHC also works with a LLVM back-end, so those optimizations are 
 done in some kind of middle-end.

 Probably a silly idea: perhaps we can collect some money, like 
 1000-2000 dollars, to pay for a 3 day long course for Walter 
 (total about 15 hours) about such matters.

Who's giving the course and where will it be held?

'Modern Compiler Design' by D. Grune, et al. & 'Compiler Design - 
Analysis and Transformation' by H. Seidl, et al. discusses some 
basic optimizations for functional programs. But I'm pretty sure 
Walter is already familiar with these.

Taking an example from Ali's book "P' in D":

import std.stdio;
import std.algorithm;
void main()
{
   auto values = [ 1, 2, 3, 4, 5 ];
   writeln(values
     .map!(a => a * 10)
     .map!(a => a / 3)
     .filter!(a => !(a % 2)));
}

As stated this implies 3 separate traversals of the list (or 
array to be specific) which is what a naïve implementation would 
do. But all three operations can run on the same traversal. Given 
that all three operations are monotonic functions (preserving the 
order and cardinality of the set, i.e. 1-1 mapping) they are also 
inherently parallelizable (i.e. amenable to auto vectorization or 
loop unrolling).

Aug 03 2013

"David Nadlinger" <code klickverbot.at> writes:

On Saturday, 3 August 2013 at 13:35:56 UTC, Andre Artus wrote:
 import std.stdio;
 import std.algorithm;
 void main()
 {
   auto values = [ 1, 2, 3, 4, 5 ];
   writeln(values
     .map!(a => a * 10)
     .map!(a => a / 3)
     .filter!(a => !(a % 2)));
 }

 As stated this implies 3 separate traversals of the list (or 
 array to be specific) which is what a naïve implementation 
 would do.

In this example, no, as all involved ranges are evaluated lazily. 
(I see your general point, though.)

David

Aug 03 2013

"Andre Artus" <andre.artus gmail.com> writes:

On Saturday, 3 August 2013 at 13:46:38 UTC, David Nadlinger wrote:
 On Saturday, 3 August 2013 at 13:35:56 UTC, Andre Artus wrote:
 import std.stdio;
 import std.algorithm;
 void main()
 {
  auto values = [ 1, 2, 3, 4, 5 ];
  writeln(values
    .map!(a => a * 10)
    .map!(a => a / 3)
    .filter!(a => !(a % 2)));
 }

 As stated this implies 3 separate traversals of the list (or 
 array to be specific) which is what a naïve implementation 
 would do.

 In this example, no, as all involved ranges are evaluated 
 lazily. (I see your general point, though.)

 David

I probably could have worded it better: I did not intend to imply 
that D follows the naïve implementation suggested. What I meant 
is that, on the face of it, given typical textbook 
implementations of map and filter you would be iterating 3 
separate times. To be clear I don't know of any serious 
implementations of these algorithms that do not address this in 
some way.

Aug 03 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/3/2013 6:46 AM, David Nadlinger wrote:
 In this example, no, as all involved ranges are evaluated lazily. (I see your
 general point, though.)

The rules for ranges do not specify if they are done eagerly, lazily, or in 
parallel. Meaning, of course, that a library writer could provide all three 
forms and the user could select which one he wanted.

Aug 03 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jul 31, 2013 at 09:16:20PM +0000, Justin Whear wrote:
 On Wed, 31 Jul 2013 12:20:56 +0200, Chris wrote:
 
 This is only losely related to D, but I don't fully understand the
 separation of component programming and OOP


[...]
 A few things:
 1) The functions used in Walter's example are not methods, they are
 generic free functions.  The "interfaces" they require are not actual
 OOP interfaces, but rather a description of what features the supplied
 type must supply.
 2) The avoidance of actual objects, interfaces, and methods means that
 the costly indirections of OOP are also avoided.  The compiler is free
 to inline as much of the pipeline as it wishes.
 3) Component programming simplifies usage requirements, OOP frameworks
 complicate usage requirements (e.g. you must inherit from this class).
 
 If anything, component programming is just functional programming +
 templates and some nice syntactic sugar.  And a healthy dose of pure
 awesome.

Keep in mind, though, that you pay for the avoidance of OO indirections
by template bloat. Every combination of types passed to your components
will create a new instantiation of that component. In simple cases, this
is generally only a handful of copies, so it's not a big deal; but for
certain frequently-used components, this can explode to a huge amount of
duplicated code. They will all avoid "costly" OO indirections, sure, but
you pay for that with larger code size, which means higher rate of CPU
cache misses, larger memory footprint of the code, etc..

This makes me wonder if we can somehow have a "happy marriage" of the
two approaches. Is it possible to have automatic template instantiation
factoring, such that in highly-used templates that generate a lot of
copies, can the compiler be made smart enough to figure out that
automatically adding indirections to the code to reduce the number of
instantiations might be better?

One case I've come across before is containers. For the sake of
genericity, we usually use templates to implement containers like, say,
a red-black tree. However, most of the code that deals with RB trees
don't really care about what type the data is at all; they implement
algorithms that operate on the structure of the RB tree, not on the
data. Only a small subset of RB tree methods actually need to know what
type the data should be (the methods that create a new node and
initialize it with data, return the data from a node, etc.). Yet, in a
template implementation of RB trees, every single method must be
repeatedly instantiated over and over, for every type you put into the
container.

Most of these method instantiations may in fact be essentially identical
to each other, except perhaps for one or two places where it may use a
different node size, or call some comparison function on the data in the
nodes.  Ideally, the compiler should be able to know when a method of a
templated struct/class is transitively independent of the template
parameter, and only emit code for that method once. All other
instantiations of that method will simply become aliases of that one
instantiation.

This doesn't cover the case where the call chain may pass through
methods that don't care about data types but eventually ends at a method
that *does* have to care about data types; but this is solvable by
factoring the code so that the type-independent code is separated from
the type-dependent code, except for one or two runtime parameters (e.g.
size of the data type, or a function pointer to the type-dependent code
that must be called at the end of, say, a tree traversal).  The compiler
may even be able to do this automatically in certain simple cases.


T

-- 
If it breaks, you get to keep both pieces. -- Software disclaimer notice

Jul 31 2013

"Meta" <jared771 gmail.com> writes:

The one thing that confused me at first when I read Walter's 
article was that I thought he was talking about the *other* 
component programming, a method commonly used by game developers 
to avoid deep class hierarchies.

http://gameprogrammingpatterns.com/component.html

Aug 01 2013

"Brad Anderson" <eco gnuk.net> writes:

On Thursday, 1 August 2013 at 07:23:42 UTC, Meta wrote:
 The one thing that confused me at first when I read Walter's 
 article was that I thought he was talking about the *other* 
 component programming, a method commonly used by game 
 developers to avoid deep class hierarchies.

 http://gameprogrammingpatterns.com/component.html

"Component programing" is kind of a crowded term in programming 
which means a lot of different things to different people.  
Digital Mars should trademark a new term for it like Ultra Stream 
Processing™.

Steven Schveighoffer may object to the use of the word "stream" 
without a read/write interface though :P.

Aug 01 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Aug 01, 2013 at 11:40:21PM +0200, Brad Anderson wrote:
 On Thursday, 1 August 2013 at 07:23:42 UTC, Meta wrote:
The one thing that confused me at first when I read Walter's
article was that I thought he was talking about the *other*
component programming, a method commonly used by game developers
to avoid deep class hierarchies.

http://gameprogrammingpatterns.com/component.html

 
 "Component programing" is kind of a crowded term in programming
 which means a lot of different things to different people.  Digital
 Mars should trademark a new term for it like Ultra Stream
 Processing™.
 
 Steven Schveighoffer may object to the use of the word "stream"
 without a read/write interface though :P.

What about "Ultra Range Processing"? :)


T

-- 
He who does not appreciate the beauty of language is not worthy to
bemoan its flaws.

Aug 01 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Thursday, 1 August 2013 at 21:55:56 UTC, H. S. Teoh wrote:
 On Thu, Aug 01, 2013 at 11:40:21PM +0200, Brad Anderson wrote:
 On Thursday, 1 August 2013 at 07:23:42 UTC, Meta wrote:
The one thing that confused me at first when I read Walter's
article was that I thought he was talking about the *other*
component programming, a method commonly used by game 
developers
to avoid deep class hierarchies.

http://gameprogrammingpatterns.com/component.html

 
 "Component programing" is kind of a crowded term in programming
 which means a lot of different things to different people.  
 Digital
 Mars should trademark a new term for it like Ultra Stream
 Processing™.
 
 Steven Schveighoffer may object to the use of the word "stream"
 without a read/write interface though :P.

 What about "Ultra Range Processing"? :)


 T

Range-Flow Processing.

Flow referring to the L->R data flow of ranges + std.algorithm + 
UFCS

Even just Data Flow Processing would be ok, then you could say 
Range-based Data Flow Processing and sound uber-cool :p

Aug 01 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Thursday, 1 August 2013 at 22:01:08 UTC, John Colvin wrote:
 On Thursday, 1 August 2013 at 21:55:56 UTC, H. S. Teoh wrote:
 On Thu, Aug 01, 2013 at 11:40:21PM +0200, Brad Anderson wrote:
 On Thursday, 1 August 2013 at 07:23:42 UTC, Meta wrote:
The one thing that confused me at first when I read Walter's
article was that I thought he was talking about the *other*
component programming, a method commonly used by game 
developers
to avoid deep class hierarchies.

http://gameprogrammingpatterns.com/component.html

 
 "Component programing" is kind of a crowded term in 
 programming
 which means a lot of different things to different people.  
 Digital
 Mars should trademark a new term for it like Ultra Stream
 Processing™.
 
 Steven Schveighoffer may object to the use of the word 
 "stream"
 without a read/write interface though :P.

 What about "Ultra Range Processing"? :)


 T

 Range-Flow Processing.

 Flow referring to the L->R data flow of ranges + std.algorithm 
 + UFCS

 Even just Data Flow Processing would be ok, then you could say 
 Range-based Data Flow Processing and sound uber-cool :p

Alternatively, substitute Procesing with Programming.

Aug 01 2013

"qznc" <qznc web.de> writes:

On Wednesday, 31 July 2013 at 10:20:57 UTC, Chris wrote:
 This is only losely related to D, but I don't fully understand 
 the separation of component programming and OOP (cf. 
 https://en.wikipedia.org/wiki/Component-based_software_engineering#Differences_from_object-or
ented_programming). 
 In an OO framwork, the objects are basically components. See 
 also

 "Brad Cox of Stepstone largely defined the modern concept of a 
 software component.[4] He called them Software ICs and set out 
 to create an infrastructure and market for these components by 
 inventing the Objective-C programming language." (see link 
 above)

 Walter's example 
 (http://www.drdobbs.com/architecture-and-design/component-programming-in-d/240008321)

 void main() {
         stdin.byLine(KeepTerminator.yes)    // 1
         map!(a => a.idup).                  // 2
         array.                              // 3
         sort.                               // 4
         copy(                               // 5
             stdout.lockingTextWriter());    // 6
     }

 This is more or less how mature OO programs look like. Ideally 
 each class (component) does one thing (however small the class 
 might be) and can be used or called to perform this task. All 
 other classes or components can live independently. From my 
 experience this is exactly what Objective-C does. Rather than 
 subclassing, it uses other classes to get a job done.

A few days ago, there was a discussion about APL on HN [0]. What 
we call Component Programming here, looks somewhat like the APL 
style to me. Sure, APLers have a single weird symbol for stuff 
like "sort.", but this chaining of powerful modular operations is 
what APL seems to be all about.

The APL paradigm is not integrated into modern languages so far. 
I am excited that it might make an introduction now. Compare for 
example Functional Programming, which is integrated into most 
mainstream languages by now. Or Logic Programming, which seems 
not worthy enough to get its own syntax, but is available in the 
business rules world with libraries and DSLs and its minor 
brother Datalog is also still alive.

[0] https://news.ycombinator.com/item?id=6115727

Aug 02 2013

"Jason den Dulk" <public2 jasondendulk.com> writes:

On Wednesday, 31 July 2013 at 10:20:57 UTC, Chris wrote:
 This is only losely related to D, but I don't fully understand 
 the separation of component programming and OOP

What the wikipedia entry is saying, in a roundabout way is:

All objects are components, but not all components are objects.

whereas in pure OOP:

All components are objects.

 In an OO framwork, the objects are basically components.

It's the other way around. In OOP frameworks, components are 
objects, a small but important distinction. If you relax the 
insistance on all components being objects, then OOP becomes a 
subset of CP (component programming).

 Walter's example 
 (http://www.drdobbs.com/architecture-and-design/component-programming-in-d/240008321)

 void main() {
         stdin.byLine(KeepTerminator.yes)    // 1
         map!(a => a.idup).                  // 2
         array.                              // 3
         sort.                               // 4
         copy(                               // 5
             stdout.lockingTextWriter());    // 6
     }

This is a design pattern called "pipes and filters", or simply, 
the pipeline. There appears to be a bit of confusion about this. 
Pipelines are a part of CP, but is not the whole of CP.

Pipelines make use of a type of component called a "service". At 
its simplest, a service is a function, but it could be a larger 
construct or even a whole program. Basically a service takes 
input, processes it (with a possible side effect) and gives a 
response (output).

Often CP is defined as being exclusively about services, while 
other definitions include objects and OOP. Functional programming 
is exclusively service oriented.

Purists would insist on using either objects or services 
exclusively (OOP vs FP), but there is nothing wrong with working 
with both.

Back to pipelines. In a pipeline, you have a chain of services in 
which the output of one service is given as the input for the 
next. In mathematics it is called "functional composition". The 
pipeline itself is a service in its own right. You can put 
together a pipeline of any length as long as the output -> input 
interfaces are compatible. In Walter's article, he goes further 
to make all interfaces the same to make the components 
interchangeable, but this is not necessary in general.

Hope this helps to explain a few things.
Regards
Jason

Aug 12 2013

"Chris" <wendlec tcd.ie> writes:

On Monday, 12 August 2013 at 12:28:36 UTC, Jason den Dulk wrote:
 On Wednesday, 31 July 2013 at 10:20:57 UTC, Chris wrote:
 This is only losely related to D, but I don't fully understand 
 the separation of component programming and OOP

 What the wikipedia entry is saying, in a roundabout way is:

 All objects are components, but not all components are objects.

 whereas in pure OOP:

 All components are objects.

 In an OO framwork, the objects are basically components.

 It's the other way around. In OOP frameworks, components are 
 objects, a small but important distinction. If you relax the 
 insistance on all components being objects, then OOP becomes a 
 subset of CP (component programming).

 Walter's example 
 (http://www.drdobbs.com/architecture-and-design/component-programming-in-d/240008321)

 void main() {
        stdin.byLine(KeepTerminator.yes)    // 1
        map!(a => a.idup).                  // 2
        array.                              // 3
        sort.                               // 4
        copy(                               // 5
            stdout.lockingTextWriter());    // 6
    }

 This is a design pattern called "pipes and filters", or simply, 
 the pipeline. There appears to be a bit of confusion about 
 this. Pipelines are a part of CP, but is not the whole of CP.

 Pipelines make use of a type of component called a "service". 
 At its simplest, a service is a function, but it could be a 
 larger construct or even a whole program. Basically a service 
 takes input, processes it (with a possible side effect) and 
 gives a response (output).

 Often CP is defined as being exclusively about services, while 
 other definitions include objects and OOP. Functional 
 programming is exclusively service oriented.

 Purists would insist on using either objects or services 
 exclusively (OOP vs FP), but there is nothing wrong with 
 working with both.

 Back to pipelines. In a pipeline, you have a chain of services 
 in which the output of one service is given as the input for 
 the next. In mathematics it is called "functional composition". 
 The pipeline itself is a service in its own right. You can put 
 together a pipeline of any length as long as the output -> 
 input interfaces are compatible. In Walter's article, he goes 
 further to make all interfaces the same to make the components 
 interchangeable, but this is not necessary in general.

 Hope this helps to explain a few things.
 Regards
 Jason


Thanks for the explanation, Jason.

Btw, I got an error message compiling dcal.d with ldmd2

dcal.d(34): Error: no property 'recurrence' for type 'Date'

It compiles with dmd and works.

Aug 19 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Mon, Aug 19, 2013 at 01:26:42PM +0200, Chris wrote:
 On Monday, 12 August 2013 at 12:28:36 UTC, Jason den Dulk wrote:

[...]
 Btw, I got an error message compiling dcal.d with ldmd2
 
 dcal.d(34): Error: no property 'recurrence' for type 'Date'
 
 It compiles with dmd and works.

What version of ldmd2 are you using? Looks like my code is incompatible
with earlier versions of the compiler. :-( I'd like to fix that, if
possible.


T

-- 
Meat: euphemism for dead animal. -- Flora

Aug 21 2013

D Programming

C/C++ Programming

Other

digitalmars.D - Component programming