digitalmars.D - std.compress

Walter Bright (8/8) Jun 03 2013 https://github.com/WalterBright/phobos/blob/std_compress/std/compress.d

Diggory (2/10) Jun 03 2013 Nice! What happens if R is not a ubyte range?

Walter Bright (3/4) Jun 03 2013 It'll work with char and ubyte, too. Anything else you'll need to cast o...

Timothee Cour (18/27) Jun 03 2013 A)

Walter Bright (4/6) Jun 04 2013 I have mixed feelings about that. If you'll notice, std.compress doesn't...

Dicebot (4/12) Jun 04 2013 If that is an issue, it is an issue in DMD, not in module.
Walter Bright (2/8) Jun 04 2013 Note also I didn't document it, so it is private and can be moved.

Peter Alexander (4/17) Jun 04 2013 Then it should be private. You should also mangle the name so

Walter Bright (7/22) Jun 04 2013 If it proves useful, it will be moved into some more proper and public p...

Peter Alexander (10/21) Jun 04 2013 import std.compress;

monarch_dodra (4/24) Jun 04 2013 Is this according to the specs though, or a bug? It was my

Peter Alexander (5/32) Jun 04 2013 Well, the fix is currently in an unapproved DIP. I have no idea

Martin Nowak (3/7) Jun 04 2013 I should really submit some ideas from my implementation to the DIP.

Dmitry Olshansky (6/18) Jun 04 2013 They are visible and clash with other symbols just like public do. Maybe...
Jonathan M Davis (10/14) Jun 04 2013 Not visible? When was that fixed? Last time I checked, access level had ...

Jakob Ovrum (3/7) Jun 04 2013 I think this is a workaround, not a proper solution.

Walter Bright (7/13) Jun 04 2013 Yup.

Adam D. Ruppe (16/16) Jun 04 2013 I actually wish we could have multiple modules in a single file.

Walter Bright (2/3) Jun 04 2013 I don't see much point to that in modern file systems.

Marco Leise (7/11) Jun 04 2013 Probably seek time if the files are scattered and not in cache.

Walter Bright (3/6) Jun 04 2013 Actually, I've often thought of making dmd able to read everything it ne...

Marco Leise (7/15) Jun 04 2013 That would have been difficult for editors and IDEs that can

eles (3/8) Jun 05 2013 True, but Java also had the same issue with its .jar files and

eles (18/21) Jun 05 2013 I support that. It would make distributing source code cleaner.
Jacob Carlborg (4/6) Jun 05 2013 I think it's better to have a proper package manager.

Regan Heath (5/9) Jun 05 2013 I think it's better to have both :)

Timothee Cour (8/16) Jun 04 2013 There's no point in having modules reinvent the wheel everytime. Circula...
Andrei Alexandrescu (5/11) Jun 04 2013 The downside of that is reinventing everything. I haven't looked at the

Walter Bright (3/6) Jun 04 2013 cycle only reads from a circular buffer. CircularBuffer can be filled as...

Jacob Carlborg (5/22) Jun 04 2013 I agree with all of these.
bearophile (6/9) Jun 04 2013 If you are interested in adding a CircularBuffer to Phobos, then

Peter Alexander (9/13) Jun 04 2013 Nitpick;

bearophile (4/9) Jun 04 2013 I see. Thank you. I will improve it later.

Walter Bright (7/15) Jun 04 2013 BTW, I also wrote this because it is a tricky component to write. There ...
David (11/23) Jun 04 2013 Why do we need that? I would much rather have a deflate which doesn't

Jonathan M Davis (3/6) Jun 04 2013 If you're sure that it's fixed, then close it.

Jacob Carlborg (9/16) Jun 04 2013 I'm wondering if (un)compress can take the compressing algorithm as a

Walter Bright (3/8) Jun 04 2013 I don't see the point. Furthermore, it requires that the compress templa...

John Colvin (5/19) Jun 04 2013 Not necessarily. If the compression algorithms were free

Walter Bright (2/19) Jun 04 2013 What value does a function which just passes an alias to another one add...

John Colvin (9/36) Jun 04 2013 A unified interface called "compress" that takes a compression

Walter Bright (6/15) Jun 04 2013 What is the improvement of typing:

Timothee Cour (5/28) Jun 04 2013 writing generic code.

Walter Bright (4/7) Jun 04 2013 The situations aren't comparable. The to!double case is parameterizing w...

Jonathan M Davis (9/17) Jun 04 2013 Well, I'd expect it to be compress!lzw(), but in any case, what it buys ...

Walter Bright (7/14) Jun 04 2013 There is zero utility in this:

Jonathan M Davis (7/24) Jun 04 2013 If that's all it's doing, then no, it wouldn't be useful to pass it as a...
Byron Heads (9/26) Jun 04 2013 but a compress interface would be nice:

Dmitry Olshansky (5/31) Jun 04 2013 It's a range already thus composable. Ranged I/O though is something to
Walter Bright (3/11) Jun 04 2013 That isn't how ranges work. Ranges already define an input and an output...

John Colvin (5/13) Jun 04 2013 Currently. However, compress could become more feature-rich in

Peter Alexander (6/20) Jun 04 2013 I think this is over-engineering. It's unlikely that an

Walter Bright (8/12) Jun 04 2013 Yup. My experience with abstractions that have no use cases is all the w...

Paulo Pinto (2/18) Jun 05 2013 Yep, it brings back some memories.

Andrei Alexandrescu (3/10) Jun 04 2013 Not absolutely nothing. Almost nothing. The distinction is important.
Max Samukha (3/11) Jun 04 2013 That "absolutely" based on limited personal experience is the

Andrei Alexandrescu (4/15) Jun 04 2013 It's a point, but "biggest" is also kind of too much and based on

Max Samukha (3/23) Jun 04 2013 Yeah, I noticed that.
Zach the Mystic (4/10) Jun 05 2013 Hey, if you ever need someone who can reliably answer with

Walter Bright (12/22) Jun 04 2013 I've seen an awful lot of abstractions over the years that provided zero...

Max Samukha (12/42) Jun 04 2013 I understand. But I've also seen a lot of abstractions over the
Timothee Cour (11/45) Jun 04 2013 What I suggested in my original post didn't involve any

Andrei Alexandrescu (3/10) Jun 05 2013 I think that's nice.

David Nadlinger (6/18) Jun 05 2013 +1. D has many powerful features for handling module namespacing
Daniel Murphy (11/22) Jun 09 2013 This has the problem that you now can't import more than one compression...

Jonathan M Davis (4/28) Jun 09 2013 That can be fixed by using a local alias, but it's true that it's an ext...
Timothee Cour (13/43) Jun 09 2013 which is why I have suggested supporting UFCS with fully qualified funct...

Daniel Murphy (12/25) Jun 09 2013 I'm not a huge fan of this syntax. If we were adding syntax, I would pr...

Jonathan M Davis (17/20) Jun 09 2013 We've actually made the opposite choice when discussing this in the past...

deadalnix (2/37) Jun 09 2013 You are wise and speak the truth :P
Daniel Murphy (8/38) Jun 11 2013 The difference here is these are range functions and you lose ufcs. It

Jakob Ovrum (37/38) Jun 11 2013 We have module-level functions called "copy" (multiple), "read",

Timothee Cour (8/45) Jun 11 2013 I have found a better way to do that: see

Jakob Ovrum (7/15) Jun 11 2013 It's clearly an option, but I think it's too syntactically heavy,

Jonathan M Davis (3/21) Jun 11 2013 Agreed.

Daniel Murphy (14/29) Jun 11 2013 It is.

Jakob Ovrum (4/12) Jun 12 2013 The way I see it, you're asking that all code should pay for the

Daniel Murphy (7/18) Jun 13 2013 Ok, how exactly is the data compressed in the following snippet? No

Michal Minich (5/9) Jun 13 2013 You can have that argument for any single overload and virtual
Peter Alexander (4/8) Jun 13 2013 If it's not obvious from the context, just be explicit.

Andrej Mitrovic (24/27) Jun 13 2013 What happens when we get std.compression.lz78 and you end up

Peter Alexander (7/13) Jun 13 2013 The exact same typo could happen with your structs. You haven't
David Nadlinger (4/8) Jun 13 2013 I think this argument is invalid: A typo in an import statement

Andrej Mitrovic (2/3) Jun 13 2013 *global functions*

David Nadlinger (6/9) Jun 13 2013 I don't need to scroll to the top of the module, just a few lines

Daniel Murphy (8/15) Jun 13 2013 I don't think 4 characters is a high price to pay for the added clarity....

David Nadlinger (3/10) Jun 13 2013 import std.compression : lz77Compress = lz78Compress;

Daniel Murphy (3/11) Jun 13 2013 :(
Jacob Carlborg (5/7) Jun 14 2013 If you do that you only have yourself to blame. What if someone uses

Peter Alexander (15/25) Jun 14 2013 I recommend you just use local imports if it bother you that

Timothee Cour (4/32) Jun 09 2013 ok I found what I think is the best solution to this problem :-)

Daniel Murphy (4/7) Jun 11 2013 That's pretty awesome, but still much much much uglier than not having t...

Jonathan M Davis (6/17) Jun 04 2013 So, you want to create whole modules for each compression algorithm? Tha...

Walter Bright (4/8) Jun 05 2013 When two modules have nothing to do with each other, they should be in s...

Jonathan M Davis (12/14) Jun 05 2013 Except that they're all compression algorithms, so they _are_ related. H...

Walter Bright (18/33) Jun 05 2013 No, they are not related. They don't share code, and it is unlikely more...

Jacob Carlborg (5/9) Jun 05 2013 The current modules in Phobos already contains too much. We shouldn't

Jonathan M Davis (5/13) Jun 05 2013 Maybe some do, but many don't, and 1000 lines is _far_ from too much. If...

Walter Bright (15/18) Jun 05 2013 1. It isn't any harder to find things in multiple files than in one file...

John Colvin (7/15) Jun 05 2013 Although I think you're right about having smaller modules, I

Diggory (4/20) Jun 05 2013 Surely you would know which compression algorithm you wanted to

John Colvin (2/25) Jun 05 2013 I eas speaking more generally, about phobos as a whole.

David Nadlinger (11/17) Jun 05 2013 Use an editor with a file tree sidebar? Quite on the contrary, I

John Colvin (5/24) Jun 05 2013 Agreed.

Jacob Carlborg (4/7) Jun 09 2013 Gedit has a file tree sidebar, at least as a plugin.

Jacob Carlborg (5/8) Jun 05 2013 I completely agree with Walter and he mad my point a lot better than I
Jakob Ovrum (39/45) Jun 05 2013 We have a standard library in disagreement with the language's

Jonathan M Davis (12/18) Jun 05 2013 I honestly don't see how Phobos is in disagreement with the module syste...

Diggory (13/44) Jun 05 2013 I agree with one or two functions it's far too small, but I'm in

H. S. Teoh (25/45) Jun 05 2013 [...]

Peter Alexander (21/91) Jun 06 2013 Massive +1

SomeDude (2/22) Jun 06 2013 Wise words !

David Nadlinger (5/12) Jun 05 2013 Modules are the unit of encapsulation in D (private), so they
SomeDude (4/12) Jun 05 2013 Well, as the author of a 15,000 lines datetime module, I think

Xiaoxi (3/17) Jun 06 2013 are cross module / file, inling working on all d compilers? if

David Nadlinger (9/11) Jun 06 2013 This is not at all relevant if either

Andrei Alexandrescu (7/15) Jun 04 2013 I think the application here is a bit more tenuous. It's natural to

David (2/18) Jun 04 2013 No the compression type only has to provide a certain api.

Walter Bright (2/20) Jun 04 2013 Again, I'm not seeing the added value with this.

Marco Leise (31/45) Jun 04 2013 LZW is a nice and fast general purpose algorithm and I
Tiago Martinez (5/13) Jun 05 2013 I may have misunderstood something, but the code does not

Dmitry Olshansky (7/24) Jun 05 2013 +1

Walter Bright (2/11) Jun 05 2013 Thanks, you're both right.

H. S. Teoh (45/78) Jun 05 2013 On the contrary, I find extremely large files (like std.algorithm) very

Walter Bright <newshound2 digitalmars.com> writes:

https://github.com/WalterBright/phobos/blob/std_compress/std/compress.d

I wrote this to add components to compress and expand ranges.

Highlights:

1. doesn't do any memory allocation
2. can handle arbitrarily large sets of data
3. it's lazy
4. takes an InputRange, and outputs an InputRange

Comments welcome.

Jun 03 2013

"Diggory" <diggsey googlemail.com> writes:

On Tuesday, 4 June 2013 at 03:44:05 UTC, Walter Bright wrote:
 https://github.com/WalterBright/phobos/blob/std_compress/std/compress.d

 I wrote this to add components to compress and expand ranges.

 Highlights:

 1. doesn't do any memory allocation
 2. can handle arbitrarily large sets of data
 3. it's lazy
 4. takes an InputRange, and outputs an InputRange

 Comments welcome.

Nice! What happens if R is not a ubyte range?

Jun 03 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/3/2013 10:41 PM, Diggory wrote:
 Nice! What happens if R is not a ubyte range?

It'll work with char and ubyte, too. Anything else you'll need to cast or use
an 
adapter.

Jun 03 2013

Timothee Cour <thelastmammoth gmail.com> writes:

A)
there already is std.zlib; why not have:
std.compress.zlib: public import std.zlib
std.compress.lzw: put this new module there instead of in std.compress
std.compress.image.png
std.compress.image.jpg

B)
rename:
std.compress.lzwCompress => std.compress.lzw.compress
std.compress. lzwExpand => std.compress.lzw.uncompress

which is more consistent with compress/uncompress from std.zlib

C)
maybe add a link to
https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Welch or other
source

D)
CircularBuffer belongs somewhere else; maybe std.range or std.container





On Mon, Jun 3, 2013 at 8:44 PM, Walter Bright <newshound2 digitalmars.com>wrote:

 https://github.com/**WalterBright/phobos/blob/std_**
 compress/std/compress.d<https://github.com/WalterBright/phobos/blob/std_compress/std/compress.d>

 I wrote this to add components to compress and expand ranges.

 Highlights:

 1. doesn't do any memory allocation
 2. can handle arbitrarily large sets of data
 3. it's lazy
 4. takes an InputRange, and outputs an InputRange

 Comments welcome.

Jun 03 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/3/2013 11:40 PM, Timothee Cour wrote:
 D)
 CircularBuffer belongs somewhere else; maybe std.range or std.container

I have mixed feelings about that. If you'll notice, std.compress doesn't have 
any imports! I wanted to make at least one module that doesn't pull in 100% of 
everything in Phobos (one of my pet peeves).

Jun 04 2013

"Dicebot" <m.strashun gmail.com> writes:

On Tuesday, 4 June 2013 at 08:00:03 UTC, Walter Bright wrote:
 On 6/3/2013 11:40 PM, Timothee Cour wrote:
 D)
 CircularBuffer belongs somewhere else; maybe std.range or 
 std.container

 I have mixed feelings about that. If you'll notice, 
 std.compress doesn't have any imports! I wanted to make at 
 least one module that doesn't pull in 100% of everything in 
 Phobos (one of my pet peeves).

If that is an issue, it is an issue in DMD, not in module. 
Modules are supposed to use each other extensively, that is the 
very reason to have them!

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 1:00 AM, Walter Bright wrote:
 On 6/3/2013 11:40 PM, Timothee Cour wrote:
 D)
 CircularBuffer belongs somewhere else; maybe std.range or std.container

 I have mixed feelings about that. If you'll notice, std.compress doesn't have
 any imports! I wanted to make at least one module that doesn't pull in 100% of
 everything in Phobos (one of my pet peeves).

Note also I didn't document it, so it is private and can be moved.

Jun 04 2013

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Tuesday, 4 June 2013 at 08:03:15 UTC, Walter Bright wrote:
 On 6/4/2013 1:00 AM, Walter Bright wrote:
 On 6/3/2013 11:40 PM, Timothee Cour wrote:
 D)
 CircularBuffer belongs somewhere else; maybe std.range or 
 std.container

 I have mixed feelings about that. If you'll notice, 
 std.compress doesn't have
 any imports! I wanted to make at least one module that doesn't 
 pull in 100% of
 everything in Phobos (one of my pet peeves).

 Note also I didn't document it, so it is private and can be 
 moved.

Then it should be private. You should also mangle the name so 
that it doesn't pollute the unqualified symbol namespace (either 
that or fix visibility of private symbols).

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 1:15 AM, Peter Alexander wrote:
 On Tuesday, 4 June 2013 at 08:03:15 UTC, Walter Bright wrote:
 On 6/4/2013 1:00 AM, Walter Bright wrote:
 On 6/3/2013 11:40 PM, Timothee Cour wrote:
 D)
 CircularBuffer belongs somewhere else; maybe std.range or std.container

 I have mixed feelings about that. If you'll notice, std.compress doesn't have
 any imports! I wanted to make at least one module that doesn't pull in 100% of
 everything in Phobos (one of my pet peeves).

 Note also I didn't document it, so it is private and can be moved.

 Then it should be private.

I agree.

 You should also mangle the name so that it doesn't
 pollute the unqualified symbol namespace (either that or fix visibility of
 private symbols).

If it proves useful, it will be moved into some more proper and public place.

I think it's a bad idea to 'mangle' the name. First off, if it is private, it
is 
not visible. And even being public, the anti-hijacking language features make
it 
a non-problem. The whole point is to avoid the wretched C problems with a
global 
name space, by not having a global name space.

Jun 04 2013

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Tuesday, 4 June 2013 at 08:23:52 UTC, Walter Bright wrote:
 You should also mangle the name so that it doesn't
 pollute the unqualified symbol namespace (either that or fix 
 visibility of
 private symbols).

 If it proves useful, it will be moved into some more proper and 
 public place.

 I think it's a bad idea to 'mangle' the name. First off, if it 
 is private, it is not visible. And even being public, the 
 anti-hijacking language features make it a non-problem. The 
 whole point is to avoid the wretched C problems with a global 
 name space, by not having a global name space.

import std.compress;
import mylib.circularbuffer;

CircularBuffer!(ubyte[1024]) buf;

ERROR: conflicting names, even though std.compress.CircularBuffer 
is private! I have to fully qualify CircularBuffer, or use alias 
to get around the problem.

D may not have a global namespace, but it does have unqualified 
name lookup, and private symbols still pollute that 
pseudo-namespace.

Jun 04 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 4 June 2013 at 08:33:29 UTC, Peter Alexander wrote:
 On Tuesday, 4 June 2013 at 08:23:52 UTC, Walter Bright wrote:
 You should also mangle the name so that it doesn't
 pollute the unqualified symbol namespace (either that or fix 
 visibility of
 private symbols).

 If it proves useful, it will be moved into some more proper 
 and public place.

 I think it's a bad idea to 'mangle' the name. First off, if it 
 is private, it is not visible. And even being public, the 
 anti-hijacking language features make it a non-problem. The 
 whole point is to avoid the wretched C problems with a global 
 name space, by not having a global name space.

 import std.compress;
 import mylib.circularbuffer;

 CircularBuffer!(ubyte[1024]) buf;

 ERROR: conflicting names, even though 
 std.compress.CircularBuffer is private! I have to fully qualify 
 CircularBuffer, or use alias to get around the problem.

Is this according to the specs though, or a bug? It was my
understanding that another module's private symbols should not
even be "seen" ?

Jun 04 2013

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Tuesday, 4 June 2013 at 09:11:49 UTC, monarch_dodra wrote:
 On Tuesday, 4 June 2013 at 08:33:29 UTC, Peter Alexander wrote:
 On Tuesday, 4 June 2013 at 08:23:52 UTC, Walter Bright wrote:
 You should also mangle the name so that it doesn't
 pollute the unqualified symbol namespace (either that or fix 
 visibility of
 private symbols).

 If it proves useful, it will be moved into some more proper 
 and public place.

 I think it's a bad idea to 'mangle' the name. First off, if 
 it is private, it is not visible. And even being public, the 
 anti-hijacking language features make it a non-problem. The 
 whole point is to avoid the wretched C problems with a global 
 name space, by not having a global name space.

 import std.compress;
 import mylib.circularbuffer;

 CircularBuffer!(ubyte[1024]) buf;

 ERROR: conflicting names, even though 
 std.compress.CircularBuffer is private! I have to fully 
 qualify CircularBuffer, or use alias to get around the problem.

 Is this according to the specs though, or a bug? It was my
 understanding that another module's private symbols should not
 even be "seen" ?

Well, the fix is currently in an unapproved DIP. I have no idea 
whether Walter intends to accept it or reject it. The discussion 
thread just seems to have died off.

http://wiki.dlang.org/DIP22

Jun 04 2013

Martin Nowak <code dawg.eu> writes:

On 06/04/2013 11:52 AM, Peter Alexander wrote:
 Well, the fix is currently in an unapproved DIP. I have no idea whether
 Walter intends to accept it or reject it. The discussion thread just
 seems to have died off.

 http://wiki.dlang.org/DIP22

I should really submit some ideas from my implementation to the DIP.
https://github.com/D-Programming-Language/dmd/pull/739

Jun 04 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

04-Jun-2013 12:23, Walter Bright пишет:
 On 6/4/2013 1:15 AM, Peter Alexander wrote:

  agree.
 You should also mangle the name so that it doesn't
 pollute the unqualified symbol namespace (either that or fix
 visibility of
 private symbols).

 If it proves useful, it will be moved into some more proper and public
 place.

 I think it's a bad idea to 'mangle' the name. First off, if it is
 private, it is not visible. And even being public, the anti-hijacking
 language features make it a non-problem. The whole point is to avoid the
 wretched C problems with a global name space, by not having a global
 name space.

They are visible and clash with other symbols just like public do. Maybe 
now is the time fix this bug?

-- 
Dmitry Olshansky

Jun 04 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Tuesday, June 04, 2013 01:23:52 Walter Bright wrote:
 I think it's a bad idea to 'mangle' the name. First off, if it is private,
 it is not visible. And even being public, the anti-hijacking language
 features make it a non-problem. The whole point is to avoid the wretched C
 problems with a global name space, by not having a global name space.

Not visible? When was that fixed? Last time I checked, access level had zero 
effect on visibility, just your ability to actually call it. Access level is 
taken into account after overload resolution. So, if there's another, public 
symbol with the same name which would be as good a match as this one aside 
from access level, then you're going to get a compilation error - which is 
exactly why most of us argue that inaccessible symbols should not be visible. 
But that requires a language change (which should definitely happen IMHO, but 
AFAIK, it still hasn't).

- Jonathan M Davis

Jun 04 2013

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Tuesday, 4 June 2013 at 08:00:03 UTC, Walter Bright wrote:
 I have mixed feelings about that. If you'll notice, 
 std.compress doesn't have any imports! I wanted to make at 
 least one module that doesn't pull in 100% of everything in 
 Phobos (one of my pet peeves).

I think this is a workaround, not a proper solution.

It probably means Phobos' granularity is horribly wrong.

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 1:08 AM, Jakob Ovrum wrote:
 On Tuesday, 4 June 2013 at 08:00:03 UTC, Walter Bright wrote:
 I have mixed feelings about that. If you'll notice, std.compress doesn't have
 any imports! I wanted to make at least one module that doesn't pull in 100% of
 everything in Phobos (one of my pet peeves).

 I think this is a workaround, not a proper solution.

Yes, it is.

 It probably means Phobos' granularity is horribly wrong.

Yup.

Phobos is hard to work on because of the complexity of everything importing and 
depending on everything else all in mutually referential cycles. I deliberately 
set out to create compress as a non-trivial module that did not do that.

I hope that splitting things up into packages will improve things.

Jun 04 2013

"Adam D. Ruppe" <destructionator gmail.com> writes:

I actually wish we could have multiple modules in a single file. 
Correct me if I'm wrong, but if imported something and only used 
one type there, the linker should strip out the others, right?

But this doesn't happen because ModuleInfo references all kinds 
of things, and moduleinfo is referenced for constructors and 
such. This is useful and removing it is probably a bad idea.

Breaking up into packages is one idea but you can't always do it. 
What if you're doing some big string mixins? A single file is 
also a little easier to distribute.

But mixins is the case that is hard to work around since they by 
definition go into one file. If we could do something like 
mixin("module foo.mixin"~name~" { code }"); you could work around 
it.

Then you could isolate sections of generated code in their own 
logical modules, letting the linker kill those sections if they 
aren't actually used.

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 5:17 AM, Adam D. Ruppe wrote:
 I actually wish we could have multiple modules in a single file.

I don't see much point to that in modern file systems.

Jun 04 2013

Marco Leise <Marco.Leise gmx.de> writes:

Am Tue, 04 Jun 2013 09:06:22 -0700
schrieb Walter Bright <newshound2 digitalmars.com>:

 On 6/4/2013 5:17 AM, Adam D. Ruppe wrote:
 I actually wish we could have multiple modules in a single file.

 
 I don't see much point to that in modern file systems.

Probably seek time if the files are scattered and not in cache.
That's hardly a show stopper unless you have 17.156 files like
the Java Runtime. But they 'solved' it by zipping them up.

-- 
Marco

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 5:13 PM, Marco Leise wrote:
 Probably seek time if the files are scattered and not in cache.
 That's hardly a show stopper unless you have 17.156 files like
 the Java Runtime. But they 'solved' it by zipping them up.


Actually, I've often thought of making dmd able to read everything it needs out 
of a zip file.

Jun 04 2013

Marco Leise <Marco.Leise gmx.de> writes:

Am Tue, 04 Jun 2013 17:58:01 -0700
schrieb Walter Bright <newshound2 digitalmars.com>:

 On 6/4/2013 5:13 PM, Marco Leise wrote:
 Probably seek time if the files are scattered and not in cache.
 That's hardly a show stopper unless you have 17.156 files like
 the Java Runtime. But they 'solved' it by zipping them up.

 
 
 Actually, I've often thought of making dmd able to read everything it needs
out 
 of a zip file.

That would have been difficult for editors and IDEs that can
look up file names from include paths only when they are not
zipped up. It is good the way it is.

-- 
Marco

Jun 04 2013

"eles" <eles eles.com> writes:

On Wednesday, 5 June 2013 at 02:23:54 UTC, Marco Leise wrote:
 Am Tue, 04 Jun 2013 17:58:01 -0700
 schrieb Walter Bright <newshound2 digitalmars.com>:
 That would have been difficult for editors and IDEs that can
 look up file names from include paths only when they are not
 zipped up. It is good the way it is.

True, but Java also had the same issue with its .jar files and 
the editors adapted.

Jun 05 2013

"eles" <eles eles.com> writes:

On Wednesday, 5 June 2013 at 00:58:02 UTC, Walter Bright wrote:
 On 6/4/2013 5:13 PM, Marco Leise wrote:
 Actually, I've often thought of making dmd able to read 
 everything it needs out of a zip file.

I support that. It would make distributing source code cleaner. 
Most of the time you don't need to look at the code, just compile 
it, while still knowing that you have it available in an archive 
if you need it.

Maybe that kind of support could improve the distribution of 
closed-source libraries, too: the generated .di files and the 
binaries could be packaged together in a zip file.

More, the zip file could be really easy tested for 
self-containment. It happens sometime that a folder of code 
compiles, then when you package the whole thing and ship it, you 
discover that you forget to package inside some kind of 
file/dependency, and the customers are complaining about it.

With a zip file, you just do a compile-check on the final package 
and, if ok, then it is ready for shipment.

Btw, I cannot not resist, just adding here my favorite quote in 
software development:

"It compiles. Let's ship it!" :)

Jun 05 2013

Jacob Carlborg <doob me.com> writes:

On 2013-06-05 02:58, Walter Bright wrote:

 Actually, I've often thought of making dmd able to read everything it
 needs out of a zip file.

I think it's better to have a proper package manager.

-- 
/Jacob Carlborg

Jun 05 2013

"Regan Heath" <regan netmail.co.nz> writes:

On Wed, 05 Jun 2013 11:17:06 +0100, Jacob Carlborg <doob me.com> wrote:

 On 2013-06-05 02:58, Walter Bright wrote:

 Actually, I've often thought of making dmd able to read everything it
 needs out of a zip file.

 I think it's better to have a proper package manager.

I think it's better to have both :)

R


-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Jun 05 2013

Timothee Cour <thelastmammoth gmail.com> writes:

There's no point in having modules reinvent the wheel everytime. CircularBuffer
is clearly usable in other contexts.

Reusing such code makes sure bug fixes and efficiency gains are done once
and for all and work across the board.

 I wanted to make at least one module that doesn't pull in 100% of


everything in Phobos

That seems like a very artificial exercise leading to unnecessary
contorsions.

On Tue, Jun 4, 2013 at 1:00 AM, Walter Bright <newshound2 digitalmars.com>wrote:

 On 6/3/2013 11:40 PM, Timothee Cour wrote:

 D)
 CircularBuffer belongs somewhere else; maybe std.range or std.container

 I have mixed feelings about that. If you'll notice, std.compress doesn't
 have any imports! I wanted to make at least one module that doesn't pull in
 100% of everything in Phobos (one of my pet peeves).

Jun 04 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/4/13 4:00 AM, Walter Bright wrote:
 On 6/3/2013 11:40 PM, Timothee Cour wrote:
 D)
 CircularBuffer belongs somewhere else; maybe std.range or std.container

 I have mixed feelings about that. If you'll notice, std.compress doesn't
 have any imports! I wanted to make at least one module that doesn't pull
 in 100% of everything in Phobos (one of my pet peeves).

The downside of that is reinventing everything. I haven't looked at the 
code yet, but std.range has http://dlang.org/phobos/std_range.html#cycle 
which implements a circular buffer.

Andrei

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 5:26 AM, Andrei Alexandrescu wrote:
 The downside of that is reinventing everything. I haven't looked at the code
 yet, but std.range has http://dlang.org/phobos/std_range.html#cycle which
 implements a circular buffer.

cycle only reads from a circular buffer. CircularBuffer can be filled as well
as 
emptied at the same time (it has a put() method).

Jun 04 2013

Jacob Carlborg <doob me.com> writes:

On 2013-06-04 08:40, Timothee Cour wrote:
 A)
 there already is std.zlib; why not have:
 std.compress.zlib: public import std.zlib
 std.compress.lzw: put this new module there instead of in std.compress
 std.compress.image.png
 std.compress.image.jpg

 B)
 rename:
 std.compress.lzwCompress => std.compress.lzw.compress
 std.compress. lzwExpand => std.compress.lzw.uncompress

 which is more consistent with compress/uncompress from std.zlib

 C)
 maybe add a link to
 https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Welch or other
 source

 D)
 CircularBuffer belongs somewhere else; maybe std.range or std.container

I agree with all of these.

Perhaps it should be put in the review queue as well.

-- 
/Jacob Carlborg

Jun 04 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Timothee Cour:

 D)
 CircularBuffer belongs somewhere else; maybe std.range or 
 std.container

If you are interested in adding a CircularBuffer to Phobos, then 
I'd like both that fixed sized one and a growing one like this:
http://rosettacode.org/wiki/Queue/Usage#Faster_Version

Bye,
bearophile

Jun 04 2013

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Tuesday, 4 June 2013 at 11:18:45 UTC, bearophile wrote:
 If you are interested in adding a CircularBuffer to Phobos, 
 then I'd like both that fixed sized one and a growing one like 
 this:
 http://rosettacode.org/wiki/Queue/Usage#Faster_Version

Nitpick;

head = (head + 1) & ((cast(size_t)1 << power2) - 1);

can be

head = (head + 1) & (A.length - 1);

No? power2 seems superfluous. Also, left/right shifts by variable 
amount are very slow on some processors

Anyway, we'll really need allocators before we can add more 
allocating containers. Andrei? :-)

Jun 04 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Peter Alexander:

 Nitpick;

 head = (head + 1) & ((cast(size_t)1 << power2) - 1);

 can be

 head = (head + 1) & (A.length - 1);

 No? power2 seems superfluous.

I see. Thank you. I will improve it later.

Bye,
bearophile

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/3/2013 8:44 PM, Walter Bright wrote:
 https://github.com/WalterBright/phobos/blob/std_compress/std/compress.d

 I wrote this to add components to compress and expand ranges.

 Highlights:

 1. doesn't do any memory allocation
 2. can handle arbitrarily large sets of data
 3. it's lazy
 4. takes an InputRange, and outputs an InputRange

 Comments welcome.

BTW, I also wrote this because it is a tricky component to write. There is not
a 
1:1 correspondence between input and output - the relationship is not 
predictable. Worse, there are "look backs" on input and "back patches" on 
output. Hence, sliding buffers have to be used on both input and output.

I like to think of it as an example of how to do such. It took me a bit of time 
to figure out a way to do it that wasn't too numbingly complex.

Jun 04 2013

David <d dav1d.de> writes:

Am 04.06.2013 05:44, schrieb Walter Bright:
 https://github.com/WalterBright/phobos/blob/std_compress/std/compress.d
 
 I wrote this to add components to compress and expand ranges.
 
 Highlights:
 
 1. doesn't do any memory allocation
 2. can handle arbitrarily large sets of data
 3. it's lazy
 4. takes an InputRange, and outputs an InputRange
 
 Comments welcome.

Why do we need that? I would much rather have a deflate which doesn't
depend on a C zlib (a proper std.zlib written in 100% D) and followed by
a less buggy, less pita, less limited std.zip (btw. I think I fxed one
of the bugs a while ago but it is still open and listed as bug on
dlang.org).

I personally never used lzw compression and from what I know it is only
used in GIF and TIFF (I might be wrong here), in comparison to deflate
which is used in a varity of formats. So making std.compress only
contain a rarely used compression algorithm feels wrong, having in it
std.compress.* ok.

Jun 04 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Tuesday, June 04, 2013 14:48:34 David wrote:
 (btw. I think I fxed one
 of the bugs a while ago but it is still open and listed as bug on
 dlang.org).

If you're sure that it's fixed, then close it.

- Jonathan M Davis

Jun 04 2013

Jacob Carlborg <doob me.com> writes:

On 2013-06-04 05:44, Walter Bright wrote:
 https://github.com/WalterBright/phobos/blob/std_compress/std/compress.d

 I wrote this to add components to compress and expand ranges.

 Highlights:

 1. doesn't do any memory allocation
 2. can handle arbitrarily large sets of data
 3. it's lazy
 4. takes an InputRange, and outputs an InputRange

I'm wondering if (un)compress can take the compressing algorithm as a 
template parameter. Does that make sense?

Something like:

auto result = data.compress!(LZW);

Then we could pass different compressing algorithms to the compress 
function.

-- 
/Jacob Carlborg

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 6:34 AM, Jacob Carlborg wrote:
 I'm wondering if (un)compress can take the compressing algorithm as a template
 parameter. Does that make sense?

 Something like:

 auto result = data.compress!(LZW);

 Then we could pass different compressing algorithms to the compress function.

I don't see the point. Furthermore, it requires that the compress template know 
about all the compression algorithms available, which limits future expansion.

Jun 04 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Tuesday, 4 June 2013 at 16:09:09 UTC, Walter Bright wrote:
 On 6/4/2013 6:34 AM, Jacob Carlborg wrote:
 I'm wondering if (un)compress can take the compressing 
 algorithm as a template
 parameter. Does that make sense?

 Something like:

 auto result = data.compress!(LZW);

 Then we could pass different compressing algorithms to the 
 compress function.

 I don't see the point. Furthermore, it requires that the 
 compress template know about all the compression algorithms 
 available, which limits future expansion.

Not necessarily. If the compression algorithms were free 
functions in the module you could just be passing an alias to 
one, which compress would then call. (which would also allow 
people to specify their own algorithms)

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 9:33 AM, John Colvin wrote:
 On Tuesday, 4 June 2013 at 16:09:09 UTC, Walter Bright wrote:
 On 6/4/2013 6:34 AM, Jacob Carlborg wrote:
 I'm wondering if (un)compress can take the compressing algorithm as a template
 parameter. Does that make sense?

 Something like:

 auto result = data.compress!(LZW);

 Then we could pass different compressing algorithms to the compress function.

 I don't see the point. Furthermore, it requires that the compress template
 know about all the compression algorithms available, which limits future
 expansion.

 Not necessarily. If the compression algorithms were free functions in the
module
 you could just be passing an alias to one, which compress would then call.
 (which would also allow people to specify their own algorithms)

What value does a function which just passes an alias to another one add?

Jun 04 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Tuesday, 4 June 2013 at 17:50:47 UTC, Walter Bright wrote:
 On 6/4/2013 9:33 AM, John Colvin wrote:
 On Tuesday, 4 June 2013 at 16:09:09 UTC, Walter Bright wrote:
 On 6/4/2013 6:34 AM, Jacob Carlborg wrote:
 I'm wondering if (un)compress can take the compressing 
 algorithm as a template
 parameter. Does that make sense?

 Something like:

 auto result = data.compress!(LZW);

 Then we could pass different compressing algorithms to the 
 compress function.

 I don't see the point. Furthermore, it requires that the 
 compress template
 know about all the compression algorithms available, which 
 limits future
 expansion.

 Not necessarily. If the compression algorithms were free 
 functions in the module
 you could just be passing an alias to one, which compress 
 would then call.
 (which would also allow people to specify their own algorithms)

 What value does a function which just passes an alias to 
 another one add?

A unified interface called "compress" that takes a compression 
function as an alias (with e.g. lzwCompress as a default) seems 
like a nicer way of working, seeing as people don't necessarily 
care/know about which algorithm they're using, they just want to 
compress something a bit.

Also, it would be cool if a range could remember which algorithm 
it was compressed with (as it's type? I.e. LzwRange), so a 
generic function "expand" could call the appropriate ***Expand

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 11:04 AM, John Colvin wrote:
 On Tuesday, 4 June 2013 at 17:50:47 UTC, Walter Bright wrote:
 What value does a function which just passes an alias to another one add?

 A unified interface called "compress" that takes a compression function as an
 alias (with e.g. lzwCompress as a default) seems like a nicer way of working,
 seeing as people don't necessarily care/know about which algorithm they're
 using, they just want to compress something a bit.

 Also, it would be cool if a range could remember which algorithm it was
 compressed with (as it's type? I.e. LzwRange), so a generic function "expand"
 could call the appropriate ***Expand

What is the improvement of typing:

    compress(lzw)

over:

    lzwCompress()

?

Jun 04 2013

Timothee Cour <thelastmammoth gmail.com> writes:

On Tue, Jun 4, 2013 at 11:37 AM, Walter Bright
<newshound2 digitalmars.com>wrote:

 On 6/4/2013 11:04 AM, John Colvin wrote:

 On Tuesday, 4 June 2013 at 17:50:47 UTC, Walter Bright wrote:

 What value does a function which just passes an alias to another one add?

 A unified interface called "compress" that takes a compression function
 as an
 alias (with e.g. lzwCompress as a default) seems like a nicer way of
 working,
 seeing as people don't necessarily care/know about which algorithm they're
 using, they just want to compress something a bit.

 Also, it would be cool if a range could remember which algorithm it was
 compressed with (as it's type? I.e. LzwRange), so a generic function
 "expand"
 could call the appropriate ***Expand

 What is the improvement of typing:

    compress(lzw)

 over:

    lzwCompress()

 ?

writing generic code.
same reason as why we prefer:
auto y=to!double(x) over auto y=to_double(x);

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 11:43 AM, Timothee Cour wrote:
 writing generic code.
 same reason as why we prefer:
 auto y=to!double(x) over auto y=to_double(x);

The situations aren't comparable. The to!double case is parameterizing with a 
type, the compress one is not. Secondly, compress(lzw) does ABSOLUTELY NOTHING 
but turn around and call lzw. It adds nothing.

Jun 04 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Tuesday, June 04, 2013 11:46:48 Walter Bright wrote:
 On 6/4/2013 11:43 AM, Timothee Cour wrote:
 writing generic code.
 same reason as why we prefer:
 auto y=to!double(x) over auto y=to_double(x);

 
 The situations aren't comparable. The to!double case is parameterizing with
 a type, the compress one is not. Secondly, compress(lzw) does ABSOLUTELY
 NOTHING but turn around and call lzw. It adds nothing.

Well, I'd expect it to be compress!lzw(), but in any case, what it buys you is 
that you can pass the algorithm around without caring what it is so that while 
code higher up on the stack may have to know that it's lzw, code deeper down 
doesn't have to care what type of algorithm it's using. Now, whether that 
flexibility is all that useful in this particular case, I don't know, but it 
_does_ help with generic code. It's like how a lot of std.algorithm takes its 
predicate as an alias.

- Jonathan M Davis

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 11:55 AM, Jonathan M Davis wrote:
 Well, I'd expect it to be compress!lzw(), but in any case, what it buys you is
 that you can pass the algorithm around without caring what it is so that while
 code higher up on the stack may have to know that it's lzw, code deeper down
 doesn't have to care what type of algorithm it's using. Now, whether that
 flexibility is all that useful in this particular case, I don't know, but it
 _does_ help with generic code. It's like how a lot of std.algorithm takes its
 predicate as an alias.

There is zero utility in this:

auto compress(alias dg)
{
     return dg();
}

Not even for generic code.

Jun 04 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Tuesday, June 04, 2013 13:15:07 Walter Bright wrote:
 On 6/4/2013 11:55 AM, Jonathan M Davis wrote:
 Well, I'd expect it to be compress!lzw(), but in any case, what it buys
 you is that you can pass the algorithm around without caring what it is
 so that while code higher up on the stack may have to know that it's lzw,
 code deeper down doesn't have to care what type of algorithm it's using.
 Now, whether that flexibility is all that useful in this particular case,
 I don't know, but it _does_ help with generic code. It's like how a lot
 of std.algorithm takes its predicate as an alias.

 
 There is zero utility in this:
 
 auto compress(alias dg)
 {
 return dg();
 }
 
 Not even for generic code.

If that's all it's doing, then no, it wouldn't be useful to pass it as an 
argument. I was just pointing out that there are plenty of cases where passing 
functions to generic algorithms is an improvement. I haven't looked at what 
you've done yet, so I can't really comment on the details of this particular 
case.

- Jonathan M Davis

Jun 04 2013

Byron Heads <byron.heads gmail.com> writes:

On Tue, 04 Jun 2013 13:15:07 -0700, Walter Bright wrote:

 On 6/4/2013 11:55 AM, Jonathan M Davis wrote:
 Well, I'd expect it to be compress!lzw(), but in any case, what it buys
 you is that you can pass the algorithm around without caring what it is
 so that while code higher up on the stack may have to know that it's
 lzw, code deeper down doesn't have to care what type of algorithm it's
 using. Now, whether that flexibility is all that useful in this
 particular case, I don't know, but it _does_ help with generic code.
 It's like how a lot of std.algorithm takes its predicate as an alias.

 
 There is zero utility in this:
 
 auto compress(alias dg)
 {
      return dg();
 }
 
 Not even for generic code.

but a compress interface would be nice:

interface Compress
{
    ubyte[] compress(ubyte[]);
    ubyte[] uncompress(ubyte[]);
}

that way you can use any compress algorithm
bool send(Compress)(Socket sock);

Jun 04 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

05-Jun-2013 00:30, Byron Heads пишет:
 On Tue, 04 Jun 2013 13:15:07 -0700, Walter Bright wrote:

 On 6/4/2013 11:55 AM, Jonathan M Davis wrote:
 Well, I'd expect it to be compress!lzw(), but in any case, what it buys
 you is that you can pass the algorithm around without caring what it is
 so that while code higher up on the stack may have to know that it's
 lzw, code deeper down doesn't have to care what type of algorithm it's
 using. Now, whether that flexibility is all that useful in this
 particular case, I don't know, but it _does_ help with generic code.
 It's like how a lot of std.algorithm takes its predicate as an alias.

 There is zero utility in this:

 auto compress(alias dg)
 {
       return dg();
 }

 Not even for generic code.

 but a compress interface would be nice:

 interface Compress
 {
      ubyte[] compress(ubyte[]);
      ubyte[] uncompress(ubyte[]);
 }

 that way you can use any compress algorithm
 bool send(Compress)(Socket sock);

It's a range already thus composable. Ranged I/O though is something to 
come some time in near future (Steve?)

-- 
Dmitry Olshansky

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 1:30 PM, Byron Heads wrote:
 but a compress interface would be nice:

 interface Compress
 {
      ubyte[] compress(ubyte[]);
      ubyte[] uncompress(ubyte[]);
 }

 that way you can use any compress algorithm
 bool send(Compress)(Socket sock);

That isn't how ranges work. Ranges already define an input and an output 
interface. We don't need to invent another scheme.

Jun 04 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Tuesday, 4 June 2013 at 18:46:49 UTC, Walter Bright wrote:
 On 6/4/2013 11:43 AM, Timothee Cour wrote:
 writing generic code.
 same reason as why we prefer:
 auto y=to!double(x) over auto y=to_double(x);

 The situations aren't comparable. The to!double case is 
 parameterizing with a type, the compress one is not. Secondly, 
 compress(lzw) does ABSOLUTELY NOTHING but turn around and call 
 lzw. It adds nothing.


Currently. However, compress could become more feature-rich in 
the future. Perhaps there's some scope for automatic 
algorithm/parameter selection based on the type and length(if 
available) of what gets passed.

Jun 04 2013

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Tuesday, 4 June 2013 at 19:00:35 UTC, John Colvin wrote:
 On Tuesday, 4 June 2013 at 18:46:49 UTC, Walter Bright wrote:
 On 6/4/2013 11:43 AM, Timothee Cour wrote:
 writing generic code.
 same reason as why we prefer:
 auto y=to!double(x) over auto y=to_double(x);

 The situations aren't comparable. The to!double case is 
 parameterizing with a type, the compress one is not. Secondly, 
 compress(lzw) does ABSOLUTELY NOTHING but turn around and call 
 lzw. It adds nothing.


 Currently. However, compress could become more feature-rich in 
 the future. Perhaps there's some scope for automatic 
 algorithm/parameter selection based on the type and length(if 
 available) of what gets passed.

I think this is over-engineering. It's unlikely that an 
application will need to support multiple compression algorithms 
in the same piece of code, and even if it did, it would be 
trivial to implement this on top of the simple interface that 
Walter is using.

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 12:41 PM, Peter Alexander wrote:
 I think this is over-engineering. It's unlikely that an application will need
to
 support multiple compression algorithms in the same piece of code, and even if
 it did, it would be trivial to implement this on top of the simple interface
 that Walter is using.

Yup. My experience with abstractions that have no use cases is all the wrong 
things get abstracted. And by my experience, I include every one I've seen
other 
people write as well as my own.

My favorite is windows.h. It was originally written for 16 bit Windows, and had 
all kinds of abstractions to make it portable for a future 32 bit Windows. 
Unfortunately, apparently nobody working on windows.h had any experience with
32 
bit code, and the abstractions turned out to be all wrong.

Jun 04 2013

Paulo Pinto <pjmlp progtools.org> writes:

Am 04.06.2013 22:20, schrieb Walter Bright:
 On 6/4/2013 12:41 PM, Peter Alexander wrote:
 I think this is over-engineering. It's unlikely that an application
 will need to
 support multiple compression algorithms in the same piece of code, and
 even if
 it did, it would be trivial to implement this on top of the simple
 interface
 that Walter is using.

 Yup. My experience with abstractions that have no use cases is all the
 wrong things get abstracted. And by my experience, I include every one
 I've seen other people write as well as my own.

 My favorite is windows.h. It was originally written for 16 bit Windows,
 and had all kinds of abstractions to make it portable for a future 32
 bit Windows. Unfortunately, apparently nobody working on windows.h had
 any experience with 32 bit code, and the abstractions turned out to be
 all wrong.

Yep, it brings back some memories.

Jun 05 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/4/13 2:46 PM, Walter Bright wrote:
 On 6/4/2013 11:43 AM, Timothee Cour wrote:
 writing generic code.
 same reason as why we prefer:
 auto y=to!double(x) over auto y=to_double(x);

 The situations aren't comparable. The to!double case is parameterizing
 with a type, the compress one is not. Secondly, compress(lzw) does
 ABSOLUTELY NOTHING but turn around and call lzw. It adds nothing.

Not absolutely nothing. Almost nothing. The distinction is important.

Andrei

Jun 04 2013

"Max Samukha" <maxsamukha gmail.com> writes:

On Tuesday, 4 June 2013 at 18:46:49 UTC, Walter Bright wrote:
 On 6/4/2013 11:43 AM, Timothee Cour wrote:
 writing generic code.
 same reason as why we prefer:
 auto y=to!double(x) over auto y=to_double(x);

 The situations aren't comparable. The to!double case is 
 parameterizing with a type, the compress one is not. Secondly, 
 compress(lzw) does ABSOLUTELY NOTHING but turn around and call 
 lzw. It adds nothing.

That "absolutely" based on limited personal experience is the 
biggest D's problem.

Jun 04 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/5/13 12:44 AM, Max Samukha wrote:
 On Tuesday, 4 June 2013 at 18:46:49 UTC, Walter Bright wrote:
 On 6/4/2013 11:43 AM, Timothee Cour wrote:
 writing generic code.
 same reason as why we prefer:
 auto y=to!double(x) over auto y=to_double(x);

 The situations aren't comparable. The to!double case is parameterizing
 with a type, the compress one is not. Secondly, compress(lzw) does
 ABSOLUTELY NOTHING but turn around and call lzw. It adds nothing.

 That "absolutely" based on limited personal experience is the biggest
 D's problem.

It's a point, but "biggest" is also kind of too much and based on 
limited personal experience :o).

Andrei

Jun 04 2013

"Max Samukha" <maxsamukha gmail.com> writes:

On Wednesday, 5 June 2013 at 04:54:46 UTC, Andrei Alexandrescu 
wrote:
 On 6/5/13 12:44 AM, Max Samukha wrote:
 On Tuesday, 4 June 2013 at 18:46:49 UTC, Walter Bright wrote:
 On 6/4/2013 11:43 AM, Timothee Cour wrote:
 writing generic code.
 same reason as why we prefer:
 auto y=to!double(x) over auto y=to_double(x);

 The situations aren't comparable. The to!double case is 
 parameterizing
 with a type, the compress one is not. Secondly, compress(lzw) 
 does
 ABSOLUTELY NOTHING but turn around and call lzw. It adds 
 nothing.

 That "absolutely" based on limited personal experience is the 
 biggest
 D's problem.

 It's a point, but "biggest" is also kind of too much and based 
 on limited personal experience :o).

 Andrei

Yeah, I noticed that.

Jun 04 2013

"Zach the Mystic" <reachzach gggggmail.com> writes:

On Wednesday, 5 June 2013 at 04:54:46 UTC, Andrei Alexandrescu 
wrote:
 That "absolutely" based on limited personal experience is the 
 biggest
 D's problem.

 It's a point, but "biggest" is also kind of too much and based 
 on limited personal experience :o).

 Andrei

Hey, if you ever need someone who can reliably answer with 
limited personal experience, I'm available. :-)

Jun 05 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 9:44 PM, Max Samukha wrote:
 On Tuesday, 4 June 2013 at 18:46:49 UTC, Walter Bright wrote:
 On 6/4/2013 11:43 AM, Timothee Cour wrote:
 writing generic code.
 same reason as why we prefer:
 auto y=to!double(x) over auto y=to_double(x);

 The situations aren't comparable. The to!double case is parameterizing with a
 type, the compress one is not. Secondly, compress(lzw) does ABSOLUTELY NOTHING
 but turn around and call lzw. It adds nothing.

 That "absolutely" based on limited personal experience is the biggest D's
problem.

I've seen an awful lot of abstractions over the years that provided zero value.

You need to provide a compelling use case to justify another layer of 
complexity. "generic code" is not a compelling use case. It's already generic.

Note how these components are to be used:

     src.lzwCompress.copy(dst);

Your proposal is:

     src.compress(lzw).copy(dst);

I.e. zero value, as so far all compress() does is call lzw().

The whole point of range-based pipeline programming is you can just plug in 
different components. There is no demonstrated use case for adding another
layer.

I am actually wrong in saying it has zero value. It has negative value :-)

Jun 04 2013

"Max Samukha" <maxsamukha gmail.com> writes:

On Wednesday, 5 June 2013 at 06:18:54 UTC, Walter Bright wrote:
 On 6/4/2013 9:44 PM, Max Samukha wrote:
 On Tuesday, 4 June 2013 at 18:46:49 UTC, Walter Bright wrote:
 On 6/4/2013 11:43 AM, Timothee Cour wrote:
 writing generic code.
 same reason as why we prefer:
 auto y=to!double(x) over auto y=to_double(x);

 The situations aren't comparable. The to!double case is 
 parameterizing with a
 type, the compress one is not. Secondly, compress(lzw) does 
 ABSOLUTELY NOTHING
 but turn around and call lzw. It adds nothing.

 That "absolutely" based on limited personal experience is the 
 biggest D's problem.

 I've seen an awful lot of abstractions over the years that 
 provided zero value.

I understand. But I've also seen a lot of abstractions over the 
years that seemed useless initially but were discovered to be 
extremely useful later (Bayes theorem is an example - it took 300 
years to find a concrete use for it). So "a compelling use case" 
is not a sufficient criterion for evaluating usefulness of 
abstractions.

 You need to provide a compelling use case to justify another 
 layer of complexity. "generic code" is not a compelling use 
 case. It's already generic.

 Note how these components are to be used:

     src.lzwCompress.copy(dst);

 Your proposal is:

     src.compress(lzw).copy(dst);

 I.e. zero value, as so far all compress() does is call lzw().

That's not my proposal. Honestly I didn't even take a close look 
at it. I just felt like it was time to attack you - there is an 
explicit permission for casual trolling you gave.

 The whole point of range-based pipeline programming is you can 
 just plug in different components. There is no demonstrated use 
 case for adding another layer.

Ok.

 I am actually wrong in saying it has zero value. It has 
 negative value :-)

In this particular case, maybe.

Jun 04 2013

Timothee Cour <thelastmammoth gmail.com> writes:

What I suggested in my original post didn't involve any
indirection/abstraction; simply a renaming to be consistent with existing
zlib (see my points A+B in my 1st post on this thread):

std.compress.zlib.compress
std.compress.zlib.uncompress
std.compress.lzw.compress
std.compress.lzw.uncompress

same reason we have: std.file.write, std.stdio.write, etc, and not
std.fileWrite, std.stdioWrite.

On Tue, Jun 4, 2013 at 11:18 PM, Walter Bright
<newshound2 digitalmars.com>wrote:

 On 6/4/2013 9:44 PM, Max Samukha wrote:

 On Tuesday, 4 June 2013 at 18:46:49 UTC, Walter Bright wrote:

 On 6/4/2013 11:43 AM, Timothee Cour wrote:

 writing generic code.
 same reason as why we prefer:
 auto y=to!double(x) over auto y=to_double(x);

 The situations aren't comparable. The to!double case is parameterizing
 with a
 type, the compress one is not. Secondly, compress(lzw) does ABSOLUTELY
 NOTHING
 but turn around and call lzw. It adds nothing.

 That "absolutely" based on limited personal experience is the biggest D's
 problem.

 I've seen an awful lot of abstractions over the years that provided zero
 value.

 You need to provide a compelling use case to justify another layer of
 complexity. "generic code" is not a compelling use case. It's already
 generic.

 Note how these components are to be used:

     src.lzwCompress.copy(dst);

 Your proposal is:

     src.compress(lzw).copy(dst);

 I.e. zero value, as so far all compress() does is call lzw().

 The whole point of range-based pipeline programming is you can just plug
 in different components. There is no demonstrated use case for adding
 another layer.

 I am actually wrong in saying it has zero value. It has negative value :-)

Jun 04 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/5/13 2:55 AM, Timothee Cour wrote:
 What I suggested in my original post didn't involve any
 indirection/abstraction; simply a renaming to be consistent with
 existing zlib (see my points A+B in my 1st post on this thread):

 std.compress.zlib.compress
 std.compress.zlib.uncompress
 std.compress.lzw.compress
 std.compress.lzw.uncompress

I think that's nice.

Andrei

Jun 05 2013

"David Nadlinger" <code klickverbot.at> writes:

On Wednesday, 5 June 2013 at 12:55:50 UTC, Andrei Alexandrescu
wrote:
 On 6/5/13 2:55 AM, Timothee Cour wrote:
 What I suggested in my original post didn't involve any
 indirection/abstraction; simply a renaming to be consistent 
 with
 existing zlib (see my points A+B in my 1st post on this 
 thread):

 std.compress.zlib.compress
 std.compress.zlib.uncompress
 std.compress.lzw.compress
 std.compress.lzw.uncompress

 I think that's nice.

+1. D has many powerful features for handling module namespacing 
(e.g. "import lzw = std.compress.lzw"), let's enable people to 
make use of them.

David

Jun 05 2013

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:koncgm$9f5$1 digitalmars.com...
 On 6/5/13 2:55 AM, Timothee Cour wrote:
 What I suggested in my original post didn't involve any
 indirection/abstraction; simply a renaming to be consistent with
 existing zlib (see my points A+B in my 1st post on this thread):

 std.compress.zlib.compress
 std.compress.zlib.uncompress
 std.compress.lzw.compress
 std.compress.lzw.uncompress

 I think that's nice.

 Andrei

This has the problem that you now can't import more than one compression 
module and still use ufcs.  The annoying one I keep hitting in phobos is 
std.file.write vs std.stdio.write.  For range-based APIs it is a huge pita 
to have to switch away from ufcs.  I think xyzCompress is still pretty 
sweet, consistent, and completely fixes the problem.  It has the added 
benefit that you can tell which compression algorithm is being used without 
having to know what is imported.

I would not have a problem with each module providing both 'compress' and 
'xyzCompress', but that is against phobos policy.

Jun 09 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, June 09, 2013 17:12:16 Daniel Murphy wrote:
 "Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message
 news:koncgm$9f5$1 digitalmars.com...
 
 On 6/5/13 2:55 AM, Timothee Cour wrote:
 What I suggested in my original post didn't involve any
 indirection/abstraction; simply a renaming to be consistent with
 existing zlib (see my points A+B in my 1st post on this thread):
 
 std.compress.zlib.compress
 std.compress.zlib.uncompress
 std.compress.lzw.compress
 std.compress.lzw.uncompress

 
 I think that's nice.
 
 Andrei

 
 This has the problem that you now can't import more than one compression
 module and still use ufcs.  The annoying one I keep hitting in phobos is
 std.file.write vs std.stdio.write.  For range-based APIs it is a huge pita
 to have to switch away from ufcs.  I think xyzCompress is still pretty
 sweet, consistent, and completely fixes the problem.  It has the added
 benefit that you can tell which compression algorithm is being used without
 having to know what is imported.

That can be fixed by using a local alias, but it's true that it's an extra 
annoyance.

- Jonathan M Davis

Jun 09 2013

Timothee Cour <thelastmammoth gmail.com> writes:

On Sun, Jun 9, 2013 at 12:53 AM, Jonathan M Davis <jmdavisProg gmx.com>wrote:

 On Sunday, June 09, 2013 17:12:16 Daniel Murphy wrote:
 "Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message
 news:koncgm$9f5$1 digitalmars.com...

 On 6/5/13 2:55 AM, Timothee Cour wrote:
 What I suggested in my original post didn't involve any
 indirection/abstraction; simply a renaming to be consistent with
 existing zlib (see my points A+B in my 1st post on this thread):

 std.compress.zlib.compress
 std.compress.zlib.uncompress
 std.compress.lzw.compress
 std.compress.lzw.uncompress

 I think that's nice.

 Andrei

 This has the problem that you now can't import more than one compression
 module and still use ufcs.  The annoying one I keep hitting in phobos is
 std.file.write vs std.stdio.write.  For range-based APIs it is a huge

 pita
 to have to switch away from ufcs.  I think xyzCompress is still pretty
 sweet, consistent, and completely fixes the problem.  It has the added
 benefit that you can tell which compression algorithm is being used

 without
 having to know what is imported.

 That can be fixed by using a local alias, but it's true that it's an extra
 annoyance.

 - Jonathan M Davis

which is why I have suggested supporting UFCS with fully qualified function
names:

auto a="".(std.path.join)("\n");
myfile.(std.file.write)(text);
text.(std.stdio.write);

see post: support UFCS with fully qualified function names (was in
"digitalmars.D.learn")
http://forum.dlang.org/post/mailman.1453.1369099708.4724.digitalmars-d puremagic.com

it also helps searchability: if one uses local aliases such as import
std.stdio:write2=write, naive searching via grep 'write(' will miss such
cases. The increase in complexity is minimal, and the feature makes sense
with the rest of the language.

Jun 09 2013

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"Timothee Cour" <thelastmammoth gmail.com> wrote in message 
news:mailman.999.1370827257.13711.digitalmars-d puremagic.com...
 which is why I have suggested supporting UFCS with fully qualified 
 function
 names:

 auto a="".(std.path.join)("\n");
 myfile.(std.file.write)(text);
 text.(std.stdio.write);

 see post: support UFCS with fully qualified function names (was in
 "digitalmars.D.learn")
 http://forum.dlang.org/post/mailman.1453.1369099708.4724.digitalmars-d puremagic.com

I'm not a huge fan of this syntax.  If we were adding syntax, I would prefer 
a new operator with lower precedence than '.'

eg
auto a = "" -> std.path.join("\n");

But I'm not sure the problem is big enough to warrant new syntax.

 it also helps searchability: if one uses local aliases such as import
 std.stdio:write2=write, naive searching via grep 'write(' will miss such
 cases. The increase in complexity is minimal, and the feature makes sense
 with the rest of the language.

I agree, renamed imports make code harder to understand, and harder to 
refactor.

In this case we can prevent problem simply by not giving functions generic 
names like 'compress'.  Ideally you should be able to import the entire 
standard library with no name conflicts.

Jun 09 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, June 10, 2013 11:44:56 Daniel Murphy wrote:
 In this case we can prevent problem simply by not giving functions generic
 names like 'compress'.  Ideally you should be able to import the entire
 standard library with no name conflicts.

We've actually made the opposite choice when discussing this in the past. 
We've specifically gone for making functions which do the same thing in 
different modules having the same name (e.g. std.ascii and std.uni), which 
makes swapping one for the other easy and avoids having to come up with 
distinct names, though it does obviously create more naming conflicts when you 
try and mix and match such modules. I'd also point out that it's been argued 
that it's a failure of the module system if we're specifically trying to avoid 
having different modules have functions with the same name. It's the module 
system's job to differentiate such functions, and specifically avoiding naming 
stuff the same to avoid naming conflicts means that you're pretty much ignoring 
the module system.

So, the general approach has been to name functions differently when they do 
different things and name them the same when they do the same thing and then 
let the module system take care of differentiating between the two when you 
need to.

- Jonathan M Davis

Jun 09 2013

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 10 June 2013 at 01:59:29 UTC, Jonathan M Davis wrote:
 On Monday, June 10, 2013 11:44:56 Daniel Murphy wrote:
 In this case we can prevent problem simply by not giving 
 functions generic
 names like 'compress'.  Ideally you should be able to import 
 the entire
 standard library with no name conflicts.

 We've actually made the opposite choice when discussing this in 
 the past.
 We've specifically gone for making functions which do the same 
 thing in
 different modules having the same name (e.g. std.ascii and 
 std.uni), which
 makes swapping one for the other easy and avoids having to come 
 up with
 distinct names, though it does obviously create more naming 
 conflicts when you
 try and mix and match such modules. I'd also point out that 
 it's been argued
 that it's a failure of the module system if we're specifically 
 trying to avoid
 having different modules have functions with the same name. 
 It's the module
 system's job to differentiate such functions, and specifically 
 avoiding naming
 stuff the same to avoid naming conflicts means that you're 
 pretty much ignoring
 the module system.

 So, the general approach has been to name functions differently 
 when they do
 different things and name them the same when they do the same 
 thing and then
 let the module system take care of differentiating between the 
 two when you
 need to.

 - Jonathan M Davis

You are wise and speak the truth :P

Jun 09 2013

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.1001.1370829569.13711.digitalmars-d puremagic.com...
 On Monday, June 10, 2013 11:44:56 Daniel Murphy wrote:
 In this case we can prevent problem simply by not giving functions 
 generic
 names like 'compress'.  Ideally you should be able to import the entire
 standard library with no name conflicts.

 We've actually made the opposite choice when discussing this in the past.
 We've specifically gone for making functions which do the same thing in
 different modules having the same name (e.g. std.ascii and std.uni), which
 makes swapping one for the other easy and avoids having to come up with
 distinct names, though it does obviously create more naming conflicts when 
 you
 try and mix and match such modules. I'd also point out that it's been 
 argued
 that it's a failure of the module system if we're specifically trying to 
 avoid
 having different modules have functions with the same name. It's the 
 module
 system's job to differentiate such functions, and specifically avoiding 
 naming
 stuff the same to avoid naming conflicts means that you're pretty much 
 ignoring
 the module system.

 So, the general approach has been to name functions differently when they 
 do
 different things and name them the same when they do the same thing and 
 then
 let the module system take care of differentiating between the two when 
 you
 need to.

 - Jonathan M Davis

The difference here is these are range functions and you lose ufcs.  It 
doesn't make much difference unless you are trying to chain them.

Ranges, and call chaining of range-based functions using ufcs, are among the 
most attractive features of phobos.  Let's define a new general approach, 
and keep them conflict-free when possible.

Also, compress is a ridiculously general name for a function.

Jun 11 2013

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Tuesday, 11 June 2013 at 13:13:56 UTC, Daniel Murphy wrote:
 Also, compress is a ridiculously general name for a function.

We have module-level functions called "copy" (multiple), "read", 
"write", "map", etc. already, and it's not a bad thing!

It's OK because the full name is not "compress", but 
"std.compression.lz77.compress". This way, how specific the code 
wants to be depends on the user and the particular use-case, 
instead of one-size-fits-all alternatives like "lz77Compress". 
There's no redundancy in the name yet we still have the option to 
be pin-point specific (e.g. static import), and yes, we still get 
to use UFCS!

To eliminate the UFCS problem - which doesn't happen very often 
(how often do you want to use two different compression 
algorithms in the same unit?), we can (must?) use renamed symbols 
when importing.

Since any example using multiple "compress" functions would be 
contrived, I'll use an existing conflict - the case of "copy".

The following program backs up the specified files and writes a 
nicely formatted message to stdout (OK, so a tiny bit contrived):
----
void main(string[] args)
{
	import std.algorithm : chain, copy, joiner;
	import std.array : empty;
	import std.file : fileCopy = copy; // `fileCopy` is std.file.copy
	import std.stdio : stdout;

	auto fileNames = args[1 .. $];

	foreach(fileName; fileNames)
		fileName.fileCopy(fileName ~ ".bak");

	if(!fileNames.empty)
		"Backed up the following files: "
			.chain(fileNames.joiner(", "))
			.copy(stdout.lockingTextWriter());
}
----

By eliminating redundancies from symbol names, we empower the 
user, and the module system offers all the tools necessary to 
solve conflicts in a variety of ways.

Jun 11 2013

Timothee Cour <thelastmammoth gmail.com> writes:

On Tue, Jun 11, 2013 at 11:22 AM, Jakob Ovrum <jakobovrum gmail.com> wrote:

 On Tuesday, 11 June 2013 at 13:13:56 UTC, Daniel Murphy wrote:

 Also, compress is a ridiculously general name for a function.

 We have module-level functions called "copy" (multiple), "read", "write",
 "map", etc. already, and it's not a bad thing!

 It's OK because the full name is not "compress", but "std.compression.lz77.
 **compress". This way, how specific the code wants to be depends on the
 user and the particular use-case, instead of one-size-fits-all alternatives
 like "lz77Compress". There's no redundancy in the name yet we still have
 the option to be pin-point specific (e.g. static import), and yes, we still
 get to use UFCS!

 To eliminate the UFCS problem - which doesn't happen very often (how often
 do you want to use two different compression algorithms in the same unit?),
 we can (must?) use renamed symbols when importing.

I have found a better way to do that: see
http://forum.dlang.org/post/mailman.1002.1370829729.13711.digitalmars-d-learn puremagic.com
subject: 'best way to handle UFCS with ambiguous names: using
std.typetuple.Alias!'
syntax: 'arg1.Alias!(std.file.write).arg2'*
see related discussion for reasoning. I'd like to push this as standard way
to deal with ambiguities.


 Since any example using multiple "compress" functions would be contrived,
 I'll use an existing conflict - the case of "copy".

 The following program backs up the specified files and writes a nicely
 formatted message to stdout (OK, so a tiny bit contrived):
 ----
 void main(string[] args)
 {
         import std.algorithm : chain, copy, joiner;
         import std.array : empty;
         import std.file : fileCopy = copy; // `fileCopy` is std.file.copy
         import std.stdio : stdout;

         auto fileNames = args[1 .. $];

         foreach(fileName; fileNames)
                 fileName.fileCopy(fileName ~ ".bak");

         if(!fileNames.empty)
                 "Backed up the following files: "
                         .chain(fileNames.joiner(", "))
                         .copy(stdout.**lockingTextWriter());
 }
 ----

 By eliminating redundancies from symbol names, we empower the user, and
 the module system offers all the tools necessary to solve conflicts in a
 variety of ways.

Jun 11 2013

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Tuesday, 11 June 2013 at 18:43:45 UTC, Timothee Cour wrote:
 I have found a better way to do that: see
 http://forum.dlang.org/post/mailman.1002.1370829729.13711.digitalmars-d-learn puremagic.com
 subject: 'best way to handle UFCS with ambiguous names: using
 std.typetuple.Alias!'
 syntax: 'arg1.Alias!(std.file.write).arg2'*
 see related discussion for reasoning. I'd like to push this as 
 standard way
 to deal with ambiguities.

It's clearly an option, but I think it's too syntactically heavy, 
causing more harm than good (the idea of UFCS is, of course, 
readability!).

Since these conflicting symbols are in the minority for the vast 
majority of code units, I think renamed symbols are much, much 
better.

Jun 11 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Tuesday, June 11, 2013 20:50:17 Jakob Ovrum wrote:
 On Tuesday, 11 June 2013 at 18:43:45 UTC, Timothee Cour wrote:
 I have found a better way to do that: see
 http://forum.dlang.org/post/mailman.1002.1370829729.13711.digitalmars-d-le
 arn puremagic.com subject: 'best way to handle UFCS with ambiguous names:
 using
 std.typetuple.Alias!'
 syntax: 'arg1.Alias!(std.file.write).arg2'*
 see related discussion for reasoning. I'd like to push this as
 standard way
 to deal with ambiguities.

 
 It's clearly an option, but I think it's too syntactically heavy,
 causing more harm than good (the idea of UFCS is, of course,
 readability!).
 
 Since these conflicting symbols are in the minority for the vast
 majority of code units, I think renamed symbols are much, much
 better.

Agreed.

- Jonathan M Davis

Jun 11 2013

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"Jakob Ovrum" <jakobovrum gmail.com> wrote in message 
news:fjmuuahorgbwkcvygnqq forum.dlang.org...
 On Tuesday, 11 June 2013 at 13:13:56 UTC, Daniel Murphy wrote:
 Also, compress is a ridiculously general name for a function.

 We have module-level functions called "copy" (multiple), "read", "write", 
 "map", etc. already, and it's not a bad thing!

It is.

 It's OK because the full name is not "compress", but 
 "std.compression.lz77.compress". This way, how specific the code wants to 
 be depends on the user and the particular use-case, instead of 
 one-size-fits-all alternatives like "lz77Compress". There's no redundancy 
 in the name yet we still have the option to be pin-point specific (e.g. 
 static import), and yes, we still get to use UFCS!

There is a reason we don't call every function in phobos 'process' and let 
the module name tell us what is actually does - when you see the name in 
your source code, it is easy to recognize what is being done.

 To eliminate the UFCS problem - which doesn't happen very often (how often 
 do you want to use two different compression algorithms in the same 
 unit?), we can (must?) use renamed symbols when importing.

My workplace has a fire extinguisher, but this doesn't mean lighting fires 
is a good idea.

I know we have the tools to disambiguate, but they come at a syntax and/or 
clarity cost.  Why create a problem when we don't have to?

 Since any example using multiple "compress" functions would be contrived, 
 I'll use an existing conflict - the case of "copy".

Eg. Code which implements http compression with support for multiple 
algorithms.

tl;dr We have great tools to disambiguate when we have to.  Let's not have 
to.

Jun 11 2013

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Tuesday, 11 June 2013 at 22:34:55 UTC, Daniel Murphy wrote:
 There is a reason we don't call every function in phobos 
 'process' and let
 the module name tell us what is actually does - when you see 
 the name in
 your source code, it is easy to recognize what is being done.

"copy", "write" and "compress" are perfectly recognizable names.

 tl;dr We have great tools to disambiguate when we have to.  
 Let's not have
 to.

The way I see it, you're asking that all code should pay for the 
benefit of a minority of cases. I'd choose the inverse.

Jun 12 2013

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"Jakob Ovrum" <jakobovrum gmail.com> wrote in message 
news:sdgqfozqnysbnumynkvp forum.dlang.org...
 On Tuesday, 11 June 2013 at 22:34:55 UTC, Daniel Murphy wrote:
 There is a reason we don't call every function in phobos 'process' and 
 let
 the module name tell us what is actually does - when you see the name in
 your source code, it is easy to recognize what is being done.

 "copy", "write" and "compress" are perfectly recognizable names.

Ok, how exactly is the data compressed in the following snippet?  No 
scrolling up to the top of the module to see what's imported!

newdata = data.compress();

 tl;dr We have great tools to disambiguate when we have to.  Let's not 
 have
 to.

 The way I see it, you're asking that all code should pay for the benefit 
 of a minority of cases. I'd choose the inverse.

This is not a function that will be used every few lines.  Making the name a 
little longer for an increase in clarity is usually seen as a good idea.

Jun 13 2013

"Michal Minich" <michal.minich gmail.com> writes:

On Thursday, 13 June 2013 at 11:36:16 UTC, Daniel Murphy wrote:

 Ok, how exactly is the data compressed in the following 
 snippet?  No
 scrolling up to the top of the module to see what's imported!

 newdata = data.compress();

You can have that argument for any single overload and virtual 
call. At least you know it statically; with virtual you don't 
know until runtime... In many languages you would have interface 
ICompressor { Stream compress (Stream s) }...

Jun 13 2013

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Thursday, 13 June 2013 at 11:36:16 UTC, Daniel Murphy wrote:
 Ok, how exactly is the data compressed in the following 
 snippet?  No
 scrolling up to the top of the module to see what's imported!

 newdata = data.compress();

If it's not obvious from the context, just be explicit.

newdata = std.compression.lz77.compress(data);

Don't force verbosity on everyone just in case someone wants it.

Jun 13 2013

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 6/13/13, Peter Alexander <peter.alexander.au gmail.com> wrote:
 If it's not obvious from the context, just be explicit.

 newdata = std.compression.lz77.compress(data);

 Don't force verbosity on everyone just in case someone wants it.

What happens when we get std.compression.lz78 and you end up
accidentally calling compress on with lz77 and expand with lz78?
Pseudocoding:

module deserialize;
import std.compression.lz77;
auto readfile(string filename) { return readFile(filename).expand; }

module serialize;
import std.compression.lz78;  // oops!
void writeFile(T)(T[] data, string filename) { writeFile(filename,
data.compress);  }

Imports are incredibly easy to screw up. But if we used types instead
of global modules then we could not only make our calling code clearer
(and less buggy), but it would also allow us to use package imports so
we can use any compression algorithm:

module deserialize;
import std.compression;  // package import, e.g. imports lz77, lz78, etc modules
auto readfile(string filename) { return filename.readFile.lz77.expand; }

module serialize;
import std.compression;  // package import
void writeFile(T)(T[] data, string filename) {
data.lz77.compress.writeFile(filename); }

"lz77" would be an auto function which takes the buffer and returns a
Lz77 struct that has expand/compress methods.

Jun 13 2013

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Thursday, 13 June 2013 at 13:15:03 UTC, Andrej Mitrovic wrote:
 What happens when we get std.compression.lz78 and you end up
 accidentally calling compress on with lz77 and expand with lz78?

The exact same typo could happen with your structs. You haven't 
solved anything:


 module serialize;
 import std.compression;  // package import
 void writeFile(T)(T[] data, string filename) {
 data.lz77.compress.writeFile(filename); }

void writeFile(T)(T[] data, string filename) {
     data.lz78.compress.writeFile(filename);
}

oops!

Jun 13 2013

"David Nadlinger" <code klickverbot.at> writes:

On Thursday, 13 June 2013 at 13:15:03 UTC, Andrej Mitrovic wrote:
 What happens when we get std.compression.lz78 and you end up
 accidentally calling compress on with lz77 and expand with lz78?
 […]
 Imports are incredibly easy to screw up.

I think this argument is invalid: A typo in an import statement 
is just as likely as in a function call.

David

Jun 13 2013

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 6/13/13, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:
 But if we used types instead of global modules

*global functions*

Jun 13 2013

"David Nadlinger" <code klickverbot.at> writes:

On Thursday, 13 June 2013 at 11:36:16 UTC, Daniel Murphy wrote:
 Ok, how exactly is the data compressed in the following 
 snippet?  No
 scrolling up to the top of the module to see what's imported!

I don't need to scroll to the top of the module, just a few lines 
up because I'm using function-local imports anyway. :P

If you want extra verbosity (which can be good *sometimes*), just 
write "import lz77 = std.compression.lz77" and you are good to go.

David

Jun 13 2013

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"David Nadlinger" <code klickverbot.at> wrote in message 
news:ahqzxzjbhmfiacwjgfkj forum.dlang.org...
 On Thursday, 13 June 2013 at 11:36:16 UTC, Daniel Murphy wrote:
 Ok, how exactly is the data compressed in the following snippet?  No
 scrolling up to the top of the module to see what's imported!

 I don't need to scroll to the top of the module, just a few lines up 
 because I'm using function-local imports anyway. :P

 If you want extra verbosity (which can be good *sometimes*), just write 
 "import lz77 = std.compression.lz77" and you are good to go.

I don't think 4 characters is a high price to pay for the added clarity. 
Then there is no ambiguity, no need to rename imports, no problems using 
ufcs.  Every time I see lz77Compress in anybody's code I know exactly what 
it does!

I understand the motivation for shortening function names that will be used 
frequently... but this is not in that category.

Jun 13 2013

"David Nadlinger" <code klickverbot.at> writes:

On Thursday, 13 June 2013 at 23:45:12 UTC, Daniel Murphy wrote:
 I don't think 4 characters is a high price to pay for the added 
 clarity.
 Then there is no ambiguity, no need to rename imports, no 
 problems using
 ufcs.  Every time I see lz77Compress in anybody's code I know 
 exactly what
 it does!

import std.compression : lz77Compress = lz78Compress;

;)

Jun 13 2013

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"David Nadlinger" <code klickverbot.at> wrote in message 
news:gzniyhyeuhjturqffgan forum.dlang.org...
 On Thursday, 13 June 2013 at 23:45:12 UTC, Daniel Murphy wrote:
 I don't think 4 characters is a high price to pay for the added clarity.
 Then there is no ambiguity, no need to rename imports, no problems using
 ufcs.  Every time I see lz77Compress in anybody's code I know exactly 
 what
 it does!

 import std.compression : lz77Compress = lz78Compress;

 ;)

:(

Jun 13 2013

Jacob Carlborg <doob me.com> writes:

On 2013-06-14 01:53, David Nadlinger wrote:

 import std.compression : lz77Compress = lz78Compress;

 ;)

If you do that you only have yourself to blame. What if someone uses 
monkey patching and replaces all your functions at runtime.

-- 
/Jacob Carlborg

Jun 14 2013

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Thursday, 13 June 2013 at 23:45:12 UTC, Daniel Murphy wrote:
 I don't think 4 characters is a high price to pay for the added 
 clarity.
 Then there is no ambiguity, no need to rename imports, no 
 problems using
 ufcs.  Every time I see lz77Compress in anybody's code I know 
 exactly what
 it does!

I recommend you just use local imports if it bother you that 
much, then it's obvious:

import std.compression.lz77;
auto newdata = compress(data);

Really, it should be obvious from the context which compression 
algorithm you are using.


 I understand the motivation for shortening function names that 
 will be used
 frequently... but this is not in that category.

This is not the motivation. The problem with lz77compress is that 
it is redundant:

std.compression.lz77.lz77compress

It's bad style to repeat the module name in module identifiers. 
It completely defeats the purpose of using modules as namespaces.

If all the compression algorithms were inside std.compression 
instead of having their own modules then yes, lz77compress would 
be a fantastic name, but they're not, so it's not.

Jun 14 2013

Timothee Cour <thelastmammoth gmail.com> writes:

ok I found what I think is the best solution to this problem :-)
see:
http://forum.dlang.org/post/mailman.1002.1370829729.13711.digitalmars-d-learn puremagic.com



On Sun, Jun 9, 2013 at 6:59 PM, Jonathan M Davis <jmdavisProg gmx.com>wrote:

 On Monday, June 10, 2013 11:44:56 Daniel Murphy wrote:
 In this case we can prevent problem simply by not giving functions

 generic
 names like 'compress'.  Ideally you should be able to import the entire
 standard library with no name conflicts.

 We've actually made the opposite choice when discussing this in the past.
 We've specifically gone for making functions which do the same thing in
 different modules having the same name (e.g. std.ascii and std.uni), which
 makes swapping one for the other easy and avoids having to come up with
 distinct names, though it does obviously create more naming conflicts when
 you
 try and mix and match such modules. I'd also point out that it's been
 argued
 that it's a failure of the module system if we're specifically trying to
 avoid
 having different modules have functions with the same name. It's the module
 system's job to differentiate such functions, and specifically avoiding
 naming
 stuff the same to avoid naming conflicts means that you're pretty much
 ignoring
 the module system.

 So, the general approach has been to name functions differently when they
 do
 different things and name them the same when they do the same thing and
 then
 let the module system take care of differentiating between the two when you
 need to.

 - Jonathan M Davis

Jun 09 2013

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"Timothee Cour" <thelastmammoth gmail.com> wrote in message 
news:mailman.1003.1370829991.13711.digitalmars-d puremagic.com...
 ok I found what I think is the best solution to this problem :-)
 see:
 http://forum.dlang.org/post/mailman.1002.1370829729.13711.digitalmars-d-learn puremagic.com

That's pretty awesome, but still much much much uglier than not having to 
disambiguate in the first place.

Jun 11 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, June 04, 2013 23:55:05 Timothee Cour wrote:
 What I suggested in my original post didn't involve any
 indirection/abstraction; simply a renaming to be consistent with existing
 zlib (see my points A+B in my 1st post on this thread):
 
 std.compress.zlib.compress
 std.compress.zlib.uncompress
 std.compress.lzw.compress
 std.compress.lzw.uncompress
 
 same reason we have: std.file.write, std.stdio.write, etc, and not
 std.fileWrite, std.stdioWrite.

So, you want to create whole modules for each compression algorithm? That 
seems like overkill to me. What Walter currently has isn't even 1000 lines 
long (and that's including the CircularBuffer helper struct). Splitting it up 
like that seems like over-modularation to me.

- Jonathan M Daivs

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 11:59 PM, Jonathan M Davis wrote:
 So, you want to create whole modules for each compression algorithm?

Yes.

 That seems like overkill to me. What Walter currently has isn't even 1000 lines
 long (and that's including the CircularBuffer helper struct). Splitting it up
 like that seems like over-modularation to me.

When two modules have nothing to do with each other, they should be in separate 
modules.

Jun 05 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, June 05, 2013 00:14:59 Walter Bright wrote:
 When two modules have nothing to do with each other, they should be in
 separate modules.

Except that they're all compression algorithms, so they _are_ related. Having 
modules that are only a few hundred lines long is very counterproductive IMHO. 
It's highly annoying how Java insists on splitting everything up into different 
files. You end up with a lot of small files to wade through. Fortunately, D 
doesn't force that, and I don't think that we should go that route by choice. 
There's no more reason to split all of these up then there is to put each 
algorithm in std.algorithm in its own module. And yes, I know that you like 
that idea, but it seems ridiculous to me to try and have only one or two 
functions per module. We don't want them to be huge, but having them be very 
small is just as harmful IMHO.

- Jonathan M Davis

Jun 05 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/5/2013 12:29 AM, Jonathan M Davis wrote:
 On Wednesday, June 05, 2013 00:14:59 Walter Bright wrote:
 When two modules have nothing to do with each other, they should be in
 separate modules.

 Except that they're all compression algorithms, so they _are_ related.

No, they are not related. They don't share code, and it is unlikely more than 
one would be called in any particular use case.

Remember, module contents have private access to other parts of the module.
This 
violates encapsulation when the parts are unrelated.

 Having
 modules that are only a few hundred lines long is very counterproductive IMHO.

Why?

On the other hand, when you are trying to understand a module, having thousands 
of lines of things that have no connection to each other makes it difficult. It 
also makes debugging them harder than necessary.

 It's highly annoying how Java insists on splitting everything up into different
 files. You end up with a lot of small files to wade through.

Wade through for what? If you're having a problem with the lzw compressor, why 
would you find it more productive to wade through the huffman compressor to get 
to it?

 Fortunately, D
 doesn't force that, and I don't think that we should go that route by choice.
 There's no more reason to split all of these up then there is to put each
 algorithm in std.algorithm in its own module. And yes, I know that you like
 that idea, but it seems ridiculous to me to try and have only one or two
 functions per module. We don't want them to be huge, but having them be very
 small is just as harmful IMHO.

You need a better case as to why it is harmful.

I've spent many miserable hours trying to find a bug in a phobos module that is 
a zillion lines of code, trying to strip out what is not necessary to repro the 
problem.

I don't see what problem kitchen sink modules solve - my experience is that 
smaller, better contained abstractions are more productive than kitchen sinks.

Jun 05 2013

Jacob Carlborg <doob me.com> writes:

On 2013-06-05 08:59, Jonathan M Davis wrote:

 So, you want to create whole modules for each compression algorithm? That
 seems like overkill to me. What Walter currently has isn't even 1000 lines
 long (and that's including the CircularBuffer helper struct). Splitting it up
 like that seems like over-modularation to me.

The current modules in Phobos already contains too much. We shouldn't 
make the same mistake again.

-- 
/Jacob Carlborg

Jun 05 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, June 05, 2013 09:31:01 Jacob Carlborg wrote:
 On 2013-06-05 08:59, Jonathan M Davis wrote:
 So, you want to create whole modules for each compression algorithm? That
 seems like overkill to me. What Walter currently has isn't even 1000 lines
 long (and that's including the CircularBuffer helper struct). Splitting it
 up like that seems like over-modularation to me.

 
 The current modules in Phobos already contains too much. We shouldn't
 make the same mistake again.

Maybe some do, but many don't, and 1000 lines is _far_ from too much. If we 
start making modules that small, we're going to end up with tons of them to 
wade through to find anything.

- Jonathan M Davis

Jun 05 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/5/2013 12:38 AM, Jonathan M Davis wrote:
 Maybe some do, but many don't, and 1000 lines is _far_ from too much. If we
 start making modules that small, we're going to end up with tons of them to
 wade through to find anything.

1. It isn't any harder to find things in multiple files than in one file.

2. If there's a ton in one file, you have to wade through the ton to find what 
you're looking for.


Your argument has merit if you are using a floppy disk drive for storage, as 
floppies are agonizingly slow to read files off of. But that problem
disappeared 
30 years ago.

(At an SD conference back in the 80's, I was on a compiler panel with the 
compiler guys from Microsoft, Borland, etc. We were each asked how our 
respective compilers worked on floppy systems. The guys would say "well, you
set 
it up this way, configure it that way, juggle what goes on which floppy, and
you 
can do it!" I was the third of five guys, and my response was:

"We charge $200 extra for the floppy disk development system, and ship you a 
hard disk with it."

That was the end of that discussion, and I never heard that question again.

Jun 05 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Wednesday, 5 June 2013 at 08:11:14 UTC, Walter Bright wrote:
 On 6/5/2013 12:38 AM, Jonathan M Davis wrote:
 Maybe some do, but many don't, and 1000 lines is _far_ from 
 too much. If we
 start making modules that small, we're going to end up with 
 tons of them to
 wade through to find anything.

 1. It isn't any harder to find things in multiple files than in 
 one file.

Although I think you're right about having smaller modules, I 
generally find it easier to browse through a larger file than 
many smaller files.

Multiple files is ok if you know what you're looking for (grep) 
but when you're just trying to scan across a system to get a feel 
for how it's working, juggling many files is a real pita.

Jun 05 2013

"Diggory" <diggsey googlemail.com> writes:

On Wednesday, 5 June 2013 at 11:30:10 UTC, John Colvin wrote:
 On Wednesday, 5 June 2013 at 08:11:14 UTC, Walter Bright wrote:
 On 6/5/2013 12:38 AM, Jonathan M Davis wrote:
 Maybe some do, but many don't, and 1000 lines is _far_ from 
 too much. If we
 start making modules that small, we're going to end up with 
 tons of them to
 wade through to find anything.

 1. It isn't any harder to find things in multiple files than 
 in one file.

 Although I think you're right about having smaller modules, I 
 generally find it easier to browse through a larger file than 
 many smaller files.

 Multiple files is ok if you know what you're looking for (grep) 
 but when you're just trying to scan across a system to get a 
 feel for how it's working, juggling many files is a real pita.

Surely you would know which compression algorithm you wanted to 
change? If it's a general renaming or something not specific to a 
particular use then a file search is necessary anyway.

Jun 05 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Wednesday, 5 June 2013 at 11:57:19 UTC, Diggory wrote:
 On Wednesday, 5 June 2013 at 11:30:10 UTC, John Colvin wrote:
 On Wednesday, 5 June 2013 at 08:11:14 UTC, Walter Bright wrote:
 On 6/5/2013 12:38 AM, Jonathan M Davis wrote:
 Maybe some do, but many don't, and 1000 lines is _far_ from 
 too much. If we
 start making modules that small, we're going to end up with 
 tons of them to
 wade through to find anything.

 1. It isn't any harder to find things in multiple files than 
 in one file.

 Although I think you're right about having smaller modules, I 
 generally find it easier to browse through a larger file than 
 many smaller files.

 Multiple files is ok if you know what you're looking for 
 (grep) but when you're just trying to scan across a system to 
 get a feel for how it's working, juggling many files is a real 
 pita.

 Surely you would know which compression algorithm you wanted to 
 change? If it's a general renaming or something not specific to 
 a particular use then a file search is necessary anyway.

I eas speaking more generally, about phobos as a whole.

Jun 05 2013

"David Nadlinger" <code klickverbot.at> writes:

On Wednesday, 5 June 2013 at 11:30:10 UTC, John Colvin wrote:
 Although I think you're right about having smaller modules, I 
 generally find it easier to browse through a larger file than 
 many smaller files.

 Multiple files is ok if you know what you're looking for (grep) 
 but when you're just trying to scan across a system to get a 
 feel for how it's working, juggling many files is a real pita.

Use an editor with a file tree sidebar? Quite on the contrary, I 
find many files to be much preferable, because you automatically 
have "bookmarks" in the source to come back to, and having the 
functionality already grouped in manageable logical units saves 
you from inferring that structure again, as it is the case when 
scrolling through a huge file.

On a lighter note, if it's really a problem for you that module 
files are too small, what about just concatenating all the files 
in a given directory using a little shell magic? ;)

David

Jun 05 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Wednesday, 5 June 2013 at 14:17:43 UTC, David Nadlinger wrote:
 On Wednesday, 5 June 2013 at 11:30:10 UTC, John Colvin wrote:
 Although I think you're right about having smaller modules, I 
 generally find it easier to browse through a larger file than 
 many smaller files.

 Multiple files is ok if you know what you're looking for 
 (grep) but when you're just trying to scan across a system to 
 get a feel for how it's working, juggling many files is a real 
 pita.

 Use an editor with a file tree sidebar? Quite on the contrary, 
 I find many files to be much preferable, because you 
 automatically have "bookmarks" in the source to come back to, 
 and having the functionality already grouped in manageable 
 logical units saves you from inferring that structure again, as 
 it is the case when scrolling through a huge file.

 On a lighter note, if it's really a problem for you that module 
 files are too small, what about just concatenating all the 
 files in a given directory using a little shell magic? ;)

 David

Agreed.

To be honest, it's a trivial matter easily solved by a variety of 
tools, but I'm often just lazy and end up reading code with gedit 
or similar.

Jun 05 2013

Jacob Carlborg <doob me.com> writes:

On 2013-06-05 16:31, John Colvin wrote:

 Agreed.

 To be honest, it's a trivial matter easily solved by a variety of tools,
 but I'm often just lazy and end up reading code with gedit or similar.

Gedit has a file tree sidebar, at least as a plugin.

-- 
/Jacob Carlborg

Jun 09 2013

Jacob Carlborg <doob me.com> writes:

On 2013-06-05 09:38, Jonathan M Davis wrote:

 Maybe some do, but many don't, and 1000 lines is _far_ from too much. If we
 start making modules that small, we're going to end up with tons of them to
 wade through to find anything.

I completely agree with Walter and he mad my point a lot better than I 
could.

-- 
/Jacob Carlborg

Jun 05 2013

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Wednesday, 5 June 2013 at 07:39:12 UTC, Jonathan M Davis wrote:
 Maybe some do, but many don't, and 1000 lines is _far_ from too 
 much. If we
 start making modules that small, we're going to end up with 
 tons of them to
 wade through to find anything.

 - Jonathan M Davis

We have a standard library in disagreement with the language's 
encapsulation mechanics. The module/package system in D is almost 
ignored in Phobos (and that's probably why the package system 
still has all these little things needing ironing out). It seems 
to owe influence to typical C and C++ library structure, which is 
simply suboptimal in D's module system.

Third-party libraries tend to do a much better job at this. For 
example, Tango goes all out and embraces the package and module 
system, and the result is an extremely organized tree of modules 
with appropriate granularity. Code isn't hard to find because 
everything isn't just dumped into (bloated) blobs in a flat 
structure like in Phobos; it's organized into a tree. It seems 
like a no-brainer with the D language, and Phobos is the only D 
library I know that doesn't embrace this style of organization. 
The result is awful coupling throughout; with Phobos, we can't 
even write Hello World without pulling in half of the standard 
library.

It's not just about the actual dependencies a module has, but the 
perceived dependencies; important from a readability perspective. 
I know a lot of D programmers embrace selective imports when 
working with Phobos, because just seeing a plain import statement 
such as "import std.datetime;" tells you very little about what 
the importing module actually does, and it's harder to figure out 
exactly where unqualified symbols come from when reading the 
module's code.

I think the programmer should have a choice of convenience versus 
readability/fine dependency management when importing. The 
current module system does a decent job at enabling this already, 
and it's bound to get better with improvements like DIP37. 
Scripts and certain application code may want to prioritize 
productivity over finely managed dependencies, while library code 
- especially the *standard* library! - should definitely aim for 
lean coupling that makes sense.

To that end, I think a lot of improvements can be made without 
breaking user code, but I'd be very much willing to see all kinds 
of breakage if it means we can get rid of the present standard 
library of substandard quality. The language may have been 
declared stable, but Phobos is in no laudable state.

Jun 05 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, June 05, 2013 14:02:37 Jakob Ovrum wrote:
 We have a standard library in disagreement with the language's
 encapsulation mechanics. The module/package system in D is almost
 ignored in Phobos (and that's probably why the package system
 still has all these little things needing ironing out). It seems
 to owe influence to typical C and C++ library structure, which is
 simply suboptimal in D's module system.

I honestly don't see how Phobos is in disagreement with the module system. No, 
it doesn't use hierarchy as much as it should, and there are a few modules 
that are overly large (like std.algorithm or std.datetime), but for the most 
part, I don't see any problem with its level of encapsulation. It's mainly 
just its organization which could have been better. My primary objection here 
is that it seems ridiculous to me create lots of tiny modules. I hate how Java 
does that sort of thing, but there you're _forced_ to in many cases, whereas 
we have the opportunity to actually group things together in a single module 
where appropriate. And having whole modules with only one or two functions is 
way too small IMHO, and that seems to be what we're proposing here.

- Jonathan M Davis

Jun 05 2013

"Diggory" <diggsey googlemail.com> writes:

On Wednesday, 5 June 2013 at 17:21:01 UTC, Jonathan M Davis wrote:
 On Wednesday, June 05, 2013 14:02:37 Jakob Ovrum wrote:
 We have a standard library in disagreement with the language's
 encapsulation mechanics. The module/package system in D is 
 almost
 ignored in Phobos (and that's probably why the package system
 still has all these little things needing ironing out). It 
 seems
 to owe influence to typical C and C++ library structure, which 
 is
 simply suboptimal in D's module system.

 I honestly don't see how Phobos is in disagreement with the 
 module system. No,
 it doesn't use hierarchy as much as it should, and there are a 
 few modules
 that are overly large (like std.algorithm or std.datetime), but 
 for the most
 part, I don't see any problem with its level of encapsulation. 
 It's mainly
 just its organization which could have been better. My primary 
 objection here
 is that it seems ridiculous to me create lots of tiny modules. 
 I hate how Java
 does that sort of thing, but there you're _forced_ to in many 
 cases, whereas
 we have the opportunity to actually group things together in a 
 single module
 where appropriate. And having whole modules with only one or 
 two functions is
 way too small IMHO, and that seems to be what we're proposing 
 here.

 - Jonathan M Davis

I agree with one or two functions it's far too small, but I'm in 
favour of having only one or two top-level classes/structs per 
module (there will be exceptional cases but in general)

For examples:
std.regex - I think it would be better if each implementation had 
its own module, plus a separate module for the parts common to 
all of them. Importing std.regex would publicly import the lot 
using the new package system.

std.range - module for tests, ie. isXXX and hasXXX, module for 
algorithms ie. retro, take, etc., module for class wrappers

std.datetime - split each class/struct into own module, systime 
alone is ~8000 lines

Jun 05 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jun 05, 2013 at 01:20:48PM -0400, Jonathan M Davis wrote:
 On Wednesday, June 05, 2013 14:02:37 Jakob Ovrum wrote:
 We have a standard library in disagreement with the language's
 encapsulation mechanics. The module/package system in D is almost
 ignored in Phobos (and that's probably why the package system
 still has all these little things needing ironing out). It seems
 to owe influence to typical C and C++ library structure, which is
 simply suboptimal in D's module system.

 
 I honestly don't see how Phobos is in disagreement with the module
 system. No, it doesn't use hierarchy as much as it should, and there
 are a few modules that are overly large (like std.algorithm or
 std.datetime), but for the most part, I don't see any problem with its
 level of encapsulation. It's mainly just its organization which could
 have been better. My primary objection here is that it seems
 ridiculous to me create lots of tiny modules. I hate how Java does
 that sort of thing, but there you're _forced_ to in many cases,
 whereas we have the opportunity to actually group things together in a
 single module where appropriate. And having whole modules with only
 one or two functions is way too small IMHO, and that seems to be what
 we're proposing here.

[...]

As Andrei pointed out, I think we need to look at this not from a size
perspective (number of lines, number of functions, etc.), but from an
API perspective: do these functions/structs belong together, or are they
only marginally related? More precisely, if some user code uses function
X, is that code equally likely to also use Y? Are there common use cases
in which only Y is used, not X?

If the use of function X almost always implies the use of function Y
(and vice versa), then they belong in the same module. Otherwise, I'd
say they are candidates for splitting up.

If function X uses function Z, and function Y also uses function Z, but
the use of X does not necessarily imply the use of Y (and vice versa),
then I'd argue that X, Y, and Z should be in separate modules to
maximize reuse and reduce the amount of code you have to pull in (you
shouldn't be forced to pull in Z just because you use X which calls Y,
which Z happens to also call).

This may be a bit heavy-handed for user code, but for Phobos, the
standard library, I think the bar should be set higher. After all, one
of the stated goals of Phobos is that you shouldn't need to pull in a
whole ton of code just because you call a single function. Right now I
think we're a bit short of that goal.


T

-- 
All men are mortal. Socrates is mortal. Therefore all men are Socrates.

Jun 05 2013

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Wednesday, 5 June 2013 at 18:21:04 UTC, H. S. Teoh wrote:
 On Wed, Jun 05, 2013 at 01:20:48PM -0400, Jonathan M Davis 
 wrote:
 On Wednesday, June 05, 2013 14:02:37 Jakob Ovrum wrote:
 We have a standard library in disagreement with the 
 language's
 encapsulation mechanics. The module/package system in D is 
 almost
 ignored in Phobos (and that's probably why the package system
 still has all these little things needing ironing out). It 
 seems
 to owe influence to typical C and C++ library structure, 
 which is
 simply suboptimal in D's module system.

 
 I honestly don't see how Phobos is in disagreement with the 
 module
 system. No, it doesn't use hierarchy as much as it should, and 
 there
 are a few modules that are overly large (like std.algorithm or
 std.datetime), but for the most part, I don't see any problem 
 with its
 level of encapsulation. It's mainly just its organization 
 which could
 have been better. My primary objection here is that it seems
 ridiculous to me create lots of tiny modules. I hate how Java 
 does
 that sort of thing, but there you're _forced_ to in many cases,
 whereas we have the opportunity to actually group things 
 together in a
 single module where appropriate. And having whole modules with 
 only
 one or two functions is way too small IMHO, and that seems to 
 be what
 we're proposing here.

 [...]

 As Andrei pointed out, I think we need to look at this not from 
 a size
 perspective (number of lines, number of functions, etc.), but 
 from an
 API perspective: do these functions/structs belong together, or 
 are they
 only marginally related? More precisely, if some user code uses 
 function
 X, is that code equally likely to also use Y? Are there common 
 use cases
 in which only Y is used, not X?

 If the use of function X almost always implies the use of 
 function Y
 (and vice versa), then they belong in the same module. 
 Otherwise, I'd
 say they are candidates for splitting up.

 If function X uses function Z, and function Y also uses 
 function Z, but
 the use of X does not necessarily imply the use of Y (and vice 
 versa),
 then I'd argue that X, Y, and Z should be in separate modules to
 maximize reuse and reduce the amount of code you have to pull 
 in (you
 shouldn't be forced to pull in Z just because you use X which 
 calls Y,
 which Z happens to also call).

 This may be a bit heavy-handed for user code, but for Phobos, 
 the
 standard library, I think the bar should be set higher. After 
 all, one
 of the stated goals of Phobos is that you shouldn't need to 
 pull in a
 whole ton of code just because you call a single function. 
 Right now I
 think we're a bit short of that goal.

Massive +1

Modules are for grouping functions/types that are commonly used 
together or have interdependencies, not for grouping things that 
are in a similar category (although these things can be related).

I don't care if levenshteinDistance is a "classic algorithm", I 
don't want to have to compile it every time I want to take the 
minimum of two numbers. Barely anyone is ever going to use it, so 
it should be off in a module on its own.

There's absolutely nothing wrong with having lots of small 
modules provided that you don't end up importing the same sets of 
modules over and over. There are numerous advantages:

1. Makes it easier to manage dependencies.
1a. reduces compile times.
1b. reduces binary size.
1c. benefits incremental and distributed/parallel compilation.
2. Makes version control easier as more files means merge 
conflicts are less likely.
3. Makes it easier to navigate files.

The only downside is that you may occasionally have to import 
more modules.

Jun 06 2013

"SomeDude" <lovelydear mailmetrash.com> writes:

On Thursday, 6 June 2013 at 14:26:51 UTC, Peter Alexander wrote:
 Modules are for grouping functions/types that are commonly used 
 together or have interdependencies, not for grouping things 
 that are in a similar category (although these things can be 
 related).

 I don't care if levenshteinDistance is a "classic algorithm", I 
 don't want to have to compile it every time I want to take the 
 minimum of two numbers. Barely anyone is ever going to use it, 
 so it should be off in a module on its own.

 There's absolutely nothing wrong with having lots of small 
 modules provided that you don't end up importing the same sets 
 of modules over and over. There are numerous advantages:

 1. Makes it easier to manage dependencies.
 1a. reduces compile times.
 1b. reduces binary size.
 1c. benefits incremental and distributed/parallel compilation.
 2. Makes version control easier as more files means merge 
 conflicts are less likely.
 3. Makes it easier to navigate files.

 The only downside is that you may occasionally have to import 
 more modules.

Wise words !

Jun 06 2013

"David Nadlinger" <code klickverbot.at> writes:

On Wednesday, 5 June 2013 at 07:00:14 UTC, Jonathan M Davis wrote:
 So, you want to create whole modules for each compression 
 algorithm? That
 seems like overkill to me. What Walter currently has isn't even 
 1000 lines
 long (and that's including the CircularBuffer helper struct). 
 Splitting it up
 like that seems like over-modularation to me.

Modules are the unit of encapsulation in D (private), so they 
should always be as small as possible.

As Andrei would say: Destroyed?

David

Jun 05 2013

"SomeDude" <lovelydear mailmetrash.com> writes:

On Wednesday, 5 June 2013 at 07:00:14 UTC, Jonathan M Davis wrote:
 So, you want to create whole modules for each compression 
 algorithm? That
 seems like overkill to me. What Walter currently has isn't even 
 1000 lines
 long (and that's including the CircularBuffer helper struct). 
 Splitting it up
 like that seems like over-modularation to me.

 - Jonathan M Daivs

Well, as the author of a 15,000 lines datetime module, I think 
your opinion is a little biased.

*I* think 1,000 lines is a perfect size for a module.

Jun 05 2013

"Xiaoxi" <xiaoxi 163.com> writes:

On Wednesday, 5 June 2013 at 19:01:28 UTC, SomeDude wrote:
 On Wednesday, 5 June 2013 at 07:00:14 UTC, Jonathan M Davis 
 wrote:
 So, you want to create whole modules for each compression 
 algorithm? That
 seems like overkill to me. What Walter currently has isn't 
 even 1000 lines
 long (and that's including the CircularBuffer helper struct). 
 Splitting it up
 like that seems like over-modularation to me.

 - Jonathan M Daivs

 Well, as the author of a 15,000 lines datetime module, I think 
 your opinion is a little biased.

 *I* think 1,000 lines is a perfect size for a module.

are cross module / file, inling working on all d compilers? if 
not, bigger modules are better.

Jun 06 2013

"David Nadlinger" <code klickverbot.at> writes:

On Thursday, 6 June 2013 at 13:34:42 UTC, Xiaoxi wrote:
 are cross module / file, inling working on all d compilers? if 
 not, bigger modules are better.

This is not at all relevant if either
  a) the functions in question are templates, as it is the case 
here
or
  b) the functions in a bigger module don't call each other 
anyway, such as in many kitchen-sink modules that just group 
vaguely related functionality together.

David

Jun 06 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/4/13 2:43 PM, Timothee Cour wrote:
     What is the improvement of typing:

         compress(lzw)

     over:

         lzwCompress()

     ?


 writing generic code.
 same reason as why we prefer:
 auto y=to!double(x) over auto y=to_double(x);

I think the application here is a bit more tenuous. It's natural to 
think of a type-parameterized algorithm that needs to!T. But it's more 
of a long shot to think of an algorithm statically parameterized on the 
compression method. That could definitely intervene, but it's not likely 
to be frequent; and if it's not, a mixin can always take care of it.

Andrei

Jun 04 2013

David <d dav1d.de> writes:

Am 04.06.2013 18:09, schrieb Walter Bright:
 On 6/4/2013 6:34 AM, Jacob Carlborg wrote:
 I'm wondering if (un)compress can take the compressing algorithm as a
 template
 parameter. Does that make sense?

 Something like:

 auto result = data.compress!(LZW);

 Then we could pass different compressing algorithms to the compress
 function.

 
 I don't see the point. Furthermore, it requires that the compress
 template know about all the compression algorithms available, which
 limits future expansion.
 

No the compression type only has to provide a certain api.

Jun 04 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/4/2013 9:34 AM, David wrote:
 Am 04.06.2013 18:09, schrieb Walter Bright:
 On 6/4/2013 6:34 AM, Jacob Carlborg wrote:
 I'm wondering if (un)compress can take the compressing algorithm as a
 template
 parameter. Does that make sense?

 Something like:

 auto result = data.compress!(LZW);

 Then we could pass different compressing algorithms to the compress
 function.

 I don't see the point. Furthermore, it requires that the compress
 template know about all the compression algorithms available, which
 limits future expansion.

 No the compression type only has to provide a certain api.

Again, I'm not seeing the added value with this.

Jun 04 2013

Marco Leise <Marco.Leise gmx.de> writes:

Am Mon, 03 Jun 2013 20:44:04 -0700
schrieb Walter Bright <newshound2 digitalmars.com>:

 Comments welcome.

LZW is a nice and fast general purpose algorithm and I
welcome its addition to Phobos to build file format readers
from it (MS-DOS compress, GIF, TIFF) or even just to compress
data on the fly in RAM. Most people seem to have moved on to
zlib though for pretty much anything else.

Actually I just happened to attempt something similar.
Influenced by your talk about modularity and bioinfornatic's
micro benchmarking with reading FASTA files I try to wrap up
the concepts of bit streams and algorithms processing them.
But some of my design goals are different:

a) Not-Invented-Here must take precedence. :D
b) There is no other measure than bytes/second.
c) Every algorithm must run in its own thread for maximal
   parallelism. (like Unix process piping)

So it is not about parallel algorithms, but building
processing pipelines that work like Unix where only circular
buffers need to be shared from one algorithm to the next.

Am Mon, 3 Jun 2013 23:40:06 -0700
schrieb Timothee Cour <thelastmammoth gmail.com>:

 A)
 there already is std.zlib; why not have:
 std.compress.zlib: public import std.zlib
 std.compress.lzw: put this new module there instead of in std.compress
 std.compress.image.png
 std.compress.image.jpg

Yes and no. Compression algorithms should be in std.compress and share
the same API, but image file formats in std.image.* or std.fileformat.*.
You don't look into std.compress when you want to open *.bmps and *.jpgs.

Am Tue, 04 Jun 2013 01:00:03 -0700
schrieb Walter Bright <newshound2 digitalmars.com>:

 On 6/3/2013 11:40 PM, Timothee Cour wrote:
 D)
 CircularBuffer belongs somewhere else; maybe std.range or std.container

 
 I have mixed feelings about that. If you'll notice, std.compress doesn't have 
 any imports! I wanted to make at least one module that doesn't pull in 100% of 
 everything in Phobos (one of my pet peeves).

I have nothing to add to the discussion on THAT matter, but
a compromise should be found between few massive imports (D)
and hundreds of tiny imports (Java). :)

-- 
Marco

Jun 04 2013

"Tiago Martinez" <tiago.martinez gmail.com> writes:

On Tuesday, 4 June 2013 at 03:44:05 UTC, Walter Bright wrote:
 https://github.com/WalterBright/phobos/blob/std_compress/std/compress.d

 I wrote this to add components to compress and expand ranges.

 Highlights:

 1. doesn't do any memory allocation
 2. can handle arbitrarily large sets of data
 3. it's lazy
 4. takes an InputRange, and outputs an InputRange

 Comments welcome.

I may have misunderstood something, but the code does not
implement LZW (a variant of LZ78), but a variant of LZ77 (i.e.
deflate/ZIP).

See https://en.wikipedia.org/wiki/LZ77_and_LZ78

Jun 05 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

05-Jun-2013 16:16, Tiago Martinez пишет:
 On Tuesday, 4 June 2013 at 03:44:05 UTC, Walter Bright wrote:
 https://github.com/WalterBright/phobos/blob/std_compress/std/compress.d

 I wrote this to add components to compress and expand ranges.

 Highlights:

 1. doesn't do any memory allocation
 2. can handle arbitrarily large sets of data
 3. it's lazy
 4. takes an InputRange, and outputs an InputRange

 Comments welcome.

 I may have misunderstood something, but the code does not
 implement LZW (a variant of LZ78), but a variant of LZ77 (i.e.
 deflate/ZIP).

+1
I thought to chime in with this too, keywords are:
sliding window ===> LZ77
dictionary ===> LZW

 See https://en.wikipedia.org/wiki/LZ77_and_LZ78


-- 
Dmitry Olshansky

Jun 05 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 6/5/2013 10:46 AM, Dmitry Olshansky wrote:
 05-Jun-2013 16:16, Tiago Martinez пишет:
 I may have misunderstood something, but the code does not
 implement LZW (a variant of LZ78), but a variant of LZ77 (i.e.
 deflate/ZIP).

 +1
 I thought to chime in with this too, keywords are:
 sliding window ===> LZ77
 dictionary ===> LZW

 See https://en.wikipedia.org/wiki/LZ77_and_LZ78


Thanks, you're both right.

Jun 05 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jun 05, 2013 at 04:17:42PM +0200, David Nadlinger wrote:
 On Wednesday, 5 June 2013 at 11:30:10 UTC, John Colvin wrote:
Although I think you're right about having smaller modules, I
generally find it easier to browse through a larger file than many
smaller files.


On the contrary, I find extremely large files (like std.algorithm) very
hard to navigate, because it's a hodgepodge of only loosely-related
code, most of which is completely independent of the others. Which means
there's no logical ordering to the code, they're just in arbitrary
random order (and often not the same order they appear in the ddoc
index). The only way to find stuff in code like this is to use the
search function -- which is no different from looking up a different
file in a well-organized module directory hierarchy.


Multiple files is ok if you know what you're looking for (grep) but
when you're just trying to scan across a system to get a feel for how
it's working, juggling many files is a real pita.


Try scanning through std.algorithm and tell me whether you "get a feel
for how it's working". I tried doing that before, and got so lost 12%
into the file that I've even less clue about how it all fits together
than before I looked at the code. After the first 5 seconds or so, I'm
just randomly paging up/down without any idea of where I am code-wise.


 Use an editor with a file tree sidebar? Quite on the contrary, I find
 many files to be much preferable, because you automatically have
 "bookmarks" in the source to come back to, and having the
 functionality already grouped in manageable logical units saves you
 from inferring that structure again, as it is the case when scrolling
 through a huge file.

+1.


 On a lighter note, if it's really a problem for you that module
 files are too small, what about just concatenating all the files in
 a given directory using a little shell magic? ;)

cat std/compress/*.d > /tmp/src.d; vim /tmp/src.d

:)


On Wed, Jun 05, 2013 at 04:20:49PM +0200, David Nadlinger wrote:
 On Wednesday, 5 June 2013 at 12:55:50 UTC, Andrei Alexandrescu
 wrote:
On 6/5/13 2:55 AM, Timothee Cour wrote:
What I suggested in my original post didn't involve any
indirection/abstraction; simply a renaming to be consistent with
existing zlib (see my points A+B in my 1st post on this thread):

std.compress.zlib.compress
std.compress.zlib.uncompress
std.compress.lzw.compress
std.compress.lzw.uncompress

I think that's nice.

 
 +1. D has many powerful features for handling module namespacing
 (e.g. "import lzw = std.compress.lzw"), let's enable people to make
 use of them.

[...]

+1. Being D's standard library, Phobos really should be the standard
example of how module namespacing should work. Right now it's just
promulgating the bad practice of throwing a bunch of unrelated (or only
loosely related) code in to giant monolithic files. C'mon, guys, this
isn't 1975. We *have* tools for managing hierarchies of smallish files.
There's no compelling reason why we have to stick to monolithic module
design (or lack of design thereof) anymore.

The biggest advantage of small modules is that code that doesn't depend
on each other will not be lumped together in the same file. Why should
they be? If you only use function X, why should the compiler do extra
unnecessary work in parsing and compiling function Y, just because we
arbitrarily lumped X and Y together for aesthetic (or whatever) reasons?
Perhaps Phobos will be more palatable to the naysayers if using a single
function doesn't, e.g., pull in a 5000-line std.algorithm.

(Actually, std.algorithm currently sits at 11636 lines. I call BS on
whoever claims to be able to "skim over" std.algorithm and "get a feel
for how it works". Chances are your finger will get so tired of hitting
PgDn about 2000 lines into the file that you won't even look at the
rest.  And most of the code is only superficially related to each other
-- about 20 functions into the file you'd have lost track of all sense
of how things fit together -- 'cos they *don't* really fit together!
It's the epitome of why we *should* move to smaller modules, rather than
the current giant monolithic ones.)


T

-- 
The right half of the brain controls the left half of the body. This means that
only left-handed people are in their right mind. -- Manoj Srivastava

Jun 05 2013

D Programming

C/C++ Programming

Other

digitalmars.D - std.compress