www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - dmdz

reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
So I'm toying with a prototype, which is proving nice enough, but there 
be a few things that I'm not quite sure which way to go with.

Currently I have the general pattern

dmdz [global flags] foo1.zip [foo1 local flags] foo2.zip [foo2 local 
flags] ...

although when given multiple zips it just compiles them independently.

My thought was when fooi.zip compiles a lib file, the result should be 
made available to all subsequent zip files, so you could do something like

dmdz lib1.zip lib2.zip main.zip

where lib2 can depend on lib1 and main can depend on either lib. But 
then most if not all of lib1's flags need to be forwarded to lib2 and main.

The other alternative I thought of is all the zip files get extracted 
and then all compiled at once.

Or is multiple zip files even a good idea?



For the more specific case

dmdz [global flags] foo.zip [local flags]

it expects all the relevant content in foo.zip to be located inside 
directory foo, and doesn't extract anything else unless you explicitly 
tell it to.

Also, there can be a file 'cmd' (name?) inside foo.zip which contains 
additional flags for the compile, with local flags overriding global 
flags overriding flags found in cmd. At least for dmdz flags.

dmd flags get filtered out and forwarded to dmd.

The current strategy for compiling just involves giving every compilable 
thing extracted to dmd. There's also an option to compile each source 
file separately (which I put in after hitting an odd Out of Memory Error).

Comments?


Also, are there any plans for std.zip, e.g. with regard to ranges, 
input/output streams, etc? The current api seems a smidge spartan.
Mar 11 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/11/2010 12:11 PM, Ellery Newcomer wrote:
 So I'm toying with a prototype, which is proving nice enough, but there
 be a few things that I'm not quite sure which way to go with.

I was eagerly waiting for you to get back regarding this project. Thank you!
 Currently I have the general pattern

 dmdz [global flags] foo1.zip [foo1 local flags] foo2.zip [foo2 local
 flags] ...

 although when given multiple zips it just compiles them independently.

 My thought was when fooi.zip compiles a lib file, the result should be
 made available to all subsequent zip files, so you could do something like

 dmdz lib1.zip lib2.zip main.zip

 where lib2 can depend on lib1 and main can depend on either lib. But
 then most if not all of lib1's flags need to be forwarded to lib2 and main.

 The other alternative I thought of is all the zip files get extracted
 and then all compiled at once.

 Or is multiple zip files even a good idea?

To me this looks like a definite V2 thing honed by experience. For now the focus is distributing entire programs as one zip file.
 For the more specific case

 dmdz [global flags] foo.zip [local flags]

 it expects all the relevant content in foo.zip to be located inside
 directory foo, and doesn't extract anything else unless you explicitly
 tell it to.

I don't understand this. Does the program foo.zip have to contain an actual directory called "foo"? That's a bit restrictive. My initial plan revolved around expanding foo.zip somewhere in a unique subdir of the temp directory and considering that a full-blown project resides inside that subdir.
 Also, there can be a file 'cmd' (name?) inside foo.zip which contains
 additional flags for the compile, with local flags overriding global
 flags overriding flags found in cmd. At least for dmdz flags.

How about dmd.conf?
 dmd flags get filtered out and forwarded to dmd.

 The current strategy for compiling just involves giving every compilable
 thing extracted to dmd. There's also an option to compile each source
 file separately (which I put in after hitting an odd Out of Memory Error).

 Comments?

That sounds about right. One thing I want is to stay reasonably KISS (e.g. like rdmd is), i.e. not invent a lot of arcana. rdmd has many heuristics and limitations but has the virtue that it gets a specific job done without requiring its user to learn most anything. I hope dmdz turns out similarly simple.
 Also, are there any plans for std.zip, e.g. with regard to ranges,
 input/output streams, etc? The current api seems a smidge spartan.

I've hoped to rewrite std.zip forever, but found no time to do so. Andrei
Mar 11 2010
next sibling parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 03/11/2010 12:29 PM, Andrei Alexandrescu wrote:
 For the more specific case

 dmdz [global flags] foo.zip [local flags]

 it expects all the relevant content in foo.zip to be located inside
 directory foo, and doesn't extract anything else unless you explicitly
 tell it to.

I don't understand this. Does the program foo.zip have to contain an actual directory called "foo"? That's a bit restrictive. My initial plan revolved around expanding foo.zip somewhere in a unique subdir of the temp directory and considering that a full-blown project resides inside that subdir.

It is. I suppose the name isn't so important, but I really hate zip files whose contents aren't contained inside a single directory. Also, there would be a bit of a dichotomy if dmdz foo.zip resulted in a directory 'foo' wherever, but unzip foo.zip resulted in what would be the contents of 'foo' above. Another thing: do you envision this just being a build-this-completed-project, or do you see this as an actual development tool? Because I've been approaching it more from the latter perspective. Zip file is a roadmap: look, all the files you need for to compile are here, here, here, and here. So use them. Compile. But if the zip file is a complete project, then you would expect to see source code, test code, test data, licenses, documentation, etc, which would likely require filtering anyways and possibly multiple compiles for different pieces. And you'd expect the result of the compile to end up somewhere in the directory you just created. Alright, I think I'm seeing less and less value in foo.zip/foo as a req.
 Also, there can be a file 'cmd' (name?) inside foo.zip which contains
 additional flags for the compile, with local flags overriding global
 flags overriding flags found in cmd. At least for dmdz flags.

How about dmd.conf?

Sounds good.
 dmd flags get filtered out and forwarded to dmd.

 The current strategy for compiling just involves giving every compilable
 thing extracted to dmd. There's also an option to compile each source
 file separately (which I put in after hitting an odd Out of Memory
 Error).

 Comments?

That sounds about right. One thing I want is to stay reasonably KISS (e.g. like rdmd is), i.e. not invent a lot of arcana. rdmd has many heuristics and limitations but has the virtue that it gets a specific job done without requiring its user to learn most anything. I hope dmdz turns out similarly simple.
 Also, are there any plans for std.zip, e.g. with regard to ranges,
 input/output streams, etc? The current api seems a smidge spartan.

I've hoped to rewrite std.zip forever, but found no time to do so.

Well, heck. Maybe I'll see what I can do with it. Do you want it to conform to any interface in particular? Also: test whether a file [path?] is contained within a specific directory [path?]. does such functionality exist somewhere in phobos?
 Andrei

Mar 11 2010
next sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Ellery Newcomer wrote:
 I've hoped to rewrite std.zip forever, but found no time to do so.

Well, heck. Maybe I'll see what I can do with it. Do you want it to conform to any interface in particular?

What I'd like to see is the creation of a library file interface, say: std.archive and then have implementations of it: std.archive.zip std.archive.tar std.archive.lha std.archive.7zip etc. Pass a file name to a factory method of std.archive, and it figures out what kind of archive it is, instantiates the appropriate implementation, etc.
Mar 11 2010
prev sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Ellery Newcomer" <ellery-newcomer utulsa.edu> wrote in message 
news:hnc4o3$2lms$1 digitalmars.com...
 I suppose the name isn't so important, but I really hate zip files whose 
 contents aren't contained inside a single directory.

This is a bit of a "vim vs emacs" or "static vs dynamic" sort of issue. Most of the archive programs I've used, including the one I currently use, put an "Extract to new directory" option into my file manager's right-click menu. I *always* use that, and consider it downright silly not to. But every once in a while I'll get an archive that follows the "nothing but one dir" convention, so I get a useless extra subfolder that I have to either delete or allow it to clutter up my filesystem, and that just irritates the hell out of me. Personally, I'm convinced that any archive program that doesn't allow you to automatically create a subfolder by default is a bad archive program. And I'm convinced that a convention that places restrictions on the top-level of a zip is, well, rediculous. But obviously there are people that disagree with me on that. So, I guess it's a "vim vs emacs" kind of thing. What I really want is an archive program that automatically makes a subfolder by default *but* detects if the top level inside the archive contains nothing more than a single folder and intelligently *not* create a new folder in that case. But I've yet to see one that does that, and I haven't had time to make one.
Mar 11 2010
next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Nick Sabalausky wrote:
 "Ellery Newcomer" <ellery-newcomer utulsa.edu> wrote in message 
 news:hnc4o3$2lms$1 digitalmars.com...
 I suppose the name isn't so important, but I really hate zip files whose 
 contents aren't contained inside a single directory.

This is a bit of a "vim vs emacs" or "static vs dynamic" sort of issue. Most of the archive programs I've used, including the one I currently use, put an "Extract to new directory" option into my file manager's right-click menu. I *always* use that, and consider it downright silly not to. But every once in a while I'll get an archive that follows the "nothing but one dir" convention, so I get a useless extra subfolder that I have to either delete or allow it to clutter up my filesystem, and that just irritates the hell out of me. Personally, I'm convinced that any archive program that doesn't allow you to automatically create a subfolder by default is a bad archive program. And I'm convinced that a convention that places restrictions on the top-level of a zip is, well, rediculous. But obviously there are people that disagree with me on that. So, I guess it's a "vim vs emacs" kind of thing.

I don't really disagree, but it's not always that simple. Take tar, for instance, which has been around since forever, and which has a legacy you can't drop just like that. (I wonder if it's even part of the POSIX standard?) There are literally thousands of applications that depend on tar working in exactly the same way as it has always done, on all systems. And that way is to automatically extract all files into the current directory unless otherwise specified. As long as tar is the most common archive format on *NIX (and it is, by far), one must expect people to be true gentlemen and -women who put their files in a subdirectory inside the archive -- i.e. make tarballs and not tarbombs. :)
 What I really want is an archive program that automatically makes a 
 subfolder by default *but* detects if the top level inside the archive 
 contains nothing more than a single folder and intelligently *not* create a 
 new folder in that case. But I've yet to see one that does that, and I 
 haven't had time to make one. 

If you do, let me know. I'd like that too. :) -Lars
Mar 12 2010
prev sibling next sibling parent reply Bernard Helyer <b.helyer gmail.com> writes:
On 12/03/10 18:09, Nick Sabalausky wrote:
 "Ellery Newcomer"<ellery-newcomer utulsa.edu>  wrote in message
 news:hnc4o3$2lms$1 digitalmars.com...

 What I really want is an archive program that automatically makes a
 subfolder by default *but* detects if the top level inside the archive
 contains nothing more than a single folder and intelligently *not* create a
 new folder in that case. But I've yet to see one that does that, and I
 haven't had time to make one.

The right click 'extract here' under GNOME does *exactly* this.
Mar 12 2010
parent reply Lutger <lutger.blijdestijn gmail.com> writes:
Bernard Helyer wrote:

 On 12/03/10 18:09, Nick Sabalausky wrote:
 "Ellery Newcomer"<ellery-newcomer utulsa.edu>  wrote in message
 news:hnc4o3$2lms$1 digitalmars.com...

 What I really want is an archive program that automatically makes a
 subfolder by default *but* detects if the top level inside the archive
 contains nothing more than a single folder and intelligently *not* create
 a new folder in that case. But I've yet to see one that does that, and I
 haven't had time to make one.

The right click 'extract here' under GNOME does *exactly* this.

Same under KDE: Dolphin right click 'extract here, autodetect subfolder' Perhaps Dolphin will also function under XP, last time I checked KDE was still a bit buggy under windows though.
Mar 12 2010
parent Chad J <chadjoan __spam.is.bad__gmail.com> writes:
Lutger wrote:
 Bernard Helyer wrote:
 
 On 12/03/10 18:09, Nick Sabalausky wrote:
 "Ellery Newcomer"<ellery-newcomer utulsa.edu>  wrote in message
 news:hnc4o3$2lms$1 digitalmars.com...

 What I really want is an archive program that automatically makes a
 subfolder by default *but* detects if the top level inside the archive
 contains nothing more than a single folder and intelligently *not* create
 a new folder in that case. But I've yet to see one that does that, and I
 haven't had time to make one.


Same under KDE: Dolphin right click 'extract here, autodetect subfolder'

Yes, I love this feature.
 
 Perhaps Dolphin will also function under XP, last time I checked KDE was 
 still a bit buggy under windows though. 

Mar 13 2010
prev sibling parent Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 03/11/2010 11:09 PM, Nick Sabalausky wrote:
 "Ellery Newcomer"<ellery-newcomer utulsa.edu>  wrote in message
 news:hnc4o3$2lms$1 digitalmars.com...
 I suppose the name isn't so important, but I really hate zip files whose
 contents aren't contained inside a single directory.

This is a bit of a "vim vs emacs" or "static vs dynamic" sort of issue. Most of the archive programs I've used, including the one I currently use, put an "Extract to new directory" option into my file manager's right-click menu. I *always* use that, and consider it downright silly not to. But every once in a while I'll get an archive that follows the "nothing but one dir" convention, so I get a useless extra subfolder that I have to either delete or allow it to clutter up my filesystem, and that just irritates the hell out of me.

I rarely come across a zip file that doesn't follow that convention, and I never extract to new directory, but I do always check the contents of the zip file manually.
 Personally, I'm convinced that any archive program that doesn't allow you to
 automatically create a subfolder by default is a bad archive program. And
 I'm convinced that a convention that places restrictions on the top-level of
 a zip is, well, rediculous. But obviously there are people that disagree
 with me on that. So, I guess it's a "vim vs emacs" kind of thing.

 What I really want is an archive program that automatically makes a
 subfolder by default *but* detects if the top level inside the archive
 contains nothing more than a single folder and intelligently *not* create a
 new folder in that case. But I've yet to see one that does that, and I
 haven't had time to make one.

Yeah, I'm thinking I'm going to do that with dmdz
Mar 12 2010
prev sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Thu, Mar 11, 2010 at 9:09 PM, Nick Sabalausky <a a.a> wrote:

 What I really want is an archive program that automatically makes a
 subfolder by default *but* detects if the top level inside the archive
 contains nothing more than a single folder and intelligently *not* create a
 new folder in that case. But I've yet to see one that does that, and I
 haven't had time to make one.

WinRAR has an option for that if the zip file and the single folder inside are named the same thing. So if Foo.zip contains just a top level folder called Foo, then it just extracts Foo. Otherwise it makes a "Foo" folder and puts the contents of Foo.zip into that. --bb
Mar 12 2010
prev sibling next sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Ellery Newcomer wrote:
 So I'm toying with a prototype, which is proving nice enough, but there 
 be a few things that I'm not quite sure which way to go with.

How about: dmdz ...stuff... foo.zip ...morestuff... being semantically identical to: dmdz ...stuff... (expanded contents of foo.zip) ...morestuff... In other words, it works just like wildcard expansion: dmd ...stuff... *.d ...morestuff... Just think of foo.zip as a macro that expands to a list of the files that are the contents of foo.zip (while ignoring files that are not usable as input to dmd). The neato thing is that, for a user, there's nothing to learn about using dmdz.
Mar 11 2010
prev sibling next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Ellery Newcomer wrote:
 So I'm toying with a prototype, which is proving nice enough, but there 
 be a few things that I'm not quite sure which way to go with.

Cool! Looking forward to using it. :) But can we please call it zdmd, so there is some consistency with rdmd? -Lars
Mar 12 2010
parent Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 03/12/2010 06:15 AM, Lars T. Kyllingstad wrote:
 Ellery Newcomer wrote:
 So I'm toying with a prototype, which is proving nice enough, but
 there be a few things that I'm not quite sure which way to go with.

Cool! Looking forward to using it. :) But can we please call it zdmd, so there is some consistency with rdmd? -Lars

I have no idea why it's called dmdz and not zdmd. My guess is so you can have rdmdz.
Mar 12 2010
prev sibling next sibling parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
Hello.

I've run into a problem.

dmd foo/bar/bizz.d

bizz.d:
  module bar.bizz;
  ...

dmd thinks it's looking at module foo.bar.bizz and generally gets 
confused unless supplied with -Ifoo. As a user, I'm not manually 
specifying that -Ifoo. So I need some bare-bones lexing capabilities.

I have an ANTLR lexer grammar, which will do fine, unless the module 
name contains unicode characters.

Any other suggestions?
Mar 15 2010
parent reply "Nick Sabalausky" <a a.a> writes:
"Ellery Newcomer" <ellery-newcomer utulsa.edu> wrote in message 
news:hnmbkl$2rsj$1 digitalmars.com...
 Hello.

 I've run into a problem.

 dmd foo/bar/bizz.d

 bizz.d:
  module bar.bizz;
  ...

 dmd thinks it's looking at module foo.bar.bizz and generally gets confused 
 unless supplied with -Ifoo. As a user, I'm not manually specifying 
 that -Ifoo. So I need some bare-bones lexing capabilities.

 I have an ANTLR lexer grammar, which will do fine, unless the module name 
 contains unicode characters.

 Any other suggestions?

I'd just require a setting in dmd.conf for that.
Mar 15 2010
parent Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 03/15/2010 10:04 PM, Nick Sabalausky wrote:
 "Ellery Newcomer"<ellery-newcomer utulsa.edu>  wrote in message
 news:hnmbkl$2rsj$1 digitalmars.com...
 Hello.

 I've run into a problem.

 dmd foo/bar/bizz.d

 bizz.d:
   module bar.bizz;
   ...

 dmd thinks it's looking at module foo.bar.bizz and generally gets confused
 unless supplied with -Ifoo. As a user, I'm not manually specifying
 that -Ifoo. So I need some bare-bones lexing capabilities.

 I have an ANTLR lexer grammar, which will do fine, unless the module name
 contains unicode characters.

 Any other suggestions?

I'd just require a setting in dmd.conf for that.

Of course it turns out to be a screwy zip file. Nevermind.. Is dmd.conf really a good name for that file? I'm of the opinion now that it isn't, since it isn't the same thing and it does confuse dmd when executed in the directory containing it. dmdz.conf?
Mar 15 2010
prev sibling parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
Anyone want to play with dmdz, here it is:

http://personal.utulsa.edu/~ellery-newcomer/dmdz.zip


Haven't tested it much, especially on windows. Don't know what it will 
do with multiple zip files. piecemeal flag doesn't know how to stop when 
you tell it to. dmd's run flag isn't handled correctly (I don't know how 
it's supposed to work).

Does anyone know of a way to tell whether a command in bash or whatever 
segfaults?

And I modified std.path.dirname and std.path.basename, so I just 
included them in dmdz.d.

Otherwise, it should work okay. It can compile itself under 2.040.
Mar 16 2010
next sibling parent Lutger <lutger.blijdestijn gmail.com> writes:
Ellery Newcomer wrote:

 Anyone want to play with dmdz, here it is:
 
 http://personal.utulsa.edu/~ellery-newcomer/dmdz.zip
 
 
 Haven't tested it much, especially on windows. Don't know what it will
 do with multiple zip files. piecemeal flag doesn't know how to stop when
 you tell it to. dmd's run flag isn't handled correctly (I don't know how
 it's supposed to work).
 
 Does anyone know of a way to tell whether a command in bash or whatever
 segfaults?

You might like TRAP: http://www.davidpashley.com/articles/writing-robust-shell-scripts.html
Mar 16 2010
prev sibling next sibling parent Robert Clipsham <robert octarineparrot.com> writes:
On 16/03/10 22:55, Ellery Newcomer wrote:
 Anyone want to play with dmdz, here it is:

 http://personal.utulsa.edu/~ellery-newcomer/dmdz.zip


 Haven't tested it much, especially on windows. Don't know what it will
 do with multiple zip files. piecemeal flag doesn't know how to stop when
 you tell it to. dmd's run flag isn't handled correctly (I don't know how
 it's supposed to work).

 Does anyone know of a way to tell whether a command in bash or whatever
 segfaults?

$SOMECMD if [ $? -eq 139 ]; then echo "Segfault: $SOMECMD" fi # if you want to check for errors in general: $SOMECMD if [ $? -gte 1 ]; then echo Error fi
 And I modified std.path.dirname and std.path.basename, so I just
 included them in dmdz.d.

 Otherwise, it should work okay. It can compile itself under 2.040.

Mar 16 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/16/2010 05:55 PM, Ellery Newcomer wrote:
 Anyone want to play with dmdz, here it is:

 http://personal.utulsa.edu/~ellery-newcomer/dmdz.zip


 Haven't tested it much, especially on windows. Don't know what it will
 do with multiple zip files. piecemeal flag doesn't know how to stop when
 you tell it to. dmd's run flag isn't handled correctly (I don't know how
 it's supposed to work).

 Does anyone know of a way to tell whether a command in bash or whatever
 segfaults?

 And I modified std.path.dirname and std.path.basename, so I just
 included them in dmdz.d.

 Otherwise, it should work okay. It can compile itself under 2.040.

This is solid work, but I absolutely refuse to believe the solution must be as complicated as this. Recall that the baseline is a 30-lines script. I can't bring myself to believe that a four-modules, over thousand lines solution justifies the added complexity. Besides, what happened to std.getopt? You don't need to recognize dmd's options any more than rdmd does. rdmd dedicates only a few lines to argument parsing, dmdz makes it a science. Don't take this the wrong way, the work is absolutely a tour de force. I'm just saying that things could be dramatically simpler with just a little loss of features. I'm looking over the code and am puzzled about the kind of gunpower that seems to be necessary for achieving the task. Recall what's needed: someone who is able and willing would like to distribute a multi-module solution as a zip file. dmdz must provide a means to do so. Simple as that. The "able and willing" part is important - you don't need to cope with arbitrarily-formatted archives, you can impose people how the zip must be formatted. If you ask for them to provide a file called "main.d" in the root of the zip, then so be it if it reduces the size of dmdz by a factor of ten. Andrei
Mar 16 2010
parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 03/16/2010 08:13 PM, Andrei Alexandrescu wrote:
 This is solid work, but I absolutely refuse to believe the solution must
 be as complicated as this. Recall that the baseline is a 30-lines
 script. I can't bring myself to believe that a four-modules, over
 thousand lines solution justifies the added complexity.

I count 2 modules and about 800 loc. 2 to 300 of which implements functionality which doesn't exist in std.path but should. The ANTLR crap could be replaced by a hundred lines of handwritten code, but the grammar already existed and took less time.
 Besides, what happened to std.getopt? You don't need to recognize dmd's
 options any more than rdmd does. rdmd dedicates only a few lines to
 argument parsing, dmdz makes it a science.

It started when I said, "huh. when is this thing building an executable, and when is it building a library?", and parsing dmd's options seemed like the most generally useful way of finding that out. I rather like the way it's turned out. eg during development: $ dmdz dxl.zip -unittest
 ...

 ...

"alright, unittests pass" $ dmdz dxl.zip
 ...

"now for the release executable" fwiw, I've never used rdmd due to bug 3860.
 Don't take this the wrong way, the work is absolutely a tour de force.
 I'm just saying that things could be dramatically simpler with just a
 little loss of features. I'm looking over the code and am puzzled about
 the kind of gunpower that seems to be necessary for achieving the task.

Huh. When all you have is a harquebus ..
 Recall what's needed: someone who is able and willing would like to
 distribute a multi-module solution as a zip file. dmdz must provide a
 means to do so. Simple as that. The "able and willing" part is important
 - you don't need to cope with arbitrarily-formatted archives, you can
 impose people how the zip must be formatted. If you ask for them to
 provide a file called "main.d" in the root of the zip, then so be it if
 it reduces the size of dmdz by a factor of ten.


 Andrei

By restricting the format of the zip file a bit and moving the directory dmd gets run in, I might save 100 loc. Maybe. Does adding main.d to root help with the run flag? It doesn't do anything for dmdz that I can see. By introducing path2list et al into std.path or wherever (really, it is quite handy) and fixing basename and dirname, I could save 2 - 300 loc. By removing piecemeal and getting rid of dmd flags, I could quit 2 - 300 loc plus the ANTLR modules. Except I find both of those features occasionally useful. Given the choice, I'd keep them.
Mar 17 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/17/2010 03:01 PM, Ellery Newcomer wrote:
 On 03/16/2010 08:13 PM, Andrei Alexandrescu wrote:
 This is solid work, but I absolutely refuse to believe the solution must
 be as complicated as this. Recall that the baseline is a 30-lines
 script. I can't bring myself to believe that a four-modules, over
 thousand lines solution justifies the added complexity.

I count 2 modules and about 800 loc. 2 to 300 of which implements functionality which doesn't exist in std.path but should. The ANTLR crap could be replaced by a hundred lines of handwritten code, but the grammar already existed and took less time.

Thanks for replying to this. I'd been afraid that I was coming off too critical. (I counted the ANTLR files as modules, and I think that's fair.) To give you an idea on where I come from, distributing dmdz with dmd is also a message to users on how things are getting done in D. For the problem "Compile a D file and all its dependents, link, and run" the solution rdmd has 469 lines. It seems quite much to me, but I couldn't find ways to make it much smaller. For the problem "Given a zip file containing a D program, build it" the dmdz solution is quite large. If we count everything: $ wc --lines dmdz.d import/antlrrt/*.d lexd.g opts.d sed.sh 782 dmdz.d 891 import/antlrrt/collections.d 551 import/antlrrt/exceptions.d 1253 import/antlrrt/lexing.d 2085 import/antlrrt/parsing.d 10 import/antlrrt/runtime.d 600 import/antlrrt/utils.d 436 lexd.g 88 opts.d 13 sed.sh 6709 total Arguably we can discount the import stuff, although I'd already raise some objections: $ wc --lines dmdz.d lexd.g opts.d sed.sh 782 dmdz.d 436 lexd.g 88 opts.d 13 sed.sh 1319 total That would suggest that it's about three times as difficult to build stuff present in a zip file than to deduce dependencies and build stuff not in a zip file. I find that difficult to swallow because to me building stuff in a zip file should be in some ways easier because there are no dependencies to deduce - they can be assumed to be in the zip file. I looked more through the program and it looks like it uses the zip library (honestly I would have used system("unzip...")), which does add some aggravation for arguably a good reason. (But I also see there's no caching, which is an important requirement.) In my mind it was all about check cache, unzip, and build. True there are details such as lib vs. executable that can be messy but I don't think anything could blow complexity up too hard.
 Besides, what happened to std.getopt? You don't need to recognize dmd's
 options any more than rdmd does. rdmd dedicates only a few lines to
 argument parsing, dmdz makes it a science.

It started when I said, "huh. when is this thing building an executable, and when is it building a library?", and parsing dmd's options seemed like the most generally useful way of finding that out. I rather like the way it's turned out. eg during development: $ dmdz dxl.zip -unittest > ... $ ./dxl/bin/dxl > ... "alright, unittests pass" $ dmdz dxl.zip > ... "now for the release executable"

Nice, but I don't know why you need to understand dmd's flags instead of simply forwarding them to dmd. You could define dmdz-specific flags which you parse and understand, and then dump everything else to dmd, which will figure its own checking and error messages and all that.
 fwiw, I've never used rdmd due to bug 3860.

I didn't mean you to use it as much as look through it for examples of patterns that may be useful to dmdz (such as the one above).
 Don't take this the wrong way, the work is absolutely a tour de force.
 I'm just saying that things could be dramatically simpler with just a
 little loss of features. I'm looking over the code and am puzzled about
 the kind of gunpower that seems to be necessary for achieving the task.

Huh. When all you have is a harquebus ..

Hehe :o). Well definitely you need to submit your stdlib additions to e.g. bugzilla.
 Recall what's needed: someone who is able and willing would like to
 distribute a multi-module solution as a zip file. dmdz must provide a
 means to do so. Simple as that. The "able and willing" part is important
 - you don't need to cope with arbitrarily-formatted archives, you can
 impose people how the zip must be formatted. If you ask for them to
 provide a file called "main.d" in the root of the zip, then so be it if
 it reduces the size of dmdz by a factor of ten.


 Andrei

By restricting the format of the zip file a bit and moving the directory dmd gets run in, I might save 100 loc. Maybe. Does adding main.d to root help with the run flag? It doesn't do anything for dmdz that I can see. By introducing path2list et al into std.path or wherever (really, it is quite handy) and fixing basename and dirname, I could save 2 - 300 loc. By removing piecemeal and getting rid of dmd flags, I could quit 2 - 300 loc plus the ANTLR modules. Except I find both of those features occasionally useful. Given the choice, I'd keep them.

I think it would be great to remove all stuff that's not necessary. I paste at the end of this message my two baselines: a shell script and a D program. They compare poorly with your program, but are extremely simple. I think it may be useful to see how much impact each feature that these programs lack is adding size to your solution. Andrei #!/usr/bin/zsh # Accepted extensions EXTENSIONS=(d di a o) # The one and only parameter is the zip file ZIP=$1 # Target directory TGT=/tmp/$ZIP # Binary result is the name of the zip without the .zip BIN=${ZIP/.zip/} # Is the zip file in there? if [[ ! -f $ZIP ]]; then echo "Zip file missing: \`$ZIP'" >&2 echo "Usage: dmdz file.zip" >&2 exit 1 fi # Was the zip file already extracted? If not, extract it if [[ ! -d $TGT ]] || [[ $ZIP -nt $TGT ]]; then mkdir --parents $TGT unzip $ZIP -d $TGT >/dev/null fi # Compile all files with accepted extensions FIND="find . -type f -false " for EXT in $EXTENSIONS; do FIND="$FIND -or -iname '*.$EXT'" done (cd $TGT && dmd -of$BIN `eval $FIND`) #!/usr/bin/env rdmd // Accepted extensions auto extensions = [ "d", "di", "a", "o" ]; int main(string[] args) { // The one and only parameter is the zip file auto zip = args[1]; if (!exists(zip)) { stderr.writeln("Zip file missing: `", zip, "'"); stderr.writeln("Usage: dmdz file.zip"); return 1; } // Target directory auto tgt = "/tmp/" ~ zip; // Binary result is the name of the zip without the .zip auto bin = replace(zip, ".zip", ""); // Was the zip file already extracted? If not, extract it if (lastModified(zip) >= lastModified(tgt, d_time.min)) { system("mkdir --parents " ~ tgt); system("unzip " ~ zip " -d " tgt ~ " >/dev/null"); } // Compile all files with accepted extensions auto find = "find . -type f -false "; foreach (ext; extensions) { find ~= " -or -iname '*." ~ ext ~ "'"; } return system("cd " ~ tgt ~ " && dmd -of" ~ bin ~ " `eval " ~ find ~ "`"); }
Mar 17 2010
parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 03/17/2010 03:53 PM, Andrei Alexandrescu wrote:
 Thanks for replying to this. I'd been afraid that I was coming off too
 critical. (I counted the ANTLR files as modules, and I think that's
 fair.) To give you an idea on where I come from, distributing dmdz with
 dmd is also a message to users on how things are getting done in D.

dang right you are. If you're going to count the antlr runtime, then maybe you should also be counting druntime and the sections of phobos that I used?
 For the problem "Compile a D file and all its dependents, link, and run"
 the solution rdmd has 469 lines. It seems quite much to me, but I
 couldn't find ways to make it much smaller.

user wouldn't know that from any dmd distribution I've ever seen.
 For the problem "Given a zip file containing a D program, build it" the
 dmdz solution is quite large. If we count everything:

 $ wc --lines dmdz.d import/antlrrt/*.d lexd.g opts.d sed.sh
 782 dmdz.d
 891 import/antlrrt/collections.d
 551 import/antlrrt/exceptions.d
 1253 import/antlrrt/lexing.d
 2085 import/antlrrt/parsing.d
 10 import/antlrrt/runtime.d
 600 import/antlrrt/utils.d
 436 lexd.g
 88 opts.d
 13 sed.sh
 6709 total

forgot generated/*.d should bump it up to 11 or 12 k.
 Arguably we can discount the import stuff, although I'd already raise
 some objections:

 $ wc --lines dmdz.d lexd.g opts.d sed.sh
 782 dmdz.d
 436 lexd.g
 88 opts.d
 13 sed.sh
 1319 total

lexd.g and sed.sh are only there for reference. I hate it when machine generated source code is in a project, but the source grammar isn't.
 That would suggest that it's about three times as difficult to build
 stuff present in a zip file than to deduce dependencies and build stuff
 not in a zip file. I find that difficult to swallow because to me
 building stuff in a zip file should be in some ways easier because there
 are no dependencies to deduce - they can be assumed to be in the zip file.

 I looked more through the program and it looks like it uses the zip
 library (honestly I would have used system("unzip...")), which does add
 some aggravation for arguably a good reason. (But I also see there's no
 caching, which is an important requirement.)

eh?
 Nice, but I don't know why you need to understand dmd's flags instead of
 simply forwarding them to dmd. You could define dmdz-specific flags
 which you parse and understand, and then dump everything else to dmd,
 which will figure its own checking and error messages and all that.

filtering out flags that screw things up for the build in question; knowing where the resultant executable is supposed to be;
 I think it would be great to remove all stuff that's not necessary. I
 paste at the end of this message my two baselines: a shell script and a
 D program. They compare poorly with your program, but are extremely
 simple. I think it may be useful to see how much impact each feature
 that these programs lack is adding size to your solution.


 Andrei

You come at this problem like "It should be an eloquent showcase of what D has to offer." I come at it like "I want this to be generally useful. To me." In my opinion, how well it works trumps how many lines of code it took to write. But for the aforementioned bug, I never would have looked at rdmd's source, and even then I didn't notice how many lines of code it was. The way dmdz was written is based on the needs that presented themselves to me at the time. So far I've run it against three different projects and I'm happy with it the way it's turned out. 1. dmdz toy example. not much here. 2. dexcelapi port of jexcelapi, 90k loc (that thing must have shrunk when I wasn't looking, I was sure it was 200k), ~ 400 source files. Big. Dumping everything to dmd is easy enough to implement one way or another, but when I hit an Out of Memory Error I need what -piecemeal has to offer. I found the offending file (still don't know what's up with it), commented it out, and I can dump everything to dmd again. Without it, I probably would have given up on D for another year and a half. 3. dcrypt Today, I wanted to play with it, so I checked it out, popped dmdz.conf and a main.d in the directory and zipped the whole thing up. dmdz dcrypt.zip It worked. Without me doing anything to dmdz or dcrypt (except adding a string alias, &&^%^ tango). I was kind of hoping others would try it and give their opinions, but apparently nobody else cares. Or they're on vacation, like I should be. Or they're giving the infamous 'silent approval'. Who knows.
Mar 17 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/17/2010 08:17 PM, Ellery Newcomer wrote:
 On 03/17/2010 03:53 PM, Andrei Alexandrescu wrote:
 Thanks for replying to this. I'd been afraid that I was coming off too
 critical. (I counted the ANTLR files as modules, and I think that's
 fair.) To give you an idea on where I come from, distributing dmdz with
 dmd is also a message to users on how things are getting done in D.

dang right you are. If you're going to count the antlr runtime, then maybe you should also be counting druntime and the sections of phobos that I used?

I meant the antlr grammar for the task. I gave two counts, one excluding the antlr runtime, and based the rest of my discussion on that. I sadly note the irony. There is no need to get defensive, really.
 For the problem "Compile a D file and all its dependents, link, and run"
 the solution rdmd has 469 lines. It seems quite much to me, but I
 couldn't find ways to make it much smaller.

user wouldn't know that from any dmd distribution I've ever seen.
 For the problem "Given a zip file containing a D program, build it" the
 dmdz solution is quite large. If we count everything:

 $ wc --lines dmdz.d import/antlrrt/*.d lexd.g opts.d sed.sh
 782 dmdz.d
 891 import/antlrrt/collections.d
 551 import/antlrrt/exceptions.d
 1253 import/antlrrt/lexing.d
 2085 import/antlrrt/parsing.d
 10 import/antlrrt/runtime.d
 600 import/antlrrt/utils.d
 436 lexd.g
 88 opts.d
 13 sed.sh
 6709 total

forgot generated/*.d

Well that's generated. I counted what's needed to get things going. Unless you meant that ironically...
 should bump it up to 11 or 12 k.

 Arguably we can discount the import stuff, although I'd already raise
 some objections:

 $ wc --lines dmdz.d lexd.g opts.d sed.sh
 782 dmdz.d
 436 lexd.g
 88 opts.d
 13 sed.sh
 1319 total

lexd.g and sed.sh are only there for reference. I hate it when machine generated source code is in a project, but the source grammar isn't.

My understanding is that lexd.g is your code so it should be included in the size of the solution, whereas the generated code should not.
 That would suggest that it's about three times as difficult to build
 stuff present in a zip file than to deduce dependencies and build stuff
 not in a zip file. I find that difficult to swallow because to me
 building stuff in a zip file should be in some ways easier because there
 are no dependencies to deduce - they can be assumed to be in the zip
 file.

 I looked more through the program and it looks like it uses the zip
 library (honestly I would have used system("unzip...")), which does add
 some aggravation for arguably a good reason. (But I also see there's no
 caching, which is an important requirement.)

eh?

The idea is to not extract the files every time you build. If they are in place already, the tool should recognize that.
 Nice, but I don't know why you need to understand dmd's flags instead of
 simply forwarding them to dmd. You could define dmdz-specific flags
 which you parse and understand, and then dump everything else to dmd,
 which will figure its own checking and error messages and all that.

filtering out flags that screw things up for the build in question; knowing where the resultant executable is supposed to be;
 I think it would be great to remove all stuff that's not necessary. I
 paste at the end of this message my two baselines: a shell script and a
 D program. They compare poorly with your program, but are extremely
 simple. I think it may be useful to see how much impact each feature
 that these programs lack is adding size to your solution.


 Andrei

You come at this problem like "It should be an eloquent showcase of what D has to offer." I come at it like "I want this to be generally useful. To me."

The tool shouldn't be a showcase. Obviously the primary purpose is for the tool to be useful. The shell script and the D script are useful. I am sure your tool is useful, but I think it doesn't hit the right balance. I simply don't think it takes that much code to achieve what the tool needs to achieve.
 In my opinion, how well it works trumps how many lines of code it took
 to write. But for the aforementioned bug, I never would have looked at
 rdmd's source, and even then I didn't notice how many lines of code it
 was. The way dmdz was written is based on the needs that presented
 themselves to me at the time. So far I've run it against three different
 projects and I'm happy with it the way it's turned out.

 1. dmdz

 toy example. not much here.

 2. dexcelapi

 port of jexcelapi, 90k loc (that thing must have shrunk when I wasn't
 looking, I was sure it was 200k), ~ 400 source files. Big. Dumping
 everything to dmd is easy enough to implement one way or another, but
 when I hit an Out of Memory Error I need what -piecemeal has to offer. I
 found the offending file (still don't know what's up with it), commented
 it out, and I can dump everything to dmd again. Without it, I probably
 would have given up on D for another year and a half.

 3. dcrypt

 Today, I wanted to play with it, so I checked it out, popped dmdz.conf
 and a main.d in the directory and zipped the whole thing up.

 dmdz dcrypt.zip

 It worked. Without me doing anything to dmdz or dcrypt (except adding a
 string alias, &&^%^ tango).

I'm not contending the tool is not useful. I'm just saying it is too big for what it does, and that that does matter with regard to distributing it with dmd.
 I was kind of hoping others would try it and give their opinions, but
 apparently nobody else cares. Or they're on vacation, like I should be.
 Or they're giving the infamous 'silent approval'. Who knows.

It looks like we're getting into a little diatribe, which is very sad because you've clearly done a good amount of work and I didn't intend to make it look any other way. All I can say is that the tool is very far removed from what I think it should look like; for my money, the moment it gets larger than one simple module it would mean I took a few wrong turns along the way. BTW Walter made a very nice suggestion: make a .zip file in the command line be equivalent to listing all files in that zip in the command line. I think it's this kind of idea that greatly simplifies things. Andrei
Mar 17 2010
next sibling parent reply BCS <none anon.com> writes:
Hello Andrei,

 The idea is to not extract the files every time you build. If they are
 in place already, the tool should recognize that.

The difference in speed between disk IO and CPU /might/ be high enough that (unless the uncompressed file is cached or you round trip it back to the disk) reading from the zip may be faster. I know that on linux there is a way to pass a stream as a file name (I forget what happens under the hood, but bash uses the ">(cmd)" syntax to do it) so you could work with that. -- ... <IXOYE><
Mar 17 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/17/2010 10:30 PM, BCS wrote:
 Hello Andrei,

 The idea is to not extract the files every time you build. If they are
 in place already, the tool should recognize that.

The difference in speed between disk IO and CPU /might/ be high enough that (unless the uncompressed file is cached or you round trip it back to the disk) reading from the zip may be faster. I know that on linux there is a way to pass a stream as a file name (I forget what happens under the hood, but bash uses the ">(cmd)" syntax to do it) so you could work with that.

That works on zsh, I'm not sure whether it works with other shells. Also, dmd refuses to compile such streams because they don't end in .d. The file must be written to the file system, so caching would always help. Andrei
Mar 18 2010
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
BCS wrote:
 Hello Andrei,
 
 The idea is to not extract the files every time you build. If they are
 in place already, the tool should recognize that.

The difference in speed between disk IO and CPU /might/ be high enough that (unless the uncompressed file is cached or you round trip it back to the disk) reading from the zip may be faster. I know that on linux there is a way to pass a stream as a file name (I forget what happens under the hood, but bash uses the ">(cmd)" syntax to do it) so you could work with that.

I'd argue that for this case, caching the extracted files is not worth the effort, complexity, or speed. If you're in an edit/compile/debug loop, I can't see working off of a zip file of the sources.
Mar 18 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/18/2010 02:36 PM, Walter Bright wrote:
 BCS wrote:
 Hello Andrei,

 The idea is to not extract the files every time you build. If they are
 in place already, the tool should recognize that.

The difference in speed between disk IO and CPU /might/ be high enough that (unless the uncompressed file is cached or you round trip it back to the disk) reading from the zip may be faster. I know that on linux there is a way to pass a stream as a file name (I forget what happens under the hood, but bash uses the ">(cmd)" syntax to do it) so you could work with that.

I'd argue that for this case, caching the extracted files is not worth the effort, complexity, or speed. If you're in an edit/compile/debug loop, I can't see working off of a zip file of the sources.

Of course not, but the typical scenario is to just run a program off its .zip file every so often. In that case, extraction makes for an unpleasant latency. FWIW, for rdmd caching makes a big, big difference. Andrei
Mar 18 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 FWIW, for rdmd caching makes a big, big difference.

Caching the executable, sure, but I'm not sure that translates into a case for caching the intermediate files (i.e. the extracted source).
Mar 18 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/18/2010 04:14 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 FWIW, for rdmd caching makes a big, big difference.

Caching the executable, sure, but I'm not sure that translates into a case for caching the intermediate files (i.e. the extracted source).

I see. It should be fine to cache the exe and regenerate only if the archive is newer. Andrei
Mar 18 2010
prev sibling parent reply BCS <none anon.com> writes:
Hello Walter,

 BCS wrote:
 
 Hello Andrei,
 
 The idea is to not extract the files every time you build. If they
 are in place already, the tool should recognize that.
 

enough that (unless the uncompressed file is cached or you round trip it back to the disk) reading from the zip may be faster. I know that on linux there is a way to pass a stream as a file name (I forget what happens under the hood, but bash uses the ">(cmd)" syntax to do it) so you could work with that.

the effort, complexity, or speed. If you're in an edit/compile/debug loop, I can't see working off of a zip file of the sources.

The only case I can think of where putting a zip file in the middle of that loop is even remotely reasonable would be for a remote build farm. The other use cases for build-from-zip are building someone else's code where you aren't editing the parts in the zip file. -- ... <IXOYE><
Mar 18 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
BCS wrote:
 The only case I can think of where putting a zip file in the middle of 
 that loop is even remotely reasonable would be for a remote build farm. 
 The other use cases for build-from-zip are building someone else's code 
 where you aren't editing the parts in the zip file.

It might even be practical to have dmdz compile from a zip file specified by a URL! That would be cool.
Mar 18 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/18/2010 04:15 PM, Walter Bright wrote:
 BCS wrote:
 The only case I can think of where putting a zip file in the middle of
 that loop is even remotely reasonable would be for a remote build
 farm. The other use cases for build-from-zip are building someone
 else's code where you aren't editing the parts in the zip file.

It might even be practical to have dmdz compile from a zip file specified by a URL! That would be cool.

In that case I do think caching would be helpful :o). Andrei
Mar 18 2010
prev sibling parent reply Lutger <lutger.blijdestijn gmail.com> writes:
Walter Bright wrote:

 BCS wrote:
 The only case I can think of where putting a zip file in the middle of
 that loop is even remotely reasonable would be for a remote build farm.
 The other use cases for build-from-zip are building someone else's code
 where you aren't editing the parts in the zip file.

It might even be practical to have dmdz compile from a zip file specified by a URL! That would be cool.

Just like dsss did...(and still does for D1 I guess) I like dmdz and rdmd, but it's a pity dsss isn't revived yet. I still really miss it, always thought it would become the ruby gems / CPAN of D.
Mar 18 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
Lutger wrote:
 Just like dsss did...(and still does for D1 I guess)
 
 I like dmdz and rdmd, but it's a pity dsss isn't revived yet. I still really 
 miss it, always thought it would become the ruby gems / CPAN of D. 

Anyone can revive it if they're motivated too!
Mar 18 2010
prev sibling parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 03/17/2010 08:49 PM, Andrei Alexandrescu wrote:
 Well that's generated. I counted what's needed to get things going.
 Unless you meant that ironically...

Yes I was speaking in jest up to this point.
 should bump it up to 11 or 12 k.

 Arguably we can discount the import stuff, although I'd already raise
 some objections:

 $ wc --lines dmdz.d lexd.g opts.d sed.sh
 782 dmdz.d
 436 lexd.g
 88 opts.d
 13 sed.sh
 1319 total

lexd.g and sed.sh are only there for reference. I hate it when machine generated source code is in a project, but the source grammar isn't.

My understanding is that lexd.g is your code so it should be included in the size of the solution, whereas the generated code should not.

Yeah, you're right there.
 That would suggest that it's about three times as difficult to build
 stuff present in a zip file than to deduce dependencies and build stuff
 not in a zip file. I find that difficult to swallow because to me
 building stuff in a zip file should be in some ways easier because there
 are no dependencies to deduce - they can be assumed to be in the zip
 file.

 I looked more through the program and it looks like it uses the zip
 library (honestly I would have used system("unzip...")), which does add
 some aggravation for arguably a good reason. (But I also see there's no
 caching, which is an important requirement.)

eh?

The idea is to not extract the files every time you build. If they are in place already, the tool should recognize that.

It does that, but on a per-file basis.
 Nice, but I don't know why you need to understand dmd's flags instead of
 simply forwarding them to dmd. You could define dmdz-specific flags
 which you parse and understand, and then dump everything else to dmd,
 which will figure its own checking and error messages and all that.

filtering out flags that screw things up for the build in question; knowing where the resultant executable is supposed to be;
 I think it would be great to remove all stuff that's not necessary. I
 paste at the end of this message my two baselines: a shell script and a
 D program. They compare poorly with your program, but are extremely
 simple. I think it may be useful to see how much impact each feature
 that these programs lack is adding size to your solution.


 Andrei

You come at this problem like "It should be an eloquent showcase of what D has to offer." I come at it like "I want this to be generally useful. To me."

The tool shouldn't be a showcase. Obviously the primary purpose is for the tool to be useful. The shell script and the D script are useful. I am sure your tool is useful, but I think it doesn't hit the right balance. I simply don't think it takes that much code to achieve what the tool needs to achieve.

All right. I'll try cutting things out and see where I end up.
 I'm not contending the tool is not useful. I'm just saying it is too big
 for what it does, and that that does matter with regard to distributing
 it with dmd.

I still don't see why (other than lexd.g adds ~ 10k loc just to get the line 'module foo.bar;' out of a source file)
 BTW Walter made a very nice suggestion: make a .zip file in the command
 line be equivalent to listing all files in that zip in the command line.
 I think it's this kind of idea that greatly simplifies things.


 Andrei

Fair enough.
Mar 18 2010
next sibling parent reply Robert Clipsham <robert octarineparrot.com> writes:
On 18/03/10 16:28, Ellery Newcomer wrote:
 I still don't see why (other than lexd.g adds ~ 10k loc just to get the
 line 'module foo.bar;' out of a source file)

That seems like a tad too much for it... Surely it would only take a few (here meaning far less than 10k) lines to parse away comments/whitespace at the start of the file then read the module declaration if there is one?
Mar 18 2010
parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 03/18/2010 11:36 AM, Robert Clipsham wrote:
 On 18/03/10 16:28, Ellery Newcomer wrote:
 I still don't see why (other than lexd.g adds ~ 10k loc just to get the
 line 'module foo.bar;' out of a source file)

That seems like a tad too much for it... Surely it would only take a few (here meaning far less than 10k) lines to parse away comments/whitespace at the start of the file then read the module declaration if there is one?

Sure. I could write it in 100 loc. My concern is they would be a buggy 100 loc that would take a good deal of effort to get right. lexd.g already existed and has been pretty heavily tested.
Mar 18 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/18/2010 11:48 AM, Ellery Newcomer wrote:
 On 03/18/2010 11:36 AM, Robert Clipsham wrote:
 On 18/03/10 16:28, Ellery Newcomer wrote:
 I still don't see why (other than lexd.g adds ~ 10k loc just to get the
 line 'module foo.bar;' out of a source file)

That seems like a tad too much for it... Surely it would only take a few (here meaning far less than 10k) lines to parse away comments/whitespace at the start of the file then read the module declaration if there is one?

Sure. I could write it in 100 loc. My concern is they would be a buggy 100 loc that would take a good deal of effort to get right. lexd.g already existed and has been pretty heavily tested.

You could write it in 5 loc. Andrei
Mar 18 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/18/2010 11:28 AM, Ellery Newcomer wrote:
 On 03/17/2010 08:49 PM, Andrei Alexandrescu wrote:
 The idea is to not extract the files every time you build. If they are
 in place already, the tool should recognize that.

It does that, but on a per-file basis.

My bad for not being able to see that in the code. I read through and also searched for "cache", "date", "time"... couldn't find it. I now find it by looking for "last".
 I'm not contending the tool is not useful. I'm just saying it is too big
 for what it does, and that that does matter with regard to distributing
 it with dmd.

I still don't see why (other than lexd.g adds ~ 10k loc just to get the line 'module foo.bar;' out of a source file)

If a casual user downloads the dmd distro and says, hey, let me see how this rdmd tool is implemented, I wouldn't be afraid. If they take a look at dmdz, they may be daunted. The example you gave is perfect. Right now rdmd runs dmd -v to figure out dependencies, but before it was parsing the file for lines that begin with "import". That was problematic, so I'm glad I now use the compiler. Your task is much simpler - nothing is allowed before the module line aside from the shebang line and comments, and you should feel free to restrict modules to e.g. not include recursive comments or anything that aggravates your job. So, I'm very glad you mentioned it: 10K of code to detect "module" is absolute overkill. I now confess that I couldn't figure out why you needed the lexer for dmdz and didn't have the time to sift through the code and figure that out. I thought there must be some solid reason, and so I was ashamed to even ask. I did know you want to find "module", but in my naivete, I wasn't thinking that just that would ever inspire you to include a lexer. To be frank, I even think you shouldn't worry at all about "module". Just extract the blessed thing with caching and call it a day. I was also thinking of simplifying options etc. by requiring a file "dmdflags.txt" in the archive and then do this when you run dmd: dmd `cat dmdflags.txt` stuff morestuff andsomemorestuff i.e. simply expand the file in the command line. No need for any extravaganza. But even dmdflags.txt I'd think would be a bit much. And speaking of cmdline stuff, assume find, zip, etc. are present on the host system if you need them.
 BTW Walter made a very nice suggestion: make a .zip file in the command
 line be equivalent to listing all files in that zip in the command line.
 I think it's this kind of idea that greatly simplifies things.


 Andrei

Fair enough.

Thank you for considering changing your program. Andrei
Mar 18 2010
next sibling parent reply Clemens <eriatarka84 gmail.com> writes:
Andrei Alexandrescu Wrote:

 To be frank, I even think you shouldn't worry at all about "module". 
 Just extract the blessed thing with caching and call it a day. I was 
 also thinking of simplifying options etc. by requiring a file 
 "dmdflags.txt" in the archive and then do this when you run dmd:
 
 dmd `cat dmdflags.txt` stuff morestuff andsomemorestuff
 
 i.e. simply expand the file in the command line.

I think it would be a good idea to stay well away from gratuitous portability barriers like this or that system("unzip") suggestion if the portable alternative isn't too much more work. I don't see why you wouldn't want this thing to work on Windows too.
Mar 18 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/18/2010 12:28 PM, Clemens wrote:
 Andrei Alexandrescu Wrote:

 To be frank, I even think you shouldn't worry at all about
 "module". Just extract the blessed thing with caching and call it a
 day. I was also thinking of simplifying options etc. by requiring a
 file "dmdflags.txt" in the archive and then do this when you run
 dmd:

 dmd `cat dmdflags.txt` stuff morestuff andsomemorestuff

 i.e. simply expand the file in the command line.

I think it would be a good idea to stay well away from gratuitous portability barriers like this or that system("unzip") suggestion if the portable alternative isn't too much more work. I don't see why you wouldn't want this thing to work on Windows too.

Yah, I agree. Well `` don't need to be used in the command line, a std.file.readText("dmdflags") should suffice. Andrei
Mar 18 2010
prev sibling next sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 To be frank, I even think you shouldn't worry at all about "module". 
 Just extract the blessed thing with caching and call it a day. I was 
 also thinking of simplifying options etc. by requiring a file 
 "dmdflags.txt" in the archive and then do this when you run dmd:
 
 dmd `cat dmdflags.txt` stuff morestuff andsomemorestuff

dmd will already read switches out of a file: dmd cmdfile ... So there's no need to parse the command file or do any shell expansion on it. Just pass it, and precede it with an .
Mar 18 2010
prev sibling parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
On 19-3-2010 1:18, Andrei Alexandrescu wrote:
 i.e. simply expand the file in the command line. No need for any
 extravaganza. But even dmdflags.txt I'd think would be a bit much. And
 speaking of cmdline stuff, assume find, zip, etc. are present on the
 host system if you need them.

and I'm out.. I'm using Windows and don't have any of those (well, I have MS's FIND.EXE but that has nothing in common with posix') Anyway, Ellery is right: general stuff that dmdz needs could probably be moved into Phobos at some point. As for "module", couldn't dmd include an option to output these, similar to the way it outputs deps? L.
Mar 18 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/18/2010 06:43 PM, Lionello Lunesu wrote:
 On 19-3-2010 1:18, Andrei Alexandrescu wrote:
 i.e. simply expand the file in the command line. No need for any
 extravaganza. But even dmdflags.txt I'd think would be a bit much. And
 speaking of cmdline stuff, assume find, zip, etc. are present on the
 host system if you need them.

and I'm out.. I'm using Windows and don't have any of those (well, I have MS's FIND.EXE but that has nothing in common with posix')

You're right.
 Anyway, Ellery is right: general stuff that dmdz needs could probably be
 moved into Phobos at some point.

I looked around. basename and dirname suggest that the ones in phobos have issues (what are those?), and some other functions rely on path2list which I'd hope to replace with a range so as to not allocate memory without necessity.
 As for "module", couldn't dmd include
 an option to output these, similar to the way it outputs deps?

I think that would be a natural thing to ask for. Until then I don't think there's a real need for supporting module declarations in dmdz. Andrei
Mar 18 2010
prev sibling parent reply Robert Clipsham <robert octarineparrot.com> writes:
On 18/03/10 01:17, Ellery Newcomer wrote:
 I was kind of hoping others would try it and give their opinions, but
 apparently nobody else cares. Or they're on vacation, like I should be.
 Or they're giving the infamous 'silent approval'. Who knows.

I'm usually one of those, but seen as you asked... It looks good :) I haven't had chance to try it yet, but a simple tool like this could be really useful. I don't have the same reservations as Andrei about the amount of code/how it's done... If it does its job it's good enough for me :) One thing I would like to know, are there plans for file formats other than .zip? You can generally get files less than half the size with faster compression/decompression times using other formats... would adding support for them (.tar.xz, .tar.gz) be too much extra hassle?
Mar 18 2010
parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 03/18/2010 11:32 AM, Robert Clipsham wrote:
 On 18/03/10 01:17, Ellery Newcomer wrote:
 I was kind of hoping others would try it and give their opinions, but
 apparently nobody else cares. Or they're on vacation, like I should be.
 Or they're giving the infamous 'silent approval'. Who knows.

I'm usually one of those, but seen as you asked... It looks good :) I haven't had chance to try it yet, but a simple tool like this could be really useful. I don't have the same reservations as Andrei about the amount of code/how it's done... If it does its job it's good enough for me :) One thing I would like to know, are there plans for file formats other than .zip? You can generally get files less than half the size with faster compression/decompression times using other formats... would adding support for them (.tar.xz, .tar.gz) be too much extra hassle?

It would only involve building support for those formats into phobos :) I actually had the same thought after I saw Walter's suggestion for a std.archive. If I have time, I'd like to make it happen. Wait, what's tar.xz?
Mar 18 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/18/2010 11:39 AM, Ellery Newcomer wrote:
 On 03/18/2010 11:32 AM, Robert Clipsham wrote:
 On 18/03/10 01:17, Ellery Newcomer wrote:
 I was kind of hoping others would try it and give their opinions, but
 apparently nobody else cares. Or they're on vacation, like I should be.
 Or they're giving the infamous 'silent approval'. Who knows.

I'm usually one of those, but seen as you asked... It looks good :) I haven't had chance to try it yet, but a simple tool like this could be really useful. I don't have the same reservations as Andrei about the amount of code/how it's done... If it does its job it's good enough for me :) One thing I would like to know, are there plans for file formats other than .zip? You can generally get files less than half the size with faster compression/decompression times using other formats... would adding support for them (.tar.xz, .tar.gz) be too much extra hassle?

It would only involve building support for those formats into phobos :) I actually had the same thought after I saw Walter's suggestion for a std.archive. If I have time, I'd like to make it happen.

Heh, incidentally I just needed a tar reader a few days ago, so I wrote an embryo of a base class etc. I'll add it soon. The basic interface is: (a) open the archive (b) get an input range for it. The range iterates over archive entries. (c) You can look at archive info, and if you want to extract you can get a .byChunk() range to extract it. That's also an input range. For now I'm only concerned with reading... writing needs to be added. Andrei
Mar 18 2010
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 Heh, incidentally I just needed a tar reader a few days ago, so I wrote 
 an embryo of a base class etc. I'll add it soon.
 
 The basic interface is:
 
 (a) open the archive
 
 (b) get an input range for it. The range iterates over archive entries.
 
 (c) You can look at archive info, and if you want to extract you can get 
 a .byChunk() range to extract it. That's also an input range.
 
 For now I'm only concerned with reading... writing needs to be added.

That's great, but I only suggest that this not be added to Phobos until a generic archive interface is also added. That way, we can constantly add support for new archive formats without requiring users to change their code. Some suggestions for that: 1. The archive type should be represented by a string literal, not an enum. This way, users can add other archive types without having to touch the Phobos source code. 2. The reader should auto-detect the archive type based on the file contents, not the file name, and then call the appropriate factory method.
Mar 18 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/18/2010 02:49 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 Heh, incidentally I just needed a tar reader a few days ago, so I
 wrote an embryo of a base class etc. I'll add it soon.

 The basic interface is:

 (a) open the archive

 (b) get an input range for it. The range iterates over archive entries.

 (c) You can look at archive info, and if you want to extract you can
 get a .byChunk() range to extract it. That's also an input range.

 For now I'm only concerned with reading... writing needs to be added.

That's great, but I only suggest that this not be added to Phobos until a generic archive interface is also added. That way, we can constantly add support for new archive formats without requiring users to change their code.

Yah.
 Some suggestions for that:

 1. The archive type should be represented by a string literal, not an
 enum. This way, users can add other archive types without having to
 touch the Phobos source code.
 2. The reader should auto-detect the archive type based on the file
 contents, not the file name, and then call the appropriate factory method.

The archive type should be a D class inheriting ArchiveReader, so no enum and no string need be involved. The rest is a matter of registry - a new archiver registers itself into a database of archivers that maps file header data to (pointers to) factory methods. Typical file extensions should help, too, because they'd ease matching. Reading the file header (e.g. first 512 bytes) and then matching against archive signatures is, I think, a very nice touch. (I was only thinking of matching by file name.) There is a mild complication - you can't close and reopen the archive, so you need to pass those 512 bytes to the archiver along with the rest of the stream. This is because the stream may not be rewindable, as is the case with pipes. Sounds great! Andrei
Mar 18 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 Reading the file header (e.g. first 512 bytes) and then matching against 
 archive signatures is, I think, a very nice touch. (I was only thinking 
 of matching by file name.) There is a mild complication - you can't 
 close and reopen the archive, so you need to pass those 512 bytes to the 
 archiver along with the rest of the stream. This is because the stream 
 may not be rewindable, as is the case with pipes.

The reasons for reading the file to determine the archive type are: 1. Files sometimes lose their extensions when being transferred around. I sometimes have this problem when downloading files from the internet - Windows will store it without an extension. 2. Sometimes I have to remove the extension when sending a file via email, as stupid email readers block certain email messages based on file attachment extensions. 3. People don't always put the right extension onto the file. 4. Passing an archive of one type to a reader for another type causes the reader to crash (yes, I know, readers should be more robust that way, but reality is reality). Is it really necessary to support streaming archives? The reason I ask is we can nicely separate building/reading archives from file I/O. The archives can be entirely done in memory. Perhaps if an archive is being streamed, the program can simply accumulate it all in memory, then call the archive library functions.
Mar 18 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/18/2010 03:11 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 Reading the file header (e.g. first 512 bytes) and then matching
 against archive signatures is, I think, a very nice touch. (I was only
 thinking of matching by file name.) There is a mild complication - you
 can't close and reopen the archive, so you need to pass those 512
 bytes to the archiver along with the rest of the stream. This is
 because the stream may not be rewindable, as is the case with pipes.

The reasons for reading the file to determine the archive type are: 1. Files sometimes lose their extensions when being transferred around. I sometimes have this problem when downloading files from the internet - Windows will store it without an extension. 2. Sometimes I have to remove the extension when sending a file via email, as stupid email readers block certain email messages based on file attachment extensions. 3. People don't always put the right extension onto the file. 4. Passing an archive of one type to a reader for another type causes the reader to crash (yes, I know, readers should be more robust that way, but reality is reality).

Makes sense.
 Is it really necessary to support streaming archives?

It is not necessary, only vital.
 The reason I ask
 is we can nicely separate building/reading archives from file I/O. The
 archives can be entirely done in memory. Perhaps if an archive is being
 streamed, the program can simply accumulate it all in memory, then call
 the archive library functions.

This is completely nonscalable! 90% of all my archive manipulation involves streaming, and I wouldn't dream of thinking of loading most of those files in RAM. They are huge! I paste from a script I'm working on right now: if [[ ! -f $D/sentences.num.gz ]]; then echo '# Text to numeric...' ./txt2num.d $D/voc.txt \ < <(pv $D/sentences.txt.gz | gunzip) \ > >(gzip >$D/sentences.num.tmp.gz) mv $D/sentences.num.tmp.gz $D/sentences.num.gz fi That takes a good amount of time to run because the .gz involved is 2,180,367,456 bytes _after_ compression. Note how zipping is done both ways - on reading and writing. It would be great if we all went to the utmost possible lengths to distance ourselves from such nonscalable thinking. It's the root reason for which the wc sample program on digitalmars.com is _inappropriate_ and _damaging_ to the reputation of the language, and also the reason for which hash tables' implementation performs so poorly on large data - i.e., exactly when it matters. It's the kind of thinking stemming from "But I don't have _one_ file larger than 1GB anywhere on my hard drive!" which you repeatedly claimed as if it were a solid argument. Well if you don't have one you better get some. Nobody's going to give us a cookie if we process 50KB files 10 times faster than Perl or Python. Where it does matter is large data, and I'd be in a much better mood if I didn't feel my beard growing while I'm waiting next to a program that uses hashes to build a large index file. Andrei
Mar 18 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 Is it really necessary to support streaming archives?


I understand your point. But I still would like a way to build and read archives entirely in memory. One reason is that's how dmd is able to generate libraries so quickly.
Mar 18 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/18/2010 04:22 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 Is it really necessary to support streaming archives?


I understand your point. But I still would like a way to build and read archives entirely in memory. One reason is that's how dmd is able to generate libraries so quickly.

Makes sense. (On the read side, reading in memory is not a problem if reading from a stream is defined - just use the streaming interface to load stuff in memory. For the writing part we need the mythical streaming abstraction that replaces current streams...) Andrei
Mar 18 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 On 03/18/2010 04:22 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 Is it really necessary to support streaming archives?


I understand your point. But I still would like a way to build and read archives entirely in memory. One reason is that's how dmd is able to generate libraries so quickly.

Makes sense. (On the read side, reading in memory is not a problem if reading from a stream is defined - just use the streaming interface to load stuff in memory. For the writing part we need the mythical streaming abstraction that replaces current streams...) Andrei

Maybe a better way to do it is to just pass a delegate that encapsulates a reader, and a delegate for the writing. That way, both streams and in-memory buffers will work with the same interface, and the archiver need know nothing about streams or memory. Some default delegates can be provided that interface to streams, files, and memory buffers. Or maybe just pass a range!
Mar 18 2010
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 The basic interface is:

Another thing needed for the interface is an associative array that maps a string to a member of the archive. Object code libraries do this (the string is the unresolved symbol's name, the member is of course the corresponding object file).
Mar 18 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/18/2010 05:11 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 The basic interface is:

Another thing needed for the interface is an associative array that maps a string to a member of the archive. Object code libraries do this (the string is the unresolved symbol's name, the member is of course the corresponding object file).

Emphatically NO. Archives work with streams. You can build indexing on top of them. Andrei
Mar 18 2010
next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-03-18 18:17:26 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 03/18/2010 05:11 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 The basic interface is:

Another thing needed for the interface is an associative array that maps a string to a member of the archive. Object code libraries do this (the string is the unresolved symbol's name, the member is of course the corresponding object file).

Emphatically NO. Archives work with streams. You can build indexing on top of them.

Andrei, have you took a look at the Zip file format? It's not streamable. To be exact, zip is not streamable because you need to read the central directory at the end of the archive to get the actual file list. This has its benefits: it makes it easy to peak at the content without loading everything, and it makes it possible to completely change the archive's logical content just by appending to the file. It's like a mini-database in a way. <http://en.wikipedia.org/wiki/ZIP_(file_format)#Technical_information> I agree it is essential to have streaming support for archives formats that works with streaming. But offering only that is not a solution for archives in general. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Mar 18 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/18/2010 05:32 PM, Michel Fortin wrote:
 On 2010-03-18 18:17:26 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 03/18/2010 05:11 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 The basic interface is:

Another thing needed for the interface is an associative array that maps a string to a member of the archive. Object code libraries do this (the string is the unresolved symbol's name, the member is of course the corresponding object file).

Emphatically NO. Archives work with streams. You can build indexing on top of them.

Andrei, have you took a look at the Zip file format? It's not streamable. To be exact, zip is not streamable because you need to read the central directory at the end of the archive to get the actual file list. This has its benefits: it makes it easy to peak at the content without loading everything, and it makes it possible to completely change the archive's logical content just by appending to the file. It's like a mini-database in a way. <http://en.wikipedia.org/wiki/ZIP_(file_format)#Technical_information> I agree it is essential to have streaming support for archives formats that works with streaming. But offering only that is not a solution for archives in general.

Interesting, thank you. I still think generally a random-access interface is not the charter of the Archive interface. A zip archive should open the archive, seek to the end of it once, build an index, and then rewind the file for sequential access. But we shouldn't ask for such miracles from all archives. Andrei
Mar 18 2010
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 On 03/18/2010 05:11 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 The basic interface is:

Another thing needed for the interface is an associative array that maps a string to a member of the archive. Object code libraries do this (the string is the unresolved symbol's name, the member is of course the corresponding object file).

Emphatically NO. Archives work with streams. You can build indexing on top of them.

Such an interface won't work with .lib or .a archives. Both have an embedded table of contents that is such an associative array - it's not a list of file names, either, that's separate.
Mar 18 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/18/2010 06:00 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 03/18/2010 05:11 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 The basic interface is:

Another thing needed for the interface is an associative array that maps a string to a member of the archive. Object code libraries do this (the string is the unresolved symbol's name, the member is of course the corresponding object file).

Emphatically NO. Archives work with streams. You can build indexing on top of them.

Such an interface won't work with .lib or .a archives. Both have an embedded table of contents that is such an associative array - it's not a list of file names, either, that's separate.

Now I understand why linkers thrash the disk. Anyway, my point is: indexing the archive should be not part of the basic interface. Such capabilities should be in an enhanced interface that builds upon the basic one. Andrei
Mar 18 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 On 03/18/2010 06:00 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 03/18/2010 05:11 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 The basic interface is:

Another thing needed for the interface is an associative array that maps a string to a member of the archive. Object code libraries do this (the string is the unresolved symbol's name, the member is of course the corresponding object file).

Emphatically NO. Archives work with streams. You can build indexing on top of them.

Such an interface won't work with .lib or .a archives. Both have an embedded table of contents that is such an associative array - it's not a list of file names, either, that's separate.

Now I understand why linkers thrash the disk.

I think this is incorrect. The table of contents in the .lib files was designed to work with a floppy disk system, and to minimize the number of disk reads. The design of .a libraries is equivalent. The thrashing of linkers came about on limited memory systems as the linker's in-memory data set often exceeded physical ram. A typical linker run also simply needs to read a lot of files.
 Anyway, my point is: indexing the archive should be not part of the 
 basic interface. Such capabilities should be in an enhanced interface 
 that builds upon the basic one.

That would be fine.
Mar 18 2010
prev sibling parent Robert Clipsham <robert octarineparrot.com> writes:
On 18/03/10 16:39, Ellery Newcomer wrote:
 Wait, what's tar.xz?

http://en.wikipedia.org/wiki/Xz - A lot of linux distro's seem to be moving to it for packaging from .tar.gz, I'm on Arch Linux, and the updates are a fraction of the size they used to be :)
Mar 18 2010