www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Tool to strip function bodies out of D code?

reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
Since my project nonagon isn't open-source (though that may change in the 
next few months), I distribute it as a lib with "headers" which just have 
declarations for all the classes and functions and whatnot.  In order to 
generate these "headers," I basically just have to remove all function 
bodies and replace them with semicolons.

The problem is that I did this manually at first, and it took me maybe 5-10 
minutes to do it.  Now that nonagon has grown so much, It would probably 
take me close to an hour to replace all the function bodies with semicolons.

I wrote a "dumb" tool which basically counts the number of left and right 
braces, and based on the nesting level, it either outputs the text it reads 
or it doesn't.  The problem with this is that once I start adding things in 
like version blocks and nested classes, it'll start stripping things out 
that it's not supposed to.  It also doesn't strip out global-level function 
bodies.

Is there any kind of lexer/parser tool for D that will allow me to quickly 
strip out the function bodies?  Surely someone has run across this problem 
before.  Normally, I'd look into using the frontend, but, well, it's 
completely undocumented and written in C++.  And I really don't feel like 
spending a week and a half figuring out how to use the thing and then 
writing a tool.

Wasn't there some kind of D lexer tool written in D several months ago?  I 
can't find the NG thread.. 
Aug 09 2005
next sibling parent reply Niko Korhonen <niktheblak hotmail.com> writes:
Jarrett Billingsley wrote:
 The problem is that I did this manually at first, and it took me maybe 5-10 
 minutes to do it.  Now that nonagon has grown so much, It would probably 
 take me close to an hour to replace all the function bodies with semicolons.

True. The approach available with current tools (i.e. stripping the function bodies manually) is not an option. Because the current approach is so difficult there are people that think it's *technically impossible* to create closed-source libraries with D (thus abandoning the language) and I don't blame them.
 Is there any kind of lexer/parser tool for D that will allow me to quickly 
 strip out the function bodies?  Surely someone has run across this problem 
 before. 

Surely we have :) This has been discussed in the NG from time to time and sometimes rather philosophical elements creeped into the discussion. The following is a summary of most common opinions, or at least my perception of them. The open source maniac: We don't need that kind of abomination. Let some company create a commercial tool for people who need it. If you're not going to share your source, you don't deserve to use free tools. Developers who are working for free shouldn't waste their time programming a tool to help creating nonfree/closed-source software. The pragmatic: This kind of tool would be very useful but open source developers have no time to create one and commercial developers aren't interested in making tools for a language with tiny user (=customer) base. The D evangelist: surely D is suitable for closed-source develpoment too, it's only a matter of time before someone will create this kind of tool. It's even possible now! You just have to go through living hell to process your thousands if not hundreds of thousands lines of code and cheerfully descend to the maintenance hell that surely awaits anyone who keeps two sets of source code with manual synchronization. -- Niko Korhonen SW Developer
Aug 09 2005
next sibling parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Niko Korhonen" <niktheblak hotmail.com> wrote in message 
news:ddaing$2ops$1 digitaldaemon.com...
 The open source maniac: We don't need that kind of abomination. Let some
 company create a commercial tool for people who need it. If you're not
 going to share your source, you don't deserve to use free tools.
 Developers who are working for free shouldn't waste their time
 programming a tool to help creating nonfree/closed-source software.

Haha, those crazy Linuxheads ;) What I'm thinking is that it would probably not be too difficult a feature to implement in the _compiler_. Think about it - creating a symbol table that just holds names and signatures is exactly what the compiler does when it imports a module. Outputting the symbol table in a readable format wouldn't be too much of a stretch from there.
Aug 09 2005
prev sibling next sibling parent reply Dejan Lekic <leka entropy.tmok.com> writes:
I share this opinion even though I am not an open-source maniac. - Every
normal human being would state "If you're not going to share your source,
you don't deserve to use free tools.".
It is imho fair enough if someone shares an apple with you today, for an
instance, than if You are good person, You will share an apple with that
person when You have it, or with some other person who gave an apple to
that person you got an apple from. :)

 The open source maniac: We don't need that kind of abomination. Let some
 company create a commercial tool for people who need it. If you're not
 going to share your source, you don't deserve to use free tools.
 Developers who are working for free shouldn't waste their time
 programming a tool to help creating nonfree/closed-source software.

-- ........... Dejan Lekic http://dejan.lekic.org
Aug 09 2005
parent bobef <bobef_member pathlink.com> writes:
In article <ddaqoc$3d1$1 digitaldaemon.com>, Dejan Lekic says...
I share this opinion even though I am not an open-source maniac. - Every
normal human being would state "If you're not going to share your source,
you don't deserve to use free tools.".
It is imho fair enough if someone shares an apple with you today, for an
instance, than if You are good person, You will share an apple with that
person when You have it, or with some other person who gave an apple to
that person you got an apple from. :)

I totally disagree. Sharing something because you expect something to be shared with you is no different than selling it.
Aug 10 2005
prev sibling parent reply bobef <bobef_member pathlink.com> writes:
The open source maniac: We don't need that kind of abomination. Let some
company create a commercial tool for people who need it. If you're not
going to share your source, you don't deserve to use free tools.
Developers who are working for free shouldn't waste their time
programming a tool to help creating nonfree/closed-source software.

The pragmatic: This kind of tool would be very useful but open source
developers have no time to create one and commercial developers aren't
interested in making tools for a language with tiny user (=customer) base.

Just an example: my app (lessequal.com/akide) is open source. Compiling and linking it takes 3 seconds on my pc when I pass all the files at once to dmd. But compiling each file w/o linking takes 40+ seconds! THIS IS SICK! Now I believe if we had headers it would take few seconds more than my 3... But now my source is so big I'd rather add intellisence to the ide (akide is developed on akide) and let it decide which files need recompiling instead of creating all the headers... If I had to "rebuild all" more often my development should be SIGNIFICANTLY slower. So I don't know if we need tool to create headers or dmd should be smarter...
Aug 10 2005
parent Derek Parnell <derek psych.ward> writes:
On Wed, 10 Aug 2005 20:41:08 +0000 (UTC), bobef wrote:

The open source maniac: We don't need that kind of abomination. Let some
company create a commercial tool for people who need it. If you're not
going to share your source, you don't deserve to use free tools.
Developers who are working for free shouldn't waste their time
programming a tool to help creating nonfree/closed-source software.

The pragmatic: This kind of tool would be very useful but open source
developers have no time to create one and commercial developers aren't
interested in making tools for a language with tiny user (=customer) base.

Just an example: my app (lessequal.com/akide) is open source. Compiling and linking it takes 3 seconds on my pc when I pass all the files at once to dmd. But compiling each file w/o linking takes 40+ seconds! THIS IS SICK! Now I believe if we had headers it would take few seconds more than my 3... But now my source is so big I'd rather add intellisence to the ide (akide is developed on akide) and let it decide which files need recompiling instead of creating all the headers... If I had to "rebuild all" more often my development should be SIGNIFICANTLY slower. So I don't know if we need tool to create headers or dmd should be smarter...

This is one of the reasons I wrote Build. By default, it only compiles those files that need to be compiled. -- Derek Parnell (skype: derek.j.parnell) Melbourne, Australia Download BUILD from ... http://www.dsource.org/projects/build/ v2.09 released 10/Aug/2005 http://www.prowiki.org/wiki4d/wiki.cgi?FrontPage 11/08/2005 9:35:17 AM
Aug 10 2005
prev sibling next sibling parent reply BCSD <BCSD_member pathlink.com> writes:
One solution suggests it’s self (if I have time I might try implementing it,
later this week maybe).

Use a stack to keep track of the state in each level of nesting. Whenever you
find a '{' go back and check to se what type of block it is, push your current
state, change your state and keep going. Whenever you find a '}' pop off a
state. You would need to delay output (or backup occasionally) but it shouldn’t
be that bad. With a little more work it could even handle braces in quotes.



In article <ddah3h$2mn5$1 digitaldaemon.com>, Jarrett Billingsley says...
Since my project nonagon isn't open-source (though that may change in the 
next few months), I distribute it as a lib with "headers" which just have 
declarations for all the classes and functions and whatnot.  In order to 
generate these "headers," I basically just have to remove all function 
bodies and replace them with semicolons.

The problem is that I did this manually at first, and it took me maybe 5-10 
minutes to do it.  Now that nonagon has grown so much, It would probably 
take me close to an hour to replace all the function bodies with semicolons.

I wrote a "dumb" tool which basically counts the number of left and right 
braces, and based on the nesting level, it either outputs the text it reads 
or it doesn't.  The problem with this is that once I start adding things in 
like version blocks and nested classes, it'll start stripping things out 
that it's not supposed to.  It also doesn't strip out global-level function 
bodies.

Is there any kind of lexer/parser tool for D that will allow me to quickly 
strip out the function bodies?  Surely someone has run across this problem 
before.  Normally, I'd look into using the frontend, but, well, it's 
completely undocumented and written in C++.  And I really don't feel like 
spending a week and a half figuring out how to use the thing and then 
writing a tool.

Wasn't there some kind of D lexer tool written in D several months ago?  I 
can't find the NG thread.. 

Aug 09 2005
next sibling parent Brad Beveridge <brad somewhere.net> writes:
I'm working on just that tool right now.  For me it is the first phase 
of building a tool that creates scripting language bindings for D code. 
  What I have now is pretty basic, but it is starting to get there.  If 
you want to take what I already have and run with it, just ask & I'll 
post it here.
So far I
  - recognise the version statement, cutting out versions that are not 
set & adding versions that are set
  - leave struct and class bodies
  - strip bodies from regular functions and member functions

The code is very simple & probably quite broken in places, so it is at 
your own risk :)

Brad
Aug 09 2005
prev sibling next sibling parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"BCSD" <BCSD_member pathlink.com> wrote in message 
news:ddajld$2q7t$1 digitaldaemon.com...
 Use a stack to keep track of the state in each level of nesting. Whenever 
 you
 find a '{' go back and check to se what type of block it is, push your 
 current
 state, change your state and keep going. Whenever you find a '}' pop off a
 state. You would need to delay output (or backup occasionally) but it 
 shouldn't
 be that bad. With a little more work it could even handle braces in 
 quotes.

Sounds similar to what I'm doing now, although it might be a little difficult to determine the type of the block without a lexer / parser. I suppose I could check for a few keywords, like "class," "struct," and "enum," and let those through, and just kill everything else. Thanks!
Aug 09 2005
prev sibling parent Hasan Aljudy <hasan.aljudy gmail.com> writes:
BCSD wrote:
 One solution suggests it’s self (if I have time I might try implementing it,
 later this week maybe).
 
 Use a stack to keep track of the state in each level of nesting. Whenever you
 find a '{' go back and check to se what type of block it is, push your current
 state, change your state and keep going. Whenever you find a '}' pop off a
 state. You would need to delay output (or backup occasionally) but it shouldn’t
 be that bad. With a little more work it could even handle braces in quotes.
 

It's probably better to toenize the text first. when strings are tokenized you don't have to worry about whether or not they have a '}' inside them .. you just look for '{' and '}' tokens. I'm *trying* to do something like that, in java. So far I only have toenization done .. although my concept of 'token' is probably different from what most people might conceptualize it .. beause I didn't read it in a book or study it in college, I just sort of made my own concept of it. I'll gladly post it if anyone asks for it. But be aware that I haven't tested it heavily, as I am on my own. Right now I'm trying to parse declarations, if that gets done properly, then I think it would be easy to strip off function bodies and whatnot.
 
 
 In article <ddah3h$2mn5$1 digitaldaemon.com>, Jarrett Billingsley says...
 
Since my project nonagon isn't open-source (though that may change in the 
next few months), I distribute it as a lib with "headers" which just have 
declarations for all the classes and functions and whatnot.  In order to 
generate these "headers," I basically just have to remove all function 
bodies and replace them with semicolons.

The problem is that I did this manually at first, and it took me maybe 5-10 
minutes to do it.  Now that nonagon has grown so much, It would probably 
take me close to an hour to replace all the function bodies with semicolons.

I wrote a "dumb" tool which basically counts the number of left and right 
braces, and based on the nesting level, it either outputs the text it reads 
or it doesn't.  The problem with this is that once I start adding things in 
like version blocks and nested classes, it'll start stripping things out 
that it's not supposed to.  It also doesn't strip out global-level function 
bodies.

Is there any kind of lexer/parser tool for D that will allow me to quickly 
strip out the function bodies?  Surely someone has run across this problem 
before.  Normally, I'd look into using the frontend, but, well, it's 
completely undocumented and written in C++.  And I really don't feel like 
spending a week and a half figuring out how to use the thing and then 
writing a tool.

Wasn't there some kind of D lexer tool written in D several months ago?  I 
can't find the NG thread.. 


Aug 09 2005
prev sibling next sibling parent "Ben Hinkle" <bhinkle mathworks.com> writes:
 Normally, I'd look into using the frontend, but, well, it's completely 
 undocumented and written in C++.  And I really don't feel like spending a 
 week and a half figuring out how to use the thing and then writing a tool.

The dmdfe "tool starter kit" might make writing the tool a little easier - but one would still have to write some C++ to get it working. Also dmfe is probably a few release behind by now. The dsource page is http://www.dsource.org/projects/dmdfe/ I also remember someone writing a header-generator but I don't have any links handy. Hopefully they are still around or someone remembers more. Check the wiki or Google or something... good luck
Aug 09 2005
prev sibling next sibling parent John Demme <me teqdruid.com> writes:
On Tue, 2005-08-09 at 18:41 +0300, Niko Korhonen wrote:
 The open source maniac: We don't need that kind of abomination. Let some
 company create a commercial tool for people who need it. If you're not
 going to share your source, you don't deserve to use free tools.
 Developers who are working for free shouldn't waste their time
 programming a tool to help creating nonfree/closed-source software.
 

I'm a F/OSS evangelist, and I definitely disagree. A tool like this would even be useful for open-source projects. Most people don't have the full source code for libraries on their machines, so why should they with D. Eventually, when a library like Mango is distributed in binary form, libmango.a along with the code-stripped source should be distributed, not the full source. I won't comment about whether or not closed-source guys deserve to use open-source stuff, however. No need for a political debate. I will say that not all F/OSS people are RMS, however. -John Demme
Aug 09 2005
prev sibling next sibling parent J C Calvarese <technocrat7 gmail.com> writes:
In article <ddah3h$2mn5$1 digitaldaemon.com>, Jarrett Billingsley says...
Is there any kind of lexer/parser tool for D that will allow me to quickly 
strip out the function bodies?  Surely someone has run across this problem 
before.  Normally, I'd look into using the frontend, but, well, it's 
completely undocumented and written in C++.  And I really don't feel like 
spending a week and a half figuring out how to use the thing and then 
writing a tool.

You might have to do that if you really don't want to release the source.
Wasn't there some kind of D lexer tool written in D several months ago?  I 
can't find the NG thread.. 

There's a bunch of different tools that do different things. The tool "digc" might help you out. It has a function called "strip". It's mentioned here: http://www.prowiki.org/wiki4d/wiki.cgi?ReferenceForTools This might be the NG post you're thinking of: digitalmars.D/11075 jcc7
Aug 09 2005
prev sibling next sibling parent reply Burton Radons <burton-radons smocky.com> writes:
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Jarrett Billingsley wrote:

 Since my project nonagon isn't open-source (though that may change in the 
 next few months), I distribute it as a lib with "headers" which just have 
 declarations for all the classes and functions and whatnot.  In order to 
 generate these "headers," I basically just have to remove all function 
 bodies and replace them with semicolons.
 
 The problem is that I did this manually at first, and it took me maybe 5-10 
 minutes to do it.  Now that nonagon has grown so much, It would probably 
 take me close to an hour to replace all the function bodies with semicolons.
 
 I wrote a "dumb" tool which basically counts the number of left and right 
 braces, and based on the nesting level, it either outputs the text it reads 
 or it doesn't.  The problem with this is that once I start adding things in 
 like version blocks and nested classes, it'll start stripping things out 
 that it's not supposed to.  It also doesn't strip out global-level function 
 bodies.
 
 Is there any kind of lexer/parser tool for D that will allow me to quickly 
 strip out the function bodies?  Surely someone has run across this problem 
 before.  Normally, I'd look into using the frontend, but, well, it's 
 completely undocumented and written in C++.  And I really don't feel like 
 spending a week and a half figuring out how to use the thing and then 
 writing a tool.

digc has a tool for stripping function bodies. The problem is that because of D's VERY complex versioning and debugging syntax there is no way to handle stripping robustly without understanding the entire language; it must parse the whole thing. Which is a pickle, since D is not simple to parse at all and some of its syntax is so subtle that you practically need to depend upon DMD. I think Ben has the right idea. I've attached the code that's currently in digc which parses D into a syntax tree. I haven't looked at it in months so it might not work with various language features; there are so damned many of them. At least usage is simple: char [] dstrip (char [] filename, char [] data, bit [char []] versions = null, int version_level = 0, bit [char []] debugs = null, uint debug_level = 0) { class_lexer lexer = new class_lexer (); class_parser parser = new class_parser (lexer); lexer.source = type_marker (filename, data); type_printer printer; class_module mod = parser.parse_module (); printer.versions = versions; printer.version_level = version_level; printer.debugs = debugs; printer.debug_level = debug_level; printer.load_default_versions (); printer.skip_bodies = true; printer.pretty = true; printer.collapse_versions = true; return mod.toString (printer); }
Aug 09 2005
parent reply Shammah Chancellor <Shammah_member pathlink.com> writes:
In article <ddamm3$2uu2$1 digitaldaemon.com>, Burton Radons says...

digc has a tool for stripping function bodies.  The problem is that 
because of D's VERY complex versioning and debugging syntax there is no 
way to handle stripping robustly without understanding the entire 
language; it must parse the whole thing.  Which is a pickle, since D is 
not simple to parse at all and some of its syntax is so subtle that you 
practically need to depend upon DMD.  I think Ben has the right idea.

Funny, I thought D was supposed to be easy to parse?
Aug 09 2005
next sibling parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Shammah Chancellor wrote:
 In article <ddamm3$2uu2$1 digitaldaemon.com>, Burton Radons says...
 
 
digc has a tool for stripping function bodies.  The problem is that 
because of D's VERY complex versioning and debugging syntax there is no 
way to handle stripping robustly without understanding the entire 
language; it must parse the whole thing.  Which is a pickle, since D is 
not simple to parse at all and some of its syntax is so subtle that you 
practically need to depend upon DMD.  I think Ben has the right idea.

Funny, I thought D was supposed to be easy to parse?

I guess it's relative. For someone who wrote a C++ compiler, D would probably be very easy to parse.
Aug 09 2005
parent reply "Walter" <newshound digitalmars.com> writes:
"Hasan Aljudy" <hasan.aljudy gmail.com> wrote in message
news:ddbc30$mcj$1 digitaldaemon.com...
 Shammah Chancellor wrote:
 In article <ddamm3$2uu2$1 digitaldaemon.com>, Burton Radons says...
digc has a tool for stripping function bodies.  The problem is that
because of D's VERY complex versioning and debugging syntax there is no
way to handle stripping robustly without understanding the entire
language; it must parse the whole thing.  Which is a pickle, since D is
not simple to parse at all and some of its syntax is so subtle that you
practically need to depend upon DMD.  I think Ben has the right idea.


For someone who wrote a C++ compiler, D would probably be very easy to parse.

Well, that's true <g>. Some things that make D easy to parse: 1) Clean separation between tokenizing and semantic processing - parsing fits neatly in between. 2) The parser fits in one file, 4592 lines of code. 3) The parser is provided free with D. Is it that hard to understand? Some things that make D harder to parse: 1) It's not LALR(1), arbitrary lookahead is required for some constructs 2) There are a lot of constructs Pascal is trivial to parse, and Java isn't much harder. D is still easier to parse than C, and orders of magnitude easier than C++.
Aug 09 2005
parent Hasan Aljudy <hasan.aljudy gmail.com> writes:
Walter wrote:
 "Hasan Aljudy" <hasan.aljudy gmail.com> wrote in message
 news:ddbc30$mcj$1 digitaldaemon.com...
 
Shammah Chancellor wrote:

In article <ddamm3$2uu2$1 digitaldaemon.com>, Burton Radons says...

digc has a tool for stripping function bodies.  The problem is that
because of D's VERY complex versioning and debugging syntax there is no
way to handle stripping robustly without understanding the entire
language; it must parse the whole thing.  Which is a pickle, since D is
not simple to parse at all and some of its syntax is so subtle that you
practically need to depend upon DMD.  I think Ben has the right idea.

Funny, I thought D was supposed to be easy to parse?

I guess it's relative. For someone who wrote a C++ compiler, D would probably be very easy to parse.


[snip]
 3) The parser is provided free with D. Is it that hard to understand?

not documented.
Aug 09 2005
prev sibling next sibling parent reply J C Calvarese <technocrat7 gmail.com> writes:
In article <ddb9ho$jua$1 digitaldaemon.com>, Shammah Chancellor says...
In article <ddamm3$2uu2$1 digitaldaemon.com>, Burton Radons says...

digc has a tool for stripping function bodies.  The problem is that 
because of D's VERY complex versioning and debugging syntax there is no 
way to handle stripping robustly without understanding the entire 
language; it must parse the whole thing.  Which is a pickle, since D is 
not simple to parse at all and some of its syntax is so subtle that you 
practically need to depend upon DMD.  I think Ben has the right idea.

Funny, I thought D was supposed to be easy to parse?

Compared to C++, I think it is much easier to parse. Also, if the purpose doesn't require perfectly correct parsing (e.g. version isn't used or nesting comments don't need to be quite handled right) it is pretty easy. Also, it'd be easier if Walter would quit adding cool new features to the language. It's hard to keep up with all of the constant improvements. ;) jcc7
Aug 09 2005
parent reply "Walter" <newshound digitalmars.com> writes:
"J C Calvarese" <technocrat7 gmail.com> wrote in message
news:ddbhgl$spl$1 digitaldaemon.com...
 Compared to C++, I think it is much easier to parse. Also, if the purpose
 doesn't require perfectly correct parsing (e.g. version isn't used or

 comments don't need to be quite handled right) it is pretty easy.

Since the source to the lexer is provided, there's no excuse for not doing comments right <g>.
Aug 09 2005
parent J C Calvarese <technocrat7 gmail.com> writes:
In article <ddc0vf$1kj7$3 digitaldaemon.com>, Walter says...
"J C Calvarese" <technocrat7 gmail.com> wrote in message
news:ddbhgl$spl$1 digitaldaemon.com...
 Compared to C++, I think it is much easier to parse. Also, if the purpose
 doesn't require perfectly correct parsing (e.g. version isn't used or

 comments don't need to be quite handled right) it is pretty easy.

Since the source to the lexer is provided, there's no excuse for not doing comments right <g>.

Yes, there is an excuse: I'm not a C++ programmer. So there. When the D front end is written in D, that's when I wouldn't have an excuse. ;) (But I still appreciate that you open-sourced it. And I like nested comments a lot, so please don't take them out.) jcc7
Aug 10 2005
prev sibling parent Burton Radons <burton-radons smocky.com> writes:
Shammah Chancellor wrote:
 In article <ddamm3$2uu2$1 digitaldaemon.com>, Burton Radons says...
 
 
digc has a tool for stripping function bodies.  The problem is that 
because of D's VERY complex versioning and debugging syntax there is no 
way to handle stripping robustly without understanding the entire 
language; it must parse the whole thing.  Which is a pickle, since D is 
not simple to parse at all and some of its syntax is so subtle that you 
practically need to depend upon DMD.  I think Ben has the right idea.

Funny, I thought D was supposed to be easy to parse?

That claim needs to be contrasted with other languages which actually are easy to parse: LISP, Self, Smalltalk. D is nowhere near their league. Being easier to parse than C++ - and I think they're about equivalent if you're not tool-dependent - doesn't mean much. That's like a pickup getting better gas mileage than a van.
Aug 11 2005
prev sibling next sibling parent reply pragma <pragma_member pathlink.com> writes:
I think it's been mentioned in this thread that hacking DMDFE may be a way to
go.  Another route would be to hack an existing utility that already understands
version statements well: take a look at the source for 'build' over on
dsource.org.  At the very least, it's already aware of brackets and strings,
which would be the biggest hurdles for writing your own tool.

Good luck.

- EricAnderton at yahoo
Aug 09 2005
parent reply Derek Parnell <derek psych.ward> writes:
On Tue, 9 Aug 2005 17:35:41 +0000 (UTC), pragma wrote:

 I think it's been mentioned in this thread that hacking DMDFE may be a way to
 go.  Another route would be to hack an existing utility that already
understands
 version statements well: take a look at the source for 'build' over on
 dsource.org.  At the very least, it's already aware of brackets and strings,
 which would be the biggest hurdles for writing your own tool.
 
 Good luck.
 
 - EricAnderton at yahoo

Okay, okay, okay! ;-) I'll add a "-strip" switch to create 'header' modules. Most of the code is already there. -- Derek Parnell Melbourne, Australia 10/08/2005 8:02:47 AM
Aug 09 2005
next sibling parent Holger <Holger_member pathlink.com> writes:
In article <1gzqddtap8a8$.1wwes1e06us88$.dlg 40tude.net>, Derek Parnell says...
On Tue, 9 Aug 2005 17:35:41 +0000 (UTC), pragma wrote:

 I think it's been mentioned in this thread that hacking DMDFE may be a way to
 go.  Another route would be to hack an existing utility that already
understands
 version statements well: take a look at the source for 'build' over on
 dsource.org.  At the very least, it's already aware of brackets and strings,
 which would be the biggest hurdles for writing your own tool.
 
 Good luck.
 
 - EricAnderton at yahoo

Okay, okay, okay! ;-) I'll add a "-strip" switch to create 'header' modules. Most of the code is already there.

Thanks, that'd be really cool! Holger
-- 
Derek Parnell
Melbourne, Australia
10/08/2005 8:02:47 AM

Aug 09 2005
prev sibling parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Derek Parnell" <derek psych.ward> wrote in message 
news:1gzqddtap8a8$.1wwes1e06us88$.dlg 40tude.net...
 Okay, okay, okay! ;-)

 I'll add a "-strip" switch to create 'header' modules. Most of the code is
 already there.

w00tage.
Aug 09 2005
prev sibling next sibling parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Jarrett Billingsley" <kb3ctd2 yahoo.com> wrote in message 
news:ddah3h$2mn5$1 digitaldaemon.com...

Thanks for all the replies!  I'll try out BCSD's solution, and if that isn't 
quite robust enough, I'll see if I can get digc to work, and if _that_ 
doesn't work, I'll make something with DMDFE.  Thanks! 
Aug 09 2005
prev sibling next sibling parent reply Shammah Chancellor <Shammah_member pathlink.com> writes:
In article <ddah3h$2mn5$1 digitaldaemon.com>, Jarrett Billingsley says...
Since my project nonagon isn't open-source (though that may change in the 
next few months), I distribute it as a lib with "headers" which just have 
declarations for all the classes and functions and whatnot.  In order to 
generate these "headers," I basically just have to remove all function 
bodies and replace them with semicolons.

The problem is that I did this manually at first, and it took me maybe 5-10 
minutes to do it.  Now that nonagon has grown so much, It would probably 
take me close to an hour to replace all the function bodies with semicolons.

I wrote a "dumb" tool which basically counts the number of left and right 
braces, and based on the nesting level, it either outputs the text it reads 
or it doesn't.  The problem with this is that once I start adding things in 
like version blocks and nested classes, it'll start stripping things out 
that it's not supposed to.  It also doesn't strip out global-level function 
bodies.

Is there any kind of lexer/parser tool for D that will allow me to quickly 
strip out the function bodies?  Surely someone has run across this problem 
before.  Normally, I'd look into using the frontend, but, well, it's 
completely undocumented and written in C++.  And I really don't feel like 
spending a week and a half figuring out how to use the thing and then 
writing a tool.

Wasn't there some kind of D lexer tool written in D several months ago?  I 
can't find the NG thread.. 

Is this it? digitalmars.D/26411
Aug 09 2005
parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message
 Is this it?

 digitalmars.D/26411

Nope, that's not it. I think it was DMDFE that I was thinking about.
Aug 09 2005
parent reply James Dunne <james.jdunne gmail.com> writes:
In article <ddbeqm$pfc$1 digitaldaemon.com>, Jarrett Billingsley says...
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message
 Is this it?

 digitalmars.D/26411

Nope, that's not it. I think it was DMDFE that I was thinking about.

No, I believe you were thinking of dlexer.d, which is a module that I wrote a while back. It lexes D source code (as of a few versions back) and returns a list of tokens. It works almost exactly like the C++ version that is in the front end, but it's written in D! dlexer lives in the bindings project on dsource.org, here: http://svn.dsource.org/projects/bindings/trunk/dlexer.d Whaddaya know, the guy that's hosting your webspace is the same guy that wrote a module you need. Am I just not useful to you or what? =P Well, that all depends on if you actually use the module in question. Regards, James Dunne
Aug 10 2005
next sibling parent Michael <Michael_member pathlink.com> writes:
Hmm the dparse.d file on that site requires the dtypes module. Where is that?

In article <ddddg2$etj$1 digitaldaemon.com>, James Dunne says...
In article <ddbeqm$pfc$1 digitaldaemon.com>, Jarrett Billingsley says...
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message
 Is this it?

 digitalmars.D/26411

Nope, that's not it. I think it was DMDFE that I was thinking about.

No, I believe you were thinking of dlexer.d, which is a module that I wrote a while back. It lexes D source code (as of a few versions back) and returns a list of tokens. It works almost exactly like the C++ version that is in the front end, but it's written in D! dlexer lives in the bindings project on dsource.org, here: http://svn.dsource.org/projects/bindings/trunk/dlexer.d Whaddaya know, the guy that's hosting your webspace is the same guy that wrote a module you need. Am I just not useful to you or what? =P Well, that all depends on if you actually use the module in question. Regards, James Dunne

Aug 10 2005
prev sibling parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"James Dunne" <james.jdunne gmail.com> wrote in message 
news:ddddg2$etj$1 digitaldaemon.com...
 No, I believe you were thinking of dlexer.d, which is a module that I 
 wrote a
 while back.  It lexes D source code (as of a few versions back) and 
 returns a
 list of tokens.  It works almost exactly like the C++ version that is in 
 the
 front end, but it's written in D!

THAT'S THE ONE!
 dlexer lives in the bindings project on dsource.org, here:
 http://svn.dsource.org/projects/bindings/trunk/dlexer.d

 Whaddaya know, the guy that's hosting your webspace is the same guy that 
 wrote a
 module you need.  Am I just not useful to you or what? =P  Well, that all
 depends on if you actually use the module in question.

Haha, how about a coincidence! I ended up changing my little tool to look for //STRIP and //NOSTRIP comments in the source file that would turn on and off stripping of code between braces. I only had a few problems, mostly caused by rogue braces that were on the same line as other code (see, I knew my coding style would pay off.. no code on the same line as a brace!), but other than that, it worked great. I'll definitely check out / update your dlexer, though, as I'd like to write a somewhat more robust tool that can handle it without the //STRIP and //NOSTRIP directives. Thanks!
Aug 10 2005
prev sibling next sibling parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Jarrett Billingsley wrote:
 Since my project nonagon isn't open-source (though that may change in the 
 next few months), I distribute it as a lib with "headers" which just have 
 declarations for all the classes and functions and whatnot.  In order to 
 generate these "headers," I basically just have to remove all function 
 bodies and replace them with semicolons.
 
 The problem is that I did this manually at first, and it took me maybe 5-10 
 minutes to do it.  Now that nonagon has grown so much, It would probably 
 take me close to an hour to replace all the function bodies with semicolons.
 
 I wrote a "dumb" tool which basically counts the number of left and right 
 braces, and based on the nesting level, it either outputs the text it reads 
 or it doesn't.  The problem with this is that once I start adding things in 
 like version blocks and nested classes, it'll start stripping things out 
 that it's not supposed to.  It also doesn't strip out global-level function 
 bodies.
 
 Is there any kind of lexer/parser tool for D that will allow me to quickly 
 strip out the function bodies?  Surely someone has run across this problem 
 before.  Normally, I'd look into using the frontend, but, well, it's 
 completely undocumented and written in C++.  And I really don't feel like 
 spending a week and a half figuring out how to use the thing and then 
 writing a tool.
 
 Wasn't there some kind of D lexer tool written in D several months ago?  I 
 can't find the NG thread.. 
 
 

Ok, I was writing a tool to lex/parse/analyze d code, it's not nearly done at all, but it's in a stage that allowed me to hacke it in an attempt to make it a stripper tool. I have absolutly no idea how reliable it is .. if anybody who wants to test/experiment/play with it, you're welcome. But I give no warranties at all .. It's written in java, and I used a trial version of exelcsior jet to compile it to a native windows exe. http://aljudy.org/dstuff/dstrip.jar << contains source http://aljudy.org/dstuff/dstrip.exe Here is a stripped version of build.d (of Dark Pernell's build) , it was stripped using this tool of mine. http://aljudy.org/dstuff/strip-build.d Note that it wasn't meant to be a stripping tool, it's more of a lexer/parser project I'm working on; but I just hacked it to make it strip, it probably has bugs, but hopefully it has some usefullness.
Aug 13 2005
parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Hasan Aljudy" <hasan.aljudy gmail.com> wrote in message 
news:ddlres$la8$1 digitaldaemon.com...
 Ok,  I was writing a tool to lex/parse/analyze d code, it's not nearly 
 done at all, but it's in a stage that allowed me to hacke it in an attempt 
 to make it a stripper tool.
 I have absolutly no idea how reliable it is .. if anybody who wants to 
 test/experiment/play with it, you're welcome. But I give no warranties at 
 all ..

 It's written in java, and I used a trial version of exelcsior jet to 
 compile it to a native windows exe.
 http://aljudy.org/dstuff/dstrip.jar << contains source
 http://aljudy.org/dstuff/dstrip.exe

I'd like to try your tool, but (1) I tried running the dstrip.jar with "java -jar dstrip.jar", but it says it can't load the Main-Class manifest attribute (I have no idea what that means as I've never done anything with Java), and (2) the compiled version is missing XKRN37052.DLL. It says you should use the JetPack II distribution utility.
Aug 13 2005
next sibling parent Hasan Aljudy <hasan.aljudy gmail.com> writes:
Jarrett Billingsley wrote:
 "Hasan Aljudy" <hasan.aljudy gmail.com> wrote in message 
 news:ddlres$la8$1 digitaldaemon.com...
 
Ok,  I was writing a tool to lex/parse/analyze d code, it's not nearly 
done at all, but it's in a stage that allowed me to hacke it in an attempt 
to make it a stripper tool.
I have absolutly no idea how reliable it is .. if anybody who wants to 
test/experiment/play with it, you're welcome. But I give no warranties at 
all ..

It's written in java, and I used a trial version of exelcsior jet to 
compile it to a native windows exe.
http://aljudy.org/dstuff/dstrip.jar << contains source
http://aljudy.org/dstuff/dstrip.exe

I'd like to try your tool, but (1) I tried running the dstrip.jar with "java -jar dstrip.jar", but it says it can't load the Main-Class manifest attribute (I have no idea what that means as I've never done anything with Java), and (2) the compiled version is missing XKRN37052.DLL. It says you should use the JetPack II distribution utility.

I don't know how to distrbute my java projects either .. tbh though, I think I agree with John Reimer news://news.digitalmars.com:119/ddluoo$nm1$1 digitaldaemon.com
Aug 13 2005
prev sibling parent reply Stefan Zobel <Stefan_member pathlink.com> writes:
In article <ddmd1a$10r8$1 digitaldaemon.com>, Jarrett Billingsley says...

I'd like to try your tool, but (1) I tried running the dstrip.jar with 
"java -jar dstrip.jar", but it says it can't load the Main-Class manifest 
attribute (I have no idea what that means as I've never done anything with 
Java) ...

Hi Jarrett, you can extract the manifest.mf file with WinZip to META-INF/manifest.mf (the META-INF directory should be created in the directory where dstrip.jar is). Then change the content of manifest.mf to this single line: Main-Class: lexer.tokens.Stripper After that update the dstrip.jar with the jar tool from the jdk by giving the following command from the commandline (you need to be in the directory where dstrip.jar is). jar uvfm dstrip.jar META-INF/manifest.mf After that it is executable by typing: java -jar dstrip.jar Note: You need to have JDK 1.5 installed in order to run it successfully! Hope this helps a little ... Kind regards, Stefan
Aug 14 2005
parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Stefan Zobel" <Stefan_member pathlink.com> wrote in message 
news:ddnjug$202u$1 digitaldaemon.com...
 Hi Jarrett,

 you can extract the manifest.mf file with WinZip to META-INF/manifest.mf
 (the META-INF directory should be created in the directory where 
 dstrip.jar is).
 Then change the content of manifest.mf to this single line:

 Main-Class: lexer.tokens.Stripper

 After that update the dstrip.jar with the jar tool from the jdk by giving
 the following command from the commandline (you need to be in the 
 directory
 where dstrip.jar is).

 jar uvfm dstrip.jar META-INF/manifest.mf

 After that it is executable by typing: java -jar dstrip.jar

 Note: You need to have JDK 1.5 installed in order to run it successfully!
 Hope this helps a little ...

*Slaps forehead* Why of course, it's so obvious! ;) I don't have the JDK, so I guess I'm stuck. Unless I download it of course.
Aug 14 2005
prev sibling next sibling parent reply John Reimer <terminal.node gmail.com> writes:
Jarrett Billingsley wrote:
 Since my project nonagon isn't open-source (though that may change in the 
 next few months), I distribute it as a lib with "headers" which just have 
 declarations for all the classes and functions and whatnot.  In order to 
 generate these "headers," I basically just have to remove all function 
 bodies and replace them with semicolons.
 
 The problem is that I did this manually at first, and it took me maybe 5-10 
 minutes to do it.  Now that nonagon has grown so much, It would probably 
 take me close to an hour to replace all the function bodies with semicolons.
 
 I wrote a "dumb" tool which basically counts the number of left and right 
 braces, and based on the nesting level, it either outputs the text it reads 
 or it doesn't.  The problem with this is that once I start adding things in 
 like version blocks and nested classes, it'll start stripping things out 
 that it's not supposed to.  It also doesn't strip out global-level function 
 bodies.
 
 Is there any kind of lexer/parser tool for D that will allow me to quickly 
 strip out the function bodies?  Surely someone has run across this problem 
 before.  Normally, I'd look into using the frontend, but, well, it's 
 completely undocumented and written in C++.  And I really don't feel like 
 spending a week and a half figuring out how to use the thing and then 
 writing a tool.
 
 Wasn't there some kind of D lexer tool written in D several months ago?  I 
 can't find the NG thread.. 
 
 

These kind of "strip" tools just amount to the same thing as C headers for a library. D should and could do better. Let's ditch the headers/import idea completely (aka "stripping") and create a tool (integrated into build perhaps?) that just reads the symbols directly from the *.lib file that the project links with (silent "stripping"). That way shipping the library itself would be the only thing necessary for closed projects. This has been discussed before. We don't really need more header files to mess with. We're drifting back into the C/C++ age again if we go that route. -JJR
Aug 14 2005
next sibling parent AJG <AJG_member pathlink.com> writes:
Hi,

These kind of "strip" tools just amount to the same thing as C headers 
for a library.  D should and could do better.  Let's ditch the 
headers/import idea completely (aka "stripping") and create a tool 
(integrated into build perhaps?) that just reads the symbols directly 
from the *.lib file that the project links with (silent "stripping"). 
That way shipping the library itself would be the only thing necessary 
for closed projects.  This has been discussed before.  We don't really 
need more header files to mess with.  We're drifting back into the C/C++ 
age again if we go that route.

Yes, please. This would be a big step forward. Cheers, --AJG.
Aug 13 2005
prev sibling next sibling parent Dave <Dave_member pathlink.com> writes:
In article <ddluoo$nm1$1 digitaldaemon.com>, John Reimer says...
Jarrett Billingsley wrote:
 Since my project nonagon isn't open-source (though that may change in the 
 next few months), I distribute it as a lib with "headers" which just have 
 declarations for all the classes and functions and whatnot.  In order to 
 generate these "headers," I basically just have to remove all function 
 bodies and replace them with semicolons.
 
 The problem is that I did this manually at first, and it took me maybe 5-10 
 minutes to do it.  Now that nonagon has grown so much, It would probably 
 take me close to an hour to replace all the function bodies with semicolons.
 
 I wrote a "dumb" tool which basically counts the number of left and right 
 braces, and based on the nesting level, it either outputs the text it reads 
 or it doesn't.  The problem with this is that once I start adding things in 
 like version blocks and nested classes, it'll start stripping things out 
 that it's not supposed to.  It also doesn't strip out global-level function 
 bodies.
 
 Is there any kind of lexer/parser tool for D that will allow me to quickly 
 strip out the function bodies?  Surely someone has run across this problem 
 before.  Normally, I'd look into using the frontend, but, well, it's 
 completely undocumented and written in C++.  And I really don't feel like 
 spending a week and a half figuring out how to use the thing and then 
 writing a tool.
 
 Wasn't there some kind of D lexer tool written in D several months ago?  I 
 can't find the NG thread.. 
 
 

These kind of "strip" tools just amount to the same thing as C headers for a library. D should and could do better. Let's ditch the headers/import idea completely (aka "stripping") and create a tool (integrated into build perhaps?) that just reads the symbols directly from the *.lib file that the project links with (silent "stripping"). That way shipping the library itself would be the only thing necessary for closed projects. This has been discussed before. We don't really need more header files to mess with. We're drifting back into the C/C++ age again if we go that route. -JJR

Great idea, and I think you're right, D really does need to do better in this area before it will be taken seriously by a lot of people as a 'commercial quality' tool (IMHO). Not saying they have the right attitude, but that's just the way it will be. How about just integrate the 'library stripping tool' right into the reference compiler (and therefore the language spec.)? If an import couldn't be found in the import path, the compiler would 'strip' the libraries - and/or object files specified on the command line - for symbols? After a quick glance at obj2asm output, the major challenge as-is looks to be member variable declarations and template code (enough info. for top-level function and variable declarations, classes, and structs looks to be available). - Dave
Aug 14 2005
prev sibling parent reply Vathix <chris dprogramming.com> writes:
 D should and could do better.  Let's ditch the headers/import idea  
 completely (aka "stripping") and create a tool (integrated into build  
 perhaps?) that just reads the symbols directly from the *.lib file that  
 the project links with (silent "stripping"). That way shipping the  
 library itself would be the only thing necessary for closed projects.   
 This has been discussed before.  We don't really need more header files  
 to mess with.  We're drifting back into the C/C++ age again if we go  
 that route.

The problem with that is the current lib files are very low level and dangerous. To distribute a lib file you have to force users to use the same compiler version and any other libs you use have to be for that same version. Say you want to use libs from 2 different sources and both rely on a different compiler version. Well, now you're pretty much screwed. Even if they successfully link to a program, there still can be hidden problems, such as access violations because there was supposed to be a pointer to something at a certain location but now it's somewhere else. If D could have a new form of lib file that knew more about the language and didn't depend on compiler version as much, hopefully not even depend on compiler brand (implementation) much, and possibly use an intermediate code form, it would be a lot safer and better in many respects.
Aug 14 2005
next sibling parent reply pragma <pragma_member pathlink.com> writes:
In article <op.svitpec7l2lsvj esi>, Vathix says...
 D should and could do better.  Let's ditch the headers/import idea  
 completely (aka "stripping") and create a tool (integrated into build  
 perhaps?) that just reads the symbols directly from the *.lib file that  
 the project links with (silent "stripping"). That way shipping the  
 library itself would be the only thing necessary for closed projects.   
 This has been discussed before.  We don't really need more header files  
 to mess with.  We're drifting back into the C/C++ age again if we go  
 that route.

The problem with that is the current lib files are very low level and dangerous. To distribute a lib file you have to force users to use the same compiler version and any other libs you use have to be for that same version. Say you want to use libs from 2 different sources and both rely on a different compiler version. Well, now you're pretty much screwed. Even if they successfully link to a program, there still can be hidden problems, such as access violations because there was supposed to be a pointer to something at a certain location but now it's somewhere else.

Its funny that that you brought this up. I'm working on a runtime loader/linker for DMD's OMF .obj files. For two day's work, I now have a crude OMF parser that can digest (most) of DMD's output http://www.dsource.org/forums/viewtopic.php?p=5691#5691 Its not useful yet, but it does shed some light at what can be accomplished via this route. The big problem you mention, having to wield fixup data and RVA's, is certainly not a trivial task. It happens to be where I'm presently spending most of my time on my OMF loader; it'll take a week or more before I can really work the major kinks out. However, all the needed information *is* in any given object format, and can be made to work. ;) You're right about compiler dependencies and such, as there is a huge amount of ground to cover to be inclusive to everyone. Just look at the mess GNU Binutils is (which is by no means complete) and you'll see just how bad it is out there: 400+ files and I'm *still* stuck writing a custom loader. BFD indeed. However, the advantages of runtime linking, and library introspection is obvious as has been proven by .NET and Java. Even if you're looking to just merely write a tool to create headers on the fly,
If  
D could have a new form of lib file that knew more about the language and  
didn't depend on compiler version as much, hopefully not even depend on  
compiler brand (implementation) much, and possibly use an intermediate  
code form, it would be a lot safer and better in many respects.

I agree completely. If we're all seriously looking for an inclusive and hassle-free binary format, then ELF will work for just about anything you want to do. They seem easier to parse (anything has to be easier than OMF) and can be made to operate the same way in windows as well as in linux (and probably apple once they go x86). http://www.skyfree.org/linux/references/ELF_Format.pdf I'm looking to support this as well as OMF for my project. Its supported under GCC, MINGW and probably a whole mess of others. If that doesn't float your boat, then one can easily write a converter between COFF and ELF (if there isn't one out there already) by way of GNU Binutils. - EricAnderton at yahoo
Aug 14 2005
parent John Reimer <terminal.node gmail.com> writes:
Heh... Eric, I was just about to mention you on this one.  Good timing! ;-)

The direction you are taking IS the solution to many of these issues.

Vathix, good points in regards to questionable safety in parsing for 
symbols in the current lib/object formats.  But I think the problem is 
solvable with enough careful planning.  Finding a solution, at least, 
would be well worth the effort for D's sake.  Otherwise D stripped 
imports are no better than C headers: D can not claim any superiority in 
this area at present (as Walter has, at times, been prone to do).

-JJR

pragma wrote:
 In article <op.svitpec7l2lsvj esi>, Vathix says...
 
D should and could do better.  Let's ditch the headers/import idea  
completely (aka "stripping") and create a tool (integrated into build  
perhaps?) that just reads the symbols directly from the *.lib file that  
the project links with (silent "stripping"). That way shipping the  
library itself would be the only thing necessary for closed projects.   
This has been discussed before.  We don't really need more header files  
to mess with.  We're drifting back into the C/C++ age again if we go  
that route.

The problem with that is the current lib files are very low level and dangerous. To distribute a lib file you have to force users to use the same compiler version and any other libs you use have to be for that same version. Say you want to use libs from 2 different sources and both rely on a different compiler version. Well, now you're pretty much screwed. Even if they successfully link to a program, there still can be hidden problems, such as access violations because there was supposed to be a pointer to something at a certain location but now it's somewhere else.

Its funny that that you brought this up. I'm working on a runtime loader/linker for DMD's OMF .obj files. For two day's work, I now have a crude OMF parser that can digest (most) of DMD's output http://www.dsource.org/forums/viewtopic.php?p=5691#5691 Its not useful yet, but it does shed some light at what can be accomplished via this route. The big problem you mention, having to wield fixup data and RVA's, is certainly not a trivial task. It happens to be where I'm presently spending most of my time on my OMF loader; it'll take a week or more before I can really work the major kinks out. However, all the needed information *is* in any given object format, and can be made to work. ;) You're right about compiler dependencies and such, as there is a huge amount of ground to cover to be inclusive to everyone. Just look at the mess GNU Binutils is (which is by no means complete) and you'll see just how bad it is out there: 400+ files and I'm *still* stuck writing a custom loader. BFD indeed. However, the advantages of runtime linking, and library introspection is obvious as has been proven by .NET and Java. Even if you're looking to just merely write a tool to create headers on the fly,
If  
D could have a new form of lib file that knew more about the language and  
didn't depend on compiler version as much, hopefully not even depend on  
compiler brand (implementation) much, and possibly use an intermediate  
code form, it would be a lot safer and better in many respects.

I agree completely. If we're all seriously looking for an inclusive and hassle-free binary format, then ELF will work for just about anything you want to do. They seem easier to parse (anything has to be easier than OMF) and can be made to operate the same way in windows as well as in linux (and probably apple once they go x86). http://www.skyfree.org/linux/references/ELF_Format.pdf I'm looking to support this as well as OMF for my project. Its supported under GCC, MINGW and probably a whole mess of others. If that doesn't float your boat, then one can easily write a converter between COFF and ELF (if there isn't one out there already) by way of GNU Binutils. - EricAnderton at yahoo

Aug 15 2005
prev sibling parent reply Hasan Aljudy <hasan.aljudy gmail.com> writes:
Vathix wrote:
 D should and could do better.  Let's ditch the headers/import idea  
 completely (aka "stripping") and create a tool (integrated into build  
 perhaps?) that just reads the symbols directly from the *.lib file 
 that  the project links with (silent "stripping"). That way shipping 
 the  library itself would be the only thing necessary for closed 
 projects.   This has been discussed before.  We don't really need more 
 header files  to mess with.  We're drifting back into the C/C++ age 
 again if we go  that route.

The problem with that is the current lib files are very low level and dangerous. To distribute a lib file you have to force users to use the same compiler version and any other libs you use have to be for that same version. Say you want to use libs from 2 different sources and both rely on a different compiler version. Well, now you're pretty much screwed. Even if they successfully link to a program, there still can be hidden problems, such as access violations because there was supposed to be a pointer to something at a certain location but now it's somewhere else. If D could have a new form of lib file that knew more about the language and didn't depend on compiler version as much, hopefully not even depend on compiler brand (implementation) much, and possibly use an intermediate code form, it would be a lot safer and better in many respects.

I really don't know anything about lib files, so I don't know how much sense am I gonna make here, but here's an idea: How about an intermediate file (between .d and .lib) that contains both stripped declarations and .lib content (whatever that is), so that it's easy for any compiler/linker/whatever to extract both declarations and/or .lib from that file (let's just call that file "dlib" for now). In other words, embed stripped declarations directly into lib files. What I'm proposing is that instead of having the compiler produce lib files, it produces the "dlib" file, or have it only produce that file on a special compiler switch. is that a reasonable proposal? or is it extreemly stupid?
Aug 14 2005
next sibling parent Ant <duitoolkit yahoo.ca> writes:
Hasan Aljudy wrote:
 Vathix wrote:
 
 D should and could do better.  



 ...  that contains both
 stripped declarations and .lib content

doesn't Burton Radon "make" tool do that already? (I might be mixing up things...) Antonio
Aug 14 2005
prev sibling parent reply Dave <Dave_member pathlink.com> writes:
In article <ddpan3$6fg$1 digitaldaemon.com>, Hasan Aljudy says...
Vathix wrote:
 D should and could do better.  Let's ditch the headers/import idea  
 completely (aka "stripping") and create a tool (integrated into build  
 perhaps?) that just reads the symbols directly from the *.lib file 
 that  the project links with (silent "stripping"). That way shipping 
 the  library itself would be the only thing necessary for closed 
 projects.   This has been discussed before.  We don't really need more 
 header files  to mess with.  We're drifting back into the C/C++ age 
 again if we go  that route.

The problem with that is the current lib files are very low level and dangerous. To distribute a lib file you have to force users to use the same compiler version and any other libs you use have to be for that same version. Say you want to use libs from 2 different sources and both rely on a different compiler version. Well, now you're pretty much screwed. Even if they successfully link to a program, there still can be hidden problems, such as access violations because there was supposed to be a pointer to something at a certain location but now it's somewhere else. If D could have a new form of lib file that knew more about the language and didn't depend on compiler version as much, hopefully not even depend on compiler brand (implementation) much, and possibly use an intermediate code form, it would be a lot safer and better in many respects.

I really don't know anything about lib files, so I don't know how much sense am I gonna make here, but here's an idea: How about an intermediate file (between .d and .lib) that contains both stripped declarations and .lib content (whatever that is), so that it's easy for any compiler/linker/whatever to extract both declarations and/or .lib from that file (let's just call that file "dlib" for now). In other words, embed stripped declarations directly into lib files. What I'm proposing is that instead of having the compiler produce lib files, it produces the "dlib" file, or have it only produce that file on a special compiler switch. is that a reasonable proposal? or is it extreemly stupid?

I think that is very reasonable and from what I gather it can actually be done. When the compiler tries to resolve an import that isn't in the path, it looks in the library and/or .obj files, right? I believe it can also be done in such a way so that other languages could use the same libraries (they would just ignore the section of the object files containing the D symbols). Just think: - Import code and libs. in one file (end of version issues between headers and libs.) - One file per library to distribute. - Reflection/introspection w/o having to strip executable library code (This would make runtime loading/linking much easier and portable, right Pragma?). - Same library file could be used by other languages (they would have to provide their own forward declarations). If you also store some type of implementation specified binary symbols of all of the code instead of just stripped declarations: - You could potentially distribute closed-source libraries made up of just template code in binary format using the current D instantiation model (w/o implicit function template instantiation). - The extra info. could be used by the compiler, e.g.: to inline functions. The info. not referenced would not make it into .exe's (stipped by most modern linkers) so it wouldn't bloat applications. Now all that would be a big step forward, IMO! - Dave
Aug 15 2005
parent reply Georg Wrede <georg.wrede nospam.org> writes:
(Sorry for top-posting, I'm in the middle of something right now, and I 
just got a raw idea to throw at you guys!)

The different lib and dll formats specify things quite meticulously. 
Now, if we wanted to make a "file type" that is compatible _and_ 
contains data that we need, then:

why not just slap our own data at the end of the lib/dll?

So, for example, if a file format only describes entry points and 
function names, we might slap a description of return values, function 
parameters, and whatever else we consider important (e.g. compiler 
version, or some such) simply at the end of that file.

This has of course to be tested, so no tools (binutils or other) crash 
with these files. With any good luck, this might be an easy "chewing gum 
and cardboard" solution, that we can use for the time being.

(Just a Temporary Solution(TM) that the FAA airliner disaster 
investigators will find 5 years from now still being used. :-)


Dave wrote:
 In article <ddpan3$6fg$1 digitaldaemon.com>, Hasan Aljudy says...
 
Vathix wrote:

D should and could do better.  Let's ditch the headers/import idea  
completely (aka "stripping") and create a tool (integrated into build  
perhaps?) that just reads the symbols directly from the *.lib file 
that  the project links with (silent "stripping"). That way shipping 
the  library itself would be the only thing necessary for closed 
projects.   This has been discussed before.  We don't really need more 
header files  to mess with.  We're drifting back into the C/C++ age 
again if we go  that route.

The problem with that is the current lib files are very low level and dangerous. To distribute a lib file you have to force users to use the same compiler version and any other libs you use have to be for that same version. Say you want to use libs from 2 different sources and both rely on a different compiler version. Well, now you're pretty much screwed. Even if they successfully link to a program, there still can be hidden problems, such as access violations because there was supposed to be a pointer to something at a certain location but now it's somewhere else. If D could have a new form of lib file that knew more about the language and didn't depend on compiler version as much, hopefully not even depend on compiler brand (implementation) much, and possibly use an intermediate code form, it would be a lot safer and better in many respects.

I really don't know anything about lib files, so I don't know how much sense am I gonna make here, but here's an idea: How about an intermediate file (between .d and .lib) that contains both stripped declarations and .lib content (whatever that is), so that it's easy for any compiler/linker/whatever to extract both declarations and/or .lib from that file (let's just call that file "dlib" for now). In other words, embed stripped declarations directly into lib files. What I'm proposing is that instead of having the compiler produce lib files, it produces the "dlib" file, or have it only produce that file on a special compiler switch. is that a reasonable proposal? or is it extreemly stupid?

I think that is very reasonable and from what I gather it can actually be done. When the compiler tries to resolve an import that isn't in the path, it looks in the library and/or .obj files, right? I believe it can also be done in such a way so that other languages could use the same libraries (they would just ignore the section of the object files containing the D symbols). Just think: - Import code and libs. in one file (end of version issues between headers and libs.) - One file per library to distribute. - Reflection/introspection w/o having to strip executable library code (This would make runtime loading/linking much easier and portable, right Pragma?). - Same library file could be used by other languages (they would have to provide their own forward declarations). If you also store some type of implementation specified binary symbols of all of the code instead of just stripped declarations: - You could potentially distribute closed-source libraries made up of just template code in binary format using the current D instantiation model (w/o implicit function template instantiation). - The extra info. could be used by the compiler, e.g.: to inline functions. The info. not referenced would not make it into .exe's (stipped by most modern linkers) so it wouldn't bloat applications. Now all that would be a big step forward, IMO! - Dave

Sep 15 2005
parent reply James Dunne <james.jdunne gmail.com> writes:
Why do some people get all up in arms about top-posting?  It's retarded. 
  It absolutely does not change the meaning or flow of conversation in a 
newsgroup.  This might be true if each post were a single sentence, but 
mostly they're not.  Besides, it's not hard to draw the lines between 
posts... Older ones are prefixed in '>'s everywhere, and the new stuff 
is not.  Personally, I don't enjoy scrolling to the very bottom to find 
a new post appended in reply to an older post, especially if that new 
post happens to be rather long.  For the longest time, I thought 
top-posting meant posting a new thread, since it was on the top level of 
discussion.  It must've sounded ridiculous of me to ask how to not 
top-post. =P

Sorry for interrupting the flow here Georg... ;)

An idea to extend existing library/binary file formats with extra 
information for reflection?  Sounds kind of hairy to me.

I would advise to check the specs of most popular executable/linkable 
formats (ELF, COFF, OMF, etc.) to see if such a thing is already 
supported without modification; sort of like a "miscellaneous" section. 
  If such a thing were allowed, then a specification for a custom format 
containing necessary reflection information for D should be drawn up, 
agreed upon, and implemented.  It should store its information within 
that miscellaneous section, with an obvious marker so as to not confuse 
with any other potential information stored within that section.

Georg Wrede wrote:
 (Sorry for top-posting, I'm in the middle of something right now, and I 
 just got a raw idea to throw at you guys!)
 
 The different lib and dll formats specify things quite meticulously. 
 Now, if we wanted to make a "file type" that is compatible _and_ 
 contains data that we need, then:
 
 why not just slap our own data at the end of the lib/dll?
 
 So, for example, if a file format only describes entry points and 
 function names, we might slap a description of return values, function 
 parameters, and whatever else we consider important (e.g. compiler 
 version, or some such) simply at the end of that file.
 
 This has of course to be tested, so no tools (binutils or other) crash 
 with these files. With any good luck, this might be an easy "chewing gum 
 and cardboard" solution, that we can use for the time being.
 
 (Just a Temporary Solution(TM) that the FAA airliner disaster 
 investigators will find 5 years from now still being used. :-)
 
 
 Dave wrote:
 
 In article <ddpan3$6fg$1 digitaldaemon.com>, Hasan Aljudy says...

 Vathix wrote:

 D should and could do better.  Let's ditch the headers/import idea  
 completely (aka "stripping") and create a tool (integrated into 
 build  perhaps?) that just reads the symbols directly from the 
 *.lib file that  the project links with (silent "stripping"). That 
 way shipping the  library itself would be the only thing necessary 
 for closed projects.   This has been discussed before.  We don't 
 really need more header files  to mess with.  We're drifting back 
 into the C/C++ age again if we go  that route.

The problem with that is the current lib files are very low level and dangerous. To distribute a lib file you have to force users to use the same compiler version and any other libs you use have to be for that same version. Say you want to use libs from 2 different sources and both rely on a different compiler version. Well, now you're pretty much screwed. Even if they successfully link to a program, there still can be hidden problems, such as access violations because there was supposed to be a pointer to something at a certain location but now it's somewhere else. If D could have a new form of lib file that knew more about the language and didn't depend on compiler version as much, hopefully not even depend on compiler brand (implementation) much, and possibly use an intermediate code form, it would be a lot safer and better in many respects.

I really don't know anything about lib files, so I don't know how much sense am I gonna make here, but here's an idea: How about an intermediate file (between .d and .lib) that contains both stripped declarations and .lib content (whatever that is), so that it's easy for any compiler/linker/whatever to extract both declarations and/or .lib from that file (let's just call that file "dlib" for now). In other words, embed stripped declarations directly into lib files. What I'm proposing is that instead of having the compiler produce lib files, it produces the "dlib" file, or have it only produce that file on a special compiler switch. is that a reasonable proposal? or is it extreemly stupid?

I think that is very reasonable and from what I gather it can actually be done. When the compiler tries to resolve an import that isn't in the path, it looks in the library and/or .obj files, right? I believe it can also be done in such a way so that other languages could use the same libraries (they would just ignore the section of the object files containing the D symbols). Just think: - Import code and libs. in one file (end of version issues between headers and libs.) - One file per library to distribute. - Reflection/introspection w/o having to strip executable library code (This would make runtime loading/linking much easier and portable, right Pragma?). - Same library file could be used by other languages (they would have to provide their own forward declarations). If you also store some type of implementation specified binary symbols of all of the code instead of just stripped declarations: - You could potentially distribute closed-source libraries made up of just template code in binary format using the current D instantiation model (w/o implicit function template instantiation). - The extra info. could be used by the compiler, e.g.: to inline functions. The info. not referenced would not make it into .exe's (stipped by most modern linkers) so it wouldn't bloat applications. Now all that would be a big step forward, IMO! - Dave


Sep 15 2005
next sibling parent reply Sean Kelly <sean f4.ca> writes:
In article <dgd52g$2ff3$1 digitaldaemon.com>, James Dunne says...
Why do some people get all up in arms about top-posting?

This is Alf P. Steinbach's signature file (from the c++ newsgroups): --- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?
Sep 15 2005
parent reply James Dunne <james.jdunne gmail.com> writes:
Sean Kelly wrote:
 In article <dgd52g$2ff3$1 digitaldaemon.com>, James Dunne says...
 
Why do some people get all up in arms about top-posting?

This is Alf P. Steinbach's signature file (from the c++ newsgroups): --- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?

Yes, I've seen many like that, and it just doesn't hold. Most posts are not one-liners. Such an example is of the most extreme case where everyone uses top-posting. It shouldn't bother you anyway if you've been following the conversation. Just find the new post and read on. It's easier to find it at the top than to start at the bottom and scroll up.
Sep 16 2005
next sibling parent Sean Kelly <sean f4.ca> writes:
In article <dgek0t$s46$1 digitaldaemon.com>, James Dunne says...
Sean Kelly wrote:
 In article <dgd52g$2ff3$1 digitaldaemon.com>, James Dunne says...
 
Why do some people get all up in arms about top-posting?

This is Alf P. Steinbach's signature file (from the c++ newsgroups): --- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?

Yes, I've seen many like that, and it just doesn't hold. Most posts are not one-liners. Such an example is of the most extreme case where everyone uses top-posting. It shouldn't bother you anyway if you've been following the conversation. Just find the new post and read on. It's easier to find it at the top than to start at the bottom and scroll up.

It's really only an issue for me with long discussions, as I might have to read pages of text in reverse to get a context for the current post. Also, I like to insert replies inline, which isn't possible with top-posting. Sean
Sep 16 2005
prev sibling parent reply "Ameer Armaly" <ameer_armaly hotmail.com> writes:
I agree.  As a blind reader, it is easier for me to read top-posting, though 
body-posting is doable if I scroll by paragraph.
"James Dunne" <james.jdunne gmail.com> wrote in message 
news:dgek0t$s46$1 digitaldaemon.com...
 Sean Kelly wrote:
 In article <dgd52g$2ff3$1 digitaldaemon.com>, James Dunne says...

Why do some people get all up in arms about top-posting?

This is Alf P. Steinbach's signature file (from the c++ newsgroups): --- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?

Yes, I've seen many like that, and it just doesn't hold. Most posts are not one-liners. Such an example is of the most extreme case where everyone uses top-posting. It shouldn't bother you anyway if you've been following the conversation. Just find the new post and read on. It's easier to find it at the top than to start at the bottom and scroll up.

Sep 18 2005
parent Sean Kelly <sean f4.ca> writes:
Interesting.  That never occurred to me.  I suppose there's something to be said
for top-posting after all :)

Sean

In article <dgkvn0$e10$1 digitaldaemon.com>, Ameer Armaly says...
I agree.  As a blind reader, it is easier for me to read top-posting, though 
body-posting is doable if I scroll by paragraph.
"James Dunne" <james.jdunne gmail.com> wrote in message 
news:dgek0t$s46$1 digitaldaemon.com...
 Sean Kelly wrote:
 In article <dgd52g$2ff3$1 digitaldaemon.com>, James Dunne says...

Why do some people get all up in arms about top-posting?

This is Alf P. Steinbach's signature file (from the c++ newsgroups): --- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?

Yes, I've seen many like that, and it just doesn't hold. Most posts are not one-liners. Such an example is of the most extreme case where everyone uses top-posting. It shouldn't bother you anyway if you've been following the conversation. Just find the new post and read on. It's easier to find it at the top than to start at the bottom and scroll up.


Sep 18 2005
prev sibling parent Kyle Furlong <kylefurlong gmail.com> writes:
James Dunne wrote:
 Why do some people get all up in arms about top-posting?  It's retarded. 
  It absolutely does not change the meaning or flow of conversation in a 
 newsgroup.  This might be true if each post were a single sentence, but 
 mostly they're not.  Besides, it's not hard to draw the lines between 
 posts... Older ones are prefixed in '>'s everywhere, and the new stuff 
 is not.  Personally, I don't enjoy scrolling to the very bottom to find 
 a new post appended in reply to an older post, especially if that new 
 post happens to be rather long.  For the longest time, I thought 
 top-posting meant posting a new thread, since it was on the top level of 
 discussion.  It must've sounded ridiculous of me to ask how to not 
 top-post. =P
 
 Sorry for interrupting the flow here Georg... ;)
 
 An idea to extend existing library/binary file formats with extra 
 information for reflection?  Sounds kind of hairy to me.
 
 I would advise to check the specs of most popular executable/linkable 
 formats (ELF, COFF, OMF, etc.) to see if such a thing is already 
 supported without modification; sort of like a "miscellaneous" section. 
  If such a thing were allowed, then a specification for a custom format 
 containing necessary reflection information for D should be drawn up, 
 agreed upon, and implemented.  It should store its information within 
 that miscellaneous section, with an obvious marker so as to not confuse 
 with any other potential information stored within that section.
 
 Georg Wrede wrote:
 
 (Sorry for top-posting, I'm in the middle of something right now, and 
 I just got a raw idea to throw at you guys!)

 The different lib and dll formats specify things quite meticulously. 
 Now, if we wanted to make a "file type" that is compatible _and_ 
 contains data that we need, then:

 why not just slap our own data at the end of the lib/dll?

 So, for example, if a file format only describes entry points and 
 function names, we might slap a description of return values, function 
 parameters, and whatever else we consider important (e.g. compiler 
 version, or some such) simply at the end of that file.

 This has of course to be tested, so no tools (binutils or other) crash 
 with these files. With any good luck, this might be an easy "chewing 
 gum and cardboard" solution, that we can use for the time being.

 (Just a Temporary Solution(TM) that the FAA airliner disaster 
 investigators will find 5 years from now still being used. :-)


 Dave wrote:

 In article <ddpan3$6fg$1 digitaldaemon.com>, Hasan Aljudy says...

 Vathix wrote:

 D should and could do better.  Let's ditch the headers/import 
 idea  completely (aka "stripping") and create a tool (integrated 
 into build  perhaps?) that just reads the symbols directly from 
 the *.lib file that  the project links with (silent "stripping"). 
 That way shipping the  library itself would be the only thing 
 necessary for closed projects.   This has been discussed before.  
 We don't really need more header files  to mess with.  We're 
 drifting back into the C/C++ age again if we go  that route.

The problem with that is the current lib files are very low level and dangerous. To distribute a lib file you have to force users to use the same compiler version and any other libs you use have to be for that same version. Say you want to use libs from 2 different sources and both rely on a different compiler version. Well, now you're pretty much screwed. Even if they successfully link to a program, there still can be hidden problems, such as access violations because there was supposed to be a pointer to something at a certain location but now it's somewhere else. If D could have a new form of lib file that knew more about the language and didn't depend on compiler version as much, hopefully not even depend on compiler brand (implementation) much, and possibly use an intermediate code form, it would be a lot safer and better in many respects.

I really don't know anything about lib files, so I don't know how much sense am I gonna make here, but here's an idea: How about an intermediate file (between .d and .lib) that contains both stripped declarations and .lib content (whatever that is), so that it's easy for any compiler/linker/whatever to extract both declarations and/or .lib from that file (let's just call that file "dlib" for now). In other words, embed stripped declarations directly into lib files. What I'm proposing is that instead of having the compiler produce lib files, it produces the "dlib" file, or have it only produce that file on a special compiler switch. is that a reasonable proposal? or is it extreemly stupid?

I think that is very reasonable and from what I gather it can actually be done. When the compiler tries to resolve an import that isn't in the path, it looks in the library and/or .obj files, right? I believe it can also be done in such a way so that other languages could use the same libraries (they would just ignore the section of the object files containing the D symbols). Just think: - Import code and libs. in one file (end of version issues between headers and libs.) - One file per library to distribute. - Reflection/introspection w/o having to strip executable library code (This would make runtime loading/linking much easier and portable, right Pragma?). - Same library file could be used by other languages (they would have to provide their own forward declarations). If you also store some type of implementation specified binary symbols of all of the code instead of just stripped declarations: - You could potentially distribute closed-source libraries made up of just template code in binary format using the current D instantiation model (w/o implicit function template instantiation). - The extra info. could be used by the compiler, e.g.: to inline functions. The info. not referenced would not make it into .exe's (stipped by most modern linkers) so it wouldn't bloat applications. Now all that would be a big step forward, IMO! - Dave



You guys should check out Eric's project, DDL (D Dynamic Libraries). He's dealing with these sort of issues, and I think you will be impressed with the progress hes made so far. http://dsource.org/projects/ddl/
Sep 15 2005
prev sibling next sibling parent reply Traveler Hauptman <Traveler_member pathlink.com> writes:
In article <ddah3h$2mn5$1 digitaldaemon.com>, Jarrett Billingsley says...
Since my project nonagon isn't open-source (though that may change in the 
next few months), I distribute it as a lib with "headers" which just have 
declarations for all the classes and functions and whatnot.  In order to 
generate these "headers," I basically just have to remove all function 
bodies and replace them with semicolons.

I think this is a weak point of D right now. Having to strip code out of a file to make it a header file is just silly. Special dlib files make me a little nervous though. I need to be able to access the library files from C. --Traveler Hauptman
Sep 16 2005
next sibling parent Sean Kelly <sean f4.ca> writes:
In article <dgedae$kgp$1 digitaldaemon.com>, Traveler Hauptman says...
In article <ddah3h$2mn5$1 digitaldaemon.com>, Jarrett Billingsley says...
Since my project nonagon isn't open-source (though that may change in the 
next few months), I distribute it as a lib with "headers" which just have 
declarations for all the classes and functions and whatnot.  In order to 
generate these "headers," I basically just have to remove all function 
bodies and replace them with semicolons.

I think this is a weak point of D right now. Having to strip code out of a file to make it a header file is just silly.

Having to manually maintain a separate header is worse. Besides, Java works the same way.
Special dlib files make me a little nervous though. I need to be able to access
the library files from C. 

So declare everything extern (C) and build standard C libraries. Sean
Sep 16 2005
prev sibling parent reply Sean Kelly <sean f4.ca> writes:
In article <dgedae$kgp$1 digitaldaemon.com>, Traveler Hauptman says...
In article <ddah3h$2mn5$1 digitaldaemon.com>, Jarrett Billingsley says...
Since my project nonagon isn't open-source (though that may change in the 
next few months), I distribute it as a lib with "headers" which just have 
declarations for all the classes and functions and whatnot.  In order to 
generate these "headers," I basically just have to remove all function 
bodies and replace them with semicolons.

I think this is a weak point of D right now. Having to strip code out of a file to make it a header file is just silly.

By the way. Did you know that compiled .NET programs actually contain the full source code as well? I assume this is to allow portions to be compiled while the application is running, but I was amazed that the code isn't even obfuscated. You have to buy a third party tool to do this if you want to protect your IP when shipping .NET applications. Sean
Sep 16 2005
parent reply Georg Wrede <georg.wrede nospam.org> writes:
 By the way.  Did you know that compiled .NET programs actually contain the full
 source code as well?  I assume this is to allow portions to be compiled while
 the application is running, but I was amazed that the code isn't even
 obfuscated.  You have to buy a third party tool to do this if you want to
 protect your IP when shipping .NET applications.

Geez! I suggested this for D code that would be used instead of scripts, for quite a long time ago. Didn't catch on too well. It has its merits in many situations, but not when you are selling software. And of course, you have to _buy_ something to fix it. As always.
Sep 16 2005
parent reply James Dunne <james.jdunne gmail.com> writes:
Georg Wrede wrote:
 By the way.  Did you know that compiled .NET programs actually contain 
 the full
 source code as well?  I assume this is to allow portions to be 
 compiled while
 the application is running, but I was amazed that the code isn't even
 obfuscated.  You have to buy a third party tool to do this if you want to
 protect your IP when shipping .NET applications.

Geez! I suggested this for D code that would be used instead of scripts, for quite a long time ago. Didn't catch on too well. It has its merits in many situations, but not when you are selling software. And of course, you have to _buy_ something to fix it. As always.

For .Net? Visual Studio .Net comes with Dotfuscator, which obfuscates the symbol table so that the .Net code cannot be reverse engineered into any *meaningful* representation. In fact, I don't believe the distributed binaries actually contain the "full source code" of the program, as you state. It's just that the .Net code is compiled into MSIL, which, like Java, is easily reverse engineered. It doesn't help that the binaries also contain all of the symbolic information from the original code, i.e. local variable names are preserved.
Sep 16 2005
parent reply Sean Kelly <sean f4.ca> writes:
In article <dgf21b$1b80$1 digitaldaemon.com>, James Dunne says...
Georg Wrede wrote:
 By the way.  Did you know that compiled .NET programs actually contain 
 the full
 source code as well?  I assume this is to allow portions to be 
 compiled while
 the application is running, but I was amazed that the code isn't even
 obfuscated.  You have to buy a third party tool to do this if you want to
 protect your IP when shipping .NET applications.

Geez! I suggested this for D code that would be used instead of scripts, for quite a long time ago. Didn't catch on too well. It has its merits in many situations, but not when you are selling software. And of course, you have to _buy_ something to fix it. As always.

For .Net? Visual Studio .Net comes with Dotfuscator, which obfuscates the symbol table so that the .Net code cannot be reverse engineered into any *meaningful* representation.

I didn't know about that. I'll admit I only know about this because a .NET person in the office told me about it. Perhaps he was unaware of this tool.
In fact, I don't believe the distributed binaries actually contain the 
"full source code" of the program, as you state.  It's just that the 
.Net code is compiled into MSIL, which, like Java, is easily reverse 
engineered.  It doesn't help that the binaries also contain all of the 
symbolic information from the original code, i.e. local variable names 
are preserved.

Exactly. The aforementioned person had a decompiler he'd gotten online somewhere and I was amazed to see that it spat out our source code exactly as it was written. I expected it would a bit more like decompiled C++ code. I've wondered whether a JIT build of .NET applications (instead of the .NET assembly format) would contain so much information, but I never bothered to test it. Sean
Sep 16 2005
parent reply James Dunne <james.jdunne gmail.com> writes:
In article <dgf5n2$1epo$1 digitaldaemon.com>, Sean Kelly says...
In article <dgf21b$1b80$1 digitaldaemon.com>, James Dunne says...
Georg Wrede wrote:
 By the way.  Did you know that compiled .NET programs actually contain 
 the full
 source code as well?  I assume this is to allow portions to be 
 compiled while
 the application is running, but I was amazed that the code isn't even
 obfuscated.  You have to buy a third party tool to do this if you want to
 protect your IP when shipping .NET applications.

Geez! I suggested this for D code that would be used instead of scripts, for quite a long time ago. Didn't catch on too well. It has its merits in many situations, but not when you are selling software. And of course, you have to _buy_ something to fix it. As always.

For .Net? Visual Studio .Net comes with Dotfuscator, which obfuscates the symbol table so that the .Net code cannot be reverse engineered into any *meaningful* representation.

I didn't know about that. I'll admit I only know about this because a .NET person in the office told me about it. Perhaps he was unaware of this tool.
In fact, I don't believe the distributed binaries actually contain the 
"full source code" of the program, as you state.  It's just that the 
.Net code is compiled into MSIL, which, like Java, is easily reverse 
engineered.  It doesn't help that the binaries also contain all of the 
symbolic information from the original code, i.e. local variable names 
are preserved.

Exactly. The aforementioned person had a decompiler he'd gotten online somewhere and I was amazed to see that it spat out our source code exactly as it was written. I expected it would a bit more like decompiled C++ code. I've wondered whether a JIT build of .NET applications (instead of the .NET assembly format) would contain so much information, but I never bothered to test it. Sean

The decompiled output is usually meaningless, depending on the feature set used and language which generated the MSIL bytecode. For instance, C# operator overloading will look like nothing more than class method calls of special names (depending on the decompiler - some might reverse engineer this correctly). I can't imagine such a thing would be useful for C# 2.0, with generics, as well. But then again, that depends entirely on the implementation. However, for the majority of your straight-forward logic/assignment/function call instructions, the resulting decompiled output should hit fairly closely to home. Such a problem exists for mostly any bytecode-compiled language. Perhaps it is due to the simplicity of the virtual machine used - stack-based machines are much easier to decompile than register-based machines are, or perhaps it is due to the wealth of debugging/reflection information provided within the binary itself. Perhaps that .NET person in your office was aware of the tool, but wrote it off since it does not entirely solve the problem. Regards, James Dunne
Sep 16 2005
parent Sean Kelly <sean f4.ca> writes:
In article <dggdrc$2gjp$1 digitaldaemon.com>, James Dunne says...
Perhaps that .NET person in your office was aware of the tool, but wrote it off
since it does not entirely solve the problem.

Probably. Source code security is a huge deal to him and he was pushing to get the best obfuscator around. For what it's worth, this was a VB assembly, so the fairly simple language design is probably why the output was so meaningful (copy-paste exact in this case). Sean
Sep 17 2005
prev sibling parent J Thomas <jtd514 ameritech.net> writes:
this could be done very easily with the front end code, it could be used 
to emit a public 'header' module from the entire source code

the front end could also be turned into a metaprogramming system that 
could do compile time source manipuation for all sorts of powerfull 
things, like reflection, emitting, extending the languge etc

Jarrett Billingsley wrote:
 Since my project nonagon isn't open-source (though that may change in the 
 next few months), I distribute it as a lib with "headers" which just have 
 declarations for all the classes and functions and whatnot.  In order to 
 generate these "headers," I basically just have to remove all function 
 bodies and replace them with semicolons.
 
 The problem is that I did this manually at first, and it took me maybe 5-10 
 minutes to do it.  Now that nonagon has grown so much, It would probably 
 take me close to an hour to replace all the function bodies with semicolons.
 
 I wrote a "dumb" tool which basically counts the number of left and right 
 braces, and based on the nesting level, it either outputs the text it reads 
 or it doesn't.  The problem with this is that once I start adding things in 
 like version blocks and nested classes, it'll start stripping things out 
 that it's not supposed to.  It also doesn't strip out global-level function 
 bodies.
 
 Is there any kind of lexer/parser tool for D that will allow me to quickly 
 strip out the function bodies?  Surely someone has run across this problem 
 before.  Normally, I'd look into using the frontend, but, well, it's 
 completely undocumented and written in C++.  And I really don't feel like 
 spending a week and a half figuring out how to use the thing and then 
 writing a tool.
 
 Wasn't there some kind of D lexer tool written in D several months ago?  I 
 can't find the NG thread.. 
 
 

Sep 16 2005