www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - std.date proposal

reply Fredrik Olsson <peylow gmail.com> writes:
I have created a proposal for a std.date replacement. And announce it 
here in hopes of some comments and criticism.

The parse code in converted from the PostgreSQL C code base, and is 
quite competent. If SQL99 supports it, then PostgreSQL does. Thanks goes 
to the postgres hackers.



The philosophy is that date and time can be represented with four basic 
types:

d_date - For a date with day precision.
d_time - For a time with millisecond precision (No date part).
d_timestamp - For a date+time with better than millisecond precision for
	dates close to present day.
d_duration - A fixed of relative duration of time. A months and a year,
	is relative as no two months are guarantied to have the same
	length, while weeks and hours are fixed (No support for leap
	seconds :) ).

The module is kept simple and consistent, all types are handled with 
just a few functions:

toType() - functions, with many overloads to make a time/date/duration
	from a string, different time units, Unix epoch, and more. For
	example toTime(hour, minute, second) and toDuration(string).

age() - functions with overloads to get the duration between two dates,
	times, or timestamps in fixed or relative units. For example:
	age(toDate("2006-02-01"), toDate("2006-03-01");
	will give "28 days", while:
	age(toDate("2006-02-01"), toDate("2006-03-01", true);
	Will give "1 month".

splitType() - functions with overloads, to split type into it's
	components. For example:
	int year, dayOfYear;
	splitDate(toDate("2006-10-28"), year, dayOfYear);

extract() - With overloads, extract a component of a type, or converts
	the type to special representations. Example:
	extract(toDate("2006-03-28"), DatePart.WEEKDAY);
	will give WeekDay.TUESDAY, and:
	extract(toDate("2006-03-28"), DatePart.EPOCH);
	will give 1143496800.

increment() - With overloads to add both a single date part, and
	duration to each type. For example:
	increment(toTime("01:02:03"), DatePart.HOUR, 4);
	Will give "05:02:03". And:
	increment(toDate("2006-02-12"), toDuration("3 months"));
	will give "2006-05-12".

truncate() - With overloads for each type, truncates a type to a given
	unit. For example:
	truncate(toDate("2006-08-12"), DatePart.QUARTER);
	will give "2006-06-01", and
	truncate(toDate("2006-03-28"), DatePart.WEEK);
	will give "2006-03-27".

toString() - With overloads for each type, converts to strings
	optionally with a formatting string (Not fully implemented).



A full documentation is available at:
http://peylow.no-ip.org/~peylow/date.html

The source is available at:
http://www.dsource.org/projects/dlisp/browser/branches?rev=51


Comments om implementation, documents and so forth are requested. I have 
no plans to ad any more functions as I like to keep simple things 
simple. But I do plan to add more "DatePart"'s, for example JULIANDAY 
for all the astronomers, and also extract date parts from other 
calendars, such as Arabic etc.

I have intentionaly not included conversion of weekdays and month into 
english names. As I believe that is up to the GUI, and locale parts of 
an applicition, not the low level lib.


regards
	Fredrik Olsson
Mar 28 2006
next sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Fredrik Olsson wrote:
 I have created a proposal for a std.date replacement. And announce it 
 here in hopes of some comments and criticism.

Preliminary comments: 1. It's "millennuim" (singular), "millennia" (plural). Two 'n's. Not "millenia". And why no CENTURY? 2. What is UDT? 3. I think you should leave out the "3/10/06" format because of the inherent ambiguity. And in English, the third month is called March, not mars. 4. How does your module deal with time zones? 5. I've also written an alternative to std.date. Please check it out: http://pr.stewartsplace.org.uk/d/sutil/ Stewart. -- -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GCS/M d- s:- C++ a->--- UB P+ L E W++ N+++ o K- w++ O? M V? PS- PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y ------END GEEK CODE BLOCK------ My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Mar 28 2006
parent reply Fredrik Olsson <peylow gmail.com> writes:
Stewart Gordon skrev:
 Fredrik Olsson wrote:
 
 I have created a proposal for a std.date replacement. And announce it 
 here in hopes of some comments and criticism.

<snip> Preliminary comments: 1. It's "millennuim" (singular), "millennia" (plural). Two 'n's. Not "millenia". And why no CENTURY?

when I was at it.
 2. What is UDT?
 

 3. I think you should leave out the "3/10/06" format because of the 
 inherent ambiguity.  And in English, the third month is called March, 
 not mars.
 

"M/D/Y" will stay I think, it is the US way, ambiguous or not, and there is allot of code/people out there making this assumption. If I could choose myself we would all go ISO :). Perhaps a mode flag should be added, to prefer YMD, DMY or MDY whenever an ambiguity exists?
 4. How does your module deal with time zones?
 

when time is created. Time zones are parsed, but currently ignored. I think the best option is to let the user simply choose if dates should be adjusted to UTC or to local when parsed.
 5. I've also written an alternative to std.date.  Please check it out:
 

contrary I would like yours to grow and complement, as I see it a OOP Date/Time library on top of mine would be the best solution. Sort of like how std.stdio should work as the under layer for a more feature complete module on top. I intend to do the bare bones, a solid foundation to build on top, and to be easy to do small stuff. Intervals, timezones, and more advanced stuff should be done with wrappers on top.
 http://pr.stewartsplace.org.uk/d/sutil/
 
 Stewart.
 

Mar 28 2006
next sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Fredrik Olsson wrote:
<snip>
 Spelling corrected, I bet there is more, way more...

Like your use of "allot" in the next sentence.
 "M/D/Y" will stay I think, it is the US way, ambiguous or not, and there 
 is allot of code/people out there making this assumption. If I could 
 choose myself we would all go ISO :).

And there are probably at least as many people in the world who expect dates to be D/M/Y. People's assumptions will differ even further on what century a two-digit year is in - do you have a policy on this?
 Perhaps a mode flag should be added, to prefer YMD, DMY or MDY whenever 
 an ambiguity exists?

I guess so. But it depends on how you define ambiguous or not. For example, date notation in MS Access confused me the other day until I got my head around how #13/5/06# is interpreted as 13 March and #12/5/06# as 5 December. Stewart. -- -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GCS/M d- s:- C++ a->--- UB P+ L E W++ N+++ o K- w++ O? M V? PS- PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y ------END GEEK CODE BLOCK------ My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Mar 29 2006
parent reply Deewiant <deewiant.doesnotlike.spam gmail.com> writes:
Stewart Gordon wrote:
 I guess so.  But it depends on how you define ambiguous or not.  For
 example, date notation in MS Access confused me the other day until I
 got my head around how #13/5/06# is interpreted as 13 March and
 #12/5/06# as 5 December.
 

If 13/5/06 is interpreted as 13 _March_ that is very confusing, indeed. <g>
Mar 29 2006
parent Stewart Gordon <smjg_1998 yahoo.com> writes:
Deewiant wrote:
 Stewart Gordon wrote:
 I guess so.  But it depends on how you define ambiguous or not.  For
 example, date notation in MS Access confused me the other day until I
 got my head around how #13/5/06# is interpreted as 13 March and
 #12/5/06# as 5 December.

If 13/5/06 is interpreted as 13 _March_ that is very confusing, indeed. <g>

Good catch! NTS I meant 13 May. Stewart. -- -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GCS/M d- s:- C++ a->--- UB P+ L E W++ N+++ o K- w++ O? M V? PS- PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y ------END GEEK CODE BLOCK------ My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Mar 29 2006
prev sibling parent reply "John C" <johnch_atms hotmail.com> writes:
 "M/D/Y" will stay I think, it is the US way, ambiguous or not, and there 
 is allot of code/people out there making this assumption. If I could 
 choose myself we would all go ISO :).

Who are these people expecting dates to appear in US format, I wonder? A date library that has no notion of locales has no business making any region-specific assumptions and should just implement ISO8601. After all, that's what it's for. If you must support a common date format, it should be D/M/Y, which is used by the vast majority of countries and accepted internationally. http://en.wikipedia.org/wiki/Calendar_date
Mar 29 2006
next sibling parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John C schrieb am 2006-03-29:
 "M/D/Y" will stay I think, it is the US way, ambiguous or not, and there 
 is allot of code/people out there making this assumption. If I could 
 choose myself we would all go ISO :).

Who are these people expecting dates to appear in US format, I wonder? A date library that has no notion of locales has no business making any region-specific assumptions and should just implement ISO8601. After all, that's what it's for.

Go ISO, go! ISO is most likely the only format that is interpreted correctly throughout the EU and Asia.
 If you must support a common date format, it should be D/M/Y, which is used 
 by the vast majority of countries and accepted internationally. 
 http://en.wikipedia.org/wiki/Calendar_date

Accepted internationally? So, what date is: 01/02/03 1) in an Arab context 2) in an American context 3) in a British context 4) in an CJK context 5) in a Frensh context 6) in a German context 7) in an Israeli context Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFEKsId3w+/yD4P9tIRAo1PAKDSgSR/F6J+vzzqAFyHQDraSMhz9wCeOkzA 36LqUXALBp1uO9FPU+QpDYE= =NBp3 -----END PGP SIGNATURE-----
Mar 29 2006
parent reply "John C" <johnch_atms hotmail.com> writes:
 If you must support a common date format, it should be D/M/Y, which is 
 used
 by the vast majority of countries and accepted internationally.
 http://en.wikipedia.org/wiki/Calendar_date

Accepted internationally?

According to the linked article. But I should have used quotation marks...
 So, what date is: 01/02/03

 1) in an Arab context
 2) in an American context
 3) in a British context
 4) in an CJK context
 5) in a Frensh context
 6) in a German context
 7) in an Israeli context

Ah, a trap. I could say for most of them it's 1 February 2003, but you can't rely on that being so. Really, it's for a good locale library to answer.
Mar 29 2006
next sibling parent kris <foo bar.com> writes:
John C wrote:
So, what date is: 01/02/03

1) in an Arab context
2) in an American context
3) in a British context
4) in an CJK context
5) in a Frensh context
6) in a German context
7) in an Israeli context

Ah, a trap. I could say for most of them it's 1 February 2003, but you can't rely on that being so. Really, it's for a good locale library to answer.

Anyone doing locale-specific formatting should take a look at what John created over here: http://svn.dsource.org/projects/mango/trunk/mango/locale/ ... all kind of handy formatting, including a variety of Calendar types. Here's a snip from the docs: --------------------- // Format with the user's current culture (eg, en-GB). Formatter.format("General: {0} Hexadecimal: 0x{0:x4} Numeric: {0:N}", 1000); // -> General: 1000 Hexadecimal: 0x03e8 Numeric: 1,000.00 // Format using a custom display format, substituting groups with those appropriate for Germany. Formatter.format(Culture.getCulture("de-DE"), "{0:#,#}", 12345678); // -> 12.345.678 // Format as a monetary value appropriate for Spain. Formatter.format(Culture.getCulture("es-ES"), "{0:C}", 59.99); // -> 59,99 € // Format today's date as appropriate for France. Formatter.format(Culture.getCulture("fr-FR"), "{0:D}", DateTime.today); // -> vendredi 3 mars 2006 --------------------- If that's not sufficient for some, there's always the venerable ICU wrappers. - Kris
Mar 29 2006
prev sibling parent Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John C schrieb am 2006-03-29:
 If you must support a common date format, it should be D/M/Y, which is 
 used
 by the vast majority of countries and accepted internationally.
 http://en.wikipedia.org/wiki/Calendar_date

Accepted internationally?

According to the linked article. But I should have used quotation marks...

The article uses a fussy meaning for "D/M/Y" - it includes "D.M.Y", "D.M.YYYY", "D. M. Y" and "D/M-Y" and perhaps a few more that weren't stated in the list explicitly e.g.: Strictly speaking 01/02/03 isn't considered a date in Germany, but 01.02.03 and 01.02.2003 are. The ones rendering the "DD/MM/YY" format totaly unusable are often the US with "MM/DD/YY". Whenever there is a slight chance that an US entity was involved: check for non-metric length, weight and odd dates :( Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFEK3PU3w+/yD4P9tIRAhngAJ4h+8uER6/PMjDJCCMSVrJwpmMKEwCgryDU r2zNiYPiCi+iU3Z8zc4ET+c= =7/AR -----END PGP SIGNATURE-----
Mar 29 2006
prev sibling parent reply Fredrik Olsson <peylow treyst.se> writes:
John C skrev:
 "M/D/Y" will stay I think, it is the US way, ambiguous or not, and there 
 is allot of code/people out there making this assumption. If I could 
 choose myself we would all go ISO :).

Who are these people expecting dates to appear in US format, I wonder? A date library that has no notion of locales has no business making any region-specific assumptions and should just implement ISO8601. After all, that's what it's for. If you must support a common date format, it should be D/M/Y, which is used by the vast majority of countries and accepted internationally. http://en.wikipedia.org/wiki/Calendar_date

Ok, let me argue for my point, and they you argue why not :). I have chosen the implementation for one single reason; I do as the SQL99 standard does. Instead of inventing my own scheme I have chosen a scheme I know, and is used by many. I could dumb it down, and greatly reduced code size, and only allow for ISO 8601 formatting, but as I rewrote the PostgreSQL parser implementation I deliberately kept the SQL way. Because it is a known standard, and allows for some flexibility. // Fredrik
Mar 30 2006
next sibling parent "John C" <johnch_atms hotmail.com> writes:
"Fredrik Olsson" <peylow treyst.se> wrote in message 
news:e0g6r2$206b$1 digitaldaemon.com...
 John C skrev:
 "M/D/Y" will stay I think, it is the US way, ambiguous or not, and there 
 is allot of code/people out there making this assumption. If I could 
 choose myself we would all go ISO :).

Who are these people expecting dates to appear in US format, I wonder? A date library that has no notion of locales has no business making any region-specific assumptions and should just implement ISO8601. After all, that's what it's for. If you must support a common date format, it should be D/M/Y, which is used by the vast majority of countries and accepted internationally. http://en.wikipedia.org/wiki/Calendar_date

Ok, let me argue for my point, and they you argue why not :). I have chosen the implementation for one single reason; I do as the SQL99 standard does. Instead of inventing my own scheme I have chosen a scheme I know, and is used by many. I could dumb it down, and greatly reduced code size, and only allow for ISO 8601 formatting, but as I rewrote the PostgreSQL parser implementation I deliberately kept the SQL way. Because it is a known standard, and allows for some flexibility.

Those are fair points and I've been known to model code on other libraries myself (coming up with original APIs is hard). But it seems to me that you're copying a dubious decision made by its developers. No doubt I'm not exempt from that charge either.
 // Fredrik 

Mar 30 2006
prev sibling parent reply Lucas Goss <lgoss007 gmail.com> writes:
Fredrik Olsson wrote:
 John C skrev:
 Who are these people expecting dates to appear in US format, I wonder?

 A date library that has no notion of locales has no business making 
 any region-specific assumptions and should just implement ISO8601. 
 After all, that's what it's for.

 If you must support a common date format, it should be D/M/Y, which is 
 used by the vast majority of countries and accepted internationally. 
 http://en.wikipedia.org/wiki/Calendar_date

Ok, let me argue for my point, and they you argue why not :). I have chosen the implementation for one single reason; I do as the SQL99 standard does. Instead of inventing my own scheme I have chosen a scheme I know, and is used by many. I could dumb it down, and greatly reduced code size, and only allow for ISO 8601 formatting, but as I rewrote the PostgreSQL parser implementation I deliberately kept the SQL way. Because it is a known standard, and allows for some flexibility.

I believe you said this earlier:
 I intend to do the bare bones, a solid foundation to build on top,
 and to be easy to do small stuff. Intervals, timezones, and more
 advanced stuff should be done with wrappers on top.

To me a good "bare bones" base would be the ISO8601 (even though I'm in the US), allowing small stuff on top, like the SQL99 stuff. I think a lot of libraries try to do to much.
Mar 30 2006
parent reply Fredrik Olsson <peylow gmail.com> writes:
Lucas Goss skrev:
 Fredrik Olsson wrote:
 
 John C skrev:

 Who are these people expecting dates to appear in US format, I wonder?

 A date library that has no notion of locales has no business making 
 any region-specific assumptions and should just implement ISO8601. 
 After all, that's what it's for.

 If you must support a common date format, it should be D/M/Y, which 
 is used by the vast majority of countries and accepted 
 internationally. http://en.wikipedia.org/wiki/Calendar_date

Ok, let me argue for my point, and they you argue why not :). I have chosen the implementation for one single reason; I do as the SQL99 standard does. Instead of inventing my own scheme I have chosen a scheme I know, and is used by many. I could dumb it down, and greatly reduced code size, and only allow for ISO 8601 formatting, but as I rewrote the PostgreSQL parser implementation I deliberately kept the SQL way. Because it is a known standard, and allows for some flexibility.

I believe you said this earlier: > I intend to do the bare bones, a solid foundation to build on top, > and to be easy to do small stuff. Intervals, timezones, and more > advanced stuff should be done with wrappers on top. To me a good "bare bones" base would be the ISO8601 (even though I'm in the US), allowing small stuff on top, like the SQL99 stuff. I think a lot of libraries try to do to much.

You have used my own words against me well. I am thinking of rewriting date.d to only allow for properly formatted dates, times and durations according to ISO8601. And then let dateparse.d, be an entity of it's own allowing for more "complex" parsing, and formatting. Is that a sound idea? // Fredrik
Mar 30 2006
parent reply Lucas Goss <lgoss007 gmail.com> writes:
Fredrik Olsson wrote:
 You have used my own words against me well.

lol...
 I am thinking of rewriting date.d to only allow for properly formatted 
 dates, times and durations according to ISO8601. And then let 
 dateparse.d, be an entity of it's own allowing for more "complex" 
 parsing, and formatting.
 
 Is that a sound idea?

That sounds much better to me. Any other ideas from anyone? I don't claim to be an expert in library development... but I'm trying to get better at it.
Mar 31 2006
parent Georg Wrede <georg.wrede nospam.org> writes:
Lucas Goss wrote:
 Fredrik Olsson wrote:
 
 You have used my own words against me well.

lol...
 I am thinking of rewriting date.d to only allow for properly formatted 
 dates, times and durations according to ISO8601. And then let 
 dateparse.d, be an entity of it's own allowing for more "complex" 
 parsing, and formatting.

 Is that a sound idea?

That sounds much better to me. Any other ideas from anyone? I don't claim to be an expert in library development... but I'm trying to get better at it.

If you really want a date parser that can do amazing stuff, you might want to look at the *nix 'at' command. It parses dates in a wide variety of formats. Check the sources. It's quite anglo-oriented, but with a little imagination one should be able to widen it to other nationalities too.
Apr 01 2006
prev sibling parent reply Frank Benoit <frank nix.de> writes:
Timestamp is double.
I don't think a date dependent precision is good choice.
I vote for

/**
 Nanosecondes since 1/1/2000
 */
public typedef long d_timestamp;

This is enough for +-270 Years and has an accuracy which is good enough
for me :)

Frank
Mar 29 2006
parent reply Fredrik Olsson <peylow treyst.se> writes:
Frank Benoit skrev:
 Timestamp is double.
 I don't think a date dependent precision is good choice.
 I vote for
 
 /**
  Nanosecondes since 1/1/2000
  */
 public typedef long d_timestamp;
 
 This is enough for +-270 Years and has an accuracy which is good enough
 for me :)
 

I see your point, and will try to explain why I have chosen double as I have. Using double I get the same scale for dates for timestamps, as for dates; the integer part is days. Having dates a days with times as fractions is also how the astronomers do it, they call it Julian Days, and base it on monday, january 1, 4713 BCE as the epoch. But the idea is the same. It is datatype used by many database implementations (PostgreSQL, MySQL, MS SQL Server 7 (and beyond?)). A double can represent infinity, -infinity, and not a number can be not a date. +-270 years is sort of an limitation :), even a simple genealogy application would hit that limit quite soon. Using a double is based on the idea that the farther away from today, the less relevant is precision. // Fredrik
Mar 30 2006
next sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Fredrik Olsson wrote:
 I see your point, and will try to explain why I have chosen double as I 
 have.
 
 Using double I get the same scale for dates for timestamps, as for 
 dates; the integer part is days.
 
 Having dates a days with times as fractions is also how the astronomers 
 do it, they call it Julian Days, and base it on monday, january 1, 4713 
 BCE as the epoch. But the idea is the same.
 
 It is datatype used by many database implementations (PostgreSQL, MySQL, 
 MS SQL Server 7 (and beyond?)).
 
 A double can represent infinity, -infinity, and not a number can be not 
 a date.
 
 +-270 years is sort of an limitation :), even a simple genealogy 
 application would hit that limit quite soon. Using a double is based on 
 the idea that the farther away from today, the less relevant is precision.

Double has another problem when used as a date - there are embedded processors in wide use that don't have floating point hardware. This means that double shouldn't be used in core routines that are not implicitly related to doing floating point calculations. A time/date package should pick one representation for time, and stick with it. Other representations should be supported only by converting them back and forth to the one format.
Apr 03 2006
parent reply Georg Wrede <georg.wrede nospam.org> writes:
Walter Bright wrote:
 Double has another problem when used as a date - there are embedded 
 processors in wide use that don't have floating point hardware. This
 means that double shouldn't be used in core routines that are not 
 implicitly related to doing floating point calculations.

Ignoring the issue of date, I have a comment on processors: IIRC, D will never be found on a processor less than 32 bits. Further, it may take some time before D actually gets used in something embedded. By that time, IMHO, it is unlikely that a 32b processor would not contain a math unit. --- Of course this may warrant a discussion here, which is good, because then we might end up with a more clear set of goals, both for library development and for D itself.
Apr 04 2006
next sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Georg Wrede wrote:
 Walter Bright wrote:
 Double has another problem when used as a date - there are embedded 
 processors in wide use that don't have floating point hardware. This
 means that double shouldn't be used in core routines that are not 
 implicitly related to doing floating point calculations.

Ignoring the issue of date, I have a comment on processors: IIRC, D will never be found on a processor less than 32 bits. Further, it may take some time before D actually gets used in something embedded. By that time, IMHO, it is unlikely that a 32b processor would not contain a math unit. --- Of course this may warrant a discussion here, which is good, because then we might end up with a more clear set of goals, both for library development and for D itself.

At the start that D wasn't going to accommodate 16 bit processors for very good reasons, there are 32 bit processors in wide use in the embedded market that do not have hardware floating point. There is no reason to gratuitously not run on those systems.
Apr 04 2006
parent reply Georg Wrede <georg.wrede nospam.org> writes:
Walter Bright wrote:
 Georg Wrede wrote:
 Walter Bright wrote:
 
 Double has another problem when used as a date - there are
 embedded processors in wide use that don't have floating point
 hardware. This means that double shouldn't be used in core
 routines that are not implicitly related to doing floating point
 calculations.

Ignoring the issue of date, I have a comment on processors: IIRC, D will never be found on a processor less than 32 bits. Further, it may take some time before D actually gets used in something embedded. By that time, IMHO, it is unlikely that a 32b processor would not contain a math unit. --- Of course this may warrant a discussion here, which is good, because then we might end up with a more clear set of goals, both for library development and for D itself.

At the start that D wasn't going to accommodate 16 bit processors for very good reasons, there are 32 bit processors in wide use in the embedded market that do not have hardware floating point. There is no reason to gratuitously not run on those systems.

Ok, that was exactly the answer I thought I'd get. Currently, this issue is not entirely foreign to me. I'm delivering a HW + SW solution to a manufacturer of plastics processing machines, where my solution will supervise the process and alert an operator whenever the machine "wants hand-holding". For that purpose, the choice is between an 8-bit and a 16-bit processor. Very probably a PIC. (So no D here. :-), I'll end up doing it in C.) Now, considering Moore, and the fact that the 80387 math coprocessor didn't have all too many transistors, the marginal price of math is plummeting. Especially compared with the minimum number of transistors needed for a (general purpose) 32-bit CPU. Also, since the purveyors of 32-bit processors are keen on showing the ease of use and versatility of their processors, it is likely that even if math is not on the chip, they at least deliver suitable libraries to emulate that in software. --- As I see it, there are mainly two use cases for D with embedded processors (correct me if I'm wrong): First (and probably the more popular scenario), there either exists a rudimentary (probably even a real-time) OS for the processor (or application domain), delivered (for free) by the HW manufacturer, or, they deliver the necessary libraries to be used either with their compiler or for GCC cross compiling. Second use case being, one is about to develop the entire SW for an application "from scratch". Now, in the former case, math is either on-chip, or included in the libraries. In the latter, either we don't use math, or we make (or acquire) the necessary functions from other sources. --- The second use case worries me. (Possibly unduely?) D not being entirely decoupled from Phobos, at least creates an illusion of potential problems for "from-scratch" SW development for embedded HW. --- We do have to remember the reasons leading to choosing a 32-bit processor in the first place: if the process to be cotrolled is too complicated or otherwise needs more power than a 16-bit CPU can deliver, only then should one choose a 32-bit CPU. Now, at that time, it is likely that requirements for RAM, address space, speed, and other things are big enough that the inclusion of math (in HW or library) becomes minor. (Oh, and some of the current 16-bit (and even some 8-bit) processors do actually deliver astonishing horsepower already.) So, assuming D has access to math on _all_ of the processors and HW it'll ever be on, suddenly doesn't seem so arbitrary.
Apr 04 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Georg Wrede wrote:
 Walter Bright wrote:
 At the start that D wasn't going to accommodate 16 bit processors for
 very good reasons, there are 32 bit processors in wide use in the 
 embedded market that do not have hardware floating point. There is no
 reason to gratuitously not run on those systems.

Ok, that was exactly the answer I thought I'd get. Currently, this issue is not entirely foreign to me. I'm delivering a HW + SW solution to a manufacturer of plastics processing machines, where my solution will supervise the process and alert an operator whenever the machine "wants hand-holding". For that purpose, the choice is between an 8-bit and a 16-bit processor. Very probably a PIC. (So no D here. :-), I'll end up doing it in C.)

So, you're not even using a 32 bit processor, but a 16 bit design. I know for a fact that there are *new* embedded systems designs going on using 32 bit processors that don't have FPUs.
 Now, considering Moore, and the fact that the 80387 math coprocessor 
 didn't have all too many transistors, the marginal price of math is 
 plummeting. Especially compared with the minimum number of transistors 
 needed for a (general purpose) 32-bit CPU.

So why are you using a 16 bit design? I can guess - cost. And that's why embedded systems for 32 bit processors often don't have FPUs. Cost, where even a few cents matter. (Also power consumption.)
 Also, since the purveyors of 32-bit processors are keen on showing the 
 ease of use and versatility of their processors, it is likely that even 
 if math is not on the chip, they at least deliver suitable libraries to 
 emulate that in software.

I have such a library (needed for the DOS-32 support). Although it works fine, it is 100 times slower than hardware floating point. Embedded CPUs are often strapped for speed, so why gratuitously require floating point?
 Now, in the former case, math is either on-chip, or included in the 
 libraries. In the latter, either we don't use math, or we make (or 
 acquire) the necessary functions from other sources.

Or design out unnecessary uses of floating point.
 The second use case worries me. (Possibly unduely?) D not being entirely 
 decoupled from Phobos, at least creates an illusion of potential 
 problems for "from-scratch" SW development for embedded HW.

Phobos doesn't require floating point support from the processor unless one actually uses floating point in the application code. I also really don't understand why anyone using D would require not using Phobos. What's the problem?
 We do have to remember the reasons leading to choosing a 32-bit 
 processor in the first place: if the process to be cotrolled is too 
 complicated or otherwise needs more power than a 16-bit CPU can deliver, 
 only then should one choose a 32-bit CPU. Now, at that time, it is 
 likely that requirements for RAM, address space, speed, and other things 
 are big enough that the inclusion of math (in HW or library) becomes 
 minor.

All I can say is I posed the same question to embedded systems people using 32 bit CPUs sans FPU, and they tell me the costs are not minor - either in money or power consumption.
Apr 04 2006
next sibling parent kris <foo bar.com> writes:
Walter Bright wrote:
[snip]
 Phobos doesn't require floating point support from the processor unless 
 one actually uses floating point in the application code.
 
 I also really don't understand why anyone using D would require not 
 using Phobos. What's the problem?

Phobos does not suit everyone's ideal of a runtime library. Enforcing it's usage as part of the D language is no better than the tight coupling of the Java libraries that you've happily denigrated in the past. There would be no problem with Phobos at all, if you'd avoid hooking it directly into the language. For example, TypeInfo recently changed to import std.string, which itself imports a slew of otherwise redundant code. I truly hope you can see the ironic humour in that :)
 We do have to remember the reasons leading to choosing a 32-bit 
 processor in the first place: if the process to be cotrolled is too 
 complicated or otherwise needs more power than a 16-bit CPU can 
 deliver, only then should one choose a 32-bit CPU. Now, at that time, 
 it is likely that requirements for RAM, address space, speed, and 
 other things are big enough that the inclusion of math (in HW or 
 library) becomes minor.

All I can say is I posed the same question to embedded systems people using 32 bit CPUs sans FPU, and they tell me the costs are not minor - either in money or power consumption.

I spend a lot of time with MCUs. The cost issue is not so much the register width, but the pin count. That is, a 32-bit device, perhaps with embedded FPU, is not really such a big cost issue (even for battery life, when you talk about static-cmos design at 10MHz to 100Mhz). But you need to feed it with something useful, which tends to increase the trace-count quite quickly (which then leads to other costs, etc, etc). On the other hand, 8-bit designs are often implemented with as little as 14 pins. That makes an entire system trivial to produce. Heck, there's a Hitachi MCU with 32bit registers on a 64pin package, just to keep the pin-count down (it can address only a few KB though). I'm rather familiar with that one, and can attest to it being able to execute realtime FFTs at 20Mhz, via FP emulation using its wide registers. Without those 32bit registers, that just wouldn't be feasible. Once you get to PDA/Phone land, one is generally talking about 200+ pins on the MCU. Overall costs are up notably at that point, but then the devices support vast address spaces (now heading for the GB range). Such devices are now starting to gain dedicated 3D graphics coprocessors on the board (jeez!), so adding FPU support is surely not a cost issue there? Still; at the both ends of the scale, it's quite likely that one would wind up facing a DSP-oriented design instead of a MCU+FPU design ~ simply because they're readily available and highly competitive (and with formidable libraries available). I think the upshot is that one probably shouldn't /rely/ on FP support on MCUs, and thus a DateTime library targeted at such devices would be a trifle foolhardy to do so ~ especially when the alternatives are typically just fine? I'm sure this has now gone completely off-topic;
Apr 04 2006
prev sibling parent reply Georg Wrede <georg.wrede nospam.org> writes:
Walter Bright wrote:
 Georg Wrede wrote:
 Walter Bright wrote:

 At the start that D wasn't going to accommodate 16 bit processors for
 very good reasons, there are 32 bit processors in wide use in the 
 embedded market that do not have hardware floating point. There is no
 reason to gratuitously not run on those systems.

Ok, that was exactly the answer I thought I'd get. Currently, this issue is not entirely foreign to me. I'm delivering a HW + SW solution to a manufacturer of plastics processing machines, where my solution will supervise the process and alert an operator whenever the machine "wants hand-holding". For that purpose, the choice is between an 8-bit and a 16-bit processor. Very probably a PIC. (So no D here. :-), I'll end up doing it in C.)

So, you're not even using a 32 bit processor, but a 16 bit design. I know for a fact that there are *new* embedded systems designs going on using 32 bit processors that don't have FPUs.

True. I understand they are targeted to big manufacturers, who know exactly the use, and do large production runs. And where ASICs would be too expensive, considering the width of the task. (Set top boxes, automotive control subsystems, telecomms network equipment, etc.) Making significant inroads to those areas, however, may be asking for too much. The manufacturers are big corporations, they have an established (and massive) infrastructure already in place, and that is either directly or indirectly relying on C, whose track record is unparallelled. So for them to even glimpse at D, D would have to offer something significantly better for that domain. (Which (I'm sorry) I don't currently see.)
 Now, considering Moore, and the fact that the 80387 math coprocessor 
 didn't have all too many transistors, the marginal price of math is 
 plummeting. Especially compared with the minimum number of transistors 
 needed for a (general purpose) 32-bit CPU.

So why are you using a 16 bit design? I can guess - cost. And that's why embedded systems for 32 bit processors often don't have FPUs. Cost, where even a few cents matter. (Also power consumption.)

The task is simple enough for an 8-bit processor to handle just fine. (In my case, the thing is mains-operated, and the CPU cost is negligible compared with the rest of the delivery, so my reason is just easier programming.)
 Also, since the purveyors of 32-bit processors are keen on showing the 
 ease of use and versatility of their processors, it is likely that 
 even if math is not on the chip, they at least deliver suitable 
 libraries to emulate that in software.

I have such a library (needed for the DOS-32 support). Although it works fine, it is 100 times slower than hardware floating point. Embedded CPUs are often strapped for speed, so why gratuitously require floating point?

It should be much slower. Otherwise FPUs would not be popular. :-) Otoh, it doesn't slow down anything else, so in many cases the total performance hit is minor. And a good compiler/linker would in any case indluce only the actually used routines. You may even use this in combination with lookup tables, if profiling results show the need. (Why use FP hardware or libraries at all, is the same kind of question as why use automatic memory management! Right? One can get by without, but if it's there, why not use it.)
 Now, in the former case, math is either on-chip, or included in the 
 libraries. In the latter, either we don't use math, or we make (or 
 acquire) the necessary functions from other sources.

Or design out unnecessary uses of floating point.

For speed, of course. But for size, with D, it's not that simple. With the current size of "Hello World" (if done without C's printf), a basic FP library starts to feel small.
 The second use case worries me. (Possibly unduely?) D not being 
 entirely decoupled from Phobos, at least creates an illusion of 
 potential problems for "from-scratch" SW development for embedded HW.

Phobos doesn't require floating point support from the processor unless one actually uses floating point in the application code.

Turbo Pascal has had this since the '80s. You could turn a switch so that it automatically uses the FP library if float was used and the runtime computer didn't have an FPU.
 I also really don't understand why anyone using D would require not 
 using Phobos. What's the problem?

I admit this is a "feelings based" thing with most people I've talked with. It seems that on embedded platforms, many expect to write all the needed code themselves. It's also felt (possibly unduely??) that Phobos (or whatever general Win+*nix standard library) is mostly useless in embedded applications. Of course, this may also be due to lack of information on their side?
 We do have to remember the reasons leading to choosing a 32-bit 
 processor in the first place: if the process to be cotrolled is too 
 complicated or otherwise needs more power than a 16-bit CPU can 
 deliver, only then should one choose a 32-bit CPU. Now, at that time, 
 it is likely that requirements for RAM, address space, speed, and 
 other things are big enough that the inclusion of math (in HW or 
 library) becomes minor.

All I can say is I posed the same question to embedded systems people using 32 bit CPUs sans FPU, and they tell me the costs are not minor - either in money or power consumption.

The world is going towards an increasing number of small and midsize companies entering the embedded arena. They typically would want to work with a single architecture, for obvious reasons. A CPU that contains math hardware which can be turned off when not needed (as well as other systems that can be powered off), seems to answer that kind of needs. I increasingly see CPU designers understanding this trend. --- To give a parallell (to explain my view here): There are many Linux distributions that are compiled with 386 as target. At the same time, their specs for memory, clock speed, etc. _in_practice_ rule out any machine not using recent Intel processors. I see this as a joke. Call this inconsistent specs. I'm discussing here so D would avoid this kind of inconsistencies. Insisting on not needing hardware FP is ok. But to legitimize that, one has to cater to scarce resources in other areas too. Conversely, not genuinely making the language usable in smaller environments, makes striving to independence of FPU not worth the effort and inconvenience.
Apr 06 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Georg Wrede wrote:
 I admit this is a "feelings based" thing with most people I've talked 
 with. It seems that on embedded platforms, many expect to write all the 
 needed code themselves. It's also felt (possibly unduely??) that Phobos 
 (or whatever general Win+*nix standard library) is mostly useless in 
 embedded applications.

I'd like to get to the bottom of this feeling. For example, Kris was unhappy that typeinfo imported std.strings. I can't figure out what the problem with that is.
 To give a parallell (to explain my view here): There are many Linux 
 distributions that are compiled with 386 as target. At the same time, 
 their specs for memory, clock speed, etc. _in_practice_ rule out any 
 machine not using recent Intel processors. I see this as a joke.
 
 Call this inconsistent specs. I'm discussing here so D would avoid this 
 kind of inconsistencies.

For the embedded people I've talked with, D without floating point would have been a good match.
 Insisting on not needing hardware FP is ok. But to legitimize that, one 
 has to cater to scarce resources in other areas too. Conversely, not 
 genuinely making the language usable in smaller environments, makes 
 striving to independence of FPU not worth the effort and inconvenience.

It isn't necessary to strive to not use the FPU. Just don't use it unless floating point is actually needed. There is no need nor benefit to use floating point for calendar time. I've also seen people use floating point for random number generators - this is also neither necessary nor beneficial.
Apr 06 2006
next sibling parent Georg Wrede <georg.wrede nospam.org> writes:
Walter Bright wrote:

 There is no need nor benefit to use floating point for calendar time.
 I've also seen people use floating point for random number generators
 - this is also neither necessary nor beneficial.

Heh, that's why I changed the subject. :-) I'm not against the calendar thing. I'm only talking processors and system requirements, etc. here. And consistency of specs. I'll think some more before actually commenting on the rest of your post.
Apr 06 2006
prev sibling next sibling parent reply kris <foo bar.com> writes:
Walter Bright wrote:
 Georg Wrede wrote:
 
 I admit this is a "feelings based" thing with most people I've talked 
 with. It seems that on embedded platforms, many expect to write all 
 the needed code themselves. It's also felt (possibly unduely??) that 
 Phobos (or whatever general Win+*nix standard library) is mostly 
 useless in embedded applications.

I'd like to get to the bottom of this feeling. For example, Kris was unhappy that typeinfo imported std.strings. I can't figure out what the problem with that is.

I'll try to explain it from my perspective: 1) You show an adversion to tightly coupled library modules ~ made a number of negative comments about the Java libraries in that respect ~ and have spelled out in the past a desire to duplicate code as necessary to avoid said tight coupling. This is good design, and it's one of the harder things to balance when building a library. Yet, there's flagrant cases where D tosses this out of the window along with the bathwater. Instead of adding a duplicate itoa() method (about 60 bytes of code), or perhaps linking to the C library version, TypeInfo gratuitously imports std.string and all its vast array of baggage. Heck, everyone makes mistakes, but your comment above indicates you feel this kind of tight coupling is perfectly fine? Then, there's printf() being linked via Object ~ it's been 2 years since the push to have that removed (which you agreed to), yet it's now clear there's no intent to do so. So what's wrong with printf()? Well, it brings along with it almost the entire C IO library, including most of the wide-char processing and, of course, all the floating-point support, setup, and management. All completely unecessary where one doesn't use it. And it's linked at the base of the Object tree. Without wishing to put too fine a point on it, you continue to do just exactly what you preach against; and there's apparently no good reason for it. 2) Not everyone likes Phobos. One might think you'd be happy to encourage (support, even!) alternate libraries that might far exceed the utility and/or design parameters of Phobos itself. Yet, by tightly coupling the D language to Phobos, you make life difficult for those alternate libraries. And for what reason? It makes no sense at all. If you'd decouple the language from the library, you'd end up with something very close to Ares. It's lean, efficient, and very flexible. In fact, it lends itself very well to others building a working D compiler & environment ~ the kind of thing that helps increase adoption by decreasing concerns. By encouraging development of alternate libraries, the D language stands a better chance of having a good one. Via judicious decoupling, you can keep the resultant executable lean and mean ~ making it more attractive to the embedded market, amongst others. D might even avoid having to link the FPU management code by default :-P You said "Kris was unhappy that typeinfo imported std.strings. I can't figure out what the problem with that is" ~~ I hope this helps you get there.
Apr 06 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
kris wrote:
 Walter Bright wrote:
 I'd like to get to the bottom of this feeling. For example, Kris was 
 unhappy that typeinfo imported std.strings. I can't figure out what 
 the problem with that is.

I'll try to explain it from my perspective: 1) You show an adversion to tightly coupled library modules ~ made a number of negative comments about the Java libraries in that respect ~ and have spelled out in the past a desire to duplicate code as necessary to avoid said tight coupling. This is good design, and it's one of the harder things to balance when building a library. Yet, there's flagrant cases where D tosses this out of the window along with the bathwater.

The trouble with Java was that even the most trivial program pulled in *everything*, including the graphics library. The typeinfo for a particular type is only linked in if that type is actually used in a user program.
 Instead of adding a duplicate itoa() method (about 60 bytes of code), or 
 perhaps linking to the C library version, TypeInfo gratuitously imports 
 std.string and all its vast array of baggage. Heck, everyone makes 
 mistakes, but your comment above indicates you feel this kind of tight 
 coupling is perfectly fine?

Although there is a lot of code in std.string, unreferenced free functions in it should be discarded by the linker. A check of the generated .map file should verify this - it is certainly supposed to work that way. One problem Java has is that there are no free functions, so referencing one function wound up pulling in every part of the class the function resided in.
 Then, there's printf() being linked via Object ~ it's been 2 years since 
 the push to have that removed (which you agreed to), yet it's now clear 
 there's no intent to do so. So what's wrong with printf()? Well, it 
 brings along with it almost the entire C IO library, including most of 
 the wide-char processing and, of course, all the floating-point support, 
 setup, and management. All completely unecessary where one doesn't use 
 it. And it's linked at the base of the Object tree.

printf doesn't pull in the floating point library (I went to a lot of effort to make that so!). It does pull in the C IO library, which is very hard to not pull in (there always seems to be something referencing it). It shouldn't pull in the C wide character stuff. D's IO (writefln) will pull in C's IO anyway, so the only thing extra is the integer version of the specific printf code (about 4K).
 2) Not everyone likes Phobos. One might think you'd be happy to 
 encourage (support, even!) alternate libraries that might far exceed the 
 utility and/or design parameters of Phobos itself. Yet, by tightly 
 coupling the D language to Phobos, you make life difficult for those 
 alternate libraries. And for what reason? It makes no sense at all.

The only parts of phobos directly referenced by the compiler are typeinfo, object, and the code in internal.
 If you'd decouple the language from the library, you'd end up with 
 something very close to Ares. It's lean, efficient, and very flexible. 
 In fact, it lends itself very well to others building a working D 
 compiler & environment ~ the kind of thing that helps increase adoption 
 by decreasing concerns. By encouraging development of alternate 
 libraries, the D language stands a better chance of having a good one. 
 Via judicious decoupling, you can keep the resultant executable lean and 
 mean ~ making it more attractive to the embedded market, amongst others. 
 D might even avoid having to link the FPU management code by default :-P
 
 
 You said "Kris was unhappy that typeinfo imported std.strings. I can't 
 figure out what the problem with that is" ~~ I hope this helps you get 
 there.

And I hope I responded adequately.
Apr 06 2006
next sibling parent reply kris <foo bar.com> writes:
As I said, that viewpoint is from my perspective ~ the intent was 
certainly not to elicit a defensive response. Instead, I'd hoped you'd 
be open to some suggestions;

More inline:


Walter Bright wrote:
 kris wrote:
 
 Walter Bright wrote:

 I'd like to get to the bottom of this feeling. For example, Kris was 
 unhappy that typeinfo imported std.strings. I can't figure out what 
 the problem with that is.

I'll try to explain it from my perspective: 1) You show an adversion to tightly coupled library modules ~ made a number of negative comments about the Java libraries in that respect ~ and have spelled out in the past a desire to duplicate code as necessary to avoid said tight coupling. This is good design, and it's one of the harder things to balance when building a library. Yet, there's flagrant cases where D tosses this out of the window along with the bathwater.

The trouble with Java was that even the most trivial program pulled in *everything*, including the graphics library. The typeinfo for a particular type is only linked in if that type is actually used in a user program.

Yes, that's correct. But typeinfo is a rather rudimetary part of the language support. Wouldn't you agree? If I, for example, declare an array of 10 bytes (static byte[10]) then I'm bound over to import std.string ~ simply because TypeInfo_StaticArray wants to use std.string.toString(int), rather than the C library version of itoa() or a "low-level support" version instead. That's tight-coupling within very low-level language support. Uncool. Wouldn't you at least agree that specific instance is hardly an absolute necessity?
 Instead of adding a duplicate itoa() method (about 60 bytes of code), 
 or perhaps linking to the C library version, TypeInfo gratuitously 
 imports std.string and all its vast array of baggage. Heck, everyone 
 makes mistakes, but your comment above indicates you feel this kind of 
 tight coupling is perfectly fine?

Although there is a lot of code in std.string, unreferenced free functions in it should be discarded by the linker. A check of the generated .map file should verify this - it is certainly supposed to work that way. One problem Java has is that there are no free functions, so referencing one function wound up pulling in every part of the class the function resided in.

This is exactly the case with printf <g>. It winds up linking the world because it's a general purpose utility function that does all kinds of conversion and all kinds of IO. Printf() is an all or nothing design ~ you can't selectively link pieces of it. That's usually not a problem. However, you've chosen to bind it to low-level language support (in the root Object). That choice causes tight coupling between the language low-level support and a high-level library function ~ one which ought to be optional. Wouldn't you at least agree this specific case is not necessary for the D language to function correctly? That there are other perfectly workable alternatives?
 Then, there's printf() being linked via Object ~ it's been 2 years 
 since the push to have that removed (which you agreed to), yet it's 
 now clear there's no intent to do so. So what's wrong with printf()? 
 Well, it brings along with it almost the entire C IO library, 
 including most of the wide-char processing and, of course, all the 
 floating-point support, setup, and management. All completely 
 unecessary where one doesn't use it. And it's linked at the base of 
 the Object tree.

printf doesn't pull in the floating point library (I went to a lot of effort to make that so!). It does pull in the C IO library, which is very hard to not pull in (there always seems to be something referencing it). It shouldn't pull in the C wide character stuff. D's IO (writefln) will pull in C's IO anyway, so the only thing extra is the integer version of the specific printf code (about 4K).

How can it convert %f, %g and so on if it doesn't use FP support at all? Either way, it's not currently possible to build a D program without a swathe of FP support code, printf, the entire C IO package, wide-char support, and a whole lot more besides. I'd assumed the linked FP support was for printf, but perhaps it's for std.string instead? I've posted the linker maps (in the past) to illustrate exactly this. There's no absolute need for most of this stuff. It shouldn't be bound at the low level.
 2) Not everyone likes Phobos. One might think you'd be happy to 
 encourage (support, even!) alternate libraries that might far exceed 
 the utility and/or design parameters of Phobos itself. Yet, by tightly 
 coupling the D language to Phobos, you make life difficult for those 
 alternate libraries. And for what reason? It makes no sense at all.

The only parts of phobos directly referenced by the compiler are typeinfo, object, and the code in internal.

No argument there. Yet those modules are slowly importing chunks of Phobos, making them dependencies also. My point is, and has always been, there's no need for those secondary dependencies. Especially at the level of language-support (like typeinfo and object).
 
 
 If you'd decouple the language from the library, you'd end up with 
 something very close to Ares. It's lean, efficient, and very flexible. 
 In fact, it lends itself very well to others building a working D 
 compiler & environment ~ the kind of thing that helps increase 
 adoption by decreasing concerns. By encouraging development of 
 alternate libraries, the D language stands a better chance of having a 
 good one. Via judicious decoupling, you can keep the resultant 
 executable lean and mean ~ making it more attractive to the embedded 
 market, amongst others. D might even avoid having to link the FPU 
 management code by default :-P


 You said "Kris was unhappy that typeinfo imported std.strings. I can't 
 figure out what the problem with that is" ~~ I hope this helps you get 
 there.

And I hope I responded adequately.

Are you not at all interested in improving this aspect of the language usage?
Apr 06 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
kris wrote:
 Yes, that's correct. But typeinfo is a rather rudimetary part of the 
 language support. Wouldn't you agree? If I, for example, declare an 
 array of 10 bytes (static byte[10]) then I'm bound over to import 
 std.string ~ simply because TypeInfo_StaticArray wants to use 
 std.string.toString(int), rather than the C library version of itoa() or 
 a "low-level support" version instead.

It has nothing to do with having a static byte[10] declaration. For the program: void main() { static byte[10] b; } The only things referenced by the object file are _main, __acrtused_con, and __Dmain. You can verify this by running obj2asm on the output, which gives: ------------------------------------- _TEXT segment dword use32 public 'CODE' ;size is 0 _TEXT ends _DATA segment para use32 public 'DATA' ;size is 0 _DATA ends CONST segment para use32 public 'CONST' ;size is 0 CONST ends _BSS segment para use32 public 'BSS' ;size is 10 _BSS ends FLAT group includelib phobos.lib extrn _main extrn __acrtused_con extrn __Dmain __Dmain COMDAT flags=x0 attr=x0 align=x0 _TEXT segment assume CS:_TEXT _TEXT ends _DATA segment _DATA ends CONST segment CONST ends _BSS segment _BSS ends __Dmain comdat assume CS:__Dmain xor EAX,EAX ret __Dmain ends end ---------------------------------- Examining the .map file produced shows that only these functions are pulled in from std.string: 0002:00002364 _D3std6string7iswhiteFwZi 00404364 0002:000023A4 _D3std6string3cmpFAaAaZi 004043A4 0002:000023E8 _D3std6string4findFAawZi 004043E8 0002:00002450 _D3std6string8toStringFkZAa 00404450 0002:000024CC _D3std6string9inPatternFwAaZi 004044CC 0002:00002520 _D3std6string6columnFAaiZk 00404520 I do not know offhand why a couple of those are pulled in, but I suggest that obj2asm and the generated .map files are invaluable at determining what pulls in what. Sometimes the results are surprising.
 That's tight-coupling within very low-level language support. Uncool.
 Wouldn't you at least agree that specific instance is hardly an absolute 
 necessity?

std.string.toString is 124 bytes long, and doesn't pull anything else in (except see below). Writing another version of it in typeinfo isn't going to reduce the size of the program *at all*, in fact, it will likely increase it because now there'll be two versions of it.
 Although there is a lot of code in std.string, unreferenced free 
 functions in it should be discarded by the linker. A check of the 
 generated .map file should verify this - it is certainly supposed to 
 work that way. One problem Java has is that there are no free 
 functions, so referencing one function wound up pulling in every part 
 of the class the function resided in.


No, it does not link in the world, floating point, or graphics libraries. It links in C's standard I/O (which usually gets linked in anyway), and about 4000 bytes of code. That's somewhat less than a megabyte <g>.
 because it's a general purpose utility function that does all kinds of 
 conversion and all kinds of IO. Printf() is an all or nothing design ~ 
 you can't selectively link pieces of it.
 
 That's usually not a problem. However, you've chosen to bind it to 
 low-level language support (in the root Object). That choice causes 
 tight coupling between the language low-level support and a high-level 
 library function ~ one which ought to be optional.
 
 Wouldn't you at least agree this specific case is not necessary for the 
 D language to function correctly? That there are other perfectly 
 workable alternatives?

It's just not a big deal. Try the following: extern (C) int printf(char* f, ...) { return 0; } void main() { static byte[10] b; } and compare the difference in exe file sizes, with and without the printf stub.
 printf doesn't pull in the floating point library (I went to a lot of 
 effort to make that so!). It does pull in the C IO library, which is 
 very hard to not pull in (there always seems to be something 
 referencing it). It shouldn't pull in the C wide character stuff. D's 
 IO (writefln) will pull in C's IO anyway, so the only thing extra is 
 the integer version of the specific printf code (about 4K).


It's magic! Naw, it's just that if you actually use floating point in a program, the compiler emits a special extern reference (to __fltused) which pulls in the floating point IO formatting code. Otherwise, it defaults to just a stub. Try it.
 Either way, it's not currently possible to build a D program without a 
 swathe of FP support code,
 printf,
 the entire C IO package,
 wide-char support,
 and a whole lot more besides. I'd assumed the linked FP support 
 was for printf, but perhaps it's for std.string instead? I've posted the 
 linker maps (in the past) to illustrate exactly this.

My point is that assuming what is pulled in by what is about as reliable as guessing where the bottlenecks in one's code is. You can't tell bottlenecks without a profiler, and you've got both hands tied behind your back trying to figure out who pulls in what if you're not using .map files, grep, and obj2asm.
 Are you not at all interested in improving this aspect of the language 
 usage?

Sure, but based on accurate information. Pulling printf won't do anything. Try it if you don't agree. For example, which modules pull in the floating point formatting code? It isn't printf. We can find out by doing a grep for __fltused: boxer.obj: __fltused complex.obj: __fltused conv.obj: __fltused date.obj: __fltused demangle.obj: __fltused format.obj: __fltused gamma.obj: __fltused math.obj: __fltused math2.obj: __fltused outbuffer.obj: __fltused stream.obj: __fltused string.obj: __fltused ti_Acdouble.obj: __fltused ti_Acfloat.obj: __fltused ti_Acreal.obj: __fltused ti_Adouble.obj: __fltused ti_Afloat.obj: __fltused ti_Areal.obj: __fltused ti_cdouble.obj: __fltused ti_cfloat.obj: __fltused ti_creal.obj: __fltused ti_double.obj: __fltused ti_float.obj: __fltused ti_real.obj: __fltused Some examination of the .map file shows that the only one of these pulled in by default is std.string. So I think a reasonable approach would be to look at removing the floating point from std.string - printf isn't the problem, nor is referencing a function in std.string.
Apr 06 2006
next sibling parent reply Sean Kelly <sean f4.ca> writes:
Walter Bright wrote:
 kris wrote:
 Yes, that's correct. But typeinfo is a rather rudimetary part of the 
 language support. Wouldn't you agree? If I, for example, declare an 
 array of 10 bytes (static byte[10]) then I'm bound over to import 
 std.string ~ simply because TypeInfo_StaticArray wants to use 
 std.string.toString(int), rather than the C library version of itoa() 
 or a "low-level support" version instead.

It has nothing to do with having a static byte[10] declaration. For the program: void main() { static byte[10] b; } The only things referenced by the object file are _main, __acrtused_con, and __Dmain. You can verify this by running obj2asm on the output, which gives: ------------------------------------- _TEXT segment dword use32 public 'CODE' ;size is 0 _TEXT ends _DATA segment para use32 public 'DATA' ;size is 0 _DATA ends CONST segment para use32 public 'CONST' ;size is 0 CONST ends _BSS segment para use32 public 'BSS' ;size is 10 _BSS ends FLAT group includelib phobos.lib extrn _main extrn __acrtused_con extrn __Dmain __Dmain COMDAT flags=x0 attr=x0 align=x0 _TEXT segment assume CS:_TEXT _TEXT ends _DATA segment _DATA ends CONST segment CONST ends _BSS segment _BSS ends __Dmain comdat assume CS:__Dmain xor EAX,EAX ret __Dmain ends end ----------------------------------

As expected, building this against Ares produces the exact same output.
 Examining the .map file produced shows that only these functions are 
 pulled in from std.string:
 
 0002:00002364       _D3std6string7iswhiteFwZi  00404364
 0002:000023A4       _D3std6string3cmpFAaAaZi   004043A4
 0002:000023E8       _D3std6string4findFAawZi   004043E8
 0002:00002450       _D3std6string8toStringFkZAa 00404450
 0002:000024CC       _D3std6string9inPatternFwAaZi 004044CC
 0002:00002520       _D3std6string6columnFAaiZk 00404520
 
 I do not know offhand why a couple of those are pulled in, but I suggest 
 that obj2asm and the generated .map files are invaluable at determining 
 what pulls in what. Sometimes the results are surprising.

Do I have to do anything special to get this data in the .map file? Mine contains no function references at all. Here's the first few lines (where it seems the function data should be): Start Length Name Class 0002:00000000 0000E1B8H _TEXT CODE 32-bit 0002:0000E1B8 00000162H ICODE ICODE 32-bit 0003:00000000 00000004H .CRT$XIA DATA 32-bit
 It's just not a big deal. Try the following:
 
 extern (C) int printf(char* f, ...) { return 0; }
 
 void main()
 {
     static byte[10] b;
 }
 
 and compare the difference in exe file sizes, with and without the 
 printf stub.

Compiled against Ares with "-release" specified, the EXE is 82,972 bytes without the stub and 82,972 bytes with the stub. Compiled against Phobos, it's 87,068 bytes without the stub and 86,556 with the stub. So you're right, it's not a big difference at all. And neither is the ~5K executable size difference--I think the gap has actually closed over time, as I remember it being wider. The zero byte difference for Ares is a bit confusing though. I'll take a look at the binaries on my way home and see if I can suss out the differences. Sean
Apr 06 2006
next sibling parent Derek Parnell <derek psych.ward> writes:
On Thu, 06 Apr 2006 17:06:15 -0700, Sean Kelly wrote:

 Walter Bright wrote:
 Do I have to do anything special to get this data in the .map file? 
 Mine contains no function references at all.  Here's the first few lines 
 (where it seems the function data should be):
 
   Start         Length     Name                   Class
   0002:00000000 0000E1B8H  _TEXT                  CODE 32-bit
   0002:0000E1B8 00000162H  ICODE                  ICODE 32-bit
   0003:00000000 00000004H  .CRT$XIA               DATA 32-bit

dmd yourprog.d -L/map -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocracy!" 7/04/2006 10:25:31 AM
Apr 06 2006
prev sibling parent Walter Bright <newshound digitalmars.com> writes:
Sean Kelly wrote:
 Do I have to do anything special to get this data in the .map file? Mine 
 contains no function references at all.

Add the switch -L/map to the dmd command line.
 Compiled against Ares with "-release" specified, the EXE is 82,972 bytes 
 without the stub and 82,972 bytes with the stub.  Compiled against 
 Phobos, it's 87,068 bytes without the stub and 86,556 with the stub.  So 
 you're right, it's not a big difference at all.  And neither is the ~5K 
 executable size difference--I think the gap has actually closed over 
 time, as I remember it being wider.  The zero byte difference for Ares 
 is a bit confusing though.  I'll take a look at the binaries on my way 
 home and see if I can suss out the differences.

Segment sizes get rounded up, I think to the page size.
Apr 06 2006
prev sibling next sibling parent reply kris <foo bar.com> writes:
Long post; sorry about that.


Walter Bright wrote:
 kris wrote:
 
 Yes, that's correct. But typeinfo is a rather rudimetary part of the 
 language support. Wouldn't you agree? If I, for example, declare an 
 array of 10 bytes (static byte[10]) then I'm bound over to import 
 std.string ~ simply because TypeInfo_StaticArray wants to use 
 std.string.toString(int), rather than the C library version of itoa() 
 or a "low-level support" version instead.

It has nothing to do with having a static byte[10] declaration. For the program: void main() { static byte[10] b; } The only things referenced by the object file are _main, __acrtused_con, and __Dmain. You can verify this by running obj2asm on the output, which gives: ------------------------------------- _TEXT segment dword use32 public 'CODE' ;size is 0 _TEXT ends _DATA segment para use32 public 'DATA' ;size is 0 _DATA ends CONST segment para use32 public 'CONST' ;size is 0 CONST ends _BSS segment para use32 public 'BSS' ;size is 10 _BSS ends FLAT group includelib phobos.lib extrn _main extrn __acrtused_con extrn __Dmain __Dmain COMDAT flags=x0 attr=x0 align=x0 _TEXT segment assume CS:_TEXT _TEXT ends _DATA segment _DATA ends CONST segment CONST ends _BSS segment _BSS ends __Dmain comdat assume CS:__Dmain xor EAX,EAX ret __Dmain ends end ----------------------------------

It would help if you'd note under what circumstances the TypeInfo /is/ included, then. For example, this program: void main() { throw new Exception (""); } causes all kinds of TypeInfo to be linked: _D3std8typeinfo2Aa11TypeInfo_Aa5tsizeFZk 004074E8 _D3std8typeinfo2Aa11TypeInfo_Aa6equalsFPvPvZi 00407470 _D3std8typeinfo2Aa11TypeInfo_Aa7compareFPvPvZi 004074CC _D3std8typeinfo2Aa11TypeInfo_Aa7getHashFPvZk 00407430 _D3std8typeinfo2Aa11TypeInfo_Aa8toStringFZAa 00407424 _D3std8typeinfo7ti_char10TypeInfo_a4swapFPvPvZv 0040466C _D3std8typeinfo7ti_char10TypeInfo_a5tsizeFZk 00404664 _D3std8typeinfo7ti_char10TypeInfo_a6equalsFPvPvZi 00404630 _D3std8typeinfo7ti_char10TypeInfo_a7compareFPvPvZi 0040464C _D3std8typeinfo7ti_char10TypeInfo_a7getHashFPvZk 00404624 _D3std8typeinfo7ti_char10TypeInfo_a8toStringFZAa 00404618 _D3std8typeinfo7ti_uint10TypeInfo_k4swapFPvPvZv 00407400 _D3std8typeinfo7ti_uint10TypeInfo_k5tsizeFZk 004073F8 _D3std8typeinfo7ti_uint10TypeInfo_k6equalsFPvPvZi 004073B0 _D3std8typeinfo7ti_uint10TypeInfo_k7compareFPvPvZi 004073CC _D3std8typeinfo7ti_uint10TypeInfo_k7getHashFPvZk 004073A4 _D3std8typeinfo7ti_uint10TypeInfo_k8toStringFZAa 00407398 _D6object14TypeInfo_Array4swapFPvPvZv 004028A8 _D6object14TypeInfo_Array5tsizeFZk 004028A0 _D6object14TypeInfo_Array6equalsFPvPvZi 00402778 _D6object14TypeInfo_Array7compareFPvPvZi 00402808 _D6object14TypeInfo_Array7getHashFPvZk 0040271C _D6object14TypeInfo_Array8toStringFZAa 004026F8 _D6object14TypeInfo_Class5tsizeFZk 00402C48 _D6object14TypeInfo_Class6equalsFPvPvZi 00402BB8 _D6object14TypeInfo_Class7compareFPvPvZi 00402C00 _D6object14TypeInfo_Class7getHashFPvZk 00402BA8 _D6object14TypeInfo_Class8toStringFZAa 00402B9C _D6object15TypeInfo_Struct5tsizeFZk 00402D3C _D6object15TypeInfo_Struct6equalsFPvPvZi 00402C94 _D6object15TypeInfo_Struct7compareFPvPvZi 00402CE8 _D6object15TypeInfo_Struct7getHashFPvZk 00402C58 _D6object15TypeInfo_Struct8toStringFZAa 00402C50 _D6object16TypeInfo_Pointer4swapFPvPvZv 004026E0 _D6object16TypeInfo_Pointer5tsizeFZk 004026D8 _D6object16TypeInfo_Pointer6equalsFPvPvZi 004026AC _D6object16TypeInfo_Pointer7compareFPvPvZi 004026C8 _D6object16TypeInfo_Pointer7getHashFPvZk 004026A0 _D6object16TypeInfo_Pointer8toStringFZAa 0040267C _D6object16TypeInfo_Typedef4swapFPvPvZv 00402664 _D6object16TypeInfo_Typedef5tsizeFZk 00402658 _D6object16TypeInfo_Typedef6equalsFPvPvZi 00402628 _D6object16TypeInfo_Typedef7compareFPvPvZi 00402640 _D6object16TypeInfo_Typedef7getHashFPvZk 00402618 _D6object16TypeInfo_Typedef8toStringFZAa 00402610 _D6object17TypeInfo_Delegate5tsizeFZk 00402B94 _D6object17TypeInfo_Delegate8toStringFZAa 00402B70 _D6object17TypeInfo_Function5tsizeFZk 00402B6C _D6object17TypeInfo_Function8toStringFZAa 00402B48 _D6object20TypeInfo_StaticArray4swapFPvPvZv 00402A40 _D6object20TypeInfo_StaticArray5tsizeFZk 00402A2C _D6object20TypeInfo_StaticArray6equalsFPvPvZi 00402960 _D6object20TypeInfo_StaticArray7compareFPvPvZi 004029BC _D6object20TypeInfo_StaticArray7getHashFPvZk 00402924 _D6object20TypeInfo_StaticArray8toStringFZAa 004028E4 _D6object25TypeInfo_AssociativeArray5tsizeFZk 00402B40 _D6object25TypeInfo_AssociativeArray8toStringFZAa 00402AFC Where did all that come from? I suspect you're looking at this concern with a microscope only, while I think the bigger picture is perhaps more important.
 Examining the .map file produced shows that only these functions are 
 pulled in from std.string:
 
 0002:00002364       _D3std6string7iswhiteFwZi  00404364
 0002:000023A4       _D3std6string3cmpFAaAaZi   004043A4
 0002:000023E8       _D3std6string4findFAawZi   004043E8
 0002:00002450       _D3std6string8toStringFkZAa 00404450
 0002:000024CC       _D3std6string9inPatternFwAaZi 004044CC
 0002:00002520       _D3std6string6columnFAaiZk 00404520
 
 I do not know offhand why a couple of those are pulled in, but I suggest 
 that obj2asm and the generated .map files are invaluable at determining 
 what pulls in what. Sometimes the results are surprising.

Yes they are surprising ~ partly because there's more than one might imagine: 0003:00000D74 _D3std6string10whitespaceG6a 00411D74 0003:00000D7C _D3std6string2LSw 00411D7C 0003:00000D80 _D3std6string2PSw 00411D80 0002:00002464 _D3std6string3cmpFAaAaZi 00404464 0002:000024A8 _D3std6string4findFAawZi 004044A8 0002:000025E4 _D3std6string6columnFAaiZi 004045E4 0003:00000CF4 _D3std6string6digitsG10a 00411CF4 0002:00002424 _D3std6string7iswhiteFwZi 00404424 0003:00000D40 _D3std6string7lettersG52a 00411D40 0003:00000D84 _D3std6string7newlineG2a 00411D84 0002:00002514 _D3std6string8toStringFkZAa 00404514 0003:00000CE4 _D3std6string9hexdigitsG16a 00411CE4 0002:00002590 _D3std6string9inPatternFwAaZi 00404590 0003:00000D08 _D3std6string9lowercaseG26a 00411D08 0003:00000D00 _D3std6string9octdigitsG8a 00411D00 0003:00000D24 _D3std6string9uppercaseG26a 00411D24 Please see the extensive list at the end for some further surprises
 
 That's tight-coupling within very low-level language support. Uncool.
 Wouldn't you at least agree that specific instance is hardly an 
 absolute necessity?

std.string.toString is 124 bytes long, and doesn't pull anything else in (except see below). Writing another version of it in typeinfo isn't going to reduce the size of the program *at all*, in fact, it will likely increase it because now there'll be two versions of it.

You're focusing purely on the fact that adding an itoa() would increase the executable size. At the same time, completely ignoring the explicit mention of using the C runtime function instead (which is usually linked also), and the clear fact that importing std.string brings along with it the following: 0003:00000D74 _D3std6string10whitespaceG6a 00411D74 0003:00000D7C _D3std6string2LSw 00411D7C 0003:00000D80 _D3std6string2PSw 00411D80 0002:00002464 _D3std6string3cmpFAaAaZi 00404464 0002:000024A8 _D3std6string4findFAawZi 004044A8 0002:000025E4 _D3std6string6columnFAaiZi 004045E4 0003:00000CF4 _D3std6string6digitsG10a 00411CF4 0002:00002424 _D3std6string7iswhiteFwZi 00404424 0003:00000D40 _D3std6string7lettersG52a 00411D40 0003:00000D84 _D3std6string7newlineG2a 00411D84 0002:00002514 _D3std6string8toStringFkZAa 00404514 0003:00000CE4 _D3std6string9hexdigitsG16a 00411CE4 0002:00002590 _D3std6string9inPatternFwAaZi 00404590 0003:00000D08 _D3std6string9lowercaseG26a 00411D08 0003:00000D00 _D3std6string9octdigitsG8a 00411D00 0003:00000D24 _D3std6string9uppercaseG26a 00411D24 Along with a number of dependencies. And, apparently, you think it's perhaps responsible for bringing in the floating point support too. The point being made is that of coupling between low and high levels ~ illustrated quite well by the above. I think this kind of thing is worth addressing, for a number of reasons.
 Although there is a lot of code in std.string, unreferenced free 
 functions in it should be discarded by the linker. A check of the 
 generated .map file should verify this - it is certainly supposed to 
 work that way. One problem Java has is that there are no free 
 functions, so referencing one function wound up pulling in every part 
 of the class the function resided in.

This is exactly the case with printf <g>. It winds up linking the world

No, it does not link in the world, floating point, or graphics libraries. It links in C's standard I/O (which usually gets linked in anyway), and about 4000 bytes of code. That's somewhat less than a megabyte <g>.

Who says the standard C IO should /always/ get linked in? D currently /enforces/ that, whereas it's not a requirement at all for valid operation. What's more, the enforcement is simply because Object.d has a print() method, which uses printf() like so: print () { printf ("%.*s", toString()); } Why not just use ConsoleWrite(), or anything but printf()? There's a number of valid (and decoupled) alternatives to this approach. Why can't they be used instead? You're answer is "well, it doesn't make any difference anyway". That's entirely silly. Yes, the C-library console-startup wrapper causes the IO system to be linked also. But that can be replaced, since it's not directly part of the D runtime support. To make things worse, Object.print() is perhaps the least used method in all of D! Thus, it tends to place this whole issue on the verge of ridiculous. Why not just remove the dependency instead? One of the tenets of good library design is to build in layers, and then ensure there's no dependencies between a lower layer and any of the higher ones. Here's two cases of just such a dependency ~ they are almost trivial to fix, yet nothing happens ... why? Thus, I really don't wish to argue with you on this one, Walter. If you simply refuse to accept that any system might prefer to avoid the default IO platform, for whatever valid reason it may have, then there's little point in even discussing the nature of tight-coupling. One can hack the internal dependencies in an attempt to rectify the concerns; yet why? Better to leave all of /internal and friends as it stands to avoid branching the code. I really thought you'd understand the value in making that part platform (library) agnostic. And for such a minor cost, too.
 because it's a general purpose utility function that does all kinds of 
 conversion and all kinds of IO. Printf() is an all or nothing design ~ 
 you can't selectively link pieces of it.

 That's usually not a problem. However, you've chosen to bind it to 
 low-level language support (in the root Object). That choice causes 
 tight coupling between the language low-level support and a high-level 
 library function ~ one which ought to be optional.

 Wouldn't you at least agree this specific case is not necessary for 
 the D language to function correctly? That there are other perfectly 
 workable alternatives?

It's just not a big deal. Try the following: extern (C) int printf(char* f, ...) { return 0; } void main() { static byte[10] b; } and compare the difference in exe file sizes, with and without the printf stub.

Funny :-D It makes little difference because all the other dependency code is linked in from other places, Walter. It can be fixed one step at a time. What you're saying here is the following. Take a shotgun, and pepper the boat you're standing in with holes. Now, see? When you plug up this one hole, it really doesn't stop the water coming in? See? Hardly any difference! Needless to say, I think you're being somewhat disingenious. Or, at least trying to obfuscate a simple case of unecessary low-high coupling in D. But let's move on ...
 printf doesn't pull in the floating point library (I went to a lot of 
 effort to make that so!). It does pull in the C IO library, which is 
 very hard to not pull in (there always seems to be something 
 referencing it). It shouldn't pull in the C wide character stuff. D's 
 IO (writefln) will pull in C's IO anyway, so the only thing extra is 
 the integer version of the specific printf code (about 4K).

How can it convert %f, %g and so on if it doesn't use FP support at all?

It's magic! Naw, it's just that if you actually use floating point in a program, the compiler emits a special extern reference (to __fltused) which pulls in the floating point IO formatting code. Otherwise, it defaults to just a stub. Try it.

void main() { throw new Exception (""); } I'm quite familiar with __fltused. It's clearly used by the little example program above, given that this stuff is linked in: 0003:00007150 ___wpscanfloat 00418150 0003:00007154 ___wpfloatfmt 00418154 0003:00007158 ___pscanfloat 00418158 0003:0000715C ___pfloatfmt 0041815C 0003:0000453C __8087 0041553C 0003:0000453C __80x87 0041553C 0002:0000E560 __8087_init 00410560 0002:0000E9B0 __FCOMPP 004109B0 0002:0000E9CE __FTEST0 004109CE 0002:0000E9EE __FTEST 004109EE 0002:0000EA06 __DTST87 00410A06 0002:0000EA0A __87TOPSW 00410A0A 0002:0000EA0F __DBLTO87 00410A0F 0002:0000EA1A __DBLINT87 00410A1A 0002:0000EA3B __DBLLNG87 00410A3B 0002:0000EA57 __FLTTO87 00410A57 0002:0000EA5E __status87 00410A5E 0002:0000EA63 __clear87 00410A63 0002:0000EA6C __control87 00410A6C 0002:0000EA93 __fpreset 00410A93 That looks rather like floating point support; Where in the program is floating point actually used? I don't get it.
 Either way, it's not currently possible to build a D program without a 
 swathe of FP support code,
 printf,
 the entire C IO package,
 wide-char support,
 and a whole lot more besides. I'd assumed the linked FP support was 
 for printf, but perhaps it's for std.string instead? I've posted the 
 linker maps (in the past) to illustrate exactly this.

My point is that assuming what is pulled in by what is about as reliable as guessing where the bottlenecks in one's code is. You can't tell bottlenecks without a profiler, and you've got both hands tied behind your back trying to figure out who pulls in what if you're not using .map files, grep, and obj2asm.
 Are you not at all interested in improving this aspect of the language 
 usage?

Sure, but based on accurate information.

*Cough*
 Pulling printf won't do 
 anything. Try it if you don't agree.

That's your claim, not mine :) See the analogy above.
 
 For example, which modules pull in the floating point formatting code? 
 It isn't printf. We can find out by doing a grep for __fltused:
 
 boxer.obj:      __fltused
 complex.obj:    __fltused
 conv.obj:       __fltused
 date.obj:       __fltused
 demangle.obj:   __fltused
 format.obj:     __fltused
 gamma.obj:      __fltused
 math.obj:       __fltused
 math2.obj:      __fltused
 outbuffer.obj:  __fltused
 stream.obj:     __fltused
 string.obj:     __fltused
 ti_Acdouble.obj:        __fltused
 ti_Acfloat.obj: __fltused
 ti_Acreal.obj:  __fltused
 ti_Adouble.obj: __fltused
 ti_Afloat.obj:  __fltused
 ti_Areal.obj:   __fltused
 ti_cdouble.obj: __fltused
 ti_cfloat.obj:  __fltused
 ti_creal.obj:   __fltused
 ti_double.obj:  __fltused
 ti_float.obj:   __fltused
 ti_real.obj:    __fltused
 
 Some examination of the .map file shows that the only one of these 
 pulled in by default is std.string. So I think a reasonable approach 
 would be to look at removing the floating point from std.string 

So importing std.string is causing FP support to be imported? No surprises there; something is certainly bringing it in. Along with the "world", as one can see from the attached .map of the example program: void main() { throw new Exception (""); } Keep in mind it's not the number of entries, but the number of superfluous entries that are of concern (I removed all Win32 imports in an attempt to make the list more managable). Also, please keep in mind that the concern is one of unecessary coupling from the low-level runtime support, into the high-level library functions. This will often result in a cascade of dependencies, much like what we see below. Not only does it cause code-bloat, but it makes the language-support dependent upon a specific high-level library. These dependencies are /very/ easy to remedy, with an approriate reduction in code size as a bonus. The map file is here, since it's too big to attach: http://www.dsource.org/projects/mango/browser/trunk/doc/map.txt?rev=818&format=raw
Apr 06 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
kris wrote:
 It would help if you'd note under what circumstances the TypeInfo /is/ 
 included, then. For example, this program:
 
 void main()
 {
         throw new Exception ("");
 }
 
 
 causes all kinds of TypeInfo to be linked:

In general, an easy way to see why a particular module is being pulled in is to temporarily remove it from the library (lib phobos -foo;), link, and see where the undefined reference is coming from. I'd start by running obj2asm on the module you just compiled, and see what extern directives it puts out. I'm not trying to be a jerk by telling you this procedure rather than just giving the answer, but 1) I don't know the answer offhand and I'd have to follow the same procedure to figure it out and 2) I hope that by giving you the tools and methodology for figuring it out, this kind of question won't repeatedly come up (and yes, it has come up repeatedly). 3) I hope that anyone else with these kinds of questions will get familiar with how to use these tools, too. It's a lot better than guessing and assuming. Tools like lib, obj2asm, and grep are incredibly useful.
 Where did all that come from? I suspect you're looking at this concern 
 with a microscope only, while I think the bigger picture is perhaps more 
 important.

I don't think there is a bigger picture. There's only a case by case analysis of what is needed and what isn't.
 Yes they are surprising ~ partly because there's more than one might 
 imagine:
 
  0003:00000D74       _D3std6string10whitespaceG6a 00411D74
  0003:00000D7C       _D3std6string2LSw          00411D7C
  0003:00000D80       _D3std6string2PSw          00411D80
  0002:00002464       _D3std6string3cmpFAaAaZi   00404464
  0002:000024A8       _D3std6string4findFAawZi   004044A8
  0002:000025E4       _D3std6string6columnFAaiZi 004045E4
  0003:00000CF4       _D3std6string6digitsG10a   00411CF4
  0002:00002424       _D3std6string7iswhiteFwZi  00404424
  0003:00000D40       _D3std6string7lettersG52a  00411D40
  0003:00000D84       _D3std6string7newlineG2a   00411D84
  0002:00002514       _D3std6string8toStringFkZAa 00404514
  0003:00000CE4       _D3std6string9hexdigitsG16a 00411CE4
  0002:00002590       _D3std6string9inPatternFwAaZi 00404590
  0003:00000D08       _D3std6string9lowercaseG26a 00411D08
  0003:00000D00       _D3std6string9octdigitsG8a 00411D00
  0003:00000D24       _D3std6string9uppercaseG26a 00411D24
 
 Please see the extensive list at the end for some further surprises

All those other names are are the static data. Things like: const dchar LS = '\u2028'; /// UTF line separator const dchar PS = '\u2029'; /// UTF paragraph separator I submit that they aren't significant. The significant thing is the entire std.string.obj is not linked in.
 You're focusing purely on the fact that adding an itoa() would increase 
 the executable size.

Yes.
 At the same time, completely ignoring the explicit
 mention of using the C runtime function instead (which is usually linked 
 also), and the clear fact that importing std.string brings along with it 
 the following:

And the only possible problem I see there is worrying about executable size.
 
  0003:00000D74       _D3std6string10whitespaceG6a 00411D74
  0003:00000D7C       _D3std6string2LSw          00411D7C
  0003:00000D80       _D3std6string2PSw          00411D80
  0002:00002464       _D3std6string3cmpFAaAaZi   00404464
  0002:000024A8       _D3std6string4findFAawZi   004044A8
  0002:000025E4       _D3std6string6columnFAaiZi 004045E4
  0003:00000CF4       _D3std6string6digitsG10a   00411CF4
  0002:00002424       _D3std6string7iswhiteFwZi  00404424
  0003:00000D40       _D3std6string7lettersG52a  00411D40
  0003:00000D84       _D3std6string7newlineG2a   00411D84
  0002:00002514       _D3std6string8toStringFkZAa 00404514
  0003:00000CE4       _D3std6string9hexdigitsG16a 00411CE4
  0002:00002590       _D3std6string9inPatternFwAaZi 00404590
  0003:00000D08       _D3std6string9lowercaseG26a 00411D08
  0003:00000D00       _D3std6string9octdigitsG8a 00411D00
  0003:00000D24       _D3std6string9uppercaseG26a 00411D24
 
 
 Along with a number of dependencies.

Take a look at those functions and data - what dependencies?
 And, apparently, you think it's perhaps responsible for bringing in the 
 floating point support too.

That is a problem, and I can fix that. No big deal - it wasn't printf bringing in the floating point - and a reengineering or rewrite of Phobos is not necessary. I don't even need to change any library source code.
 The point being made is that of coupling between low and high levels ~ 
 illustrated quite well by the above.
 I think this kind of thing is worth addressing, for a number of reasons.

I think you're seeing an effect that is an issue, but are mistaken as to the cause of the problem.
 Who says the standard C IO should /always/ get linked in? D currently 
 /enforces/ that, whereas it's not a requirement at all for valid 
 operation.

There isn't that much to it, and it doesn't hurt anything.
 What's more, the enforcement is simply because Object.d has a 
 print() method, which uses printf() like so:
 
 print ()
 {
     printf ("%.*s", toString());
 }

Again, it isn't necessarilly printf doing that. Try the code I posted in the last message that stubs out printf, which will *prevent* it from being linked in from the library. Compile/link it, and examine the .map file. (The stubbing out method is another technique for figuring out what pulls in what.)
 Why not just use ConsoleWrite(), or anything but printf()?

Because it's not portable (what should the Linux one look like?), and does not deliver the billed benefits. But the worst thing about calling ConsoleWrite() directly is that it does not play well with any other IO the user may have done or be in the process of doing. What will happen is that any object.print()'s will not be synchronized with the output from writef, printf, or any other of the stdout functions.
 There's a 
 number of valid (and decoupled) alternatives to this approach. Why can't 
 they be used instead? You're answer is "well, it doesn't make any 
 difference anyway". That's entirely silly. Yes, the C-library 
 console-startup wrapper causes the IO system to be linked also. But that 
 can be replaced, since it's not directly part of the D runtime support.

Why does the C library need replacing? I honestly don't get it.
 To make things worse, Object.print() is perhaps the least used method in 
 all of D! Thus, it tends to place this whole issue on the verge of 
 ridiculous.
 Why not just remove the dependency instead?

Because it doesn't buy anything to remove it. Try it and see (or even easier, try the source I posted with the stubbed out printf - that will absolutely, positively prevent printf from being linked in from the library, without needing to change or recompile object.d at all).
 One of the tenets of good library design is to build in layers, and then 
 ensure there's no dependencies between a lower layer and any of the 
 higher ones. Here's two cases of just such a dependency ~ they are 
 almost trivial to fix, yet nothing happens ... why?
 
 Thus, I really don't wish to argue with you on this one, Walter. If you 
 simply refuse to accept that any system might prefer to avoid the 
 default IO platform, for whatever valid reason it may have, then there's 
 little point in even discussing the nature of tight-coupling.

If you want to use a system that for some reason can't have C's IO subsystem, then just include the one liner: extern (C) int printf(char* f, ...) { return 0; } somewhere in your code, and it's gone.
 One can hack the internal dependencies in an attempt to rectify the 
 concerns; yet why? Better to leave all of /internal and friends as it 
 stands to avoid branching the code. I really thought you'd understand 
 the value in making that part platform (library) agnostic. And for such 
 a minor cost, too.

You don't need to hack the internals to get rid of any vestige of printf. Just stub it out.
 Or, at 
 least trying to obfuscate a simple case of unecessary low-high coupling 
 in D. But let's move on ...

I'm trying to point out that things aren't so simple.
 I'm quite familiar with __fltused.

Your questions about how printf avoided linking in %f support indicated otherwise.
 It's clearly used by the little 
 example program above, given that this stuff is linked in:

 That looks rather like floating point support; Where in the program is 
 floating point actually used? I don't get it.

I went over that in my last post, too.
 Pulling printf won't do anything. Try it if you don't agree.


You don't have to believe me, that's why I encourage you to try it and give you the tools and methodology to figure these things out.
 Keep in mind it's not the number of entries, but the number of 
 superfluous entries that are of concern (I removed all Win32 imports in 
 an attempt to make the list more managable).

Until you've tracked down each and every one and understand where it is pulled in from and why it is there, there is no way to decide which ones are superfluous or not. There's an awful lot of startup and shutdown going on - stuff that is required for D (or the C runtime library, for that matter) to function. An awful lot is required for the exception handling support to work - that has to be in all programs. For the gc to start up and shut down gracefully. It goes on.
 Also, please keep in mind that the concern is one of unecessary coupling 
 from the low-level runtime support, into the high-level library 
 functions. This will often result in a cascade of dependencies, much 
 like what we see below. Not only does it cause code-bloat, but it makes 
 the language-support dependent upon a specific high-level library. These 
 dependencies are /very/ easy to remedy, with an approriate reduction in 
 code size as a bonus.

As we've discovered, pulling printf out of object.d isn't going to remedy anything. It just is not that simple.
Apr 06 2006
next sibling parent reply kris <foo bar.com> writes:
Walter Bright wrote:
 kris wrote:
 
 It would help if you'd note under what circumstances the TypeInfo /is/ 
 included, then. For example, this program:

 void main()
 {
         throw new Exception ("");
 }


 causes all kinds of TypeInfo to be linked:

In general, an easy way to see why a particular module is being pulled in is to temporarily remove it from the library (lib phobos -foo;), link, and see where the undefined reference is coming from. I'd start by running obj2asm on the module you just compiled, and see what extern directives it puts out. I'm not trying to be a jerk by telling you this procedure rather than just giving the answer, but 1) I don't know the answer offhand and I'd have to follow the same procedure to figure it out and 2) I hope that by giving you the tools and methodology for figuring it out, this kind of question won't repeatedly come up (and yes, it has come up repeatedly). 3) I hope that anyone else with these kinds of questions will get familiar with how to use these tools, too. It's a lot better than guessing and assuming. Tools like lib, obj2asm, and grep are incredibly useful.

Yes, they are useful. And I'm not trying to be a jerk by pointing out that the D runtime is missing some much needed TLC (to put it very nicely). Why do you think Ares exists anyway?
 Where did all that come from? I suspect you're looking at this concern 
 with a microscope only, while I think the bigger picture is perhaps 
 more important.

I don't think there is a bigger picture. There's only a case by case analysis of what is needed and what isn't.

I see.
 All those other names are are the static data. Things like:
 
 const dchar LS = '\u2028';      /// UTF line separator
 const dchar PS = '\u2029';      /// UTF paragraph separator
 
 I submit that they aren't significant. 

Yes, you're right. Unless they're huge tables (such as the Unicode character map; oh wait; is that linked by default? :)
 You're focusing purely on the fact that adding an itoa() would 
 increase the executable size.

Yes. > At the same time, completely ignoring the explicit
 mention of using the C runtime function instead (which is usually 
 linked also), and the clear fact that importing std.string brings 
 along with it the following:

And the only possible problem I see there is worrying about executable size.

Forgive me, but, there's a certain unattributed reputation for avoiding any and all important and/or salient points whenever it suits ~
 And, apparently, you think it's perhaps responsible for bringing in 
 the floating point support too.

That is a problem, and I can fix that. No big deal - it wasn't printf bringing in the floating point - and a reengineering or rewrite of Phobos is not necessary. I don't even need to change any library source code.

What's all this about necessary re-engineering and rewriting of Phobos? Where the heck did that come from?
 The point being made is that of coupling between low and high levels ~ 
 illustrated quite well by the above.
 I think this kind of thing is worth addressing, for a number of reasons.

I think you're seeing an effect that is an issue, but are mistaken as to the cause of the problem.

I see low-level code being dependent upon high-level. I also see a large brick wall with an entirely unresponsive mason sitting atop.
 Again, it isn't necessarilly printf doing that. Try the code I posted in 
 the last message that stubs out printf, which will *prevent* it from 
 being linked in from the library. Compile/link it, and examine the .map 
 file.

Sigh. I did that last year, as you well know. After all, that's the reason you sent me the source code.
 Because it's not portable (what should the Linux one look like?), and 
 does not deliver the billed benefits. But the worst thing about calling 
 ConsoleWrite() directly is that it does not play well with any other IO 
 the user may have done or be in the process of doing. What will happen 
 is that any object.print()'s will not be synchronized with the output 
 from writef, printf, or any other of the stdout functions.

Fair point about the synchronization aspect. I'm glad you brought that up. This is when library designers do one of two things: insert an indirect hook (like Sean has done in many places), or remove the functionality from that layer (again, like Sean has done). The worst possible thing to do is just leave it in there. It becomes part of the "legacy" and thus is impossible to remove cleanly at some future date; and it tends to negate reasonable attempts to clean up other similar concerns. Heck, object.print() should probably be entirely removed anyway; the functionality provided is of dubious value, other than for some truly lazy and limited debugging; it caused enough concern that there was a general concensus to remove it, *or at least make it a null-op*, two years ago; and it's not even applied to any extent. The toString() method provides similar capability without the coupling issues. Again, it could simply become a null-op, or be removed. There's no reasonable balanced need for it to exist as it does today. But then, why ever bother cleaning /anything/ up, when it won't make any difference anyway?
 Why does the C library need replacing? I honestly don't get it.

Who said it /needed/ replacing? I'm simply talking about applying a small sprinkling of decoupling dust
 Why not just remove the dependency instead?

Because it doesn't buy anything to remove it. Try it and see (or even easier, try the source I posted with the stubbed out printf - that will absolutely, positively prevent printf from being linked in from the library, without needing to change or recompile object.d at all).

You snipped what I thought a fair and appropriate analogy, so I'll repeat it: "Take a shotgun, and pepper the boat you're standing in with holes. Now, see? When you plug up this one hole, it really doesn't stop the water coming in? See? Hardly any difference" There /is/ a bigger picture here. One has to first see it, and then approach a resolution in small steps. Unfortunately the "it doesn't buy anything to remove it" outlook is entirely non-conducive to stepwise refinement. Nothing will ever get improved with that attitude.
 If you want to use a system that for some reason can't have C's IO 
 subsystem, then just include the one liner:
 
 extern (C) int printf(char* f, ...) { return 0; }
 
 somewhere in your code, and it's gone.

Forgive me, but that's just printf(). If I have, as you say, a system that can't have C's IO subsystem, the above will hardly help me at all. I suspect you're well aware the C IO subsystem consists of significantly more than just printf? Say, perhaps 50-ish functions? You're going to suggest I stub them all out just because Object.print() calls printf()?
 I'm trying to point out that things aren't so simple.

And I'm trying to point out just how simple it is to /start/ the process of eliminating questionable couplings that lead to Derek's sad example.
 Until you've tracked down each and every one and understand where it is 
 pulled in from and why it is there, there is no way to decide which ones 
 are superfluous or not.

There's rarely a need to do any of that when one is careful to decouple responsibilities.
 There's an awful lot of startup and shutdown going on - stuff that is 
 required for D (or the C runtime library, for that matter) to function. 
 An awful lot is required for the exception handling support to work - 
 that has to be in all programs. For the gc to start up and shut down 
 gracefully. It goes on.

Sure; that's a given. Yet it's also quite clear the D runtime links the kitchen-sink too. Given Derek's example: void main() {} The .map for that is a fine specimen of "unexpected" coupling. It does seem as though much of that is actually in the C library, but then you're arguing most fervently against doing anything to fix any such things.
 Also, please keep in mind that the concern is one of unecessary 
 coupling from the low-level runtime support, into the high-level 
 library functions. This will often result in a cascade of 
 dependencies, much like what we see below. Not only does it cause 
 code-bloat, but it makes the language-support dependent upon a 
 specific high-level library. These dependencies are /very/ easy to 
 remedy, with an approriate reduction in code size as a bonus.

As we've discovered, pulling printf out of object.d isn't going to remedy anything.

Sigh. I'm now at a complete loss at how to respond. Instead, I'll remind you what prompted this exercise in futility: ~~~~~~~~~~~~~~~~~~~~ Walter Bright wrote:
 Phobos doesn't require floating point support from the processor
 unless one actually uses floating point in the application code.

That turned out to be somewhat less than truthful.
 I also really don't understand why anyone using D would require
 not using Phobos. What's the problem?

With much respect, the 'problem' is perhaps that you don't see any?
Apr 06 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
kris wrote:
 Yes, they are useful. And I'm not trying to be a jerk by pointing out 
 that the D runtime is missing some much needed TLC (to put it very 
 nicely). Why do you think Ares exists anyway?

I'm trying to understand.
 All those other names are are the static data. Things like:

 const dchar LS = '\u2028';      /// UTF line separator
 const dchar PS = '\u2029';      /// UTF paragraph separator

 I submit that they aren't significant. 

character map; oh wait; is that linked by default? :)

If you're talking about the one in std.uni, it took me a minute to figure that one out. The reason it's pulled in is because of an error in the compiler, where lambda functions should be generated as comdats, but aren't. It is not a problem with the library design.
 What's all this about necessary re-engineering and rewriting of Phobos? 
 Where the heck did that come from?

You asked what the motivation for Ares was.
 Why does the C library need replacing? I honestly don't get it.

small sprinkling of decoupling dust

You write: "If I have, as you say, a system that can't have C's IO subsystem, the above will hardly help me at all."
 Forgive me, but that's just printf(). If I have, as you say, a system 
 that can't have C's IO subsystem, the above will hardly help me at all. 
 I suspect you're well aware the C IO subsystem consists of significantly 
 more than just printf? Say, perhaps 50-ish functions? You're going to 
 suggest I stub them all out just because Object.print() calls printf()?

You only need to stub out the functions that pull in the rest. If it's only printf, then that's all you need to stub out. If you implement your own printf, then it will 'hook' any other calls to printf, and you can use it to call your own stuff. It's a common technique - heck, I've done it to 'hook' printf to write to a window instead of stdout.
 Given Derek's example:
 
 void main() {}
 
 The .map for that is a fine specimen of "unexpected" coupling. It does 
 seem as though much of that is actually in the C library, but then 
 you're arguing most fervently against doing anything to fix any such 
 things.

It is mostly the C runtime library. And I've lived with this for a very long time, don't you think I've looked at it to try to reduce the minimum required to be pulled in?
 Walter Bright wrote:
  > Phobos doesn't require floating point support from the processor
  > unless one actually uses floating point in the application code.
 
 That turned out to be somewhat less than truthful.

It will be in the next update. I had assumed it was working properly when it wasn't - however, printf had nothing to do with the problem. I don't know why you call it "futility" - the problem will get fixed. The solutions you proposed wouldn't have fixed it.
  > I also really don't understand why anyone using D would require
  > not using Phobos. What's the problem?
 
 With much respect, the 'problem' is perhaps that you don't see any?

Here's the issue I have - postings that adamantly offer solutions without identifying the actual problem. Here are some examples: Solution: remove printf Alleged problem: printf pulls in floating point formatting code Actual problem: std.string pulls in floating point formatting code due to reference to __fltused. printf does not pull in floating point formatting code. Correct solution: fix compiler to not generate __fltused references in library code Solution: implement separate itoa() for typeinfo Alleged problem: calling one function in std.string pulls in everything in std.string Actual problem: only a small portion of std.string is actually linked in because the free functions are implemented as COMDATs, but due to an error in the compiler, lambda functions are not written as COMDATs. This causes a reference to std.uni to still be pulled in, pulling in a large table in std.uni Correct solution: fix compiler to generate COMDATs for lambda functions. I'm glad these two problems came to light, and I'm happy to fix them. It's why I ask "What's the problem?" rather than just applying solutions without understanding what problem the solution is aimed at. In the two cases above, the real problems were found only after I kept asking seemingly stupid questions about what problem you were trying to solve by removing printf and the call to std.string.toString. I know you're still bothered by the issue of C's IO being pulled in. I have another utility for you - \dm\bin\libunres. Libunres will identify any symbols unresolved by a library. Running libunres on phobos.lib gives: Unresolved externals: ??2 YAPAXI Z ??2 YAPAXIPAX Z ??3 YAXPAX Z ?__stl_throw_length_error std YAXPBD Z ?__stl_throw_out_of_range std YAXPBD Z _ExpandEnvironmentStringsA 12 _FreeLibrary 4 _GetFileType 4 _GetProcAddress 8 _GetVersion 0 _IID_IUnknown _LoadLibraryA 4 _QueryPerformanceCounter 4 _RegCloseKey 4 _RegCreateKeyExA 36 _RegDeleteKeyA 8 _RegDeleteValueA 8 _RegEnumKeyExA 32 _RegEnumValueA 32 _RegFlushKey 4 _RegOpenKeyA 12 _RegOpenKeyExA 20 _RegQueryInfoKeyA 48 _RegQueryValueExA 24 _RegSetValueExA 24 _WSACleanup 0 _WSAGetLastError 0 _WSAStartup 8 __Ccmp __Dmain __LCMP __LDIV __U64_LDBL __ULDIV ___alloca ___fp_lock ___fp_unlock ___pfloatfmt __assert __beginthreadex __end __except_list __fltused __fputc_nlock __fputwc_nlock __global_unwind __imp__CloseHandle 4 __imp__CopyFileA 12 __imp__CopyFileW 12 __imp__CreateDirectoryA 8 __imp__CreateDirectoryW 8 __imp__CreateFileA 28 __imp__CreateFileMappingA 24 __imp__CreateFileW 28 __imp__CreateWindowExA 48 __imp__DeleteCriticalSection 4 __imp__DeleteFileA 4 __imp__DeleteFileW 4 __imp__DuplicateHandle 28 __imp__EnterCriticalSection 4 __imp__FileTimeToSystemTime 8 __imp__FindClose 4 __imp__FindFirstFileA 8 __imp__FindFirstFileW 8 __imp__FindNextFileA 8 __imp__FindNextFileW 8 __imp__FlushViewOfFile 8 __imp__FormatMessageA 28 __imp__GetCurrentDirectoryA 8 __imp__GetCurrentDirectoryW 8 __imp__GetCurrentProcess 0 __imp__GetCurrentThread 0 __imp__GetCurrentThreadId 0 __imp__GetFileAttributesA 4 __imp__GetFileAttributesW 4 __imp__GetFileSize 8 __imp__GetFullPathNameA 16 __imp__GetLastError 0 __imp__GetModuleFileNameA 12 __imp__GetProcessTimes 20 __imp__GetSystemInfo 4 __imp__GetSystemTime 4 __imp__GetThreadContext 8 __imp__GetThreadTimes 20 __imp__GetTickCount 0 __imp__GetTimeZoneInformation 4 __imp__GetVersionExA 4 __imp__InitializeCriticalSection 4 __imp__InterlockedDecrement 4 __imp__InterlockedExchange 8 __imp__InterlockedIncrement 4 __imp__LeaveCriticalSection 4 __imp__LocalFree 4 __imp__MapViewOfFileEx 24 __imp__MoveFileA 8 __imp__MoveFileW 8 __imp__MultiByteToWideChar 24 __imp__QueryPerformanceCounter 4 __imp__QueryPerformanceFrequency 4 __imp__RaiseException 16 __imp__ReadFile 20 __imp__RemoveDirectoryA 4 __imp__RemoveDirectoryW 4 __imp__ResumeThread 4 __imp__SetCurrentDirectoryA 4 __imp__SetCurrentDirectoryW 4 __imp__SetFilePointer 16 __imp__SetThreadPriority 8 __imp__Sleep 4 __imp__SuspendThread 4 __imp__UnmapViewOfFile 4 __imp__VirtualAlloc 16 __imp__VirtualFree 12 __imp__WaitForSingleObject 8 __imp__WideCharToMultiByte 32 __imp__WriteFile 20 __imp__lstrcatA 8 __imp__lstrcmpA 8 __imp__lstrcpyA 8 __imp__lstrlenA 4 __iob __local_except_handler __snprintf __vsnprintf __xi_a _accept 12 _acosl _asinl _atan2l _atanl _atoi _bind 12 _calloc _cbrtl _ceill _closesocket 4 _connect 12 _coshl _erfcl _erfl _errno _execv _execve _execvp _execvpe _exit _exp2l _expl _expm1l _fclose _fdopen _feof _fflush _fgetc _floorl _fopen _fprintf _fputc _fread _free _fseek _ftell _fwide _fwrite _gethostbyaddr 12 _gethostbyname 4 _gethostname 8 _getpeername 12 _getprotobyname 4 _getprotobynumber 4 _getservbyname 8 _getservbyport 8 _getsockname 12 _getsockopt 20 _ilogbl _inet_addr 4 _inet_ntoa 4 _ioctlsocket 12 _lgammal _listen 8 _log10l _log1pl _log2l _logbl _logl _malloc _memchr _memcmp _memcpy _memicmp _memmove _memset _modfl _nanl _nearbyintl _powl _printf _realloc _recv 16 _recvfrom 24 _remainderl _roundl _select 20 _send 16 _sendto 24 _setsockopt 20 _shutdown 8 _sinhl _socket 12 _spawnvp _sprintf _strcat _strcmp _strcpy _strerror _strlen _strncpy _strrchr _strtod _strtof _strtold _system _tanhl _tgammal _toupper _truncl _ungetc _vsnprintf _wcscmp _wcslen And that is the *entirety* of what phobos.lib needs for everything in phobos. A big chunk of it is Windows API imports. A quick look through it shows the following related to C I/O: _printf ___fp_lock ___fp_unlock __fputc_nlock __fputwc_nlock __iob _exit _fclose _fdopen _feof _fflush _fgetc _fopen _fprintf _fputc _fread _free _fseek _ftell _fwide _fwrite _ungetc Applying grep tells us where these are used: modules like std.cstream, which is "C" streams, no surprise there. gzio.c, a C function that's part of zlib, no surprise there, either. trace's file logging function, no big deal, that wouldn't get shipped with a released application. std.stdio, no surprise there either, it needs to sync with C's stdio. There isn't anywhere near 50 routines, and if you don't use cstream, zlib, trace, or writef, it's hard to see any difficulty at all. In fact, I find it hard to find any other uses of stdio other than printf (perhaps I overlooked some?). So why can't 'hooking' printf, as outlined above, work? BTW, internal\dmain2.d also uses printf to print out an error message relating to any uncaught exceptions.
Apr 07 2006
next sibling parent reply kris <foo bar.com> writes:
Walter Bright wrote:
 kris wrote:
 Why do you think Ares exists anyway?


 I'm trying to understand.

Ah. That makes some sense now ~ so, you're saying you don't see any need for a better library? That you're trying to understand what such a need might be?
 Here's the issue I have - postings that adamantly offer solutions 
 without identifying the actual problem. Here are some examples:
 
 Solution: remove printf
 Alleged problem: printf pulls in floating point formatting code
 Actual problem: std.string pulls in floating point formatting code due 
 to reference to __fltused. printf does not pull in floating point 
 formatting code.
 Correct solution: fix compiler to not generate __fltused references in 
 library code
 
 Solution: implement separate itoa() for typeinfo
 Alleged problem: calling one function in std.string pulls in everything 
 in std.string
 Actual problem: only a small portion of std.string is actually linked in 
 because the free functions are implemented as COMDATs, but due to an 
 error in the compiler, lambda functions are not written as COMDATs. This 
 causes a reference to std.uni to still be pulled in, pulling in a large 
 table in std.uni
 Correct solution: fix compiler to generate COMDATs for lambda functions.

If you'll read the posts again, sans prejudice, I hope you'll find that they are about decoupling (like the title says). If you haven't heard of it before, it's a generally and widely applicable concept in software design. Been applied for probably 40 years now. If that thrust is not clear from my writing, then I've been totally lacking in my own conviction. The 'solutions' you note above are being rather frugal with the truth: Let's face it; printf should probably be removed from object.d purely because it represents poor design judgement (yes; my opinion. I have one). That aside; you make much of the fact that it can be stubbed out, whilst staunchly "refusing" to examine why it should be present in the first place. Instead, there's the persistent "it doesn't make any difference to remove it" stonewall. That approach inevitably leads to a less than fruitful discourse, and fits the description of futile (since you asked). TypeInfo et. al. should most probably strive to be as isolated as they can be from higher level modules (which includes printf). Hooking it up contrary to this manner is sometimes called "tight coupling", and it's what this topic is about. Implementing a local itoa() is one way to decouple TypeInfo; linking to the C lib itoa() is another. You made it implicitly clear there was no way you'd consider isolating object.d from std.string in order to make the former more amenable to alternate libraries. Thus, that aspect was completely ignored also. Looking again at your recital of adamant examples, I'm rather sorry, and entirely disappointed that's all you got from this exchange. Yes, there's certainly truth there; but it apparently makes a point of purging all mention of decoupling ~ that's where frugality lies.
 And that is the *entirety* of what phobos.lib needs for everything in 
 phobos. A big chunk of it is Windows API imports. A quick look through 
 it shows the following related to C I/O:
 
 _printf
 ___fp_lock
 ___fp_unlock
 __fputc_nlock
 __fputwc_nlock
 __iob
 _exit
 _fclose
 _fdopen
 _feof
 _fflush
 _fgetc
 _fopen
 _fprintf
 _fputc
 _fread
 _free
 _fseek
 _ftell
 _fwide
 _fwrite
 _ungetc
 
 Applying grep tells us where these are used: modules like std.cstream, 
 which is "C" streams, no surprise there. gzio.c, a C function that's 
 part of zlib, no surprise there, either. trace's file logging function, 
 no big deal, that wouldn't get shipped with a released application. 
 std.stdio, no surprise there either, it needs to sync with C's stdio.
 
 There isn't anywhere near 50 routines, and if you don't use cstream, 
 zlib, trace, or writef, it's hard to see any difficulty at all. In fact, 
 I find it hard to find any other uses of stdio other than printf 
 (perhaps I overlooked some?). So why can't 'hooking' printf, as outlined 
 above, work?

Yeah ~ I posted the same lists last year, gained simply by hiding the C library. You've again omitted a crucial part. The C runtime itself apparently has all kinds of interdependencies (the console startup/exit code is a prime example). Thus, the lists you show are simply the tip of the iceberg. I imagine you already know this quite intimately, so will suggest you do a step-by-step examination of the .map file for Derek's example: void main() {} and ask yourself just why and where the kitchen-sink is linked? I'm not telling you this to be a jerk ~ it would surely be of benefit to D if you were to understand where the dependencies actually lie.
 
 BTW, internal\dmain2.d also uses printf to print out an error message 
 relating to any uncaught exceptions.

I had not forgotton it. There's a time for that one also.
Apr 07 2006
next sibling parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
kris wrote:

 BTW, internal\dmain2.d also uses printf to print out an error message 
 relating to any uncaught exceptions.

I had not forgotton it. There's a time for that one also.

Maybe then the bug with it printing to the wrong stream can be fixed ? Errors should be printed on stderr (fprintf), not on stdout (printf)... --anders
Apr 07 2006
parent reply Sean Kelly <sean f4.ca> writes:
Anders F Björklund wrote:
 kris wrote:
 
 BTW, internal\dmain2.d also uses printf to print out an error message 
 relating to any uncaught exceptions.

I had not forgotton it. There's a time for that one also.

Maybe then the bug with it printing to the wrong stream can be fixed ? Errors should be printed on stderr (fprintf), not on stdout (printf)...

An easy change. Just use fprintf and specify stderr as the output file. Replacing printf entirely with something a bit less complex would be quite easy as well, as the error messages are just strings--there's no need for all the fancy stuff printf does. I'll do this at some point for the DMD runtime in Ares, but no one's complained about it yet so I've put it off for now. Sean
Apr 07 2006
parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Sean Kelly wrote:

 Errors should be printed on stderr (fprintf), not on stdout (printf)...

An easy change. Just use fprintf and specify stderr as the output file.

For some reason this change has been rejected earlier. I don't know why. Just thought that if the file is revised, then maybe it can be included? digitalmars.D.bugs/2001 digitalmars.D.bugs/4368 --anders
Apr 07 2006
parent Sean Kelly <sean f4.ca> writes:
Anders F Björklund wrote:
 Sean Kelly wrote:
 
 Errors should be printed on stderr (fprintf), not on stdout (printf)...

An easy change. Just use fprintf and specify stderr as the output file.

For some reason this change has been rejected earlier. I don't know why. Just thought that if the file is revised, then maybe it can be included? digitalmars.D.bugs/2001 digitalmars.D.bugs/4368

No idea. I do suggest using fprintf instead of fwritefln, but I can't think of a reason not to do this. I made the change to Ares quite a while ago. Sean
Apr 07 2006
prev sibling parent reply Walter Bright <newshound digitalmars.com> writes:
kris wrote:
 If you'll read the posts again, sans prejudice, I hope you'll find that 
 they are about decoupling (like the title says).

What I did is ask what is being coupled that needs decoupling, i.e. trying to drill down to find the *real* issue with printf and std.string.toString. I don't agree with the notion that printf and its 4K of code is in the same category as Java's entire runtime library. In other words, I do not agree with absolutes like "coupling is always bad." Each case should be looked at individually on its merits. For printf, the actual coupling problem is: 1) It pulls in floating point formatting code. This turns out to not be correct. 2) It's bloated. The bloat turns out to be 4K code - a big problem on an 8 bit machine to be sure, but not on a 32 bit one. 3) It pulls in the entire C I/O system. This turns out to also not be correct. It pulls in a reasonable portion of it. 4) It should be replaceable/hookable. Yes, it is. On the other hand, the advantages of having a print in Object are: 1) Every object can be relied upon to have some sort of print method. 2) Substituting a toString() is problematic because it usually requires extra allocation and hence double buffering - so most I/O systems avoid such designs. Also, the print is often used for debugging, and having it be forced to use an allocation can upset what one is trying to debug - by causing a gc collection cycle, for example. 3) It being synchronized with C's IO means it doesn't screw up when mixed with normal IO code. If printf pulled in a megabyte graphics library, sure, that's unreasonable coupling. But it doesn't. So, in my judgment, its benefits outweigh its disadvantages. You're free to disagree, but disagreement on a judgment call doesn't mean I or you are secretly convinced by the other's argument and lying about it. Let's look at the 'coupling' problem of typeinfo calling std.string.toString: 1) It's assumed to link in all of std.string, and everything std.string references. This is not correct. 2) std.string does pull in the floating point code, but this is a compiler problem and easily (and already in my working version) fixed. 3) There's another problem where it pulled in std.uni, but again, this is a compiler problem and easily fixed. So what std.string.toString actually pulls in (given the compiler fixes) is just std.string.toString and a few bytes of static data. So, in the end, the coupling turns out to be nonexistent. The assumption that calling one routine in a module brings in the entire module is not correct (and hasn't been since the late 80's). The proposed solution, calling C's itoa(), is problematic because it requires another allocation (itoa expects a 0 terminated string). Furthermore, itoa() is not a standard C function, so such dependency won't be portable. Writing another itoa unique to typeinfo kinda defeats the purpose of writing a library with reusable code.
 You made it 
 implicitly clear there was no way you'd consider isolating object.d from 
 std.string in order to make the former more amenable to alternate 
 libraries. Thus, that aspect was completely ignored also.

No, I did not ignore it. I don't feel it's productive to rewrite the functions in std.string in each module. std.string is not a burdensome piece of code. Should I also rewrite memcpy() in every module? How far do you want to go to pursue decoupling for decoupling's sake? And suppose Fred discovers a way to double the speed of std.string.toString - pursuing your approach would mean none of the rest of the library would benefit from that. In my not-so-humble (!) opinion, when you're using copy/paste across modules, that's a red flag something is wrong with the design. I'll agree with you that pointless and gratuitous coupling should be avoided, but that is not the case with the two examples we're discussing.
 Looking again at your recital of adamant examples, I'm rather sorry, and 
 entirely disappointed that's all you got from this exchange.

I used the word adamant because you haven't acknowledged that I've addressed the underlying __fltused and std.uni issues, you haven't acknowledged that they are solvable compiler issues rather than library design issues, and just keep pushing the same solution.
 Yes, there's certainly truth there; but it apparently makes a point of 
 purging all mention of decoupling ~ that's where frugality lies.

I shall reiterate that what I was doing was addressing what the underlying issue with coupling was. If those can be successfully addressed, then there is no coupling issue.
 You've again omitted a crucial part. The C runtime itself apparently has 
 all kinds of interdependencies (the console startup/exit code is a prime 
 example).

I did not omit it. I addressed that in my last post.
 Thus, the lists you show are simply the tip of the iceberg. I 
 imagine you already know this quite intimately, so will suggest you do a 
 step-by-step examination of the .map file for Derek's example:
 
 void main() {}
 and ask yourself just why and where the kitchen-sink is linked?

And as I've already told you, I've already done that and that there are good reasons for the code that is linked in.
Apr 07 2006
parent kris <foo bar.com> writes:
Walter Bright wrote:
 kris wrote:

I see you snipped the questions about trying to understand the need for Ares? I thought those were realistic, reasonable, and pertinent questions in the search to understand why you can't understand Ares exists. Are they just not important enough to warrant attention? Here they are again: <repost> Kris: Why do you think Ares exists anyway? Walter: I'm trying to understand. Kris: Ah. That makes some sense now ~ so, you're saying you don't see any need for a better library? That you're trying to understand what such a need might be? </repost> Those are important questions, since they'll help clarify the apparently vast divide between what you think of as a library, and what I do. I doubt very much that I'm alone upon that sea of confusion. If I were, Ares would not exist. I feel its important for you to be open about your perspective on "alternate" libraries vs Phobos ~ after all, there's a lot of people putting a lot of effort into making those available ... BTW: the phrase "better library" is used there because that what Ares was designed to be. I now realise that you may have interpreted that as some kind of an "insult" to Phobos. Many pardons.
 For printf, the actual coupling problem is:
 
 1) It pulls in floating point formatting code. This turns out to not be 
 correct.

Mea culpa; easy mistake to make when there's nothing else referencing FP, wouldn't you say? Who might have guessed it were std.string instead of printf() ?
 2) It's bloated. The bloat turns out to be 4K code - a big problem on an 
 8 bit machine to be sure, but not on a 32 bit one.
 3) It pulls in the entire C I/O system. This turns out to also not be 
 correct. It pulls in a reasonable portion of it.

Smoke & mirrors: If the C IO were not already leaking in via a number of other holes in the boat, then the reference within object.d would be solely responsible for pulling in everything that printf references. This does amount to a rather large majority of the C IO subsystem, as you know. Yes, you've made it perfectly clear that you have no interest in even trying to see any relationship therein.
 4) It should be replaceable/hookable. Yes, it is.

By stubbing out dependencies? That's not what I'd call good design. If you think it is, then we can agree to disagree.
 On the other hand, the advantages of having a print in Object are:
 
 1) Every object can be relied upon to have some sort of print method.

If that has such an immeasurable value, why is it not used more? I submit this particular "reason to exist" is of questionable merit. D deserves better debugging support than object.print() ~ there's a decent logging package in mango.log ... doesn't even allocate memory.
 2) Substituting a toString() is problematic because it usually requires 
 extra allocation and hence double buffering - so most I/O systems avoid 
 such designs. 

I see. The type of trivial, lazy, and inflexible debugging-output represented by object.print() needs to avoid double buffering at all costs?
 Also, the print is often used for debugging, and having it 
 be forced to use an allocation can upset what one is trying to debug - 
 by causing a gc collection cycle, for example.

Pardon me; std.string is the one of the worst offenders in terms of wasteful memory allocation that I've seen in decades. It appears that every single toString() method allocates from the heap, Walter. I would have mentioned this before, but we couldn't even get past the introduction. To use your own words "Also, the print is often used for debugging, and having it be forced to use an allocation can upset what one is trying to debug - by causing a gc collection cycle, for example.". I'm sure you realise that print() method invokes toString(), quite possibly resulting in calls to std.string.toString()? Which does wholly unecessary memory allocations? Pardon me, but std.string is hardly the shining light to be held aloft in such matters.
 3) It being synchronized with C's IO means it doesn't screw up when 
 mixed with normal IO code.

That's really stretching: you don't need printf() to synchronize with normal IO. Use one of the lower layers instead, since none of the formatting options are actually used ~ as you know. This is a good example of using an "expensive" library function where a "less costly" one would suffice perfectly well. Again, the functionality is of questionable value anyway.
 No, I did not ignore it. I don't feel it's productive to rewrite the 
 functions in std.string in each module. std.string is not a burdensome 
 piece of code. 

It is most certainly burdensome, Walter. Allocating memory is one of the most expensive things one can do in this realm ~ std.string does it with gay abandon. How does that fit with your position?
 I'll agree with you that pointless and gratuitous coupling should be 
 avoided, but that is not the case with the two examples we're discussing.

I submit that printf() within object.d is gratuitous and almost entirely pointless. And not even used appropriately. There's fair reason why one won't see that kind of appendage in mainstream designs. The coupling to std.string is indeed pulling in unwarranted FP support, so there is clearly an issue there. Std.string also happens to allocate memory for each toString() method (looks like all of them do?). That is harldy wise to do at a low level, especially where it can be easily avoided. You spelt out some reasons to avoid that yourself, vis-a-vis allocation whilst debugging.
 
 Looking again at your recital of adamant examples, I'm rather sorry, 
 and entirely disappointed that's all you got from this exchange.

I used the word adamant because you haven't acknowledged that I've addressed the underlying __fltused and std.uni issues, you haven't acknowledged that they are solvable compiler issues rather than library design issues, and just keep pushing the same solution.

Sigh
 I shall reiterate that what I was doing was addressing what the 
 underlying issue with coupling was. If those can be successfully 
 addressed, then there is no coupling issue.

I'm very happy you found some problems and were able to rectify them. That's great! But that doesn't, in any shape or form, mean there's no coupling issues. You think these are the only examples? You feel it's OK for std.string to be allocating from the heap, when it can easily be avoided instead? That it's cool for low level language-support code (like object.d) to be invoking the GC where it could easily be avoided? I don't. But, I'd be perfectly content if you were to leave it just as it is
Apr 07 2006
prev sibling next sibling parent reply Sean Kelly <sean f4.ca> writes:
Walter Bright wrote:
 
 BTW, internal\dmain2.d also uses printf to print out an error message 
 relating to any uncaught exceptions.

Ares currently uses fprintf for this purpose, which is essentially the same thing. And as you say, it would be easy enough to hook the function if this behavior isn't desirable. I'm glad the compiler issues came to light however. I don't suppose this will have any impact on the template code generation problem in libraries? Sean
Apr 07 2006
next sibling parent Walter Bright <newshound digitalmars.com> writes:
Sean Kelly wrote:
 Ares currently uses fprintf for this purpose, which is essentially the 
 same thing.  And as you say, it would be easy enough to hook the 
 function if this behavior isn't desirable.
 
 I'm glad the compiler issues came to light however.  I don't suppose 
 this will have any impact on the template code generation problem in 
 libraries?

Unfortunately, no, as as I recall that is a limitation in the object file format.
Apr 07 2006
prev sibling parent reply Dave <Dave_member pathlink.com> writes:
Sean Kelly wrote:
 Walter Bright wrote:
 BTW, internal\dmain2.d also uses printf to print out an error message 
 relating to any uncaught exceptions.

Ares currently uses fprintf for this purpose, which is essentially the same thing. And as you say, it would be easy enough to hook the function if this behavior isn't desirable. I'm glad the compiler issues came to light however. I don't suppose this will have any impact on the template code generation problem in libraries? Sean

Sean - could you point me to some links or post some example code on this? Thanks, - Dave
Apr 08 2006
parent reply Sean Kelly <sean f4.ca> writes:
Dave wrote:
 Sean Kelly wrote:
 Walter Bright wrote:
 BTW, internal\dmain2.d also uses printf to print out an error message 
 relating to any uncaught exceptions.

Ares currently uses fprintf for this purpose, which is essentially the same thing. And as you say, it would be easy enough to hook the function if this behavior isn't desirable. I'm glad the compiler issues came to light however. I don't suppose this will have any impact on the template code generation problem in libraries?

Sean - could you point me to some links or post some example code on this?

http://svn.dsource.org/projects/ares/trunk/src/dmdrt/dmain2.d See the "catch" blocks near the end of the file. Sean
Apr 08 2006
parent reply Dave <Dave_member pathlink.com> writes:
Sean Kelly wrote:
 Dave wrote:
 Sean Kelly wrote:
 Walter Bright wrote:
 BTW, internal\dmain2.d also uses printf to print out an error 
 message relating to any uncaught exceptions.

Ares currently uses fprintf for this purpose, which is essentially the same thing. And as you say, it would be easy enough to hook the function if this behavior isn't desirable. I'm glad the compiler issues came to light however. I don't suppose this will have any impact on the template code generation problem in libraries?

Sean - could you point me to some links or post some example code on this?

http://svn.dsource.org/projects/ares/trunk/src/dmdrt/dmain2.d See the "catch" blocks near the end of the file. Sean

I still don't understand in the context of this message - what are the error messages you're getting? Thanks, - Dave
Apr 08 2006
parent Sean Kelly <sean f4.ca> writes:
Dave wrote:
 Sean Kelly wrote:
 Dave wrote:
 Sean Kelly wrote:
 Walter Bright wrote:
 BTW, internal\dmain2.d also uses printf to print out an error 
 message relating to any uncaught exceptions.

Ares currently uses fprintf for this purpose, which is essentially the same thing. And as you say, it would be easy enough to hook the function if this behavior isn't desirable. I'm glad the compiler issues came to light however. I don't suppose this will have any impact on the template code generation problem in libraries?

Sean - could you point me to some links or post some example code on this?

http://svn.dsource.org/projects/ares/trunk/src/dmdrt/dmain2.d See the "catch" blocks near the end of the file.

I still don't understand in the context of this message - what are the error messages you're getting?

Oops, I think I misunderstood. Were you asking about the template link problems? See this thread: http://www.digitalmars.com/d/archives/digitalmars/D/23685.html Sean
Apr 08 2006
prev sibling parent reply Sean Kelly <sean f4.ca> writes:
Walter Bright wrote:
 
 I know you're still bothered by the issue of C's IO being pulled in. I 
 have another utility for you - \dm\bin\libunres. Libunres will identify 
 any symbols unresolved by a library.

For reference, here's the result of running this utility on the Ares version of the DMD GC and DMD runtime, respectively. The dependencies should be roughly equivalent to the appropriate portions of Phobos. Note that "_gc_*" "_thread_*" and "_on*" are defined by the GC or standard library: C:\bin\dmd\lib>libunres dmdgc.lib Unresolved externals: _D6object6Object5opCmpFC6ObjectZi _D6object6Object6toHashFZk _D6object6Object8opEqualsFC6ObjectZi _D6object6Object8toStringFZAa __Class_6Object __d_framehandler __d_local_unwind2 __d_monitorenter __d_monitorexit __end __except_list __imp__GetCurrentThreadId 0 __imp__VirtualAlloc 16 __imp__VirtualFree 12 __nullext __vtbl_9ClassInfo __xi_a _calloc _free _malloc _memcpy _memmove _memset _onOutOfMemory _realloc _thread_init _thread_needLock _thread_resumeAll _thread_scanAll _thread_suspendAll C:\bin\dmd\lib>libunres dmdrt.lib Unresolved externals: __Ccmp __Dmain __LCMP __LDIV __ModuleInfo_3std1c4math __ModuleInfo_3std1c5ctype __ModuleInfo_3std1c5stdio __ModuleInfo_3std1c6stdarg __ModuleInfo_3std1c6stddef __ModuleInfo_3std1c6stdlib __ModuleInfo_3std1c6string __ModuleInfo_3std1c7stdbool __ULDIV ___alloca ___fpclassify_d ___fpclassify_f ___fpclassify_ld __assert __except_list __fltused __global_unwind __imp__DeleteCriticalSection 4 __imp__EnterCriticalSection 4 __imp__InitializeCriticalSection 4 __imp__LeaveCriticalSection 4 __imp__QueryPerformanceCounter 4 __imp__QueryPerformanceFrequency 4 __imp__RaiseException 16 __iob __local_except_handler _calloc _exit _fclose _fgetc _fopen _fprintf _free _gc_calloc _gc_free _gc_init _gc_malloc _gc_realloc _gc_setFinalizer _gc_sizeOf _gc_term _isalpha _isgraph _isspace _malloc _memcmp _memcpy _memicmp _memmove _memset _onArrayBoundsError _onAssert _onOutOfMemory _onSwitchError _onUnicodeError _printf _qsort _sprintf _strlen _strtoul _strtoull _unmangle_ident The ModuleInfo dependencies could be eliminated by explicitly declaring forward references to the appropriate C routines instead of importing the D header modules. The remaining dependency list is really pretty minimal. I am somewhat surprised to see printf in there, though--I must have missed a "debug" prefix somewhere. Sean
Apr 07 2006
parent reply Sean Kelly <sean f4.ca> writes:
Sean Kelly wrote:
 I am somewhat surprised to see printf in there, though--I must 
 have missed a "debug" prefix somewhere.

Quick follow-up. All printf calls in the DMD runtime are either commented out or are prefixed by a debug qualifier. And the library was built without -debug set. Any ideas why it would be listed as a dependency? A grep of dmdrt.lib listed this: ._fprintf ._printf ._fprintf ._sprintf so perhaps calling fprintf creates the other dependencies as well? I'll give the map a look and see if it ofers any clues. Sean
Apr 07 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Sean Kelly wrote:
 Quick follow-up.  All printf calls in the DMD runtime are either 
 commented out or are prefixed by a debug qualifier.  And the library was 
 built without -debug set.  Any ideas why it would be listed as a 
 dependency?  A grep of dmdrt.lib listed this:
 
 ._fprintf
 ._printf
 ._fprintf
 ._sprintf
 
 so perhaps calling fprintf creates the other dependencies as well?  I'll 
 give the map a look and see if it ofers any clues.

You can see which module(s) is referencing them by using grep across the .obj files.
Apr 07 2006
parent Sean Kelly <sean f4.ca> writes:
Walter Bright wrote:
 
 You can see which module(s) is referencing them by using grep across the 
 .obj files.

Thanks. Turns out it was a debug printf I'd left in place by accident. Sean
Apr 07 2006
prev sibling parent "Derek Parnell" <derek psych.ward> writes:
On Fri, 07 Apr 2006 14:05:02 +1000, Walter Bright  
<newshound digitalmars.com> wrote:


 If you want to use a system that for some reason can't have C's IO  
 subsystem, then just include the one liner:

 extern (C) int printf(char* f, ...) { return 0; }

 somewhere in your code, and it's gone.

So I did this, and it reduced the object size by 118 bytes. However, it still seems all the C I/O system is linked in as the only difference was that _vprintf was not linked in. Every other symbol was still linked. -- Derek Parnell Melbourne, Australia
Apr 07 2006
prev sibling parent Georg Wrede <georg nospam.org> writes:
Walter Bright wrote:
 kris wrote:
 
 Yes, that's correct. But typeinfo is a rather rudimetary part of the 
 language support. Wouldn't you agree? If I, for example, declare an 
 array of 10 bytes (static byte[10]) then I'm bound over to import 
 std.string ~ simply because TypeInfo_StaticArray wants to use 
 std.string.toString(int), rather than the C library version of itoa() 
 or a "low-level support" version instead.

It has nothing to do with having a static byte[10] declaration. For the program: void main() { static byte[10] b; } The only things referenced by the object file are _main, __acrtused_con, and __Dmain. You can verify this by running obj2asm on the output, which gives:

 
 It's just not a big deal. Try the following:
 
 extern (C) int printf(char* f, ...) { return 0; }
 
 void main()
 {
     static byte[10] b;
 }
 
 and compare the difference in exe file sizes, with and without the 
 printf stub.
 
 
 printf doesn't pull in the floating point library (I went to a lot of 
 effort to make that so!). It does pull in the C IO library, which is 
 very hard to not pull in (there always seems to be something 
 referencing it). It shouldn't pull in the C wide character stuff. D's 
 IO (writefln) will pull in C's IO anyway, so the only thing extra is 
 the integer version of the specific printf code (about 4K).

How can it convert %f, %g and so on if it doesn't use FP support at all?

It's magic! Naw, it's just that if you actually use floating point in a program, the compiler emits a special extern reference (to __fltused) which pulls in the floating point IO formatting code. Otherwise, it defaults to just a stub. Try it.
 Either way, it's not currently possible to build a D program without a 
 swathe of FP support code,
 printf,
 the entire C IO package,
 wide-char support,
 and a whole lot more besides. I'd assumed the linked FP support was 
 for printf, but perhaps it's for std.string instead? I've posted the 
 linker maps (in the past) to illustrate exactly this.

My point is that assuming what is pulled in by what is about as reliable as guessing where the bottlenecks in one's code is.

Apr 13 2006
prev sibling next sibling parent "Derek Parnell" <derek psych.ward> writes:
On Fri, 07 Apr 2006 06:52:08 +1000, Walter Bright  
<newshound digitalmars.com> wrote:


 And I hope I responded adequately.

I don't think you did. Once again you skirted the issues with politian-like answers. -- Derek Parnell Melbourne, Australia
Apr 06 2006
prev sibling next sibling parent Derek Parnell <derek psych.ward> writes:
On Thu, 06 Apr 2006 13:52:08 -0700, Walter Bright wrote:


 Instead of adding a duplicate itoa() method (about 60 bytes of code), or 
 perhaps linking to the C library version, TypeInfo gratuitously imports 
 std.string and all its vast array of baggage. Heck, everyone makes 
 mistakes, but your comment above indicates you feel this kind of tight 
 coupling is perfectly fine?

Although there is a lot of code in std.string, unreferenced free functions in it should be discarded by the linker. A check of the generated .map file should verify this - it is certainly supposed to work that way. One problem Java has is that there are no free functions, so referencing one function wound up pulling in every part of the class the function resided in.

The following program ... void main() {} linked in the following modules (from the .map file) ---------------- ??0RTLHeap QAE XZ ??0RTLHeapBlock QAE IAAV0 Z ??0RTLHeapBlock QAE XZ ??0RTLHeapBlockHeader QAE I Z ??0RTLMultiPool QAE II Z ??0RTLPool QAE II Z ??0Type_info AAA ABV0 Z ??0bad_cast std QAE ABV01 Z ??0bad_cast std QAE XZ ??0exception std QAE XZ ??0type_info std IAE XZ ??1RTLMultiPool QAE XZ ??1Type_info UAA XZ ??1__eh_cv QAE XZ ??1bad_cast std UAE XZ ??1bad_exception std UAE XZ ??1bad_typeid std UAE XZ ??1exception std UAE XZ ??1type_info std UAA XZ ??2 YAPAXI Z ??3 YAXPAX Z ??_GType_info UAEPAXI Z ??_Gbad_cast std UAEPAXI Z ??_Gbad_exception std UAEPAXI Z ??_Gbad_typeid std UAEPAXI Z ??_Gexception std UAEPAXI Z ??_Gtype_info std UAEPAXI Z ??_QType_info 6B ??_Qbad_cast std 6B ??_Qbad_exception std 6B ??_Qbad_typeid std 6B ??_Qexception std 6B ??_Qtype_info std 6B ?Alloc RTLHeap QAEPAXK Z ?Alloc RTLMultiPool QAEPAXK Z ?Alloc RTLPool QAEPAXXZ ?Claim RTLHeapBlock QAEHI Z ?CreateMainHeap RTLHeap SAXXZ ?CreateMainHeap RTLMultiPool SAXXZ ?FixSize RTLHeap AAEII Z ?Free RTLHeap QAEXPAX Z ?Free RTLMultiPool QAEXPAX Z ?Free RTLPool QAEXPAX Z ?GetNext RTLHeapBlock QAEAAV1 XZ ?Handle_VC_Exception YAPAUEhstack PAU_EXCEPTION_RECORD H Z ?InsertAfter RTLHeapBlock QAEXAAV1 Z ?MergeBackward RTLHeapBlock QAEHXZ ?MergeForward RTLHeapBlock QAEHXZ ?MoreCore RTLHeap AAEPAVRTLHeapBlock I Z ?Realloc RTLHeap QAEPAXPAXK Z ?Realloc RTLMultiPool QAEPAXPAXK Z ?Reclaim RTLHeapBlock QAEHIAAV1 Z ?Remove RTLHeapBlock QAEXXZ ?SelectFree RTLMultiPool AAEXPAX Z ?ThreadNewBlock RTLPool AAEXXZ ?__cpp_local_unwind YAXPAU_CPP_Establisher_Frame PADH Z ?__do_newalloc YAPAXI Z ?__eh_delete YAXPAX Z ?__eh_delp 3P6AXPAX ZA ?__eh_error YAXXZ ?__eh_new YAPAXI Z ?__eh_newp 3P6APAXI ZA ?__eh_throw YAXPBDP6CHXZIZZ ?__internal_cpp_framehandler YA?AW4_EXCEPTION_DISPOSITION PAUfunc_data PAU_EXCEPTION_RECORD PAU_CPP_Establisher_Frame PAU_CONTEXT PAX Z ?__new_handler_type 3HA ?__rtti_cast YAPAXPAX0PBD1H Z ?__rtti_match YAHPBD0PAI Z ?_call_catch_block YAXJIP6CHXZ Z ?_new_handler 3P6AHI ZA ?match_with_vc_throw_type YAHPAUThrowInfo PADPAI Z ?pMainHeap RTLHeap 0PAV1 A ?pMainHeap RTLMultiPool 0PAV1 A ?pPools RTLPool 0PAV1 A ?set_terminate YAP6AXXZP6AXXZ Z ?set_terminate std YAP6AXXZP6AXXZ Z ?set_unexpected YAP6AXXZP6AXXZ Z ?set_unexpected std YAP6AXXZP6AXXZ Z ?terminate YAXXZ ?terminate_fp 3P6AXXZA ?unexpected YAXXZ ?unexpected_fp 3P6AXXZA ?what bad_cast std UBEPBDXZ ?what bad_exception std UBEPBDXZ ?what bad_typeid std UBEPBDXZ ?what exception std UBEPBDXZ _CW_USEDEFAULT _CloseHandle 4 _CreateSemaphoreA 16 _CreateThread 24 _D3gcx10notbinsizeG12k _D3gcx14SENTINEL_EXTRAk _D3gcx2GC10genCollectFZv _D3gcx2GC10initializeFZv _D3gcx2GC10removeRootFPvZv _D3gcx2GC11__invariantFZv _D3gcx2GC11fullCollectFZv _D3gcx2GC11removeRangeFPvZv _D3gcx2GC12mallocNoSyncFkZPv _D3gcx2GC12setFinalizerFPvPFPvPvZvZv _D3gcx2GC14scanStaticDataFC3gcx2GCZv _D3gcx2GC14setStackBottomFPvZv _D3gcx2GC18fullCollectNoStackFZv _D3gcx2GC4DtorFZv _D3gcx2GC4filePa _D3gcx2GC4freeFPvZv _D3gcx2GC4linek _D3gcx2GC5checkFPvZv _D3gcx2GC6callocFkkZPv _D3gcx2GC6enableFZv _D3gcx2GC6gcLockC9ClassInfo _D3gcx2GC6mallocFkZPv _D3gcx2GC7addRootFPvZv _D3gcx2GC7disableFZv _D3gcx2GC7reallocFPvkZPv _D3gcx2GC8addRangeFPvPvZv _D3gcx2GC8capacityFPvZk _D3gcx2GC8getStatsFJS7gcstats7GCStatsZv _D3gcx2GC8minimizeFZv _D3gcx3Gcx10removeRootFPvZv _D3gcx3Gcx11fullcollectFPvZk _D3gcx3Gcx11removeRangeFPvZv _D3gcx3Gcx16fullcollectshellFZk _D3gcx3Gcx4DtorFZv _D3gcx3Gcx4markFPvPvZv _D3gcx3Gcx7addRootFPvZv _D3gcx3Gcx7findBinFkZh _D3gcx3Gcx7newPoolFkZPS3gcx4Pool _D3gcx3Gcx8addRangeFPvPvZv _D3gcx3Gcx8bigAllocFkZPv _D3gcx3Gcx8findPoolFPvZPS3gcx4Pool _D3gcx3Gcx8findSizeFPvZk _D3gcx3Gcx9allocPageFhZi _D3gcx4Pool10allocPagesFkZk _D3gcx4Pool10initializeFkZv _D3gcx4Pool4DtorFZv _D3gcx7binsizeG12k _D3gcx9GCVERSIONk _D3std10moduleinit12_moduleCtor2FAC10ModuleInfoiZv _D3std10moduleinit15ModuleCtorError5_ctorFC10ModuleInfoZC3std10moduleinit15ModuleCtorError _D3std10moduleinit17_moduleinfo_dtorsAC10ModuleInfo _D3std10moduleinit19_moduleinfo_dtors_ik _D3std11outofmemory20OutOfMemoryException1sAa _D3std11outofmemory20OutOfMemoryException8toStringFZAa _D3std1c6stdarg15__T8va_startTkZ8va_startFJPvKkZv _D3std1c6stdarg6va_endFPvZv _D3std2gc11newCapacityFkkZk _D3std2gc11newCapacityFkkZk9log2plus1FkZi _D3std2gc13new_finalizerFPvPvZv _D3std2gc3_gcC3gcx2GC _D3std3uni10isUniAlphaFwZi _D3std3uni10isUniLowerFwZi _D3std3uni10isUniUpperFwZi _D3std3uni10toUniLowerFwZw _D3std3uni10toUniUpperFwZw _D3std3utf10UTF8strideG256h _D3std3utf12UtfException5_ctorFAakZC3std3utf12UtfException _D3std3utf12isValidDcharFwZx _D3std3utf6decodeFAaKkZw _D3std3utf6encodeFKAawZv _D3std3utf6strideFAakZk _D3std3utf6toUTF8FG4awZAa _D3std5array16ArrayBoundsError5_ctorFAakZC3std5array16ArrayBoundsError _D3std6string10whitespaceG6a _D3std6string2LSw _D3std6string2PSw _D3std6string3cmpFAaAaZi _D3std6string4findFAawZi _D3std6string6columnFAaiZi _D3std6string6digitsG10a _D3std6string7iswhiteFwZi _D3std6string7lettersG52a _D3std6string7newlineG2a _D3std6string8toStringFkZAa _D3std6string9hexdigitsG16a _D3std6string9inPatternFwAaZi _D3std6string9lowercaseG26a _D3std6string9octdigitsG8a _D3std6string9uppercaseG26a _D3std6thread11ThreadError5_ctorFAaZC3std6thread11ThreadError _D3std6thread20os_query_stackBottomFZPv _D3std6thread6Thread10allThreadsG1024C3std6thread6Thread _D3std6thread6Thread10threadLockC6Object _D3std6thread6Thread11_staticDtorFZv _D3std6thread6Thread11setPriorityFE3std6thread6Thread8PRIORITYZv _D3std6thread6Thread11thread_initFZv _D3std6thread6Thread13allThreadsDimk _D3std6thread6Thread22getCurrentThreadHandleFZT3std1c7windows7windows6HANDLE _D3std6thread6Thread3runFZi _D3std6thread6Thread4waitFZv _D3std6thread6Thread4waitFkZv _D3std6thread6Thread5_ctorFZC3std6thread6Thread _D3std6thread6Thread5errorFAaZv _D3std6thread6Thread5pauseFZv _D3std6thread6Thread5startFZv _D3std6thread6Thread6getAllFZAC3std6thread6Thread _D3std6thread6Thread6resumeFZv _D3std6thread6Thread7getThisFZC3std6thread6Thread _D3std6thread6Thread8getStateFZE3std6thread6Thread2TS _D3std6thread6Thread8nthreadsk _D3std6thread6Thread8pauseAllFZv _D3std6thread6Thread9resumeAllFZv _D3std8typeinfo2Aa11TypeInfo_Aa5tsizeFZk _D3std8typeinfo2Aa11TypeInfo_Aa6equalsFPvPvZi _D3std8typeinfo2Aa11TypeInfo_Aa7compareFPvPvZi _D3std8typeinfo2Aa11TypeInfo_Aa7getHashFPvZk _D3std8typeinfo2Aa11TypeInfo_Aa8toStringFZAa _D3std8typeinfo7ti_char10TypeInfo_a4swapFPvPvZv _D3std8typeinfo7ti_char10TypeInfo_a5tsizeFZk _D3std8typeinfo7ti_char10TypeInfo_a6equalsFPvPvZi _D3std8typeinfo7ti_char10TypeInfo_a7compareFPvPvZi _D3std8typeinfo7ti_char10TypeInfo_a7getHashFPvZk _D3std8typeinfo7ti_char10TypeInfo_a8toStringFZAa _D3std8typeinfo7ti_uint10TypeInfo_k4swapFPvPvZv _D3std8typeinfo7ti_uint10TypeInfo_k5tsizeFZk _D3std8typeinfo7ti_uint10TypeInfo_k6equalsFPvPvZi _D3std8typeinfo7ti_uint10TypeInfo_k7compareFPvPvZi _D3std8typeinfo7ti_uint10TypeInfo_k7getHashFPvZk _D3std8typeinfo7ti_uint10TypeInfo_k8toStringFZAa _D5win3210os_mem_mapFkZPv _D5win3212os_mem_unmapFPvkZi _D5win3213os_mem_commitFPvkkZi _D5win3215os_mem_decommitFPvkkZi _D5win3220os_query_stackBottomFZPv _D5win3222os_query_staticdatasegFPPvPkZv _D6gcbits6GCBits10BITS_SHIFTi _D6gcbits6GCBits13BITS_PER_WORDi _D6gcbits6GCBits3setFkZv _D6gcbits6GCBits4DtorFZv _D6gcbits6GCBits4baseFZPk _D6gcbits6GCBits4copyFPS6gcbits6GCBitsZv _D6gcbits6GCBits4testFkZk _D6gcbits6GCBits4zeroFZv _D6gcbits6GCBits5allocFkZv _D6gcbits6GCBits9BITS_MASKi _D6gcbits6GCBits9testClearFkZk _D6object14TypeInfo_Array4swapFPvPvZv _D6object14TypeInfo_Array5tsizeFZk _D6object14TypeInfo_Array6equalsFPvPvZi _D6object14TypeInfo_Array7compareFPvPvZi _D6object14TypeInfo_Array7getHashFPvZk _D6object14TypeInfo_Array8toStringFZAa _D6object14TypeInfo_Class5tsizeFZk _D6object14TypeInfo_Class6equalsFPvPvZi _D6object14TypeInfo_Class7compareFPvPvZi _D6object14TypeInfo_Class7getHashFPvZk _D6object14TypeInfo_Class8toStringFZAa _D6object15TypeInfo_Struct5tsizeFZk _D6object15TypeInfo_Struct6equalsFPvPvZi _D6object15TypeInfo_Struct7compareFPvPvZi _D6object15TypeInfo_Struct7getHashFPvZk _D6object15TypeInfo_Struct8toStringFZAa _D6object16TypeInfo_Pointer4swapFPvPvZv _D6object16TypeInfo_Pointer5tsizeFZk _D6object16TypeInfo_Pointer6equalsFPvPvZi _D6object16TypeInfo_Pointer7compareFPvPvZi _D6object16TypeInfo_Pointer7getHashFPvZk _D6object16TypeInfo_Pointer8toStringFZAa _D6object16TypeInfo_Typedef4swapFPvPvZv _D6object16TypeInfo_Typedef5tsizeFZk _D6object16TypeInfo_Typedef6equalsFPvPvZi _D6object16TypeInfo_Typedef7compareFPvPvZi _D6object16TypeInfo_Typedef7getHashFPvZk _D6object16TypeInfo_Typedef8toStringFZAa _D6object17TypeInfo_Delegate5tsizeFZk _D6object17TypeInfo_Delegate8toStringFZAa _D6object17TypeInfo_Function5tsizeFZk _D6object17TypeInfo_Function8toStringFZAa _D6object20TypeInfo_StaticArray4swapFPvPvZv _D6object20TypeInfo_StaticArray5tsizeFZk _D6object20TypeInfo_StaticArray6equalsFPvPvZi _D6object20TypeInfo_StaticArray7compareFPvPvZi _D6object20TypeInfo_StaticArray7getHashFPvZk _D6object20TypeInfo_StaticArray8toStringFZAa _D6object25TypeInfo_AssociativeArray5tsizeFZk _D6object25TypeInfo_AssociativeArray8toStringFZAa _D6object5Error5_ctorFAaZC6object5Error _D6object6Object5opCmpFC6ObjectZi _D6object6Object5printFZv _D6object6Object6toHashFZk _D6object6Object8opEqualsFC6ObjectZi _D6object6Object8toStringFZAa _D6object8TypeInfo4swapFPvPvZv _D6object8TypeInfo5opCmpFC6ObjectZi _D6object8TypeInfo5tsizeFZk _D6object8TypeInfo6equalsFPvPvZi _D6object8TypeInfo6toHashFZk _D6object8TypeInfo7compareFPvPvZi _D6object8TypeInfo7getHashFPvZk _D6object8TypeInfo8opEqualsFC6ObjectZi _D6object9Exception5_ctorFAaZC9Exception _D6object9Exception5printFZv _D6object9Exception8toStringFZAa _DeleteCriticalSection 4 _DeleteFileA 4 _DuplicateHandle 28 _EnterCriticalSection 4 _ExitProcess 4 _ExitThread 4 _FREEBUF _FileTimeToDosDateTime 12 _FindClose 4 _FindFirstFileA 8 _FindNextFileA 8 _FreeEnvironmentStringsA 4 _GetACP 0 _GetCPInfo 8 _GetCommandLineA 0 _GetCurrentProcess 0 _GetCurrentThread 0 _GetCurrentThreadId 0 _GetEnvironmentStrings 0 _GetFileType 4 _GetLastError 0 _GetModuleFileNameA 12 _GetModuleHandleA 4 _GetOEMCP 0 _GetStdHandle 4 _GetStringTypeA 20 _GetThreadContext 8 _GetTickCount 0 _GetVersion 0 _GlobalAlloc 8 _GlobalFree 4 _HKEY_CLASSES_ROOT _HKEY_CURRENT_CONFIG _HKEY_CURRENT_USER _HKEY_DYN_DATA _HKEY_LOCAL_MACHINE _HKEY_PERFORMANCE_DATA _HKEY_USERS _HWND_DESKTOP _IDC_ARROW _IDC_CROSS _IDI_APPLICATION _INVALID_FILE_SIZE _INVALID_HANDLE_VALUE _INVALID_SET_FILE_POINTER _InitializeCriticalSection 4 _LCMapStringA 24 _LeaveCriticalSection 4 _MAILSLOT_NO_MESSAGE _MAILSLOT_WAIT_FOREVER _MessageBoxA 16 _MultiByteToWideChar 24 _REG_CREATED_NEW_KEY _REG_OPENED_EXISTING_KEY _RTLPoolCreate _RaiseException 16 _ReadFile 20 _ReleaseSemaphore 12 _ResumeThread 4 _RtlUnwind 16 _SetConsoleCtrlHandler 8 _SetFilePointer 16 _SetHandleCount 4 _SetThreadPriority 8 _SetUnhandledExceptionFilter 4 _SuspendThread 4 _ThreadStarted _UnhandledExceptionFilter 4 _VirtualAlloc 16 _VirtualFree 12 _WaitForSingleObject 8 _WideCharToMultiByte 32 _WriteConsoleA 20 _WriteFile 20 __8087 __8087_init __80x87 __87TOPSW __Class_10ModuleInfo __Class_10TypeInfo_a __Class_10TypeInfo_k __Class_11TypeInfo_Aa __Class_13TypeInfo_Enum __Class_14TypeInfo_Array __Class_14TypeInfo_Class __Class_15TypeInfo_Struct __Class_16TypeInfo_Pointer __Class_16TypeInfo_Typedef __Class_17TypeInfo_Delegate __Class_17TypeInfo_Function __Class_20TypeInfo_StaticArray __Class_25TypeInfo_AssociativeArray __Class_3gcx2GC __Class_3gcx6GCLock __Class_3std10moduleinit15ModuleCtorError __Class_3std11outofmemory20OutOfMemoryException __Class_3std3utf12UtfException __Class_3std3utf8UtfError __Class_3std5array16ArrayBoundsError __Class_3std6string15StringException __Class_3std6thread11ThreadError __Class_3std6thread6Thread __Class_6Object __Class_6object5Error __Class_8TypeInfo __Class_9ClassInfo __Class_9Exception __D3std6thread6Thread11threadstartWPvZk 4 __DBLINT87 __DBLLNG87 __DBLTO87 __DOSIGN __DTST87 __DestroySemaphore __Dmain __Exit __FCOMPP __FEEXCEPT __FE_DFL_ENV __FLOATCVT __FLTTO87 __FTEST0 __FTEST __InitSemaphoreSys __LCMP __LDIV __LMUL __ModuleInfo_3adi __ModuleInfo_3gcx __ModuleInfo_3std10moduleinit __ModuleInfo_3std2gc __ModuleInfo_3std3utf __ModuleInfo_3std5ctype __ModuleInfo_3std5stdio __ModuleInfo_3std6format __ModuleInfo_3std6string __ModuleInfo_3std6thread __ModuleInfo_3std8typeinfo2Aa __ModuleInfo_6aApply __ModuleInfo_6dmain2 __ModuleInfo_6gcbits __ModuleInfo_6object __ModuleInfo_8arraycat __ReleaseSemaphore __SCANFLOAT __SET_ERRNO __SET_NT_ERRNO __STD_critical __STD_init_std_files __STD_monitor_staticdtor __STD_signal __STI_critical __STI_fltused __STI_init_std_files __STI_io_ctor __STI_monitor_staticctor __STI_signal __ULDIV __WDOSIGN __WSCANFLOAT __WaitSemaphore ___CPPExceptionFilter 4 ___SD?%___heap32_multpool_cpp1525433864_ ___SI?%___heap32_multpool_cpp1525433864_ ___alloca ___argc ___argv ___build_environ ___callve ___cpp_init ___cpp_init_ptr ___faterr ___fcloseall ___fhnd_info ___floatfmt ___fp_lock ___fp_sigfunc ___fp_unlock ___fpclassify_ld ___init_environ_ptr ___init_mbctype_ptr ___initmbctype ___itoa ___locale_chars ___locale_codepage ___locale_decimal_const ___locale_decpoint ___locale_ident ___locale_mbsize ___mb_cpinfo ___mb_cpinfoN ___mbcodepage ___mbcpinfo ___mblcid ___module_handle ___pfloatfmt ___pscanfloat ___rtl_clean_cppexceptions ___rtl_criticalsection ___rtl_init_cppexceptions ___rtti?AVType_info ___rtti?AVbad_cast std ___rtti?AVbad_exception std ___rtti?AVbad_typeid std ___rtti?AVexception std ___rtti?AVtype_info std ___sigill_sigfunc ___sigsegv_sigfunc ___terminate_done ___thdtbl ___threadstartex 4 ___ti?AVType_info ___ti?AVbad_cast std ___ti?AVbad_exception std ___ti?AVbad_typeid std ___ti?AVexception std ___ti?AVtype_info std ___tiX ___unmangle_vc_exception_type ___ve_debughook ___wargv ___wfloatfmt ___wildcard ___wpfloatfmt ___wpscanfloat ___write_fp ___xcfilter __aApplycd1 __aApplycd2 __acmdln __acrtused_con __adDup __adEq __addthreadtableentry __alloca_probe __allocinit __argc __argv __assert __atexit_max __atexit_tbl __atexitp __atopsp __beginthreadex __call_except_block __chkstack __chkstk __cinit __clear87 __clearbss __control87 __cpp_framehandler __cppeh_sv __cpumode __ctype __d_OutOfMemory __d_arrayappend __d_arrayappendc __d_arraycat __d_arraycatn __d_arraysetlength __d_callfinalizer __d_create_exception_object __d_delmemory __d_exception_filter __d_framehandler __d_isbaseof __d_local_unwind __d_local_unwind2 __d_monitor_epilog __d_monitor_handler __d_monitor_prolog __d_monitorenter __d_monitorexit __d_monitorrelease __d_new __d_newarrayi __d_newclass __d_throw 4 __d_translate_se_to_d_exception __dodtors __dos_sethandlecount __doserrno __edata __end __endthreadex __environ __envptr __errno_set __except_handler3 __except_list __fatexit __fcloseallp __fe_cur_env __fgetc_nlock __fillbuf __fltused __flushbu __flushterm __fpreset __fputc_nlock __getthreaddata __global_unwind __hdlSemHandles __hookexitp __iSemLockCtrs __iSemNestCount __iSemThreadIds __imp__CloseHandle 4 __imp__CreateSemaphoreA 16 __imp__CreateThread 24 __imp__DeleteCriticalSection 4 __imp__DeleteFileA 4 __imp__DuplicateHandle 28 __imp__EnterCriticalSection 4 __imp__ExitProcess 4 __imp__ExitThread 4 __imp__FileTimeToDosDateTime 12 __imp__FindClose 4 __imp__FindFirstFileA 8 __imp__FindNextFileA 8 __imp__GetACP 0 __imp__GetCPInfo 8 __imp__GetCommandLineA 0 __imp__GetCurrentProcess 0 __imp__GetCurrentThread 0 __imp__GetCurrentThreadId 0 __imp__GetFileType 4 __imp__GetLastError 0 __imp__GetModuleFileNameA 12 __imp__GetModuleHandleA 4 __imp__GetOEMCP 0 __imp__GetStdHandle 4 __imp__GetStringTypeA 20 __imp__GetThreadContext 8 __imp__GetTickCount 0 __imp__InitializeCriticalSection 4 __imp__LCMapStringA 24 __imp__LeaveCriticalSection 4 __imp__MessageBoxA 16 __imp__MultiByteToWideChar 24 __imp__RaiseException 16 __imp__ReleaseSemaphore 12 __imp__ResumeThread 4 __imp__SetConsoleCtrlHandler 8 __imp__SetFilePointer 16 __imp__SetHandleCount 4 __imp__SetThreadPriority 8 __imp__SetUnhandledExceptionFilter 4 __imp__SuspendThread 4 __imp__UnhandledExceptionFilter 4 __imp__VirtualAlloc 16 __imp__VirtualFree 12 __imp__WaitForSingleObject 8 __imp__WideCharToMultiByte 32 __imp__WriteConsoleA 20 __init_10ModuleInfo __init_10TypeInfo_a __init_10TypeInfo_k __init_11TypeInfo_Aa __init_13TypeInfo_Enum __init_14TypeInfo_Array __init_14TypeInfo_Class __init_15TypeInfo_Struct __init_16TypeInfo_Pointer __init_16TypeInfo_Typedef __init_17TypeInfo_Delegate __init_17TypeInfo_Function __init_20TypeInfo_StaticArray __init_25TypeInfo_AssociativeArray __init_3adi5Array __init_3gcx2GC __init_3gcx3Gcx __init_3gcx4List __init_3gcx4Pool __init_3gcx5Range __init_3gcx6GCLock __init_3std10moduleinit15ModuleCtorError __init_3std11outofmemory20OutOfMemoryException __init_3std1c7windows7windows10BITMAPINFO __init_3std1c7windows7windows10LOGPALETTE __init_3std1c7windows7windows10OVERLAPPED __init_3std1c7windows7windows10SYSTEMTIME __init_3std1c7windows7windows11DLGTEMPLATE __init_3std1c7windows7windows11PAINTSTRUCT __init_3std1c7windows7windows11TEXTMETRICA __init_3std1c7windows7windows11WNDCLASSEXA __init_3std1c7windows7windows12MEMORYSTATUS __init_3std1c7windows7windows12PALETTEENTRY __init_3std1c7windows7windows13OPENFILENAMEA __init_3std1c7windows7windows13OPENFILENAMEW __init_3std1c7windows7windows15WIN32_FIND_DATA __init_3std1c7windows7windows16BITMAPINFOHEADER __init_3std1c7windows7windows16WIN32_FIND_DATAW __init_3std1c7windows7windows18FLOATING_SAVE_AREA __init_3std1c7windows7windows19SECURITY_ATTRIBUTES __init_3std1c7windows7windows21PIXELFORMATDESCRIPTOR __init_3std1c7windows7windows21TIME_ZONE_INFORMATION __init_3std1c7windows7windows24MEMORY_BASIC_INFORMATION __init_3std1c7windows7windows3MSG __init_3std1c7windows7windows4RECT __init_3std1c7windows7windows5POINT __init_3std1c7windows7windows6BITMAP __init_3std1c7windows7windows6LOGPEN __init_3std1c7windows7windows7CONTEXT __init_3std1c7windows7windows7RGBQUAD __init_3std1c7windows7windows8FILETIME __init_3std1c7windows7windows8LOGFONTA __init_3std1c7windows7windows9WNDCLASSA __init_3std2gc5Array __init_3std3utf12UtfException __init_3std3utf8UtfError __init_3std5array16ArrayBoundsError __init_3std6string15StringException __init_3std6thread11ThreadError __init_3std6thread6Thread __init_6Object __init_6gcbits6GCBits __init_6object5Error __init_6object9Interface __init_8TypeInfo __init_9ClassInfo __init_9Exception __iob __isctype __ismbblead __ismbcdigit __local_except_handler __local_unwind __mbctype __mbschr __minit __moddtor_3std6thread __moduleCtor __moduleDtor __moduleUnitTests __moduleinfo_array __nullext __opti_stosd __osfhnd __osmode __osver __pCPPExceptionFilter __pPreviousUnhandledExceptionFilter __pctype __pformat __pgmptr __pwctype __rtl_critical_enter __rtl_critical_exit __semerr __setargv __setenvp __setmbcp __status87 __szSemPrefix __szSemPrefixLen __tab_size __thread1 __thread_init __vtbl_10ModuleInfo __vtbl_10TypeInfo_a __vtbl_10TypeInfo_k __vtbl_11TypeInfo_Aa __vtbl_13TypeInfo_Enum __vtbl_14TypeInfo_Array __vtbl_14TypeInfo_Class __vtbl_15TypeInfo_Struct __vtbl_16TypeInfo_Pointer __vtbl_16TypeInfo_Typedef __vtbl_17TypeInfo_Delegate __vtbl_17TypeInfo_Function __vtbl_20TypeInfo_StaticArray __vtbl_25TypeInfo_AssociativeArray __vtbl_3gcx2GC __vtbl_3gcx6GCLock __vtbl_3std10moduleinit15ModuleCtorError __vtbl_3std11outofmemory20OutOfMemoryException __vtbl_3std3utf12UtfException __vtbl_3std3utf8UtfError __vtbl_3std5array16ArrayBoundsError __vtbl_3std6string15StringException __vtbl_3std6thread11ThreadError __vtbl_3std6thread6Thread __vtbl_6Object __vtbl_6object5Error __vtbl_8TypeInfo __vtbl_9ClassInfo __vtbl_9Exception __wargv __wenviron __wenvptr __win32_faterr __win32_stderrmsg __winmajor __winminor __winver __wpgmptr __xc_a __xc_z __xi_a __xi_z __xp_a __xp_z __xt_a __xt_z _abort _calloc _close _edata _end _errno _exit _fclose _fegetenv _fesetenv _fflush _fgetc _findfirst _findnext _flushall _fprintf _fputc _free _gc_init _gc_term _isatty _itoa _lseek _main _mainCRTStartup _malloc _mbtowc _memchr _memcmp _memcpy _memicmp _memmove _memset _no_catch_exceptions _printf _raise _read _realloc _remove _sbrk _setvbuf _signal _sprintf _strcat _strchr _strcmp _strcpy _strdup _strlen _strtold _tolower _ultoa _unmangle_ident _unmangle_pt _vfprintf _vprintf _vsprintf _wcscpy _wcslen _wctomb _write roundto0 ---------------- So I either the map file is wrong (or I misunderstood it), the linker is not stripping out 'unreferenced' functions, or there are many more referenced functions than you suspected. -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocracy!" 7/04/2006 9:39:07 AM
Apr 06 2006
prev sibling parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Walter Bright schrieb am 2006-04-06:
 Although there is a lot of code in std.string, unreferenced free 
 functions in it should be discarded by the linker. A check of the 
 generated .map file should verify this - it is certainly supposed to 
 work that way.

That's not what is happening on Linux:
 extern(C) int printf(char* x, ...){
	*(cast(char*)0) = 'a';
 }

 int main(){
	return 0;
 }

Symbols present in the executeable from std.string: _D3std6string10capitalizeFAaZAa _D3std6string10countcharsFAaAaZk _D3std6string10expandtabsFAaiZAa _D3std6string10splitlinesFAaZAAa _D3std6string11removecharsFAaAaZAa _D3std6string12replaceSliceFAaAaAaZAa _D3std6string15StringException5_ctorFAaZC3std6string15StringException _D3std6string2trFAaAaAaAaZAa _D3std6string3cmpFAaAaZi _D3std6string4atofFAaZe _D3std6string4atoiFAaZl _D3std6string4chopFAaZAa _D3std6string4findFAaAaZi _D3std6string4findFAawZi _D3std6string4icmpFAaAaZi _D3std6string4joinFAAaAaZAa _D3std6string4succFAaZAa _D3std6string4wrapFAaiAaAaiZAa _D3std6string5chompFAaAaZAa _D3std6string5countFAaAaZk _D3std6string5entabFAaiZAa _D3std6string5ifindFAaAaZi _D3std6string5ifindFAawZi _D3std6string5rfindFAaAaZi _D3std6string5rfindFAawZi _D3std6string5splitFAaAaZAAa _D3std6string5splitFAaZAAa _D3std6string5stripFAaZAa _D3std6string5zfillFAaiZAa _D3std6string6abbrevFAAaZHAaAa _D3std6string6centerFAaiZAa _D3std6string6columnFAaiZi _D3std6string6formatFYAa _D3std6string6insertFAakAaZAa _D3std6string6irfindFAaAaZi _D3std6string6irfindFAawZi _D3std6string6repeatFAakZAa _D3std6string6striplFAaZAa _D3std6string6striprFAaZAa _D3std6string7iswhiteFwZi _D3std6string7replaceFAaAaAaZAa _D3std6string7sformatFAaYAa _D3std6string7soundexFAaAaZAa _D3std6string7squeezeFAaAaZAa _D3std6string7toCharzFAaZPa _D3std6string7tolowerFAaZAa _D3std6string7toupperFAaZAa _D3std6string8capwordsFAaZAa _D3std6string8ljustifyFAaiZAa _D3std6string8rjustifyFAaiZAa _D3std6string8toStringFPaZAa _D3std6string8toStringFaZAa _D3std6string8toStringFcZAa _D3std6string8toStringFdZAa _D3std6string8toStringFeZAa _D3std6string8toStringFfZAa _D3std6string8toStringFgZAa _D3std6string8toStringFhZAa _D3std6string8toStringFiZAa _D3std6string8toStringFjZAa _D3std6string8toStringFkZAa _D3std6string8toStringFlZAa _D3std6string8toStringFlkZAa _D3std6string8toStringFmZAa _D3std6string8toStringFmkZAa _D3std6string8toStringFoZAa _D3std6string8toStringFpZAa _D3std6string8toStringFqZAa _D3std6string8toStringFrZAa _D3std6string8toStringFsZAa _D3std6string8toStringFtZAa _D3std6string8toStringFxZAa _D3std6string9inPatternFwAAaZi _D3std6string9inPatternFwAaZi _D3std6string9isNumericFAC8TypeInfoPvZx _D3std6string9isNumericFAaxZx _D3std6string9isNumericFYx _D3std6string9maketransFAaAaZAa _D3std6string9toStringzFAaZPa _D3std6string9translateFAaAaAaZAa dmd a.d -L--cref: internal/arraycat.d:101 current:
 throw new Error(std.string.format("lengths don't match for array copy,
 %s = %s", to.length, from.length));

suggested:
 throw new Error("lengths don't match for array copy," ~
 toString(to.length) ~ " = " ~ toString(from.length));

Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFENtSR3w+/yD4P9tIRAo7eAJwKqOKEYb8/LPQ0E+wTSZk4yCBnzgCfX0Mi xuBx2eSSsWztBTJWcHjxuRc= =dajr -----END PGP SIGNATURE-----
Apr 07 2006
next sibling parent Walter Bright <newshound digitalmars.com> writes:
Thomas Kuehne wrote:
 Walter Bright schrieb am 2006-04-06:
 Although there is a lot of code in std.string, unreferenced free 
 functions in it should be discarded by the linker. A check of the 
 generated .map file should verify this - it is certainly supposed to 
 work that way.

That's not what is happening on Linux:

 dmd a.d -L--cref:

Should compile with -O -release to check this.
 internal/arraycat.d:101
 current:
 throw new Error(std.string.format("lengths don't match for array copy,
 %s = %s", to.length, from.length));

suggested:
 throw new Error("lengths don't match for array copy," ~
 toString(to.length) ~ " = " ~ toString(from.length));


Thanks, I'll do that.
Apr 07 2006
prev sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Thomas Kuehne wrote:
 Walter Bright schrieb am 2006-04-06:
 Although there is a lot of code in std.string, unreferenced free 
 functions in it should be discarded by the linker. A check of the 
 generated .map file should verify this - it is certainly supposed to 
 work that way.

That's not what is happening on Linux:

Hmm. I tried --gc-symbols, which is supposed to do it, but nothing happens (even when I try it on C++ files). Can you try --gc-symbols on your system?
Apr 07 2006
parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Walter Bright schrieb am 2006-04-07:
 Thomas Kuehne wrote:
 Walter Bright schrieb am 2006-04-06:
 Although there is a lot of code in std.string, unreferenced free 
 functions in it should be discarded by the linker. A check of the 
 generated .map file should verify this - it is certainly supposed to 
 work that way.

That's not what is happening on Linux:

Hmm. I tried --gc-symbols, which is supposed to do it, but nothing happens (even when I try it on C++ files). Can you try --gc-symbols on your system?

doesn't work: --gc-symbols seems to work: --gc-sections Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFENxYL3w+/yD4P9tIRAsRUAJ9/D9yEGexwjIqZT0HPWoINl3OLvgCgzmyT Om/028iE19za+NMPBQMi8os= =S9NM -----END PGP SIGNATURE-----
Apr 07 2006
next sibling parent Walter Bright <newshound digitalmars.com> writes:
Thomas Kuehne wrote:
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1
 
 Walter Bright schrieb am 2006-04-07:
 Thomas Kuehne wrote:
 Walter Bright schrieb am 2006-04-06:
 Although there is a lot of code in std.string, unreferenced free 
 functions in it should be discarded by the linker. A check of the 
 generated .map file should verify this - it is certainly supposed to 
 work that way.


happens (even when I try it on C++ files). Can you try --gc-symbols on your system?

doesn't work: --gc-symbols seems to work: --gc-sections

Right, I typo'd it. However, --gc-sections doesn't work on my system - everything is still linked in. But it does on yours - not all of std.string is linked in?
Apr 07 2006
prev sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Thomas Kuehne wrote:
 Walter Bright schrieb am 2006-04-07:
 Thomas Kuehne wrote:
 Walter Bright schrieb am 2006-04-06:
 Although there is a lot of code in std.string, unreferenced free 
 functions in it should be discarded by the linker. A check of the 
 generated .map file should verify this - it is certainly supposed to 
 work that way.


happens (even when I try it on C++ files). Can you try --gc-symbols on your system?

doesn't work: --gc-symbols seems to work: --gc-sections

Also, could you post the full, exact command you used? Perhaps I have botched that up, too.
Apr 07 2006
parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Walter Bright schrieb am 2006-04-08:
 Thomas Kuehne wrote:
 Walter Bright schrieb am 2006-04-07:
 Thomas Kuehne wrote:
 Walter Bright schrieb am 2006-04-06:
 Although there is a lot of code in std.string, unreferenced free
 functions in it should be discarded by the linker. A check of the
 generated .map file should verify this - it is certainly supposed to
 work that way.


happens (even when I try it on C++ files). Can you try --gc-symbols on your system?

doesn't work: --gc-symbols seems to work: --gc-sections


 Right, I typo'd it. However, --gc-sections doesn't work on my system -
 everything is still linked in. But it does on yours - not all of
 std.string is linked in?

Correct
 Also, could you post the full, exact command you used? Perhaps I have
 botched that up, too.

$ dmd a.d -ofx gcc a.o -o x -lphobos -lpthread -lm -Xlinker -L/opt/dmd/dmd/lib $ dmd a.d -ofy -L--gc-sections gcc a.o -o y -lphobos -lpthread -lm -Xlinker -L/opt/dmd/dmd/lib -Xlinker --gc-sections $ ls -sh1 x y 212K x 136K y $ objdump -t y | awk "{ print (\$4); }" | grep "6string" | sed 's/\.gnu\.linkonce\.t//' | sort -u _D3std6string3cmpFAaAaZi _D3std6string4findFAawZi _D3std6string6columnFAaiZi _D3std6string7iswhiteFwZi _D3std6string8toStringFkZAa _D3std6string9inPatternFwAaZi $ gcc --version gcc (GCC) 3.4.4 (Gentoo 3.4.4-r1, ssp-3.4.4-1.0, pie-8.7.8) $ ld --version GNU ld version 2.16.1 Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFEN2Nb3w+/yD4P9tIRArpyAKC0W5zNDTYnugmWjQE5gj09Ww0ZLgCgxsER jlALdPoz8gHsb2Bw6DWDQ5A= =8ZHQ -----END PGP SIGNATURE-----
Apr 07 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Thomas Kuehne wrote:
 Walter Bright schrieb am 2006-04-08:
 Right, I typo'd it. However, --gc-sections doesn't work on my system -
 everything is still linked in. But it does on yours - not all of
 std.string is linked in?

Correct

This is good news. The documentation on --gc-sections is pretty sparse, I wasn't sure it was intended to do that, and it does nothing on my system.
 $ ld --version
 GNU ld version 2.16.1

I'm using ld 2.13. The release notes say that --gc-sections works for it, but I guess it doesn't actually :-(. Anyhow, it is good news that it is working on your system. I'll modify dmd to pass the --gc-sections command through to the linker, and I'll not worry about older ld's ignoring the switch. Thanks for helping out with this.
Apr 08 2006
parent reply Dave <Dave_member pathlink.com> writes:
In article <e17um0$1hts$1 digitaldaemon.com>, Walter Bright says...
Thomas Kuehne wrote:
 Walter Bright schrieb am 2006-04-08:
 Right, I typo'd it. However, --gc-sections doesn't work on my system -
 everything is still linked in. But it does on yours - not all of
 std.string is linked in?

Correct

This is good news. The documentation on --gc-sections is pretty sparse, I wasn't sure it was intended to do that, and it does nothing on my system.
 $ ld --version
 GNU ld version 2.16.1

I'm using ld 2.13. The release notes say that --gc-sections works for it, but I guess it doesn't actually :-(. Anyhow, it is good news that it is working on your system. I'll modify dmd to pass the --gc-sections command through to the linker, and I'll not worry about older ld's ignoring the switch. Thanks for helping out with this.

Might want to take a look at the gcc '-s' switch as well. Man page (under gcc linking options): -s Remove all symbol table and relocation information from the executable. It dramatically reduces the size of the executables on my system. Don't have much experience with it yet though. - Dave
Apr 08 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Dave wrote:
 Might want to take a look at the gcc '-s' switch as well. Man page (under gcc
 linking options):
 
 -s  Remove all symbol table and relocation information from the executable.
 
 It dramatically reduces the size of the executables on my system. Don't have
 much experience with it yet though.

I would rather that not be done automatically, as often the symbols are needed. I think the user can run strip afterwards anyway to get the same result.
Apr 08 2006
parent Dave <Dave_member pathlink.com> writes:
Walter Bright wrote:
 Dave wrote:
 Might want to take a look at the gcc '-s' switch as well. Man page 
 (under gcc
 linking options):

 -s  Remove all symbol table and relocation information from the 
 executable.

 It dramatically reduces the size of the executables on my system. 
 Don't have
 much experience with it yet though.

I would rather that not be done automatically, as often the symbols are needed. I think the user can run strip afterwards anyway to get the same result.

Ok - just thought I'd throw that out there. - Dave
Apr 08 2006
prev sibling next sibling parent reply Georg Wrede <georg.wrede nospam.org> writes:
Walter Bright wrote:
 Georg Wrede wrote:
 
 I admit this is a "feelings based" thing with most people I've talked 
 with. It seems that on embedded platforms, many expect to write all 
 the needed code themselves. It's also felt (possibly unduely??) that 
 Phobos (or whatever general Win+*nix standard library) is mostly 
 useless in embedded applications.

I'd like to get to the bottom of this feeling. For example, Kris was unhappy that typeinfo imported std.strings. I can't figure out what the problem with that is.
 To give a parallell (to explain my view here): There are many Linux 
 distributions that are compiled with 386 as target. At the same time, 
 their specs for memory, clock speed, etc. _in_practice_ rule out any 
 machine not using recent Intel processors. I see this as a joke.

 Call this inconsistent specs. I'm discussing here so D would avoid 
 this kind of inconsistencies.

For the embedded people I've talked with, D without floating point would have been a good match.

Uh-oh, after having read what Kris and others have posted as replies to your post, I can't push D for embedded development. At least until the issues they've brought up are resolved.
 Insisting on not needing hardware FP is ok. But to legitimize that, 
 one has to cater to scarce resources in other areas too. Conversely, 
 not genuinely making the language usable in smaller environments, 
 makes striving to independence of FPU not worth the effort and 
 inconvenience.


Apr 06 2006
next sibling parent reply Sean Kelly <sean f4.ca> writes:
Georg Wrede wrote:
 
 Uh-oh, after having read what Kris and others have posted as replies to 
 your post, I can't push D for embedded development. At least until the 
 issues they've brought up are resolved.

For what it's worth, the D spec doesn't require any of the behavior Kris has been talking about. I'd consider most of it specific to the DMD implementation. Sean
Apr 06 2006
parent kris <foo bar.com> writes:
Sean Kelly wrote:
 Georg Wrede wrote:
 
 Uh-oh, after having read what Kris and others have posted as replies 
 to your post, I can't push D for embedded development. At least until 
 the issues they've brought up are resolved.

For what it's worth, the D spec doesn't require any of the behavior Kris has been talking about. I'd consider most of it specific to the DMD implementation.

Agreed. If there were a good embedded compiler available (via GDC) then I'd certainly use Ares plus an appropriate C lib.
Apr 06 2006
prev sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Georg Wrede wrote:
 Uh-oh, after having read what Kris and others have posted as replies to 
 your post, I can't push D for embedded development. At least until the 
 issues they've brought up are resolved.

Which particular issue?
Apr 06 2006
parent Georg Wrede <georg.wrede nospam.org> writes:
Walter Bright wrote:
 Georg Wrede wrote:
 
 Uh-oh, after having read what Kris and others have posted as
 replies to your post, I can't push D for embedded development. At
 least until the issues they've brought up are resolved.

Which particular issue?

Well,
 For what it's worth, the D spec doesn't require any of the behavior
 Kris has been talking about.  I'd consider most of it specific to the
 DMD implementation.
 
 Sean

when Sean pointed this out, those issues of mine vanished, Poof! Since DMD is Win+Lin _only_ (IIUC), none of the embedded woes apply. It's happened to me before, too, that I forget that DMD is not the same as the D spec. And that Phobos is not part of the spec. So, actually, any discussions about embedded systems belong to D.gnu instead. --- Still, I do agree with Kris on separation of concern, clear and consistent compartmentalization of dependencies, low level dependencies on high level code, modularization, etc., in Phobos. Not addressing those issues, at least makes it harder to maintain and develop Phobos in the long run. IMHO.
Apr 07 2006
prev sibling parent reply Fredrik Olsson <peylow treyst.se> writes:
Walter Bright skrev:
 Georg Wrede wrote:
 I admit this is a "feelings based" thing with most people I've talked 
 with. It seems that on embedded platforms, many expect to write all 
 the needed code themselves. It's also felt (possibly unduely??) that 
 Phobos (or whatever general Win+*nix standard library) is mostly 
 useless in embedded applications.

I'd like to get to the bottom of this feeling. For example, Kris was unhappy that typeinfo imported std.strings. I can't figure out what the problem with that is.

Importing std.string while only using a single function still gives the impression of needing the whole module. Perhaps having a module scope of hmm... sys where typeinfo, object and anything needed by compiler, and runtime resides is a good idea. Totally forbid anything in "sys" to import/depend on anything from the outside. That way there would be no question for anyone about "how much is safe to strip"? And besides, is it wise to depend on what a linker "should do"? If current build chain nicely throws out what is not needed, does that make it right to assume that all build chains should behave as such? As I see it each module in std should be as self contained as ever possible. I know the std.date I proposed imports std.conv, std.stdio, std.string and std.c.time, but my intent is to not import any of them when finished. But then my intention was never to bring up the internals to much debate, I wanted to have input on the externals, how you as developers use the code. That was perhaps futile, but I still think my approach of a few but flexible, and overloaded functions is the best approach. // Fredrik
Apr 07 2006
parent reply Sean Kelly <sean f4.ca> writes:
Fredrik Olsson wrote:
 Walter Bright skrev:
 Georg Wrede wrote:
 I admit this is a "feelings based" thing with most people I've talked 
 with. It seems that on embedded platforms, many expect to write all 
 the needed code themselves. It's also felt (possibly unduely??) that 
 Phobos (or whatever general Win+*nix standard library) is mostly 
 useless in embedded applications.

I'd like to get to the bottom of this feeling. For example, Kris was unhappy that typeinfo imported std.strings. I can't figure out what the problem with that is.

Importing std.string while only using a single function still gives the impression of needing the whole module.

As I don't do much embedded programming, the issue for me is somewhat different, though the goals are similar: I want there to be a clean separation or clearly defined interaction between the runtime and standard library code to allow for the link-time integration of third-party standard libraries and garbage collectors. I have little problem with using C library calls in the runtime however, as I see that as largely unavoidable. Rather, my primary concern is that the runtime should not have any compile-time dependencies on standard library or GC code.
 Perhaps having a module scope of hmm... sys where typeinfo, object and 
 anything needed by compiler, and runtime resides is a good idea. Totally 
 forbid anything in "sys" to import/depend on anything from the outside. 
 That way there would be no question for anyone about "how much is safe 
 to strip"?

See Ares for another possible solution.
 And besides, is it wise to depend on what a linker "should do"? If 
 current build chain nicely throws out what is not needed, does that make 
 it right to assume that all build chains should behave as such?

I think this is a reasonable assumption, as to do otherwise necessitates design compromises to keep modules as small and isolated as possible. And while this may be reasonable for small projects, I can't see it working very well for large ones.
 As I see it each module in std should be as self contained as ever 
 possible. I know the std.date I proposed imports std.conv, std.stdio, 
 std.string and std.c.time, but my intent is to not import any of them 
 when finished.

Again, I don't see much of a problem with calling C library functions in general, though I would prefer not doing so if it brings in an entire subsystem that is unrelated to anything done by the client code.
 But then my intention was never to bring up the internals to much 
 debate, I wanted to have input on the externals, how you as developers 
 use the code. That was perhaps futile, but I still think my approach of 
 a few but flexible, and overloaded functions is the best approach.

That leaves a lot of room for interpretation, but I tenatively agree. Sean
Apr 07 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Sean Kelly wrote:
 Fredrik Olsson wrote:
 And besides, is it wise to depend on what a linker "should do"? If 
 current build chain nicely throws out what is not needed, does that 
 make it right to assume that all build chains should behave as such?

I think this is a reasonable assumption, as to do otherwise necessitates design compromises to keep modules as small and isolated as possible. And while this may be reasonable for small projects, I can't see it working very well for large ones.

This capability of linkers (eliminating unreferenced functions) first appeared in the late 80's, and quickly became standard practice. If you've got a linker that doesn't support that, you're likely to have many other serious problems with it, as D (and C++) depend on other linker features introduced in the late 80's. D doesn't require anything of a linker that C++ doesn't already realistically require.
Apr 07 2006
parent reply Nic Tiger <g_tiger progtech.ru> writes:
Walter Bright wrote:
 This capability of linkers (eliminating unreferenced functions) first 
 appeared in the late 80's, and quickly became standard practice. If 
 you've got a linker that doesn't support that, you're likely to have 
 many other serious problems with it, as D (and C++) depend on other 
 linker features introduced in the late 80's.
 
 D doesn't require anything of a linker that C++ doesn't already 
 realistically require.

Last 3 years I had to use Microsoft linker (from VS98, VS2003) and it really foolish (especially if compared with DMC optimizing linker). I fulfilled investigations and it seems MS linker uses "hungry" algorithm - it wants any extern symbol declared in module, even if it is not used, then walk though all objects in libs in search for those symbols; meanwhile it includes all data found in that objects. After this cycle, /optref switch tries to drop unreferenced functions - but a lot of data get linked in anyway and sometimes this unnecessary data references a lot of code. Most problems we had with static objects of virtual classes - when only existence of such object added references for all its virtual methods and so on. The only solution against this awful linker strategy was to split modules into *very* small files and manually predict dependencies and try to prevent linker from wanting something that nobody else wants. It was hard. For previous 3 years with DMC I never needed such handwork. Digital Mars linker is great in this field, but you definitely cannot assume that all linkers sing late 80's are as smart as DMC. Nic Tiger.
Apr 08 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Nic Tiger wrote:
 Walter Bright wrote:
 This capability of linkers (eliminating unreferenced functions) first 
 appeared in the late 80's, and quickly became standard practice. If 
 you've got a linker that doesn't support that, you're likely to have 
 many other serious problems with it, as D (and C++) depend on other 
 linker features introduced in the late 80's.

 D doesn't require anything of a linker that C++ doesn't already 
 realistically require.

assume that all linkers sing late 80's are as smart as DMC.

I must say I'm surprised that 15+ year old linker technology that is easy to implement and once upon a time was much discussed (idiotically dubbed 'smart linking') seems to have fallen by the wayside. At least the linkers D uses (optlink and newer versions of ld) support it. Optlink does it by default, and ld via a badly underdocumented obscure switch.
Apr 08 2006
parent Sean Kelly <sean f4.ca> writes:
Walter Bright wrote:
 
 I must say I'm surprised that 15+ year old linker technology that is 
 easy to implement and once upon a time was much discussed (idiotically 
 dubbed 'smart linking') seems to have fallen by the wayside.
 
 At least the linkers D uses (optlink and newer versions of ld) support 
 it. Optlink does it by default, and ld via a badly underdocumented 
 obscure switch.

I think this is one cause of the confusion about how Optlink behaves. But I do think that it is quite reasonable to expect such behavior, much as one expects optimization features in compilers. Sean
Apr 08 2006
prev sibling parent Nic Tiger <g_tiger progtech.ru> writes:
Georg Wrede wrote:
 Walter Bright wrote:
 Double has another problem when used as a date - there are embedded 
 processors in wide use that don't have floating point hardware. This
 means that double shouldn't be used in core routines that are not 
 implicitly related to doing floating point calculations.

Ignoring the issue of date, I have a comment on processors: IIRC, D will never be found on a processor less than 32 bits. Further, it may take some time before D actually gets used in something embedded. By that time, IMHO, it is unlikely that a 32b processor would not contain a math unit.

I have developed for UltraSparc (Sparc V8) CPUs (in millions of PowerTV boxes), that doesn't have FPU unit (or, what much worse, have damaged FPU unit due to faulty manufacturing process). We had to force software FP implementation when compiled code with GCC, otherwise it could just hang at first FP instruction encountered. and since we talk of double (64-bit), I think 64-bit integer whould be enough to pack time with microseconds accuracy for at least year 137438 I we need another (higher) accuracy, general time/date format is useless anyway
Apr 04 2006
prev sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Fredrik Olsson wrote:
 I see your point, and will try to explain why I have chosen double as I 
 have.

I have some more thoughts on using double <g>.
 Using double I get the same scale for dates for timestamps, as for 
 dates; the integer part is days.
 
 Having dates a days with times as fractions is also how the astronomers 
 do it, they call it Julian Days, and base it on monday, january 1, 4713 
 BCE as the epoch. But the idea is the same.
 
 It is datatype used by many database implementations (PostgreSQL, MySQL, 
 MS SQL Server 7 (and beyond?)).

This may be a consequence of Microsoft Basic implementing time as a double. Chicken or egg?
 
 A double can represent infinity, -infinity, and not a number can be not 
 a date.

std.date's d_time offers d_time_nan, which fills the role of nan for times. I don't see a purpose for infinity or -infinity when dealing with calendar dates or file times. There is a purpose for such when doing physics math, but that is way beyond the scope of std.date.
 +-270 years is sort of an limitation :), even a simple genealogy 
 application would hit that limit quite soon. Using a double is based on 
 the idea that the farther away from today, the less relevant is precision.

Double can appear to represent far more precision. But the system clocks give quantized time (usually in millisecond precision). Doubles cannot exactly represent milliseconds, so when you convert from system time to doubles and back to system time, it's very possible that you can get a different system time. This will play havoc with file utilities and programs like make.
Apr 03 2006
parent reply Fredrik Olsson <peylow treyst.se> writes:
Walter Bright skrev:
 Fredrik Olsson wrote:
 I see your point, and will try to explain why I have chosen double as 
 I have.

I have some more thoughts on using double <g>.
 Using double I get the same scale for dates for timestamps, as for 
 dates; the integer part is days.

 Having dates a days with times as fractions is also how the 
 astronomers do it, they call it Julian Days, and base it on monday, 
 january 1, 4713 BCE as the epoch. But the idea is the same.

 It is datatype used by many database implementations (PostgreSQL, 
 MySQL, MS SQL Server 7 (and beyond?)).

This may be a consequence of Microsoft Basic implementing time as a double. Chicken or egg?

OLE/COM/ActiveX compenents as general is another question? For PostgreSQL I guess they must have a reason. You can choose to use 64bit int for timestamp when compiling PostgreSQL but double is still the default.
 A double can represent infinity, -infinity, and not a number can be 
 not a date.

std.date's d_time offers d_time_nan, which fills the role of nan for times. I don't see a purpose for infinity or -infinity when dealing with calendar dates or file times. There is a purpose for such when doing physics math, but that is way beyond the scope of std.date.

like: isInRange(aDate, now(), infinity); A date way into the future would be just as good for most purposes, but clean and readable code is nice.
 +-270 years is sort of an limitation :), even a simple genealogy 
 application would hit that limit quite soon. Using a double is based 
 on the idea that the farther away from today, the less relevant is 
 precision.

Double can appear to represent far more precision. But the system clocks give quantized time (usually in millisecond precision). Doubles cannot exactly represent milliseconds, so when you convert from system time to doubles and back to system time, it's very possible that you can get a different system time. This will play havoc with file utilities and programs like make.

This along with floating point not always supported have convinced me though. It is rewritten with d_timestamp as a 64bit long. // Fredrik
Apr 05 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Fredrik Olsson wrote:
 A double can represent infinity, -infinity, and not a number can be 
 not a date.

std.date's d_time offers d_time_nan, which fills the role of nan for times. I don't see a purpose for infinity or -infinity when dealing with calendar dates or file times. There is a purpose for such when doing physics math, but that is way beyond the scope of std.date.

like: isInRange(aDate, now(), infinity); A date way into the future would be just as good for most purposes, but clean and readable code is nice.

Why not write it as: if (now() <= aDate) ... ?
Apr 05 2006
parent reply Fredrik Olsson <peylow treyst.se> writes:
Walter Bright skrev:
 Fredrik Olsson wrote:
 A double can represent infinity, -infinity, and not a number can be 
 not a date.

std.date's d_time offers d_time_nan, which fills the role of nan for times. I don't see a purpose for infinity or -infinity when dealing with calendar dates or file times. There is a purpose for such when doing physics math, but that is way beyond the scope of std.date.

like: isInRange(aDate, now(), infinity); A date way into the future would be just as good for most purposes, but clean and readable code is nice.

Why not write it as: if (now() <= aDate) ... ?

Perhaps a better example: Item[] itemsInRange(Item[] items, d_date start, d_date end) { Item[] ret; foreach (Item item; items) { if (isInRange(item.date, start, end) ret ~= item; } return ret; } Introducing itemsBefore() and itemsAfter() could be done, but less code for the same functionality would be to simply send "infinity" to itemsInTange's start or end. And now it would be nice with a set standard for "what is infinity". Best would be if the properties min and max could be made for typedefs, and maybe introduce your own, such as nad for "not a date". // Fredrik
Apr 05 2006
parent Walter Bright <newshound digitalmars.com> writes:
Fredrik Olsson wrote:
 Perhaps a better example:
 Item[] itemsInRange(Item[] items, d_date start, d_date end) {
   Item[] ret;
   foreach (Item item; items) {
     if (isInRange(item.date, start, end)
       ret ~= item;
   }
   return ret;
 }
 
 Introducing itemsBefore() and itemsAfter() could be done, but less code 
 for the same functionality would be to simply send "infinity" to 
 itemsInTange's start or end. And now it would be nice with a set 
 standard for "what is infinity".

You can use d_time.max and d_time.min. I also don't understand why use isInRange rather than < and >.
 Best would be if the properties min and max could be made for typedefs, 
 and maybe introduce your own, such as nad for "not a date".

There's already a d_time_nan for just that purpose.
Apr 06 2006