www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Proposal for custom time string formatting in std.datetime

reply Jonathan M Davis <jmdavisProg gmx.com> writes:
Okay. At the moment, the time point types in std.datetime have functions for 
converting to and from strings of standard formats but not custom formats, so 
functions for that need to be added. I've come up with a proposal for how 
they're going to work and would like some feedback on it.

Originally, I was going to make them work like strftime and strptime, since it 
was my understanding that those functions were fairly standard among various 
programming languags. And it _does_ look like a variety of programming 
languages have something similar (Java, Ruby, Python, etc.), but the exact set 
of flags that they use is not standard, so there _isn't_ really a standard to 
follow, just similar functions across a variety of programming languages. And 
honestly, strftime and strptime aren't very good. They're fairly limited IMHO, 
and the choice of flags is fairly arbitrary, so it seems like a good idea to 
design our own, assuming that we can make something better.

Stewart Gordon has a library that takes a different approach ( 
http://pr.stewartsplace.org.uk/d/sutil/datetime_format.html ). It does away 
with % flags and uses maximul munch with each of the flags being name such that 
they don't overlap in a way that would make certain combinations of flags 
impossible. It then requires that characters which are not part of the flags be 
surrounded by single quotes. It's an interesting approach, but it isn't as 
flexible as it could be because of its use of maximul munch instead of % flags.

So, I've come up with something new which tries to take the best of both. On 
the whole, I think that it's fairly straightforward, and the flags are 
generally recognizable and memorable (though there are a lot). It's also 
definitely extremely flexible (e.g. you can pass it functions to generate 
portions of the string if the existing flags don't get you quite what you 
need). But I'd like some feedback on it before I spend a lot of time on the 
implementation.

This page has the docs for std.datetime with everything else but the proposed 
custom formatting functions for SysTime stripped out of it:

http://jmdavis.github.com/d-programming-language.org/std_datetime.html

So, what do you think?

- Jonathan M Davis
Dec 21 2011
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
My first thought is that std.datetime is already very large. Few will need a 
custom date formatter, so it should be in a separate module to:

1. reduce cognitive load on the programmer

2. reduce the overhead pulled in for every program that may want to use an 
std.datetime function, but not need custom formatting
Dec 21 2011
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/21/2011 8:12 PM, Jonathan M Davis wrote:
 Yes, putting the custom formatting functions in increases the size of the
 module, but I think that if we want to do something about the size of
 std.datetime, it would make more sense to move some of its existing pieces out
 than to not put the custom time formatting on the types themselves.

I would seriously like to change to a "pay only for what you use" model for Phobos. Note that a module can be split into sub-modules without changing the interface for the user.
Dec 21 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/21/2011 10:36 PM, Jonathan M Davis wrote:
 In general, I agree that that's a good policy. How expensive would you
 consider templated functions which aren't used to be with regards to that?
 They don't cost nothing, since they still have to be lexed and parsed, but
 they don't get fully compiled.

The test is to import the module without referencing any functions in it. Check if the resulting executable increases in size.
 Note that a module can be split into sub-modules without changing the
 interface for the user.

You mean create other modules that get publicly imported by one module? I do think that sections of std.datetime would benefit from that.

Yes.
Dec 21 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/21/2011 10:56 PM, Jonathan M Davis wrote:
 On Wednesday, December 21, 2011 22:46:37 Walter Bright wrote:
 On 12/21/2011 10:36 PM, Jonathan M Davis wrote:
 In general, I agree that that's a good policy. How expensive would you
 consider templated functions which aren't used to be with regards to
 that? They don't cost nothing, since they still have to be lexed and
 parsed, but they don't get fully compiled.

The test is to import the module without referencing any functions in it. Check if the resulting executable increases in size.

Isn't that a compiler and/or linker issue? I mean, if _nothing_ is referenced in the module, then shouldn't it never pull anything in and therefore never increase the size of the executable?

We must deal with our imperfect tools as they are. For example, I am not going to rewrite the Linux ld linker, and the OSX linker, and the FreeBSD linker, etc.
Dec 22 2011
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/22/2011 12:14 AM, Jonathan M Davis wrote:
 On Thursday, December 22, 2011 00:05:58 Walter Bright wrote:
 We must deal with our imperfect tools as they are. For example, I am not
 going to rewrite the Linux ld linker, and the OSX linker, and the FreeBSD
 linker, etc.

If the compiler and/or linker don't strip unused symbols, then how on earth is importing the module _not_ going to pull in everything in it save for uninitialized templates?

There are data that needs to be there, but is never symbolically referenced, for example, the exception handler tables. The compiler does the best it can, like emitting one object file per function, but it always must behave conservatively. The elf object file format isn't going to change, we aren't going to rewrite the gnu linker, nor will we rewrite the back ends of gcc and lcc. There isn't any one simple rule to avoid unnecessary bloat, the only way is to get somewhat familiar with how things work. For example, using 'class' as a namespace generates a useless vtbl[] and a typeinfo instance, and in the future will generate useless reflection data. It's better to use a module as a namespace, as that is what it is designed for. The most important thing is to design the boxes around the various units of functionality such that there are minimal lines between those boxes. For example, if I want to see if one file is newer than another (like for a make program), it should not pull in timezone processing or date formatting code.
Dec 22 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/22/2011 1:04 AM, Jonathan M Davis wrote:
 On Thursday, December 22, 2011 00:54:49 Walter Bright wrote:
 The most important thing is to design the boxes around the various units of
 functionality such that there are minimal lines between those boxes. For
 example, if I want to see if one file is newer than another (like for a make
 program), it should not pull in timezone processing or date formatting
 code.

Both of those are part of SysTime, and the time zone in particular is an integral part of that. You can't do _anything_ with a SysTime without a time zone. That's part of the point of the design. It avoids time conversion issues. Sure, if stuff was rearranged, only TimeZone and LocalTime would have to be pulled in (since LocalTime is the default and TimeZone is its base class), but they have to be there regardless.

Timezone information is not necessary to measure elapsed time or relative time, for example.
 So, in principle, reducing how much has to be pulled in is very much
 desirable, but in this case, it doesn't make sense. The time zone stuff is
 required, and there's a usability issue if you split out the date formatting.
 It's already built into the type.

Can it be added with PIMPL? PIMPL is good for more than just information hiding, it can also be a fine way to avoid pulling in things unless they are actually dynamically used (as opposed to pulling them in if they are statically referenced).
 It's just the custom formatting which isn't
 there yet, but that's going to be templated, so it shouldn't pull in any
 additional symbols unless it's used.

Please check and see if additional symbols are pulled in or not. I've seen a lot of template code where the author of that code never checked and was surprised at the instantiations that were happening that he overlooked.
Dec 22 2011
next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-12-22 11:03:13 +0000, Jonathan M Davis <jmdavisProg gmx.com> said:

 On Thursday, December 22, 2011 02:12:31 Walter Bright wrote:
 Timezone information is not necessary to measure elapsed time or relative
 time, for example.

The type requires it though. No, comparison doesn't require the time zone, but many (most?) of the other operations do. And the type can't be separated from the time zone. That's part of the whole point of how SysTime is designed. It holds its time internally in UTC and then uses the time zone to adjust the time whenever a property or other function is used which requires the time in that time zone. That way, you avoid all of the issues and bugs that result from converting the time. The cost of that is that you can't not have a time zone and use SysTime. So, if someone cares about saving that little bit of extra size in their executable by not using the time zone, they're going to have to use the C functions or design their own time code.

I'd tend to say that for general purpose time representation not involving local time, SysTime is suboptimal because it forces you to carry around a pointer to a time zone. Imagine an array of SysTime all in UTC and the space wasted with all those pointers referencing the UTC time zone object. It should be very easy to make a separate type, let's say UTCTime, and allow SysTime to be constructed from it and to be implicitly converted to it (with alias this). Then put UTCTime in a different module from SysTime and you can deal with time in UTC without having to ever import the module with SysTime the time zone class it wants. Then redefine all APIs not dealing with local time so they work with UTCTime instead of SysTime. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Dec 22 2011
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/22/2011 7:24 PM, Jonathan M Davis wrote:
 So, it harms usability IMHO to using something like UTCTime instead of
 SysTime, and just to save yourself the cost of the reference for the time
 zone?

No, it's not the cost of the reference. It's the cost of pulling in all the code to deal with that reference.
Dec 22 2011
prev sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2011-12-23 03:24:27 +0000, Jonathan M Davis <jmdavisProg gmx.com> said:

 On Thursday, December 22, 2011 10:29:59 Michel Fortin wrote:
 I'd tend to say that for general purpose time representation not
 involving local time, SysTime is suboptimal because it forces you to
 carry around a pointer to a time zone. Imagine an array of SysTime all
 in UTC and the space wasted with all those pointers referencing the UTC
 time zone object.
 
 It should be very easy to make a separate type, let's say UTCTime, and
 allow SysTime to be constructed from it and to be implicitly converted
 to it (with alias this). Then put UTCTime in a different module from
 SysTime and you can deal with time in UTC without having to ever import
 the module with SysTime the time zone class it wants.
 
 Then redefine all APIs not dealing with local time so they work with
 UTCTime instead of SysTime.

That could certainly be done, but it complicates things that much more. The idea, at least, of SysTime was that it would deal with all of the time zone stuff correctly without you having to worry about it unless you wanted to deal with the time zone stuff, in which case it would give you those capabilities. That requires having it to carry the time zone around. Something like UTCTime would allow you to carry the time around without the time zone, but then anyone who wants to be able to do stuff like convert it to a string or get its year or anything like that is almost certainly going to want it in a particular time zone (probably local time) rather than UTC, so it increases the burden on the programmer to deal with something like UTCTime. If you're dealing with anything beyond comparing times or adding durations to them, you're going to need the time zone, and in most cases, UTC is not the one that people are going to want. So, it harms usability IMHO to using something like UTCTime instead of SysTime, and just to save yourself the cost of the reference for the time zone? If you're _that_ worried about the space, you can always use a SysTime's stdTime property or toUnixTime and get an integral value to store. Granted, that's not as safe as something like UTCTime, since it's a naked number, but I really don't think that the cost of that reference is generally an issue.

Well, what I'm getting at is that most of the time you don't care which time zone the time was recorded in, so you don't need to attach a time zone to it, you only need to take the time zone into consideration when formatting as a string, and then you mostly always use local time. The real issue remains that you can't use SysTime without including all the code for all the time zones. Think about this: if you don't need to carry around the time zones but instead only ask for a time zone when formatting as a string, you much less need time zones to be polymorphic. The time zone could be a template argument to the formatting functions for instance. On the other hand, if you need to carry the associated time zone along with the time, then things gets more complicated and a polymorphic time zone type tend to solve that problem well. But how many of us need to carry a time zone with a time value? So in my opinion associating a time zone with SysTime was a mistake. Not just because it forces you to carry around an extra pointer, but mostly because it forces time zones to be polymorphic which brings all the drawbacks of a class: less inlining and worse performance due to virtual functions, and all the virtual functions need to be included in the binary even if you don't use them. It does benefit the use case where you need to tag a time with a specific time zone, but that sounds rather specialized to me. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Dec 23 2011
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/22/2011 3:03 AM, Jonathan M Davis wrote:
 I don't follow you. You mean use PIMPL for the time zone? I haven't a clue how
 you're going to do PIMPL without .di files,

PIMPL means you have an opaque pointer to something. It can be a function pointer, or a data pointer. It gets filled in at runtime. It has nothing to do with .di files. This was used in the olden days for printf floating point formatting. If a program didn't use fp, it was bad to pull in the (large) floating point formatting package which printf must support. So, printf used an opaque pointer to the printf formatting package, which was null. If any other floating point code appeared in the source code, the compiler would emit a hook to initialize that pointer, and hence pull in the fp code. In other words, it works a lot like a 'weak' reference. I've done similar things to prevent pulling in multithreaded code when an app was single threaded. (OOP programming with derived classes and such is a more formalized implementation of PIMPL. The user of the class has no idea which functions he actually is winding up calling.)
Dec 22 2011
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/22/2011 3:03 AM, Jonathan M Davis wrote:
 On Thursday, December 22, 2011 02:12:31 Walter Bright wrote:
 Timezone information is not necessary to measure elapsed time or relative
 time, for example.

The type requires it though. No, comparison doesn't require the time zone, but many (most?) of the other operations do. And the type can't be separated from the time zone. That's part of the whole point of how SysTime is designed. It holds its time internally in UTC and then uses the time zone to adjust the time whenever a property or other function is used which requires the time in that time zone. That way, you avoid all of the issues and bugs that result from converting the time. The cost of that is that you can't not have a time zone and use SysTime. So, if someone cares about saving that little bit of extra size in their executable by not using the time zone, they're going to have to use the C functions or design their own time code.

The time zone info can be lazily initialized only by those operations that need a time zone.
Dec 22 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/22/11 1:32 PM, Jonathan M Davis wrote:
[snip]

Now that we got to talk about std.datetime, here are three things that I 
think we could do to make it more manageable.

1. Put files in data. I find it a tad awkward that we have time zone 
information in hardcoded strings inside the code. That means any such 
change would have us redistributed Phobos. I'm thinking a small data 
file would be more appropriate. Better yet, hook into OSs timezone 
information and let the OS worry about keeping that timely.

2. datetime == time + date. We could reduce std.datetime to "public 
import std.time, std.date;" and define:

(a) std.time -> everything having to do with sheer time information, no 
date-related oddities. That means the largest formalized interval would 
be the week.

(b) std.date -> all of the bizarre calendar stuff, dealing with months 
and more. Naturally std.date would use std.time.

3. Using loops in unittest instead of rote repetition - this is already 
underway. We could actually use data files with unittests if that's helpful.


Andrei
Dec 22 2011
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/22/11 1:38 PM, Andrei Alexandrescu wrote:
 On 12/22/11 1:32 PM, Jonathan M Davis wrote:
 [snip]

 Now that we got to talk about std.datetime, here are three things that I
 think we could do to make it more manageable.

 1. Put files in data.

I meant "put data in files" :o). Also: 4. Move some of the benchmarking clock stuff from std.datetime into the fledgling std.benchmark. I'm on that. Andrei
Dec 22 2011
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-12-22 20:38, Andrei Alexandrescu wrote:
 On 12/22/11 1:32 PM, Jonathan M Davis wrote:
 [snip]

 Now that we got to talk about std.datetime, here are three things that I
 think we could do to make it more manageable.

 1. Put files in data. I find it a tad awkward that we have time zone
 information in hardcoded strings inside the code. That means any such
 change would have us redistributed Phobos. I'm thinking a small data
 file would be more appropriate. Better yet, hook into OSs timezone
 information and let the OS worry about keeping that timely.

 2. datetime == time + date. We could reduce std.datetime to "public
 import std.time, std.date;" and define:

 (a) std.time -> everything having to do with sheer time information, no
 date-related oddities. That means the largest formalized interval would
 be the week.

 (b) std.date -> all of the bizarre calendar stuff, dealing with months
 and more. Naturally std.date would use std.time.

That seems like a good start to divide the module. -- /Jacob Carlborg
Dec 22 2011
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-12-23 03:38, Jonathan M Davis wrote:
 On Thursday, December 22, 2011 13:38:51 Andrei Alexandrescu wrote:
 On 12/22/11 1:32 PM, Jonathan M Davis wrote:
 [snip]

 Now that we got to talk about std.datetime, here are three things that I
 think we could do to make it more manageable.

 1. Put files in data. I find it a tad awkward that we have time zone
 information in hardcoded strings inside the code. That means any such
 change would have us redistributed Phobos. I'm thinking a small data
 file would be more appropriate. Better yet, hook into OSs timezone
 information and let the OS worry about keeping that timely.

The only reason there are any hard-coded time zone names is that they're required to convert between the names used by Posix and those used by Windows. So, you _can't_ hook into the OS information and get them. Now, conceivably, you could move that information into a file and then parse the file when it's needed. That would obviously be less efficient, but creating a WindowsTimeZone or PosixTimeZone (which is what they'd most frequently be needed for) isn't exactly terribly efficient to begin with, since you have to read in the time zone information from from the disk or from the registry (which is probably on disk). So, that's not unreasonable.
 2. datetime == time + date. We could reduce std.datetime to "public
 import std.time, std.date;" and define:

 (a) std.time ->  everything having to do with sheer time information, no
 date-related oddities. That means the largest formalized interval would
 be the week.

 (b) std.date ->  all of the bizarre calendar stuff, dealing with months
 and more. Naturally std.date would use std.time.

Well, the only time point type which only deals with time and not dates is TimeOfDay. Date, DateTime, and SysTime all deal with dates. You can't really get away from dealing with dates once your type holds more than 24 hours worth of time unless it's a duration as opposed to a time point. So, I really don't think that trying to split std.datetime into std.date and std.time makes much sense. A better division would be to put SysTime in a module and TimeOfDay, Date, and DateTime in another. SysTime deals with the system time, has a time zone, and is intended for use with stuff which isn't calendar-based (timestamps and file times being good examples - anything where you need the absolute time when it occured), whereas the others don't have a time zone and therefore _are_ calendar-based. However, they all share common code, so they'd either need to duplicate that code or any modules that they're split up into need to be in the same package. They _could_ both be sitting in std directly, but that would give package access to completely unrelated functions. It's also possible that we'll have more date and/or time related modules in the future (for instance, having one for handling date-recurrence patterns would be nice), and if that occurs, it makes that much more sense to use a sub-package rather than std. If we're splitting it up, there's also the question of how far we want to split it up. In the extreme case, we could put every struct and class in its own module, though that's going too far IMHO. But we're probably going to want to go farther than just splitting it in two. In addition to the benchmarking functionality, I'd like to see the time interval and range functionality in a separate module, and there's the question of whether the time zone stuff should be in its own module - though there's not much point to the time zones without SysTime, so I'm not sure whether that's really valuable. In any case, if we keep std.datetime and have it publicly import the other modules, we can split it up more or less however we like, but having a sub- package would make the most sense IMHO. My original std.datetime proposal was that way, but it was split badly, and we didn't really have any sub-package stuff in Phobos beyond the C stuff and Windows stuff at the time, but that has been slowly changing. - Jonathan M Davis

I think a sub-package would be nice for all time and date related code. -- /Jacob Carlborg
Dec 23 2011
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/22/2011 11:32 AM, Jonathan M Davis wrote:
 On Thursday, December 22, 2011 10:26:27 Walter Bright wrote:
 PIMPL means you have an opaque pointer to something. It can be a function
 pointer, or a data pointer. It gets filled in at runtime. It has nothing to
 do with .di files.

Well, I have no idea how you'd do that in this case without hiding SysTime's implementation,

Why not hide it?
 since it has to call TimeZone's functions and therefore needs
 to know that they exist. They're polymorphic, so the exact type of the
 TimeZone could be unknown and unseen by SysTime, but it has to know about
 TimeZone. And if you hid the function bodies in an effort to make TimeZone's
 usage opaque, then you couldn't inline those functions anymore, and I would
 consider the efficiency of the functions to be more important

Why is the efficiency of those functions important? I cannot think of an application that needs high performance timezone calculations. (Benchmark timing code does not need timezones.)
 that trying to
 avoid pulling in the TimeZone class just to avoid a few KB in the executable.

Kb in the executable does negatively impact performance. The issue with something being only a few Kb is that when everyone thinks that, we wind up with a 1Mb "hello world" program.
 On Thursday, December 22, 2011 10:28:52 Walter Bright wrote:
 The time zone info can be lazily initialized only by those operations that
 need a time zone.

I don't think that that would really buy you anything. SysTime is default- initialized to use LocalTime, which is a singleton, so it's not like you're allocating a new TimeZone every time that you create or use a SysTime. Currently, the singleton is initialized by a static constructor, but that's going to be changed to be lazily initialized (which should get rid of the static constructors and their cost). So, there _is_ still some cost on the _first_ SysTime that gets created in the program, but after that, there isn't really. And doing a lazy initialization of the TimeZone within the SysTime in the case where the programmer does not specify a TimeZone would just increase the cost of most of SysTime's functions, since most of them would have to be checking whether the TimeZone had been initialized or not. With the singleton, such a check only occurs when the SysTime is created. And at some point, the singleton will probably be change to use emplace, which will allow it to completely avoid the GC heap, which will make the singleton cost that much less. So, the cost of the time zone from the perspective of execution speed is minimal. It sounds like it's just the fact that using a class increases the symbols in the executable that's the problem.

This conflates performance with allocated memory consumption and with static memory consumption. I am talking about minimizing the executable file size on disk.
Dec 22 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/22/2011 7:13 PM, Jonathan M Davis wrote:
 Okay. Assuming that I'm going to try and make TimeZone opaque within SysTime,
 does that require a pointer rather than a reference? And I assume then that
 the time zone stuff would need to be in a separate module than SysTime. That
 being the case, how would SysTime be able to use the time zone without
 importing that module? Does the C++ solution of forward declaring it like

 class TimeZone;

 work in D?

It'll still put a reference to TimeZone in the ModuleInfo. I suggest: void* tz; The functions that don't need it, just ignore it. The functions that do need TimeZone, do: class TimeZone { void foo() { ... } } if (!tz) tz = initTimeZone(); auto t = cast(TimeZone)tz; t.foo(); // call members of TimeZone Put the functions that do need TimeZone in a separate module from the ones that don't.
Dec 22 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/22/2011 11:12 PM, Jonathan M Davis wrote:
 On Thursday, December 22, 2011 21:30:46 Walter Bright wrote:
 On 12/22/2011 7:13 PM, Jonathan M Davis wrote:
 Okay. Assuming that I'm going to try and make TimeZone opaque within
 SysTime, does that require a pointer rather than a reference? And I
 assume then that the time zone stuff would need to be in a separate
 module than SysTime. That being the case, how would SysTime be able to
 use the time zone without importing that module? Does the C++ solution
 of forward declaring it like

 class TimeZone;

 work in D?

It'll still put a reference to TimeZone in the ModuleInfo.

Will that still happen if the TimeZone is used in templated functions? SysTime has several functions that use TimeZone explicitly - e.g. the timezone property. It needs to be able to take and return a TimeZone. However, it _could_ be templatized with an empty template parameter list. Would that avoid pulling in the information on TimeZone if those functions aren't instantiated? Or would it still pull it in? - Jonathan M Davis

Templates, after instantiation, are exactly like their non-templated equivalents. Before instantiation, they are not even semantically analyzed.
Dec 23 2011
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-12-22 09:14, Jonathan M Davis wrote:
 On Thursday, December 22, 2011 00:05:58 Walter Bright wrote:
 We must deal with our imperfect tools as they are. For example, I am not
 going to rewrite the Linux ld linker, and the OSX linker, and the FreeBSD
 linker, etc.

If the compiler and/or linker don't strip unused symbols, then how on earth is importing the module _not_ going to pull in everything in it save for uninitialized templates? - Jonathan M Davis

Isn't that what we're trying to find out by testing it? -- /Jacob Carlborg
Dec 22 2011
prev sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-12-22 08:14:21 +0000, Jonathan M Davis <jmdavisProg gmx.com> said:

 On Thursday, December 22, 2011 00:05:58 Walter Bright wrote:
 We must deal with our imperfect tools as they are. For example, I am not
 going to rewrite the Linux ld linker, and the OSX linker, and the FreeBSD
 linker, etc.

If the compiler and/or linker don't strip unused symbols, then how on earth is importing the module _not_ going to pull in everything in it save for uninitialized templates?

I'm pretty sure, even though it's been a long time I've looked at those details, that the problem is caused by classes. Basically, the typeinfo for all classes in a module are referenced by the generated module info structure. The class's typeinfo points to many things including the default constructor and vtable, the vtable points to all virtual functions, which finally points to everything used by those functions. You can thus easily have all the code in the module referenced indirectly by the module info, so the linker can't strip it. That might be why std.datetime has a heavy footprint, even when you use only a small part of it. But still, you should look at the generated code to test this hypothesis. The benefit of referencing classes within module info: you can instantiate them using Object.factory, if they have a default constructor. We pay a heavy price compared to what we get with this very limited runtime reflection. Things you can do if this hypothesis proves to be correct: make it so that any virtual function in your classes are short and do not depend on heavy functions, as all the code referenced by those functions will be dragged along. Don't forget that final functions are still virtual if they override a function in a base class or defined in an interface. Also, don't use a class if you don't need one, or if you need one put it in a separate module that you only import when you need it. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Dec 22 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-12-22 16:56, Michel Fortin wrote:
 The benefit of referencing classes within module info: you can
 instantiate them using Object.factory, if they have a default
 constructor. We pay a heavy price compared to what we get with this very
 limited runtime reflection.

It's a really nice feature to have when implementing serialization. -- /Jacob Carlborg
Dec 22 2011
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/22/2011 9:22 AM, Jacob Carlborg wrote:
 On 2011-12-22 16:56, Michel Fortin wrote:
 The benefit of referencing classes within module info: you can
 instantiate them using Object.factory, if they have a default
 constructor. We pay a heavy price compared to what we get with this very
 limited runtime reflection.

It's a really nice feature to have when implementing serialization.

Sure, but we need to be aware of class overhead, and not use classes unless necessary. I.e. a class shouldn't be used to merely create a namespace. Classes also should not be used if it is not intended to be a polymorphic type.
Dec 22 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-12-22 19:32, Walter Bright wrote:
 On 12/22/2011 9:22 AM, Jacob Carlborg wrote:
 On 2011-12-22 16:56, Michel Fortin wrote:
 The benefit of referencing classes within module info: you can
 instantiate them using Object.factory, if they have a default
 constructor. We pay a heavy price compared to what we get with this very
 limited runtime reflection.

It's a really nice feature to have when implementing serialization.

Sure, but we need to be aware of class overhead, and not use classes unless necessary. I.e. a class shouldn't be used to merely create a namespace. Classes also should not be used if it is not intended to be a polymorphic type.

Exactly. But I'm referring to deserializing classes, I don't care what they're used for. -- /Jacob Carlborg
Dec 22 2011
parent reply Joshua Reusch <yoschi arkandos.de> writes:
Am 22.12.2011 22:57, schrieb Jacob Carlborg:
 On 2011-12-22 19:32, Walter Bright wrote:
 On 12/22/2011 9:22 AM, Jacob Carlborg wrote:
 On 2011-12-22 16:56, Michel Fortin wrote:
 The benefit of referencing classes within module info: you can
 instantiate them using Object.factory, if they have a default
 constructor. We pay a heavy price compared to what we get with this
 very
 limited runtime reflection.

It's a really nice feature to have when implementing serialization.

Sure, but we need to be aware of class overhead, and not use classes unless necessary. I.e. a class shouldn't be used to merely create a namespace. Classes also should not be used if it is not intended to be a polymorphic type.

Exactly. But I'm referring to deserializing classes, I don't care what they're used for.

IMHO, the user should know the type of the object he wants to deserialize, so it can be done only with compile-time reflection.
Dec 23 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-12-23 11:14, Joshua Reusch wrote:
 Am 22.12.2011 22:57, schrieb Jacob Carlborg:
 On 2011-12-22 19:32, Walter Bright wrote:
 On 12/22/2011 9:22 AM, Jacob Carlborg wrote:
 On 2011-12-22 16:56, Michel Fortin wrote:
 The benefit of referencing classes within module info: you can
 instantiate them using Object.factory, if they have a default
 constructor. We pay a heavy price compared to what we get with this
 very
 limited runtime reflection.

It's a really nice feature to have when implementing serialization.

Sure, but we need to be aware of class overhead, and not use classes unless necessary. I.e. a class shouldn't be used to merely create a namespace. Classes also should not be used if it is not intended to be a polymorphic type.

Exactly. But I'm referring to deserializing classes, I don't care what they're used for.

IMHO, the user should know the type of the object he wants to deserialize, so it can be done only with compile-time reflection.

That might work, I haven't thought about it. -- /Jacob Carlborg
Dec 23 2011
prev sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-12-22 17:22:59 +0000, Jacob Carlborg <doob me.com> said:

 On 2011-12-22 16:56, Michel Fortin wrote:
 The benefit of referencing classes within module info: you can
 instantiate them using Object.factory, if they have a default
 constructor. We pay a heavy price compared to what we get with this very
 limited runtime reflection.

It's a really nice feature to have when implementing serialization.

True, for serialization and other things. I'm not arguing against the feature. What I am observing is that the the capabilities of D's runtime reflection are quite small compared to the footprint it has. We often hear on this list about how adding more information about functions and fields to typeinfo would bloat executables, yet I am under the impression the biggest part of that bloat (the code of all the virtual functions and whatever they call, even if you don't use those functions or even the class) is already part of each and every D executable, only we don't realize this because it is not exposed in the API, it just sits there in a mostly non-interpretable form (the vtable). -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Dec 22 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-12-23 02:10, Michel Fortin wrote:
 On 2011-12-22 17:22:59 +0000, Jacob Carlborg <doob me.com> said:

 On 2011-12-22 16:56, Michel Fortin wrote:
 The benefit of referencing classes within module info: you can
 instantiate them using Object.factory, if they have a default
 constructor. We pay a heavy price compared to what we get with this very
 limited runtime reflection.

It's a really nice feature to have when implementing serialization.

True, for serialization and other things. I'm not arguing against the feature. What I am observing is that the the capabilities of D's runtime reflection are quite small compared to the footprint it has. We often hear on this list about how adding more information about functions and fields to typeinfo would bloat executables, yet I am under the impression the biggest part of that bloat (the code of all the virtual functions and whatever they call, even if you don't use those functions or even the class) is already part of each and every D executable, only we don't realize this because it is not exposed in the API, it just sits there in a mostly non-interpretable form (the vtable).

If that's the case I think we should take advantage of the already available data in the executables to implement better runtime reflection. -- /Jacob Carlborg
Dec 23 2011
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-12-22 04:57, Walter Bright wrote:
 My first thought is that std.datetime is already very large. Few will
 need a custom date formatter, so it should be in a separate module to:

 1. reduce cognitive load on the programmer

 2. reduce the overhead pulled in for every program that may want to use
 an std.datetime function, but not need custom formatting

As I've said several times, std.datetime is way too large and should be a package. -- /Jacob Carlborg
Dec 21 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-12-22 08:58, Jonathan M Davis wrote:
 On Thursday, December 22, 2011 08:34:23 Jacob Carlborg wrote:
 On 2011-12-22 04:57, Walter Bright wrote:
 My first thought is that std.datetime is already very large. Few will
 need a custom date formatter, so it should be in a separate module to:

 1. reduce cognitive load on the programmer

 2. reduce the overhead pulled in for every program that may want to use
 an std.datetime function, but not need custom formatting

As I've said several times, std.datetime is way too large and should be a package.

That's the way that it was originally proposed (albeit split in a poor manner), and a number of people (particularly Phobos devs) were against it, and have been any time that I've suggested splitting out part of it. The fact that Phobos has historically shunned sub-packages probably has a lot to do with that (though we're starting to have some).

Yeah, I don't get this. Most modules in Phobos are too large, in my opinion.
 To make it a package at this point would probably break a fair bit of code
 unless it were a new package which std.datetime publicly imported, so simply
 making std.datetime a package isn't necessarily a great idea, and there's a
 definite benefit IMHO in being able to just import it all with one import. But
 it's not necessarily unreasonable to create a new package which holds the
 various pieces and have std.datetime import them all.

 - Jonathan M Davis

It would break code, but it's better to break it sooner than later. Seems more and more like D should support importing all modules in a package, that Java does: import foo.*; And yes I know this can be done manually with public imports. I think it's a benefit to have smaller modules and that a given module is only responsible for one specific thing. -- /Jacob Carlborg
Dec 22 2011
next sibling parent reply Somedude <lovelydear mailmetrash.com> writes:
Le 22/12/2011 11:40, Jacob Carlborg a écrit :
 
 Yeah, I don't get this. Most modules in Phobos are too large, in my
 opinion.
 

It largely is a matter of taste, I think. There are advantages in minimizing the size of files but there are also advantages in minimizing the number of files. But for datetime.d, it has largely gone beyond my own point of acceptability (which is about 5,000 lines, if that means anything).
Dec 22 2011
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-12-22 12:01, Somedude wrote:
 Le 22/12/2011 11:40, Jacob Carlborg a écrit :
 Yeah, I don't get this. Most modules in Phobos are too large, in my
 opinion.

It largely is a matter of taste, I think. There are advantages in minimizing the size of files but there are also advantages in minimizing the number of files. But for datetime.d, it has largely gone beyond my own point of acceptability (which is about 5,000 lines, if that means anything).

If there are too many files you divide them up in several packages. If there are too many packages you divide them up in several sub-packages or libraries/projects. This approach should be taken through out the whole code. From statements, via functions, classes, modules and packages, up to libraries/projects. -- /Jacob Carlborg
Dec 22 2011
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-12-22 12:27, Jonathan M Davis wrote:
 On Thursday, December 22, 2011 12:01:31 Somedude wrote:
 Le 22/12/2011 11:40, Jacob Carlborg a écrit :
 Yeah, I don't get this. Most modules in Phobos are too large, in my
 opinion.

It largely is a matter of taste, I think. There are advantages in minimizing the size of files but there are also advantages in minimizing the number of files. But for datetime.d, it has largely gone beyond my own point of acceptability (which is about 5,000 lines, if that means anything).

Well, a large portion of the file is documentation and unit tests, and the number of lines that the unit tests take up should go down as I refactor them (which I've done some of, but I've still got a long way to go), but it's never going to be anywhere near as small as 5,000 lines. SysTime alone is over 5,000 lines (though again, much of that is documentation and unit tests). But ultimately, I think that whether a module is too large or not is a function of its API rather than the amount of source code. It's a question of how digestible the documentation is. And by that count, std.datetime is still quite large, but it's a very different measurement. - Jonathan M Davis

Even if you cut it in half I think it's way too large. I think 5000 lines are too large as well. I don't agree with what you're saying about the API. If a module has 5+k lines and only one public function and the rest are private functions I will still think it's too large. About the unit tests. If they take up so much of the module then move them it their own module(s). And now everyone will say that it's very useful to have the unit tests next to the function. I don't agree with that when the unit test is more then around five lines. I will have the same problem with std.serialization/Orange if that will end up in Phobos. In Orange I'm testing one feature in one module and all modules are located in a specific directory. The shortest testing module is 54 lines. The average is probably around 70 lines. I'm not particular happy about putting all those tests in one file and even less happy about putting them next to all the regular code making those module EVEN large. You should treat your testing code just as you treat your "regular" code. Just as well designed, just as modularized, just as effective, just as clean. The testing code is in fact just as much part of the "regular" code as the rest of the code. -- /Jacob Carlborg
Dec 22 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/22/11 6:50 AM, Jacob Carlborg wrote:
 You should treat your testing code just as you treat your "regular"
 code. Just as well designed, just as modularized, just as effective,
 just as clean. The testing code is in fact just as much part of the
 "regular" code as the rest of the code.

This. YES. A liability of the current std.datetime is that it assumes that unittest code is exempt from the rules that apply to regular code. I am increasingly worried about that module. It has been argued that its sheer size is not a problem, but somehow the task of accounting for that has taken a life of its own - e.g. we can't test std.datetime like everything else in Phobos, it needs its own version. Andrei
Dec 22 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-12-22 17:59, Andrei Alexandrescu wrote:
 On 12/22/11 6:50 AM, Jacob Carlborg wrote:
 You should treat your testing code just as you treat your "regular"
 code. Just as well designed, just as modularized, just as effective,
 just as clean. The testing code is in fact just as much part of the
 "regular" code as the rest of the code.

This. YES. A liability of the current std.datetime is that it assumes that unittest code is exempt from the rules that apply to regular code. I am increasingly worried about that module. It has been argued that its sheer size is not a problem, but somehow the task of accounting for that has taken a life of its own - e.g. we can't test std.datetime like everything else in Phobos, it needs its own version. Andrei

That doesn't sound right. If std.datetime can't be tested like the rest of Phobos there's something quite seriously wrong with it. -- /Jacob Carlborg
Dec 22 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/22/2011 10:13 AM, Jonathan M Davis wrote:
 It can be, and it is. Previously, there were issues compiling it on Windows
 which caused the compiler to run out of memory. So, I added a version
 identifier for the unit tests which was disable on Windows. Those issues have
 been fixed, so the version identifier is now always enabled. I left it in,
 because it made Don's life easier when he was trying to reduce compiler bugs
 (since, without the unit tests, less gets pulled in, and less of std.datetime
 gets instantiated). The version identifier could be removed, but the ease of
 disabling the unit tests if necessary merited leaving it in.

I can't run the unittests on my old FreeBSD machine because it runs out of memory on std.datetime.
Dec 22 2011
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/22/11 4:40 AM, Jacob Carlborg wrote:
 It would break code, but it's better to break it sooner than later.
 Seems more and more like D should support importing all modules in a
 package, that Java does:

 import foo.*;

I think this style is currently discouraged in Java. Andrei
Dec 22 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-12-22 17:50, Andrei Alexandrescu wrote:
 On 12/22/11 4:40 AM, Jacob Carlborg wrote:
 It would break code, but it's better to break it sooner than later.
 Seems more and more like D should support importing all modules in a
 package, that Java does:

 import foo.*;

I think this style is currently discouraged in Java. Andrei

Well, it seems like everyone wants it and there are several libraries that supports "import foo.all;" to include the whole "foo" package, including some of my own. I know that it's used in the SWT library at least. -- /Jacob Carlborg
Dec 22 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/22/11 11:29 AM, Jacob Carlborg wrote:
 On 2011-12-22 17:50, Andrei Alexandrescu wrote:
 On 12/22/11 4:40 AM, Jacob Carlborg wrote:
 It would break code, but it's better to break it sooner than later.
 Seems more and more like D should support importing all modules in a
 package, that Java does:

 import foo.*;

I think this style is currently discouraged in Java. Andrei

Well, it seems like everyone wants it and there are several libraries that supports "import foo.all;" to include the whole "foo" package, including some of my own.

The authors of Java apparently thought the same. It is _experience_ with the actual feature that turned out to be bad. Last time I used Eclipse it underlined with red the lines with .*. Andrei
Dec 22 2011
next sibling parent reply Piotr Szturmaj <bncrbme jadamspam.pl> writes:
Andrei Alexandrescu wrote:
 On 12/22/11 11:29 AM, Jacob Carlborg wrote:
 On 2011-12-22 17:50, Andrei Alexandrescu wrote:
 On 12/22/11 4:40 AM, Jacob Carlborg wrote:
 It would break code, but it's better to break it sooner than later.
 Seems more and more like D should support importing all modules in a
 package, that Java does:

 import foo.*;

I think this style is currently discouraged in Java. Andrei

Well, it seems like everyone wants it and there are several libraries that supports "import foo.all;" to include the whole "foo" package, including some of my own.

The authors of Java apparently thought the same. It is _experience_ with the actual feature that turned out to be bad. Last time I used Eclipse it underlined with red the lines with .*.

I wish D could support partial modules - partial as analogy to C#'s partial classes. module std.datetime-unit1; import std.datetime-unit2; // dash allowed only in submodules with the same module name ... module std.datetime-unit2; import std.datetime-unit1; ... // then module whatever; import std.datetime; // as usual
Dec 22 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/22/11 1:25 PM, Piotr Szturmaj wrote:
 I wish D could support partial modules - partial as analogy to C#'s
 partial classes.

 module std.datetime-unit1;
 import std.datetime-unit2;
 // dash allowed only in submodules with the same module name
 ....

 module std.datetime-unit2;
 import std.datetime-unit1;
 ....

 // then

 module whatever;
 import std.datetime; // as usual

I think there's a lot of mileage in the 1:1 correspondence between files and modules, and between directories and packages. We should keep it that way. Andrei
Dec 22 2011
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-12-22 20:27, Andrei Alexandrescu wrote:
 On 12/22/11 1:25 PM, Piotr Szturmaj wrote:
 I wish D could support partial modules - partial as analogy to C#'s
 partial classes.

 module std.datetime-unit1;
 import std.datetime-unit2;
 // dash allowed only in submodules with the same module name
 ....

 module std.datetime-unit2;
 import std.datetime-unit1;
 ....

 // then

 module whatever;
 import std.datetime; // as usual

I think there's a lot of mileage in the 1:1 correspondence between files and modules, and between directories and packages. We should keep it that way. Andrei

I like the 1:1 mapping between file and modules, and folders and packages as well. -- /Jacob Carlborg
Dec 22 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/22/2011 2:00 PM, Jacob Carlborg wrote:
 I like the 1:1 mapping between file and modules, and folders and packages as
well.

That feature is something that took some getting used to for a lot of people, but when they got used to it they preferred it.
Dec 22 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-12-23 04:15, Walter Bright wrote:
 On 12/22/2011 2:00 PM, Jacob Carlborg wrote:
 I like the 1:1 mapping between file and modules, and folders and
 packages as well.

That feature is something that took some getting used to for a lot of people, but when they got used to it they preferred it.

I had no problems with it. Although I came from a Java background and hadn't used C++ before I learned D. -- /Jacob Carlborg
Dec 23 2011
prev sibling parent Piotr Szturmaj <bncrbme jadamspam.pl> writes:
Andrei Alexandrescu wrote:
 On 12/22/11 1:25 PM, Piotr Szturmaj wrote:
 I wish D could support partial modules - partial as analogy to C#'s
 partial classes.

 module std.datetime-unit1;
 import std.datetime-unit2;
 // dash allowed only in submodules with the same module name
 ....

 module std.datetime-unit2;
 import std.datetime-unit1;
 ....

 // then

 module whatever;
 import std.datetime; // as usual

I think there's a lot of mileage in the 1:1 correspondence between files and modules, and between directories and packages. We should keep it that way.

If you mean compatibility with build tools/compilers/parsers/IDEs/etc. then of course you are right. It's better to leave it as is.
Dec 23 2011
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 12/22/2011 11:25 AM, Piotr Szturmaj wrote:
 I wish D could support partial modules - partial as analogy to C#'s partial
 classes.

 module std.datetime-unit1;
 import std.datetime-unit2;
 // dash allowed only in submodules with the same module name
 ...

 module std.datetime-unit2;
 import std.datetime-unit1;
 ...

 // then

 module whatever;
 import std.datetime; // as usual

I have no idea why anyone would want this. (Is it because the file is too big to fit on a floppy disk? <g>)
Dec 22 2011
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-12-23 03:21, Walter Bright wrote:
 On 12/22/2011 11:25 AM, Piotr Szturmaj wrote:
 I wish D could support partial modules - partial as analogy to C#'s
 partial
 classes.

 module std.datetime-unit1;
 import std.datetime-unit2;
 // dash allowed only in submodules with the same module name
 ...

 module std.datetime-unit2;
 import std.datetime-unit1;
 ...

 // then

 module whatever;
 import std.datetime; // as usual

I have no idea why anyone would want this. (Is it because the file is too big to fit on a floppy disk? <g>)

std.datetime is kind of hard on IDE's and text editors. I know TextMate has problems with it and Descent has problems with a lot less code. -- /Jacob Carlborg
Dec 23 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/23/2011 7:19 AM, Jacob Carlborg wrote:
 On 2011-12-23 03:21, Walter Bright wrote:
 On 12/22/2011 11:25 AM, Piotr Szturmaj wrote:
 I wish D could support partial modules - partial as analogy to C#'s
 partial
 classes.

 module std.datetime-unit1;
 import std.datetime-unit2;
 // dash allowed only in submodules with the same module name
 ...

 module std.datetime-unit2;
 import std.datetime-unit1;
 ...

 // then

 module whatever;
 import std.datetime; // as usual

I have no idea why anyone would want this. (Is it because the file is too big to fit on a floppy disk? <g>)

std.datetime is kind of hard on IDE's and text editors. I know TextMate has problems with it and Descent has problems with a lot less code.

I'm not going to defend programs that can't handle large text files, but in D one could always: module std.datetime; mixin(import("datetime-unit1.d")); mixin(import("datetime-unit2.d"));
Dec 23 2011
prev sibling next sibling parent reply zhang <bitworld qq.com> writes:
 On Fri, 23 Dec 2011 02:21:37 -0000, Walter Bright  
 <newshound2 digitalmars.com> wrote:
 
 On 12/22/2011 11:25 AM, Piotr Szturmaj wrote:
 I wish D could support partial modules - partial as analogy to C#'s  
 partial
 classes.

 module std.datetime-unit1;
 import std.datetime-unit2;
 // dash allowed only in submodules with the same module name
 ...

 module std.datetime-unit2;
 import std.datetime-unit1;
 ...

 // then

 module whatever;
 import std.datetime; // as usual

I have no idea why anyone would want this. (Is it because the file is too big to fit on a floppy disk? <g>)


As for big module, my solutions are: 1) put related modules into a package (or directory) 2) add a module named all.d into the directory, and this module will import all the other modules publicly 3) now just import the *all* module when needed For example, we have these modules std\datetime\all.d std\datetime\unit1.d std\datetime\unit2.d std\datetime\unit3.d in a package called std.datetime (a directory named std\datetime) To import the datetime package, we use this: import std.datetime.all; So, my suggestions are: 1) can we import the *all* module defaultly (maybe it called another module name) ? Then, we can use "import std.datetime;" instead of "import std.datetime.all;". 2) The compiler can recognise the importing module as a package, then import all modules in the package automaticly (including sub-package or not ?) Then, we just use "import std;" to import all the modules in the standard package.
 
 It's of most benefit (IMO) for the Visual Studio IDE/GUI designer code.   
 The automated code generation goes into one source file, in a partial  
 class.  The user defined code into another source file/partial class.  It  
 makes life easier for both the developer and the GUI designer code itself.

Maybe this is another thing. The "partial class" seems important for auto-code-generation or function-extention. ---------- Zhang <bitworld qq.com>
Dec 24 2011
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-12-25 04:05, zhang wrote:
 On Fri, 23 Dec 2011 02:21:37 -0000, Walter Bright
 <newshound2 digitalmars.com>  wrote:

 On 12/22/2011 11:25 AM, Piotr Szturmaj wrote:
 I wish D could support partial modules - partial as analogy to C#'s
 partial
 classes.

 module std.datetime-unit1;
 import std.datetime-unit2;
 // dash allowed only in submodules with the same module name
 ...

 module std.datetime-unit2;
 import std.datetime-unit1;
 ...

 // then

 module whatever;
 import std.datetime; // as usual

I have no idea why anyone would want this. (Is it because the file is too big to fit on a floppy disk?<g>)


As for big module, my solutions are: 1) put related modules into a package (or directory) 2) add a module named all.d into the directory, and this module will import all the other modules publicly 3) now just import the *all* module when needed

Here we have yet another example of some one who wants to use "import foo.*;". -- /Jacob Carlborg
Dec 25 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-12-25 15:31, Jakob Ovrum wrote:
 On Sunday, 25 December 2011 at 14:00:58 UTC, Jacob Carlborg wrote:
 On 2011-12-25 04:05, zhang wrote:
 On Fri, 23 Dec 2011 02:21:37 -0000, Walter Bright
 <newshound2 digitalmars.com> wrote:

 On 12/22/2011 11:25 AM, Piotr Szturmaj wrote:
 I wish D could support partial modules - partial as analogy to C#'s
 partial
 classes.

 module std.datetime-unit1;
 import std.datetime-unit2;
 // dash allowed only in submodules with the same module name
 ...

 module std.datetime-unit2;
 import std.datetime-unit1;
 ...

 // then

 module whatever;
 import std.datetime; // as usual

I have no idea why anyone would want this. (Is it because the file is too big to fit on a floppy disk?<g>)


As for big module, my solutions are: 1) put related modules into a package (or directory) 2) add a module named all.d into the directory, and this module will import all the other modules publicly 3) now just import the *all* module when needed

Here we have yet another example of some one who wants to use "import foo.*;".

I agree it would be nice to have a proper way to do this. Providing .all modules feels hacky and they need to be manually maintained - it also doesn't look as good. The _really important thing_ to note here is, there are at least two kinds of D libraries when it comes to this issue. Some libraries, including Phobos, prefer putting a lot of stuff in one module rather than splitting it up into a package. But many other libraries prefer splitting a library over multiple modules and using the package system to tie them together. Arguing over which approach is better comes down to a wide range of arguments, many quite subjective. I do not believe either approach is always better than the other. But I do believe it's important not to disregard the package approach, because it has advantages and adherents as well. Right now, the language favours the single module approach. When packages are used instead, it's clunky to provide a convenient interface to the entire library, and it's not very intuitive to look for a ".all" module, which could be named anything. I really like the idea of simply adding "import myPackage;", behaving like your average ".all" module. Doing it this way solves at least three problems: No more clunky maintenance of convenience modules, big modules can later be split up into a package without breaking any client code, and we don't have to worry about a ".all" module for Phobos anymore (which is a suggestion that has been on the table several times).

I'm not sure if I like that syntax because you wouldn't be able to tell the difference between importing a package and a module. But perhaps that's the point. Otherwise I agree. -- /Jacob Carlborg
Dec 25 2011
parent Stewart Gordon <smjg_1998 yahoo.com> writes:
On 26/12/2011 12:51, zhang wrote:
<snip>
 The compiler should do this. A package is a directory, and a module just a
file.
 When importing a package, the compiler will import all the modules in the
package.
 The user doesn't care about this.

The user may well care if he/she is compiling someone else's project that imports the whole of some huge library despite using only a little bit of it, and this greatly increases the time it takes to compile. Both because of the time it takes to load the modules and because of having a larger symbol table to look through to resolve symbols as and when they are used. Moreover, there may be modules that are intended primarily for a library's internal use, which would get imported and thereby clutter the symbol table. Stewart.
Dec 26 2011
prev sibling parent zhang <bitworld qq.com> writes:
 As for big module, my solutions are:
 1) put related modules into a package (or directory)
 2) add a module named all.d into the directory, and this module will import
all the other modules publicly
 3) now just import the *all* module when needed

Here we have yet another example of some one who wants to use "import foo.*;".

---------- Zhang <bitworld qq.com>
Dec 26 2011
prev sibling parent reply "Jakob Ovrum" <jakobovrum gmail.com> writes:
On Sunday, 25 December 2011 at 14:00:58 UTC, Jacob Carlborg wrote:
 On 2011-12-25 04:05, zhang wrote:
 On Fri, 23 Dec 2011 02:21:37 -0000, Walter Bright
 <newshound2 digitalmars.com>  wrote:

 On 12/22/2011 11:25 AM, Piotr Szturmaj wrote:
 I wish D could support partial modules - partial as analogy 
 to C#'s
 partial
 classes.

 module std.datetime-unit1;
 import std.datetime-unit2;
 // dash allowed only in submodules with the same module name
 ...

 module std.datetime-unit2;
 import std.datetime-unit1;
 ...

 // then

 module whatever;
 import std.datetime; // as usual

I have no idea why anyone would want this. (Is it because the file is too big to fit on a floppy disk?<g>)


As for big module, my solutions are: 1) put related modules into a package (or directory) 2) add a module named all.d into the directory, and this module will import all the other modules publicly 3) now just import the *all* module when needed

Here we have yet another example of some one who wants to use "import foo.*;".

I agree it would be nice to have a proper way to do this. Providing .all modules feels hacky and they need to be manually maintained - it also doesn't look as good. The _really important thing_ to note here is, there are at least two kinds of D libraries when it comes to this issue. Some libraries, including Phobos, prefer putting a lot of stuff in one module rather than splitting it up into a package. But many other libraries prefer splitting a library over multiple modules and using the package system to tie them together. Arguing over which approach is better comes down to a wide range of arguments, many quite subjective. I do not believe either approach is always better than the other. But I do believe it's important not to disregard the package approach, because it has advantages and adherents as well. Right now, the language favours the single module approach. When packages are used instead, it's clunky to provide a convenient interface to the entire library, and it's not very intuitive to look for a ".all" module, which could be named anything. I really like the idea of simply adding "import myPackage;", behaving like your average ".all" module. Doing it this way solves at least three problems: No more clunky maintenance of convenience modules, big modules can later be split up into a package without breaking any client code, and we don't have to worry about a ".all" module for Phobos anymore (which is a suggestion that has been on the table several times).
Dec 25 2011
parent zhang <bitworld qq.com> writes:
 I really like the idea of simply adding "import myPackage;", behaving
 like your average ".all" module.

 Doing it this way solves at least three problems: No more clunky
 maintenance of convenience modules, big modules can later be split up
 into a package without breaking any client code, and we don't have to
 worry about a ".all" module for Phobos anymore (which is a suggestion
 that has been on the table several times).

I'm not sure if I like that syntax because you wouldn't be able to tell the difference between importing a package and a module. But perhaps that's the point. Otherwise I agree.

The compiler should do this. A package is a directory, and a module just a file. When importing a package, the compiler will import all the modules in the package. The user doesn't care about this. ---------- Zhang <bitworld qq.com>
Dec 26 2011
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-12-22 18:36, Andrei Alexandrescu wrote:
 On 12/22/11 11:29 AM, Jacob Carlborg wrote:
 On 2011-12-22 17:50, Andrei Alexandrescu wrote:
 On 12/22/11 4:40 AM, Jacob Carlborg wrote:
 It would break code, but it's better to break it sooner than later.
 Seems more and more like D should support importing all modules in a
 package, that Java does:

 import foo.*;

I think this style is currently discouraged in Java. Andrei

Well, it seems like everyone wants it and there are several libraries that supports "import foo.all;" to include the whole "foo" package, including some of my own.

The authors of Java apparently thought the same. It is _experience_ with the actual feature that turned out to be bad. Last time I used Eclipse it underlined with red the lines with .*. Andrei

Really? I have not seen that. I have Eclipse 3.7.0 installed and it doesn't indicate and error with "import package.*;". Latest Eclipse is 3.7.1. -- /Jacob Carlborg
Dec 22 2011
next sibling parent David Gileadi <gileadis NSPMgmail.com> writes:
On 12/22/11 2:32 PM, Jacob Carlborg wrote:
 Really? I have not seen that. I have Eclipse 3.7.0 installed and it
 doesn't indicate and error with "import package.*;". Latest Eclipse is
 3.7.1.

Mine doesn't either, but if you use Eclipe's awesome (and weakly named) Organize Imports feature (Ctrl+Shift+O) by default it replaces the .* with the specific imports. So I think Eclipse tends to discourage .* in general. It makes sense to me that it's discouraged in Java; when reading third-party code (on the web for instance) it can be very difficult to tell where a type came from if it was part of a .* import.
Dec 22 2011
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/22/11 3:32 PM, Jacob Carlborg wrote:
 On 2011-12-22 18:36, Andrei Alexandrescu wrote:
 On 12/22/11 11:29 AM, Jacob Carlborg wrote:
 On 2011-12-22 17:50, Andrei Alexandrescu wrote:
 On 12/22/11 4:40 AM, Jacob Carlborg wrote:
 It would break code, but it's better to break it sooner than later.
 Seems more and more like D should support importing all modules in a
 package, that Java does:

 import foo.*;

I think this style is currently discouraged in Java. Andrei

Well, it seems like everyone wants it and there are several libraries that supports "import foo.all;" to include the whole "foo" package, including some of my own.

The authors of Java apparently thought the same. It is _experience_ with the actual feature that turned out to be bad. Last time I used Eclipse it underlined with red the lines with .*. Andrei

Really? I have not seen that. I have Eclipse 3.7.0 installed and it doesn't indicate and error with "import package.*;". Latest Eclipse is 3.7.1.

I stand corrected (it was a while ago and my memory is hazy). Found a related discussion at http://stackoverflow.com/questions/1983435/eclipse-java-is-it-harmful-to-import-java-namespace Andrei
Dec 22 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-12-22 22:58, Andrei Alexandrescu wrote:
 On 12/22/11 3:32 PM, Jacob Carlborg wrote:
 Really? I have not seen that. I have Eclipse 3.7.0 installed and it
 doesn't indicate and error with "import package.*;". Latest Eclipse is
 3.7.1.

I stand corrected (it was a while ago and my memory is hazy). Found a related discussion at http://stackoverflow.com/questions/1983435/eclipse-java-is-it-harmful-to-import-java-namespace Andrei

According to that the only problem seems to be the possibility of namespace collision. -- /Jacob Carlborg
Dec 22 2011
prev sibling parent Somedude <lovelydear mailmetrash.com> writes:
Le 22/12/2011 22:32, Jacob Carlborg a écrit :
 The authors of Java apparently thought the same. It is _experience_ with
 the actual feature that turned out to be bad. Last time I used Eclipse
 it underlined with red the lines with .*.


 Andrei

Really? I have not seen that. I have Eclipse 3.7.0 installed and it doesn't indicate and error with "import package.*;". Latest Eclipse is 3.7.1.

It used to be warning, if I remember well. In the current version of eclipse, all the * are automatically replaced by the full names when saving. That's how it behaves here, at least. And Andrei is correct about the fact that the star notation is considered sloppy, both in Java and in Python.
Dec 23 2011
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/22/2011 9:36 AM, Andrei Alexandrescu wrote:
 The authors of Java apparently thought the same. It is _experience_ with the
 actual feature that turned out to be bad. Last time I used Eclipse it
underlined
 with red the lines with .*.

Checked exceptions is another feature that looks great on paper; it's only after years of use one discovers what a perniciously bad feature it is. C++11 dropped it, too.
Dec 22 2011
prev sibling parent Somedude <lovelydear mailmetrash.com> writes:
Le 22/12/2011 18:36, Andrei Alexandrescu a écrit :
 On 12/22/11 11:29 AM, Jacob Carlborg wrote:
 On 2011-12-22 17:50, Andrei Alexandrescu wrote:
 On 12/22/11 4:40 AM, Jacob Carlborg wrote:
 It would break code, but it's better to break it sooner than later.
 Seems more and more like D should support importing all modules in a
 package, that Java does:

 import foo.*;

I think this style is currently discouraged in Java. Andrei

Well, it seems like everyone wants it and there are several libraries that supports "import foo.all;" to include the whole "foo" package, including some of my own.

The authors of Java apparently thought the same. It is _experience_ with the actual feature that turned out to be bad. Last time I used Eclipse it underlined with red the lines with .*. Andrei

Yes, this style *is* discouraged in Java. In Python as well. Modern IDE's like eclipse do all the tedious work automatically for you, so it's no longer a pain to have the full list of imports.
Dec 23 2011
prev sibling parent reply Piotr Szturmaj <bncrbme jadamspam.pl> writes:
Walter Bright wrote:
 My first thought is that std.datetime is already very large. Few will
 need a custom date formatter, so it should be in a separate module to:

 1. reduce cognitive load on the programmer

 2. reduce the overhead pulled in for every program that may want to use
 an std.datetime function, but not need custom formatting

Why not just extract unittest code to separate module?
Dec 22 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-12-22 15:44, Piotr Szturmaj wrote:
 Walter Bright wrote:
 My first thought is that std.datetime is already very large. Few will
 need a custom date formatter, so it should be in a separate module to:

 1. reduce cognitive load on the programmer

 2. reduce the overhead pulled in for every program that may want to use
 an std.datetime function, but not need custom formatting

Why not just extract unittest code to separate module?

Exactly, that's what I've said several times. See my reply to Jonathan: http://dfeed.kimsufi.thecybershadow.net/discussion/post/jcv92h$1kus$1 digitalmars.com -- /Jacob Carlborg
Dec 22 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, December 21, 2011 19:57:28 Walter Bright wrote:
 My first thought is that std.datetime is already very large. Few will need a
 custom date formatter, so it should be in a separate module to:
 
 1. reduce cognitive load on the programmer
 
 2. reduce the overhead pulled in for every program that may want to use an
 std.datetime function, but not need custom formatting

It makes by far the most sense to put it on the types themselves IMHO (especilaly since all of the other string functions are that way), and the functions are templated, so the overhead is reduced if you don't use them. If we want to address the size of std.datetime, I believe that there are far better ways to do it. Breaking out the benchmarking stuff (which we're likely to do) would be one. Another would be to take the interval and range stuff and put it in a separate module. It uses the time point stuff, but doesn't need to be in the same module to do what it does. Yes, putting the custom formatting functions in increases the size of the module, but I think that if we want to do something about the size of std.datetime, it would make more sense to move some of its existing pieces out than to not put the custom time formatting on the types themselves. - Jonathan M Davis
Dec 21 2011
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thursday, 22 December 2011 at 03:42:32 UTC, Jonathan M Davis 
wrote:
 http://jmdavis.github.com/d-programming-language.org/std_datetime.html

What is the purpose of %nyplus ? There should be presets for common standard date formats, like here: http://php.net/manual/en/class.datetime.php#datetime.constants.types
Dec 21 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 05:19:07 Vladimir Panteleev wrote:
 On Thursday, 22 December 2011 at 03:42:32 UTC, Jonathan M Davis
 
 wrote:
 http://jmdavis.github.com/d-programming-language.org/std_datetime.html

What is the purpose of %nyplus ?

It's what the ISO formats use. They put a + in front of the number if it's positive and exceeds 4 digits. %yplus would put the + there as long as the year is positive, whereas %4yplus puts it there if the year is > 9999.
 There should be presets for common standard date formats, like
 here:
 http://php.net/manual/en/class.datetime.php#datetime.constants.types

Those could be added, though I'd probably add them as additional flags. It already has %ctime and %mpeg7, but those are some good ones to add as well. - Jonathan M Davis
Dec 21 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, December 21, 2011 22:22:40 Walter Bright wrote:
 On 12/21/2011 8:12 PM, Jonathan M Davis wrote:
 Yes, putting the custom formatting functions in increases the size of
 the
 module, but I think that if we want to do something about the size of
 std.datetime, it would make more sense to move some of its existing
 pieces out than to not put the custom time formatting on the types
 themselves.

Phobos.

In general, I agree that that's a good policy. How expensive would you consider templated functions which aren't used to be with regards to that? They don't cost nothing, since they still have to be lexed and parsed, but they don't get fully compiled.
 Note that a module can be split into sub-modules without changing the
 interface for the user.

You mean create other modules that get publicly imported by one module? I do think that sections of std.datetime would benefit from that. - Jonathan M Davis
Dec 21 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, December 21, 2011 22:46:37 Walter Bright wrote:
 On 12/21/2011 10:36 PM, Jonathan M Davis wrote:
 In general, I agree that that's a good policy. How expensive would you
 consider templated functions which aren't used to be with regards to
 that? They don't cost nothing, since they still have to be lexed and
 parsed, but they don't get fully compiled.

The test is to import the module without referencing any functions in it. Check if the resulting executable increases in size.

Isn't that a compiler and/or linker issue? I mean, if _nothing_ is referenced in the module, then shouldn't it never pull anything in and therefore never increase the size of the executable? - Jonathan M Davis
Dec 21 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 08:34:23 Jacob Carlborg wrote:
 On 2011-12-22 04:57, Walter Bright wrote:
 My first thought is that std.datetime is already very large. Few will
 need a custom date formatter, so it should be in a separate module to:
 
 1. reduce cognitive load on the programmer
 
 2. reduce the overhead pulled in for every program that may want to use
 an std.datetime function, but not need custom formatting

As I've said several times, std.datetime is way too large and should be a package.

That's the way that it was originally proposed (albeit split in a poor manner), and a number of people (particularly Phobos devs) were against it, and have been any time that I've suggested splitting out part of it. The fact that Phobos has historically shunned sub-packages probably has a lot to do with that (though we're starting to have some). To make it a package at this point would probably break a fair bit of code unless it were a new package which std.datetime publicly imported, so simply making std.datetime a package isn't necessarily a great idea, and there's a definite benefit IMHO in being able to just import it all with one import. But it's not necessarily unreasonable to create a new package which holds the various pieces and have std.datetime import them all. - Jonathan M Davis
Dec 21 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 00:05:58 Walter Bright wrote:
 We must deal with our imperfect tools as they are. For example, I am not
 going to rewrite the Linux ld linker, and the OSX linker, and the FreeBSD
 linker, etc.

If the compiler and/or linker don't strip unused symbols, then how on earth is importing the module _not_ going to pull in everything in it save for uninitialized templates? - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 00:14:21 Jonathan M Davis wrote:
 On Thursday, December 22, 2011 00:05:58 Walter Bright wrote:
 We must deal with our imperfect tools as they are. For example, I am not
 going to rewrite the Linux ld linker, and the OSX linker, and the
 FreeBSD
 linker, etc.

If the compiler and/or linker don't strip unused symbols, then how on earth is importing the module _not_ going to pull in everything in it save for uninitialized templates?

Or rather, _uninstantiated_ templates. - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 00:54:49 Walter Bright wrote:
 The most important thing is to design the boxes around the various units of
 functionality such that there are minimal lines between those boxes. For
 example, if I want to see if one file is newer than another (like for a make
 program), it should not pull in timezone processing or date formatting
 code.

Both of those are part of SysTime, and the time zone in particular is an integral part of that. You can't do _anything_ with a SysTime without a time zone. That's part of the point of the design. It avoids time conversion issues. Sure, if stuff was rearranged, only TimeZone and LocalTime would have to be pulled in (since LocalTime is the default and TimeZone is its base class), but they have to be there regardless. So, in principle, reducing how much has to be pulled in is very much desirable, but in this case, it doesn't make sense. The time zone stuff is required, and there's a usability issue if you split out the date formatting. It's already built into the type. It's just the custom formatting which isn't there yet, but that's going to be templated, so it shouldn't pull in any additional symbols unless it's used. - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent Kapps <Kapps NotValidEmail.com> writes:
On 21/12/2011 9:41 PM, Jonathan M Davis wrote:
 Okay. At the moment, the time point types in std.datetime have functions for
 converting to and from strings of standard formats but not custom formats, so
 functions for that need to be added. I've come up with a proposal for how
 they're going to work and would like some feedback on it.

 Originally, I was going to make them work like strftime and strptime, since it
 was my understanding that those functions were fairly standard among various
 programming languags. And it _does_ look like a variety of programming
 languages have something similar (Java, Ruby, Python, etc.), but the exact set
 of flags that they use is not standard, so there _isn't_ really a standard to
 follow, just similar functions across a variety of programming languages. And
 honestly, strftime and strptime aren't very good. They're fairly limited IMHO,
 and the choice of flags is fairly arbitrary, so it seems like a good idea to

Every language seems to force you memorize a bunch of format strings for different toString functions, which is quite annoying. I highly doubt I'm the only person who has to look up the format strings every single time I want to use a custom toString (besides an obvious one like YYYY). It would be nice if D did better in this regard. Something taking advantage of enums, such as 'time.toCustomString!(Year, ", ", LongHour, ":", LongMinute, " ", AMPM)'. The problem with that particular example, is that it's... quite ugly. And bloated. Perhaps there would be a way to make it not quite so bloated however, while still retaining the simplicity of compile-time enums. Perhaps this doesn't matter so much in D though, given that template arguments are passed in and thus can be evaluated at compile-time. I definitely like the current approach shown in docs, but it's still painful to try and figure out something like assert(st.toCustomString!"%4yplus%04+Y-%emon-%02D%02H:%02m:%02s%f%tz"() == "2010-Jul-04 07:06:12").
Dec 22 2011
prev sibling next sibling parent Somedude <lovelydear mailmetrash.com> writes:
Le 22/12/2011 04:41, Jonathan M Davis a écrit :
 
 So, what do you think?
 
 - Jonathan M Davis

Honestly ? The simpler, the better. I've hardly ever seen anyone complain about the primitive formatting functions available in other languages. It's not like anyone wants something super sophisticated when doing date formatting. Experience tells me that for such mundane tasks, what you want is: - something that does what you want simply - something fast And that's about it. In fact, I am not even sure that custom formats are useful, when standard ones are perfectly suited for the task.
Dec 22 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 11:51:55 Somedude wrote:
 And that's about it. In fact, I am not even sure that custom formats are
 useful, when standard ones are perfectly suited for the task.

The standard ones are there, and I'd definitely recommend using them in the general case, but some people do require custom formatting, so the functions are needed. But yes, if you don't need them, then don't use them. The standard formats are standard for a reason. - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 02:12:31 Walter Bright wrote:
 Timezone information is not necessary to measure elapsed time or relative
 time, for example.

The type requires it though. No, comparison doesn't require the time zone, but many (most?) of the other operations do. And the type can't be separated from the time zone. That's part of the whole point of how SysTime is designed. It holds its time internally in UTC and then uses the time zone to adjust the time whenever a property or other function is used which requires the time in that time zone. That way, you avoid all of the issues and bugs that result from converting the time. The cost of that is that you can't not have a time zone and use SysTime. So, if someone cares about saving that little bit of extra size in their executable by not using the time zone, they're going to have to use the C functions or design their own time code.
 Can it be added with PIMPL? PIMPL is good for more than just information
 hiding, it can also be a fine way to avoid pulling in things unless they
 are actually dynamically used (as opposed to pulling them in if they are
 statically referenced).

I don't follow you. You mean use PIMPL for the time zone? I haven't a clue how you're going to do PIMPL without .di files, and Phobos doesn't use .di files (and arguably _can't_, because it would destroy inlining and CTFE). Not to mention, PIMPL would make SysTime less efficient, because in order to avoid needing to know what the functions are on a TimeZone, you'd have to hide the bodies of a number of SysTime's functions, which would disallow inlining and CTFE (not that CTFE is terribly likely to be used with SysTime, but inlining could be very important). You'd be losing efficiency of execution just to save a few KB in the executable. Rearranging stuff to save some size in the executable without costing efficiency is one thing, but if it's going to cost efficiency, then I'm generally going to be against it.
 Please check and see if additional symbols are pulled in or not. I've seen a
 lot of template code where the author of that code never checked and was
 surprised at the instantiations that were happening that he overlooked.

Nothing would pull in toCustomString or fromCustomString in unless the user decided to call them, because they're templated and no other functions in Phobos are going to use them at this point (if ever). What exactly will be pulled in when they _are_ called, I don't know, because they're not completed yet. It probably wouldn't be much though, since toCustomString is basically a fancy toString. But there's a good chance that it's stuff that's already being pulled in for the standard string functions (e.g. toISOExtString). - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
On 22/12/2011 03:41, Jonathan M Davis wrote:
<snip>
 Stewart Gordon has a library that takes a different approach (
 http://pr.stewartsplace.org.uk/d/sutil/datetime_format.html ). It does away
 with % flags and uses maximul munch with each of the flags being name such that
 they don't overlap in a way that would make certain combinations of flags
 impossible.

If you mean such things as writing a datum twice consecutively in two different formats, it can be done using an empty literal. For example, "Mmm''m" would today generate "Dec12". Not that I can see any use for such a format, just showing that it can be done.
 It then requires that characters which are not part of the flags be
 surrounded by single quotes.

Wrong. It requires _letters_ that aren't flags to be literalised, and that can be done either by surrounding with '...' or by prefixing with `. Other characters that have no meaning are automatically literal.
 It's an interesting approach, but it isn't as
 flexible as it could be because of its use of maximul munch instead of % flags.

How do you mean?
 So, I've come up with something new which tries to take the best of both. On
 the whole, I think that it's fairly straightforward, and the flags are
 generally recognizable and memorable (though there are a lot). It's also
 definitely extremely flexible (e.g. you can pass it functions to generate
 portions of the string if the existing flags don't get you quite what you
 need). But I'd like some feedback on it before I spend a lot of time on the
 implementation.

 This page has the docs for std.datetime with everything else but the proposed
 custom formatting functions for SysTime stripped out of it:

 http://jmdavis.github.com/d-programming-language.org/std_datetime.html

Looks complicated compared to mine at first sight. Maybe I just need to spend a bit of time looking at it in more detail. Stewart.
Dec 22 2011
next sibling parent David Gileadi <gileadis NSPMgmail.com> writes:
On 12/22/11 4:20 AM, Stewart Gordon wrote:
 Looks complicated compared to mine at first sight.

++ I strongly prefer the look of Stewart's formatting strings; at a glance it is much easier for me to understand "yyyymmdd`THHiisszzz" than "%4yplus%04+Y%02M%02D{T}%02H%02m%02s%f%tz".
Dec 22 2011
prev sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
On 23/12/2011 04:05, Jonathan M Davis wrote:
<snip>
 I mean that you have to be way more careful about how you name the flags. For
 instance, if you have

 MMM

 and

 Month

 you have issues with stuff like MMMonth.

To quote directly from the spec of my scheme: "Each specifier is a letter, or two or more of the same letter consecutively (picked out by maximal munch before lookup in the following table)." So this is a non-issue.
 It can definitely work, but the more
 flags that you have, the more problematic it becomes. It's also easier to
 separate out consecutive flags when reading them if you have %.

Here's the little bit of code in my library that finds the end of a flag: char letter = cast(char) std.ctype.tolower(*charPos); CPtr!(char) beginSpec = charPos; do { ++charPos; } while (std.ctype.tolower(*charPos) == letter); Seems to me pretty straightforward.
 It's an interesting approach, but it isn't as
 flexible as it could be because of its use of maximul munch instead of %
 flags.


It's harder to have modifiers for flags without the %. For instance, what I'm doing with filler characters would be much more difficult with your scheme. With % delineating the flags, it becomes easier to handle that sort of thing.

What are you doing with filler characters? It appears to me, just allowing space or 0 as a filler character in some flags. In my system, alignment fields provide a more powerful way of doing the same thing. Stewart.
Dec 23 2011
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
On 23/12/2011 11:23, Jonathan M Davis wrote:
<snip>
 So, your flags are even more restrictive than I understood them to be. You
 can't have multiple-character flags unless they're the same letter. I thought
 that you could. That _does_ avoid the problem that I was describing, but the
 result is too limiting IMHO. It certainly makes it hard to add flags for
 specific formats similar to %ctime or %mpeg7.

What is the use case for including a full date in some externally defined format within a longer formatted date string? ISTM the way to do this is to define a function that just generates this format straight off. In my library, toShortFormatString and toLongFormatString are already examples of this.
 It also means that you can't reuse letters.

At the moment only 12 letters are used. I can't see the whole alphabet being used up any time in the foreseeable future. <snip>
 It's possible that your scheme is somewhat easier to use for the most basic
 cases, but as soon as more specific and/or complicated schemes are needed (e.g.
 mpeg-7 or any of the ISO schemes), I don't think it works as well in general.

I think I could expand my scheme to include ISO signed-year notation easily enough. In the mpeg-7 standard, does the denominator of the fractional second have to be the smallest possible power of 10, or is F20/1000 or F0/1000 allowed just as well? Stewart.
Dec 23 2011
parent Stewart Gordon <smjg_1998 yahoo.com> writes:
On 23/12/2011 21:24, Jonathan M Davis wrote:
<snip>
 It's not that I want to put %ctime and %mpeg7 in the same string. It's that I
 don't want to have to go define a function for every single one of them. It's a
 much smaller hit to the API to have flags for them.

So this is the principle of API design you go by - aim for one function that does everything? :) OK, so if there's room for it in a given format string scheme, it doesn't really do any harm to have it. But there's no point convoluting a scheme just to make room for named full-timestamp formats.
 Another alternative is to supply an enum of format strings for a variety of
 formats, though that also expands the API a bit (not as badly much though).

This seems to me a good idea. It avoids implementation bloat, and these symbolic constants can be used just as they are or concatenated into the format string if you want to put something else there as well. <snip>
 I don't want to shell out money for the spec if I don't have to. Even if I
 were willing to buy it, it's sold in 12 parts, and I'd probably end up buying
 several parts before I found the one with the definition of its time format,
 wasting that much more money. So, I'll have to go digging online again.

I've just discovered it on the ISO website. It does seem an extortionate price for a spec. Any idea where the money goes? Stewart.
Dec 24 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 12:01:31 Somedude wrote:
 Le 22/12/2011 11:40, Jacob Carlborg a =C3=A9crit :
 Yeah, I don't get this. Most modules in Phobos are too large, in my=


 opinion.

It largely is a matter of taste, I think. There are advantages in minimizing the size of files but there are also advantages in minimiz=

 the number of files.
 But for datetime.d, it has largely gone beyond my own point of
 acceptability (which is about 5,000 lines, if that means anything).

Well, a large portion of the file is documentation and unit tests, and = the=20 number of lines that the unit tests take up should go down as I refacto= r them=20 (which I've done some of, but I've still got a long way to go), but it'= s never=20 going to be anywhere near as small as 5,000 lines. SysTime alone is ove= r 5,000=20 lines (though again, much of that is documentation and unit tests). But= =20 ultimately, I think that whether a module is too large or not is a func= tion of=20 its API rather than the amount of source code. It's a question of how=20= digestible the documentation is. And by that count, std.datetime is sti= ll=20 quite large, but it's a very different measurement. - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
On 22/12/2011 03:41, Jonathan M Davis wrote:
<snip>
 http://jmdavis.github.com/d-programming-language.org/std_datetime.html

Have I got all this right? - a flag goes from % up to another %, a {, a whitespace or punctuation character - flags with [...] portions are exceptions to this rule, extending to the closing ] - %{ or %} is a literal { or } - all literal characters that aren't classed as whitespace or punctuation must be enclosed in {...} - {%} is just a literal % - in %C2* and %Cn* flags, C must be either _ (space) or 0 To look at the detail: - In %nY, does "up to n" mean that it truncates years longer than that to that many characters? Does it strip any leading 0s that result from this truncation? (Under %3Y, does 2011 become 11 or 011?) - But the approach to formatting the year is nicely systematic. (I've thought of possibly adding to my scheme a means of formatting dates to arbitrary length with sign, in order to support the ISO format for years that may be outside the 1BC-9999 range. - If you're going to have the ISO week number in the system, it seems to me you should also have the week-numbering year. (I've thought about possibly adding these to my scheme.) - In %F, What does "as many digits as necessary" mean? In particular, why in the example code does it give 12/100000 and not 3/25000 despite the latter being shorter? - Can only literal stuff be included within %BC[...] and %AD[...]? - %cond and %func - I'm made to wonder to what extent this would be used and to what extent it would just be simpler to use if / ?: / ~. Anyway, I assume that if there's more than one, they will reference the template arguments in order. Can %cond contain other %*[...] flags? Indeed, can %cond's be nested to arbitrary depth? - Does %localeDate give the short or the long date format? (This and %localeTime are handled by separate functions in my library - I didn't think there was any real use case for including such a thing within a longer formatted date/time string.) On the whole, it seems a powerful system. The format strings can get quite obfuscated, but at least the flags that are likely to be commonly used aren't too bad. It seems strange to require explicit notations both for flags and for alphanumeric literals. But I suppose it makes the literals easier to read than having to follow where all those % signs are. One thing I noticed doesn't appear in your scheme is ordinal suffix. OK, so arbitrary alignment fields and collapsible portions aren't to be seen either, but those are quite rare in format string schemes anyway. Stewart.
Dec 22 2011
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
On 23/12/2011 03:55, Jonathan M Davis wrote:
<snip>
 - If you're going to have the ISO week number in the system, it seems to me
 you should also have the week-numbering year.  (I've thought about possibly
 adding these to my scheme.)

%isoweek and %C2isoweek

What are you saying? That the documentation is wrong, and %isoweek emits a year, not a week?
 - In %F, What does "as many digits as necessary" mean?  In particular, why
 in the example code does it give 12/100000 and not 3/25000 despite the
 latter being shorter?

Obviously, that needs to be clearer. The denominator is always a multiple of 10. It's what the mpeg-7 standard uses, which is why it's there.

What's that to do with it? 25000 _is_ a multiple of 10. And 3/25000 contains 6 digits, compared to 12/100000's 8. So the spec reads to the effect that %F should generate 3/25000 in that example. Stewart.
Dec 23 2011
parent Stewart Gordon <smjg_1998 yahoo.com> writes:
On 23/12/2011 11:04, Jonathan M Davis wrote:
 On Friday, December 23, 2011 10:17:00 Stewart Gordon wrote:
 On 23/12/2011 03:55, Jonathan M Davis wrote:
 <snip>

 - If you're going to have the ISO week number in the system, it seems
 to me you should also have the week-numbering year.  (I've thought
 about possibly adding these to my scheme.)

%isoweek and %C2isoweek

What are you saying? That the documentation is wrong, and %isoweek emits a year, not a week?

%isoweek emits a week. If it's not the week that you mean, then what are you talking about? I've obviously misunderstood.

I said "week-numbering year". How can that phrase mean a kind of week? The week-numbering year is the year to which the ISO week belongs. Most of the time this corresponds to the actual year, but around year boundaries it sometimes differs by 1. For example, week 52 of 2011 goes from 2011-12-26 to 2012-01-01 inclusive. The last of these dates isn't in the calendar year 2011, but it is still in the ISO week 2011-W52, so the week-numbering year is 2011. Stewart.
Dec 23 2011
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
 If the compiler and/or linker don't strip unused symbols, then 
 how on earth is
 importing the module _not_ going to pull in everything in it 
 save for
 uninitialized templates?

There are data that needs to be there, but is never symbolically referenced, for example, the exception handler tables. The compiler does the best it can, like emitting one object file per function, but it always must behave conservatively.

AFAIK, dmd used to remove unused modules but then bearophile complained that unittests are not run for imported but otherwise unused modules, so it looks like frontend's responsibility, not backend's.
Dec 22 2011
prev sibling next sibling parent "Martin Nowak" <dawg dawgfoto.de> writes:
On Thu, 22 Dec 2011 12:03:13 +0100, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Thursday, December 22, 2011 02:12:31 Walter Bright wrote:
 Timezone information is not necessary to measure elapsed time or  
 relative
 time, for example.

The type requires it though. No, comparison doesn't require the time zone, but many (most?) of the other operations do. And the type can't be separated from the time zone. That's part of the whole point of how SysTime is designed. It holds its time internally in UTC and then uses the time zone to adjust the time whenever a property or other function is used which requires the time in that time zone. That way, you avoid all of the issues and bugs that result from converting the time. The cost of that is that you can't not have a time zone and use SysTime. So, if someone cares about saving that little bit of extra size in their executable by not using the time zone, they're going to have to use the C functions or design their own time code.
 Can it be added with PIMPL? PIMPL is good for more than just information
 hiding, it can also be a fine way to avoid pulling in things unless they
 are actually dynamically used (as opposed to pulling them in if they are
 statically referenced).

I don't follow you. You mean use PIMPL for the time zone? I haven't a clue how you're going to do PIMPL without .di files, and Phobos doesn't use .di files (and arguably _can't_, because it would destroy inlining and CTFE). Not to mention, PIMPL would make SysTime less efficient, because in order to avoid needing to know what the functions are on a TimeZone, you'd have to hide the bodies of a number of SysTime's functions, which would disallow inlining and CTFE (not that CTFE is terribly likely to be used with SysTime, but inlining could be very important). You'd be losing efficiency of execution just to save a few KB in the executable. Rearranging stuff to save some size in the executable without costing efficiency is one thing, but if it's going to cost efficiency, then I'm generally going to be against it.

https://github.com/dawgfoto/druntime/commit/4626c20b66a647a497b9086b104e10ea89cfef02 The compiler is not really your friend here, e.g. it requires to compile the implementation separately. But if your type is already used by reference then this is a good option to reduce coupling and improve ABI stability. Are you sure that inlining is a performance issue?
 Please check and see if additional symbols are pulled in or not. I've  
 seen a
 lot of template code where the author of that code never checked and was
 surprised at the instantiations that were happening that he overlooked.

Nothing would pull in toCustomString or fromCustomString in unless the user decided to call them, because they're templated and no other functions in Phobos are going to use them at this point (if ever). What exactly will be pulled in when they _are_ called, I don't know, because they're not completed yet. It probably wouldn't be much though, since toCustomString is basically a fancy toString. But there's a good chance that it's stuff that's already being pulled in for the standard string functions (e.g. toISOExtString). - Jonathan M Davis

Dec 22 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 15:44:48 Piotr Szturmaj wrote:
 Walter Bright wrote:
 My first thought is that std.datetime is already very large. Few will
 need a custom date formatter, so it should be in a separate module to:
 
 1. reduce cognitive load on the programmer
 
 2. reduce the overhead pulled in for every program that may want to use
 an std.datetime function, but not need custom formatting

Why not just extract unittest code to separate module?

Because it's harder to maintain that way. The interval types are templated, so their unit test blocks can't be next to the functions that they're testing, and it's a royal pain to manage those tests in comparison to the others. It's _really_ nice to have the tests right next to the functions that they're testing. - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 10:59:54 Andrei Alexandrescu wrote:
 This. YES. A liability of the current std.datetime is that it assumes
 that unittest code is exempt from the rules that apply to regular code.

I still don't agree with this at all, but there's no point in rehashing those arguments. I agreed to refactor the unit tests, and I've refactored some of them based on those complaints. I just haven't finished the job yet.
 I am increasingly worried about that module. It has been argued that its
 sheer size is not a problem, but somehow the task of accounting for that
 has taken a life of its own - e.g. we can't test std.datetime like
 everything else in Phobos, it needs its own version.

Any time that breaking it up has come up, it's been voted down. I don't personally find the size to be a maintainability issue at all, but it _is_ an issue as far as the documentation goes, since it makes the module harder to understand in spite of the fact that the basic design is fairly simple. Also, it's caused issues for Don when reducing compiler bugs, because he needs to strip out the unit tests, and there are a lot (and there still will be even if they're refactored, just because there's a lot to test). That's the main reason that the version identifier for the tests is still there. I'm not against breaking up std.datetime as long as its broken into an actual package. Pieces of it can be broken out without that (e.g. the interval and range stuff), but I'm not sure that that would really reduce the size of the module enough to really fix the complaints on that front. - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 18:33:46 Jacob Carlborg wrote:
 That doesn't sound right. If std.datetime can't be tested like the rest
 of Phobos there's something quite seriously wrong with it.

It can be, and it is. Previously, there were issues compiling it on Windows which caused the compiler to run out of memory. So, I added a version identifier for the unit tests which was disable on Windows. Those issues have been fixed, so the version identifier is now always enabled. I left it in, because it made Don's life easier when he was trying to reduce compiler bugs (since, without the unit tests, less gets pulled in, and less of std.datetime gets instantiated). The version identifier could be removed, but the ease of disabling the unit tests if necessary merited leaving it in. - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 13:27:47 Andrei Alexandrescu wrote:
 I think there's a lot of mileage in the 1:1 correspondence between files
 and modules, and between directories and packages. We should keep it
 that way.

Yes. And if we want to split up modules, publicly importing allows you to have a single place to import them all, which gives you more or less the same effect as if modules could be split into multiple files except that there's still a 1:1 correspondance between modules and files. So, I think that D's features do a fine job in that regard. - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 10:26:27 Walter Bright wrote:
 PIMPL means you have an opaque pointer to something. It can be a function 
 pointer, or a data pointer. It gets filled in at runtime. It has nothing to
 do with .di files.

Well, I have no idea how you'd do that in this case without hiding SysTime's implementation, since it has to call TimeZone's functions and therefore needs to know that they exist. They're polymorphic, so the exact type of the TimeZone could be unknown and unseen by SysTime, but it has to know about TimeZone. And if you hid the function bodies in an effort to make TimeZone's usage opaque, then you couldn't inline those functions anymore, and I would consider the efficiency of the functions to be more important that trying to avoid pulling in the TimeZone class just to avoid a few KB in the executable. On Thursday, December 22, 2011 10:28:52 Walter Bright wrote:
 The time zone info can be lazily initialized only by those operations that
 need a time zone.

I don't think that that would really buy you anything. SysTime is default- initialized to use LocalTime, which is a singleton, so it's not like you're allocating a new TimeZone every time that you create or use a SysTime. Currently, the singleton is initialized by a static constructor, but that's going to be changed to be lazily initialized (which should get rid of the static constructors and their cost). So, there _is_ still some cost on the _first_ SysTime that gets created in the program, but after that, there isn't really. And doing a lazy initialization of the TimeZone within the SysTime in the case where the programmer does not specify a TimeZone would just increase the cost of most of SysTime's functions, since most of them would have to be checking whether the TimeZone had been initialized or not. With the singleton, such a check only occurs when the SysTime is created. And at some point, the singleton will probably be change to use emplace, which will allow it to completely avoid the GC heap, which will make the singleton cost that much less. So, the cost of the time zone from the perspective of execution speed is minimal. It sounds like it's just the fact that using a class increases the symbols in the executable that's the problem. - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 13:38:51 Andrei Alexandrescu wrote:
 On 12/22/11 1:32 PM, Jonathan M Davis wrote:
 [snip]
 
 Now that we got to talk about std.datetime, here are three things that I
 think we could do to make it more manageable.
 
 1. Put files in data. I find it a tad awkward that we have time zone
 information in hardcoded strings inside the code. That means any such
 change would have us redistributed Phobos. I'm thinking a small data
 file would be more appropriate. Better yet, hook into OSs timezone
 information and let the OS worry about keeping that timely.

The only reason there are any hard-coded time zone names is that they're required to convert between the names used by Posix and those used by Windows. So, you _can't_ hook into the OS information and get them. Now, conceivably, you could move that information into a file and then parse the file when it's needed. That would obviously be less efficient, but creating a WindowsTimeZone or PosixTimeZone (which is what they'd most frequently be needed for) isn't exactly terribly efficient to begin with, since you have to read in the time zone information from from the disk or from the registry (which is probably on disk). So, that's not unreasonable.
 2. datetime == time + date. We could reduce std.datetime to "public
 import std.time, std.date;" and define:
 
 (a) std.time -> everything having to do with sheer time information, no
 date-related oddities. That means the largest formalized interval would
 be the week.
 
 (b) std.date -> all of the bizarre calendar stuff, dealing with months
 and more. Naturally std.date would use std.time.

Well, the only time point type which only deals with time and not dates is TimeOfDay. Date, DateTime, and SysTime all deal with dates. You can't really get away from dealing with dates once your type holds more than 24 hours worth of time unless it's a duration as opposed to a time point. So, I really don't think that trying to split std.datetime into std.date and std.time makes much sense. A better division would be to put SysTime in a module and TimeOfDay, Date, and DateTime in another. SysTime deals with the system time, has a time zone, and is intended for use with stuff which isn't calendar-based (timestamps and file times being good examples - anything where you need the absolute time when it occured), whereas the others don't have a time zone and therefore _are_ calendar-based. However, they all share common code, so they'd either need to duplicate that code or any modules that they're split up into need to be in the same package. They _could_ both be sitting in std directly, but that would give package access to completely unrelated functions. It's also possible that we'll have more date and/or time related modules in the future (for instance, having one for handling date-recurrence patterns would be nice), and if that occurs, it makes that much more sense to use a sub-package rather than std. If we're splitting it up, there's also the question of how far we want to split it up. In the extreme case, we could put every struct and class in its own module, though that's going too far IMHO. But we're probably going to want to go farther than just splitting it in two. In addition to the benchmarking functionality, I'd like to see the time interval and range functionality in a separate module, and there's the question of whether the time zone stuff should be in its own module - though there's not much point to the time zones without SysTime, so I'm not sure whether that's really valuable. In any case, if we keep std.datetime and have it publicly import the other modules, we can split it up more or less however we like, but having a sub- package would make the most sense IMHO. My original std.datetime proposal was that way, but it was split badly, and we didn't really have any sub-package stuff in Phobos beyond the C stuff and Windows stuff at the time, but that has been slowly changing. - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
Okay. Assuming that I'm going to try and make TimeZone opaque within SysTime, 
does that require a pointer rather than a reference? And I assume then that 
the time zone stuff would need to be in a separate module than SysTime. That 
being the case, how would SysTime be able to use the time zone without 
importing that module? Does the C++ solution of forward declaring it like

class TimeZone;

work in D?

- Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 10:29:59 Michel Fortin wrote:
 I'd tend to say that for general purpose time representation not
 involving local time, SysTime is suboptimal because it forces you to
 carry around a pointer to a time zone. Imagine an array of SysTime all
 in UTC and the space wasted with all those pointers referencing the UTC
 time zone object.
 
 It should be very easy to make a separate type, let's say UTCTime, and
 allow SysTime to be constructed from it and to be implicitly converted
 to it (with alias this). Then put UTCTime in a different module from
 SysTime and you can deal with time in UTC without having to ever import
 the module with SysTime the time zone class it wants.
 
 Then redefine all APIs not dealing with local time so they work with
 UTCTime instead of SysTime.

That could certainly be done, but it complicates things that much more. The idea, at least, of SysTime was that it would deal with all of the time zone stuff correctly without you having to worry about it unless you wanted to deal with the time zone stuff, in which case it would give you those capabilities. That requires having it to carry the time zone around. Something like UTCTime would allow you to carry the time around without the time zone, but then anyone who wants to be able to do stuff like convert it to a string or get its year or anything like that is almost certainly going to want it in a particular time zone (probably local time) rather than UTC, so it increases the burden on the programmer to deal with something like UTCTime. If you're dealing with anything beyond comparing times or adding durations to them, you're going to need the time zone, and in most cases, UTC is not the one that people are going to want. So, it harms usability IMHO to using something like UTCTime instead of SysTime, and just to save yourself the cost of the reference for the time zone? If you're _that_ worried about the space, you can always use a SysTime's stdTime property or toUnixTime and get an integral value to store. Granted, that's not as safe as something like UTCTime, since it's a naked number, but I really don't think that the cost of that reference is generally an issue. - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 12:59:15 Stewart Gordon wrote:
 On 22/12/2011 03:41, Jonathan M Davis wrote:
 <snip>
 
 http://jmdavis.github.com/d-programming-language.org/std_datetime.html

Have I got all this right? - a flag goes from % up to another %, a {, a whitespace or punctuation character - flags with [...] portions are exceptions to this rule, extending to the closing ] - %{ or %} is a literal { or } - all literal characters that aren't classed as whitespace or punctuation must be enclosed in {...} - {%} is just a literal % - in %C2* and %Cn* flags, C must be either _ (space) or 0

Yes. I believe that that's correct.
 To look at the detail:
 
 - In %nY, does "up to n" mean that it truncates years longer than that to
 that many characters?  Does it strip any leading 0s that result from this
 truncation?  (Under %3Y, does 2011 become 11 or 011?)

It becomes 11. There's no filler character. For instance, the example section gives assert(SysTime(Date(8, 7, 4)).toCustomString!"%2Y"() == "8"); But I guess that I need to find a clearer way to state the flag's definition.
 - But the approach to formatting the year is nicely systematic.  (I've
 thought of possibly adding to my scheme a means of formatting dates to
 arbitrary length with sign, in order to support the ISO format for years
 that may be outside the 1BC-9999 range.

I tried to be extremely systematic about all of the flags. As a result, I believe that the system is very consistent with itself.
 - If you're going to have the ISO week number in the system, it seems to me
 you should also have the week-numbering year.  (I've thought about possibly
 adding these to my scheme.)

%isoweek and %C2isoweek
 - In %F, What does "as many digits as necessary" mean?  In particular, why
 in the example code does it give 12/100000 and not 3/25000 despite the
 latter being shorter?

Obviously, that needs to be clearer. The denominator is always a multiple of 10. It's what the mpeg-7 standard uses, which is why it's there.
 - Can only literal stuff be included within %BC[...] and %AD[...]?

That was the idea. I don't think that it needs to be any fancier than that. You could just use %cond or %func otherwise. That should probably clearer though.
 - %cond and %func - I'm made to wonder to what extent this would be used and
 to what extent it would just be simpler to use if / ?: / ~.  Anyway, I
 assume that if there's more than one, they will reference the template
 arguments in order.  Can %cond contain other %*[...] flags?  Indeed, can
 %cond's be nested to arbitrary depth?

In theory, %cond is supposed to be arbitrarily nestable, though I question that it would generally be a good idea to do so. And yes, as the text underneath that section of flags mentions, functions are associated with flags the same way that arguments to format or writefln would be.
 - Does %localeDate give the short or the long date format?  (This and
 %localeTime are handled by separate functions in my library - I didn't
 think there was any real use case for including such a thing within a
 longer formatted date/time string.)

I have no idea. It's strftime which has that functionality (%x and %X). The idea is that std.datetime would parse that to determine the correct format, but I don't know how well it will work, since I haven't written it yet. They're flags that are only there, because I was trying to make it possible to to use toCustomString to do everything that strftime can do. It does occur to me though that I don't have a way to do %c (which would be both the date and time in the preferred format). I guess that that would be %localeDateTime. But these _are_ flags that might have to get the axe if I can't do what I think that I'm going to be able to do to figure out what strftime is doing and use the same format.
 On the whole, it seems a powerful system.  The format strings can get quite
 obfuscated, but at least the flags that are likely to be commonly used
 aren't too bad.

The idea at least is that the common use cases are fairly easy but that it's also powerful to do most any format without much difficulty. Some of the standard formats _do_ get a bit convoluted however.
 It seems strange to require explicit notations both for flags and for
 alphanumeric literals.  But I suppose it makes the literals easier to read
 than having to follow where all those % signs are.

It makes the parsing easier (particularly with multi-character flags), and makes it somewhat easier to read IMHO, since it does clearly separate out the portions that _aren't_ flags. I'm not sure that I would have thought of that on my own however. It seemed like a good idea from your scheme.
 One thing I noticed doesn't appear in your scheme is ordinal suffix.  OK, so
 arbitrary alignment fields and collapsible portions aren't to be seen
 either, but those are quite rare in format string schemes anyway.

I was debating that. A flag for it could certainly be added. I don't have a way to deal with it in a locale-specific manner though, unfortunately. It's also a bit nasty, since it has to be tied to a specific number. I think that it only really makes sense with the days, however. So, at that point, you either create alternate versions of some of the day flags that have a th suffix or something similar, or you create a separate day flag which only does the ordinal suffix (similar %yplus). I may add such a flag. Regardless, some of the specific flags are definitely up for debate (e.g. several of them are only there because strftime has something similar), and if there are flags that are likely to be generally useful which I'm missing, they may be worth adding. - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 11:20:11 Stewart Gordon wrote:
 On 22/12/2011 03:41, Jonathan M Davis wrote:
 <snip>
 
 Stewart Gordon has a library that takes a different approach (
 http://pr.stewartsplace.org.uk/d/sutil/datetime_format.html ). It does
 away with % flags and uses maximul munch with each of the flags being
 name such that they don't overlap in a way that would make certain
 combinations of flags impossible.

If you mean such things as writing a datum twice consecutively in two different formats, it can be done using an empty literal. For example, "Mmm''m" would today generate "Dec12". Not that I can see any use for such a format, just showing that it can be done.

I mean that you have to be way more careful about how you name the flags. For instance, if you have MMM and Month you have issues with stuff like MMMonth. It can definitely work, but the more flags that you have, the more problematic it becomes. It's also easier to separate out consecutive flags when reading them if you have %.
 It's an interesting approach, but it isn't as
 flexible as it could be because of its use of maximul munch instead of %
 flags.


It's harder to have modifiers for flags without the %. For instance, what I'm doing with filler characters would be much more difficult with your scheme. With % delineating the flags, it becomes easier to handle that sort of thing.
 Looks complicated compared to mine at first sight.  Maybe I just need to
 spend a bit of time looking at it in more detail.

I don't think that it's all that complicated ultimately, but it definitely _looks_ complicated to begin with. I _tried_ to do the documentation in a way that made it less daunting, but I'm not sure that I succeeded. Actually, it's a bit like std.datetime in general in that sense. It really isn't all that complicated to use, but there's a lot of it, so it _looks_ complicated and therefore is more daunting than it should be. I probably shouldn't use the standard formats as the first examples. I was trying to show examples which corresponded with known formats so that you could see how they're done, but those formats are have to be very precise in how they're laid out, so they have a complicated set of flags. Doing simpler stuff like "%M/%D/%Y" as the primary examples would probably be better. - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 21:32:17 Walter Bright wrote:
 On 12/22/2011 7:24 PM, Jonathan M Davis wrote:
 So, it harms usability IMHO to using something like UTCTime instead of
 SysTime, and just to save yourself the cost of the reference for the
 time
 zone?

No, it's not the cost of the reference. It's the cost of pulling in all the code to deal with that reference.

Well, I still dispute that that's a big deal, but regardless, if the issue can be solved with PIMPL (as ugly as it may be do so), then at least the difference should be hidden from the programmer instead of affecting the API. - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, December 22, 2011 21:30:46 Walter Bright wrote:
 On 12/22/2011 7:13 PM, Jonathan M Davis wrote:
 Okay. Assuming that I'm going to try and make TimeZone opaque within
 SysTime, does that require a pointer rather than a reference? And I
 assume then that the time zone stuff would need to be in a separate
 module than SysTime. That being the case, how would SysTime be able to
 use the time zone without importing that module? Does the C++ solution
 of forward declaring it like
 
 class TimeZone;
 
 work in D?

It'll still put a reference to TimeZone in the ModuleInfo.

Will that still happen if the TimeZone is used in templated functions? SysTime has several functions that use TimeZone explicitly - e.g. the timezone property. It needs to be able to take and return a TimeZone. However, it _could_ be templatized with an empty template parameter list. Would that avoid pulling in the information on TimeZone if those functions aren't instantiated? Or would it still pull it in? - Jonathan M Davis
Dec 22 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, December 23, 2011 01:41:43 Walter Bright wrote:
 Templates, after instantiation, are exactly like their non-templated
 equivalents. Before instantiation, they are not even semantically analyzed.

That's more or less what I figured, but then again, I never would have expected having a class would be such a big deal in the first place. With the judicious use of templates, it may be possible to get the PIMPL bit to work. The problem that I see is at the moment that while a templated may not be instantiated, if it takes a particular type as a parameter (e.g. TimeZone), the type still needs to be imported. If it were just internal to the function, then a import statement could be put inside of the function, but I'm not sure how you could have a localized import like that for a parameter. Since essentially what is needed is for the module with the type to be imported when the programmer tries to instantiate the templated function but not be imported otherwise. But doing that gets rather convoluted. I _think_ that it could be done if the function used a template constraint which used an eponymous template from another module which imported the module with TimeZone in it and checked the type. But that might still pull in the class, because the module with the eponymous template imported it. I really don't understand what exactly results in the class' info being pulled into the executable. It's also a bit ugly to template a parameter which can only be one type, but that's not the end of the world if it works. So, I'm not sure that it's actually possible to get PIMPL to work here, since several functions take a TimeZone argument, and even templatizing them, I'm not sure how you avoid having to always import the module with TimeZone in it to have those functions compile properly when they're instantiated. - Jonathan M Davis
Dec 23 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, December 23, 2011 10:17:00 Stewart Gordon wrote:
 On 23/12/2011 03:55, Jonathan M Davis wrote:
 <snip>
 
 - If you're going to have the ISO week number in the system, it seems
 to me you should also have the week-numbering year.  (I've thought
 about possibly adding these to my scheme.)

%isoweek and %C2isoweek

What are you saying? That the documentation is wrong, and %isoweek emits a year, not a week?

%isoweek emits a week. If it's not the week that you mean, then what are you talking about? I've obviously misunderstood.
 Obviously, that needs to be clearer. The denominator is always a
 multiple of 10. It's what the mpeg-7 standard uses, which is why it's
 there.

What's that to do with it? 25000 _is_ a multiple of 10. And 3/25000 contains 6 digits, compared to 12/100000's 8. So the spec reads to the effect that %F should generate 3/25000 in that example.

Sorry. I meant power, not multiple. Wrong word. It's always a 1 followed by some number of 0s. - Jonathan M Davis
Dec 23 2011
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 23 Dec 2011 02:21:37 -0000, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 12/22/2011 11:25 AM, Piotr Szturmaj wrote:
 I wish D could support partial modules - partial as analogy to C#'s  
 partial
 classes.

 module std.datetime-unit1;
 import std.datetime-unit2;
 // dash allowed only in submodules with the same module name
 ...

 module std.datetime-unit2;
 import std.datetime-unit1;
 ...

 // then

 module whatever;
 import std.datetime; // as usual

I have no idea why anyone would want this. (Is it because the file is too big to fit on a floppy disk? <g>)

It's of most benefit (IMO) for the Visual Studio IDE/GUI designer code. The automated code generation goes into one source file, in a partial class. The user defined code into another source file/partial class. It makes life easier for both the developer and the GUI designer code itself. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Dec 23 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, December 23, 2011 10:26:05 Stewart Gordon wrote:
 To quote directly from the spec of my scheme:
 "Each specifier is a letter, or two or more of the same letter consecutively
 (picked out by maximal munch before lookup in the following table)."
 So this is a non-issue.

So, your flags are even more restrictive than I understood them to be. You can't have multiple-character flags unless they're the same letter. I thought that you could. That _does_ avoid the problem that I was describing, but the result is too limiting IMHO. It certainly makes it hard to add flags for specific formats similar to %ctime or %mpeg7. It also means that you can't reuse letters.
 It can definitely work, but the more
 flags that you have, the more problematic it becomes. It's also easier
 to
 separate out consecutive flags when reading them if you have %.

Here's the little bit of code in my library that finds the end of a flag: char letter = cast(char) std.ctype.tolower(*charPos); CPtr!(char) beginSpec = charPos; do { ++charPos; } while (std.ctype.tolower(*charPos) == letter); Seems to me pretty straightforward.

The situation would be very different if you allowed multi-character flags of varying letters like I thought you did.
 It's an interesting approach, but it isn't as
 flexible as it could be because of its use of maximul munch instead
 of % flags.

How do you mean?

It's harder to have modifiers for flags without the %. For instance, what I'm doing with filler characters would be much more difficult with your scheme. With % delineating the flags, it becomes easier to handle that sort of thing.

What are you doing with filler characters? It appears to me, just allowing space or 0 as a filler character in some flags. In my system, alignment fields provide a more powerful way of doing the same thing.

There's also the question of indicating the number of digits in the flag. In mine, it's quite easy to indicate either the exact number of digits that the result of the flag must be or to indicate that it must be _at least_ that many. And it's very consistent across flags. In yours, you have to do it with the casing of the letters, since the letters are all you have to work with. You have no modifiers. It may be that your alignment fields do a better job than the filler characters that I'm using, but I'd have to read over your spec again. I do remember thinking that the alignment fields didn't really add enough over using filler characters to be worth the extra complication. It's possible that your scheme is somewhat easier to use for the most basic cases, but as soon as more specific and/or complicated schemes are needed (e.g. mpeg-7 or any of the ISO schemes), I don't think it works as well in general. I was trying to create a scheme that was flexible and powerful enough to do those more complicated schemes and yet still be easily usable for basic stuff. And I don't think that I'm all that far off from that, but it's quite possible that that extra power is making mine harder to use than yours for the most basic cases. I don't know. - Jonathan M Davis
Dec 23 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, December 23, 2011 11:21:48 Stewart Gordon wrote:
 I said "week-numbering year".  How can that phrase mean a kind of week?

Because I've never heard of it, and I misunderstood it.
 The week-numbering year is the year to which the ISO week belongs.  Most of
 the time this corresponds to the actual year, but around year boundaries it
 sometimes differs by 1.  For example, week 52 of 2011 goes from 2011-12-26
 to 2012-01-01 inclusive.  The last of these dates isn't in the calendar
 year 2011, but it is still in the ISO week 2011-W52, so the week-numbering
 year is 2011.

A flag for that could be added easily enough. - Jonathan M Davis
Dec 23 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, December 23, 2011 12:02:26 Stewart Gordon wrote:
 What is the use case for including a full date in some externally defined
 format within a longer formatted date string?  ISTM the way to do this is
 to define a function that just generates this format straight off.  In my
 library, toShortFormatString and toLongFormatString are already examples of
 this.

 It also means that you can't reuse letters.

At the moment only 12 letters are used. I can't see the whole alphabet being used up any time in the foreseeable future.

It's not that I want to put %ctime and %mpeg7 in the same string. It's that I don't want to have to go define a function for every single one of them. It's a much smaller hit to the API to have flags for them. Another alternative is to supply an enum of format strings for a variety of formats, though that also expands the API a bit (not as badly much though). That may a be better approach for known formats. That doesn't work as well though in cases where the format strings to toCustomString and fromCustomString aren't identical (which happens in the case of mpeg-7, since it uses %cond). So, I don't know. Regardless, I think that the abliity to have more or less arbitrary strings used for flags is valuable.
 I think I could expand my scheme to include ISO signed-year notation easily
 enough.  In the mpeg-7 standard, does the denominator of the fractional
 second have to be the smallest possible power of 10, or is F20/1000 or
 F0/1000 allowed just as well?

If it's 0, there is no fraction, just like there's no decimal if it's 0 for ISO. I don't _think_ that it has to be the smallest possible power of 10, but I don't remember at the moment. I'm going to have to track down the info on the spec again. I'd figured it out for work previously, but unfortunately I lost whatever bookmarks I had with the info. I don't want to shell out money for the spec if I don't have to. Even if I were willing to buy it, it's sold in 12 parts, and I'd probably end up buying several parts before I found the one with the definition of its time format, wasting that much more money. So, I'll have to go digging online again. - Jonathan M Davis
Dec 23 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, December 23, 2011 09:36:14 Michel Fortin wrote:
 Well, what I'm getting at is that most of the time you don't care which
 time zone the time was recorded in, so you don't need to attach a time
 zone to it, you only need to take the time zone into consideration when
 formatting as a string, and then you mostly always use local time.
 
 The real issue remains that you can't use SysTime without including all
 the code for all the time zones. Think about this: if you don't need to
 carry around the time zones but instead only ask for a time zone when
 formatting as a string, you much less need time zones to be
 polymorphic. The time zone could be a template argument to the
 formatting functions for instance.
 
 On the other hand, if you need to carry the associated time zone along
 with the time, then things gets more complicated and a polymorphic time
 zone type tend to solve that problem well. But how many of us need to
 carry a time zone with a time value?
 
 So in my opinion associating a time zone with SysTime was a mistake.
 Not just because it forces you to carry around an extra pointer, but
 mostly because it forces time zones to be polymorphic which brings all
 the drawbacks of a class: less inlining and worse performance due to
 virtual functions, and all the virtual functions need to be included in
 the binary even if you don't use them. It does benefit the use case
 where you need to tag a time with a specific time zone, but that sounds
 rather specialized to me.

The core issue with not carrying around a time zone is that you get conversion problems. People do dumb stuff like convert time_t values from UTC to local time and back again, which causes all kinds of bugs. For most time zones, you cannot convert them back to UTC correctly in the general case. The hours when DST switches occur screw that up. So, in order to handle time correctly, you need to always keep it in UTC until you need to present it (e.g. converting it to a string) or need to know something about its date (such as what year or month it's in). If you're actually storing a converted time value, you're doing it wrong and will almost inevitably have bugs. I did std.datetime in the first place, because I was sick of having to fix time-related bugs at work and wanted D to get it right. That requires a type which holds the time in UTC and knows its time zone. Now, "knowing its time zone" can be dealt with in several ways. One way is to integrate it into the type. Your proposed UTCTime would do that. It's always in UTC. You could also have another type which was called something like LocalTime which held the time in UTC internally but any functions on it which needed time zone conversions used the appropriate calculations for the local time zone instead of UTC. There are three problems with that approach though. 1. It doesn't scale. 2. You have to care about the time zone of the time object when you pass it to functions. 3. Code duplication. #1 isn't really a big deal if all you care about is UTC and local time. Dealing with other time zones can be very useful, but is definitely more of a niche issue. It's not all that hard to end up in situations where it matters though when you're getting data from another computer. There are lots of applications out there which have to deal with that, but I don't know how common it ulimately is. Certainly, where I work, we have to do it all the time. And I know that Adam Ruppe has to deal with similar issues with his web software based on questions he asked about std.datetime previously. #2 can be dealt with with templates to some extent, but functions still still have to return a specific time type. So, unless you do something like make it so that you pass the type you want returned as a template argument, you have to deal with a specific time type and potentially convert it to whichever one you actually want. This isn't necessarily a big deal, but it does complicate things. With a solution like SysTime, you don't generally have to care what the time zone of a particular SysTime actually is. #3 can be dealt with in several ways - including using a bunch of free functions instead of having the functions on the types themselves, but that's no OO at all, and highly ugly IMHO. Mixins would be a better approach, but regardless, you have to deal with the fact that you're having to effectively define a bunch of functions multiple times. So, all around, a solution where the time zone is treated as a property of the type which can be changed seems like a much cleaner solution to me. It's also much more flexible and powerful, because you _can_ then define more time zones than just two. The current solution handles time correctly without generally requiring the programmer to care about what time zone the time is in and yet still allows for more advanced time zone usage if the programmer wants it. The only real downside that I see here is that for those trying to keep their executables as absolutely small as possible and don't care about anything beyond comparing file times end up with a larger executable due to features that they don't care about. My greatest fear here is that we would be pushed into an inferior solution which would increase the likelihood of bugs just because some folks want a slightly smaller executable. - Jonathan M Davis
Dec 23 2011
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
 The core issue with not carrying around a time zone is that you 
 get conversion problems. People do dumb stuff like convert 
 time_t values from UTC to local time and back again, which 
 causes all kinds of bugs.

Can you elaborate?
Dec 27 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, December 27, 2011 11:19:03 Kagamin wrote:
 The core issue with not carrying around a time zone is that you
 get conversion problems. People do dumb stuff like convert
 time_t values from UTC to local time and back again, which
 causes all kinds of bugs.

Can you elaborate?

1. Any time you do a conversion of any kind, you risk screwing it up. The less converting you do, the better. 2. With regards to time, DST is a killer. The reason for this is the fact that in the spring, one hour gets skipped, and in the fall one hour happens twice. So, for example, let's say that in a particular time zone, at 2 am on March 3rd, the time goes to 3 am to apply DST. That means that the times of 2 am up to (but not including) 3 am do not exist. So, if you have the time 2:30 am in local time, what time is that in UTC? It's likely that the time code will assume that it's really 3:30 am, but the time that you're trying to convert doesn't technically exist, so there is not actually a right answer. On the other hand, let's say that on 2 am on October 30th, the time falls back to 1:00 am taking that time zone out of DST. That means that the times of 1 am up to (but not including) 2 am happen twice. So, if you have the time 1:30 am in local time, what time is it in UTC? You can't know. It could be either. The time code is going to have to pick one or the other, but since 1:30 am is non- unique, it's not necessarily going to have the behavior which is correct for your program. However, UTC can _always_ be correctly converted to other time zones whether they have DST or not. This is because UTC does not have DST and therefore does not have hours that don't exist or hours that happen twice. All of its times are unique. So, if you want to deal with time accurately and reliably, you need to always keep the time in UTC until you _need_ to convert it to local time (typically for display purposes but also for things like if you need to know something along the lines of what year that time is in in the local time zone). Code which converts time back and forth between UTC and local time is asking for trouble. Even if it gets all of the conversions correct (or at least as correct as possible), it's going to have issues whenever a DST change occurs. That's one of the reasons why non-Windows OSes typically want to put the system clock in UTC and what it's so horrible that Windows normally puts the system clock in local time (another is the fact that applying DST can make it so that files from a few minutes ago are suddenly in the future if the file timestamps are in local time). Where I work, we ended up with a bug where when you rewinded video and you were east of UTC, it would rewind one hour too far for each hour east of UTC that you were (so 2 hours east of UTC would rewind 2 hours and 1 second instead of 1 second). It turns out that the code had been converting a time value to UTC from local time when it was already in UTC (or it might have been the other way around - I don't recall exactly which at the moment). And that meant that it was subtracting the UTC offset from the time, causing it to go back too far. The only reason that we hadn't seen it in the US was because west of UTC, the time would have been in the future, which it couldn't do, so it ended up rewinding properly. Had that code completely avoided all of the time conversions that it was doing, it wouldn't have had any issues like that. And the fact that the bug only manifested in certain time zones (and _not_ the time zones that the code was being developed in) made it that much worse. Times should be kept in UTC as much as possible and converted as little as possible. Anything else is asking for trouble. That's also one reason why it's good to have times be objects rather than naked numbers. It reduces the risk of people converting them incorrectly. SysTime mostly avoids the whole issue by encapsulating the time (in UTC) with a time zone, making it so that it generally "just works." - Jonathan M Davis
Dec 27 2011
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
On Tuesday, 27 December 2011 at 10:51:25 UTC, Jonathan M Davis 
wrote:
 On the other hand, let's say that on 2 am on October 30th, the 
 time falls back to 1:00 am taking that time zone out of DST. 
 That means that the times of 1 am up to (but not including) 2 
 am happen twice. So, if you have the time 1:30 am in local 
 time, what time is it in UTC? You can't know. It could be 
 either. The time code is going to have to pick one or the 
 other, but since 1:30 am is non-
 unique, it's not necessarily going to have the behavior which 
 is correct for your program.

how std.datetime works in this case? You need UTC value for SysTime.
 So, if you want to deal with time accurately and reliably, you 
 need to always keep the time in UTC until you _need_ to convert 
 it to local time (typically for display purposes but also for 
 things like if you need to know something along the lines of 
 what year that time is in in the local time zone).

True. That's why UTCtime proposed earlier makes perfect sense (I'd call it just `Time`).
 Code which converts time back and forth between UTC and local 
 time is asking for trouble. Even if it gets all of the 
 conversions correct (or at least as correct as possible), it's 
 going to have issues whenever a DST change occurs. That's one 
 of the reasons why non-Windows OSes typically want to put the 
 system clock in UTC and what it's so horrible that Windows 
 normally puts the system clock in local time

Are you sure? http://msdn.microsoft.com/en-us/library/windows/desktop/ms724390%28v=vs.85%29.aspx
 Where I work, we ended up with a bug where when you rewinded 
 video and you were east of UTC, it would rewind one hour too 
 far for each hour east of UTC that you were (so 2 hours east of 
 UTC would rewind 2 hours and 1 second instead of 1 second). It 
 turns out that the code had been converting a time value to UTC 
 from local time when it was already in UTC (or it might have 
 been the other way around - I don't recall exactly which at the 
 moment).

As far as I understand, this is a problem only for time values that bear no time zone information and not a problem for, say, UTCtime: when you convert UTCtime to UTC, you get the same value.
 Times should be kept in UTC as much as possible and converted 
 as little as possible. Anything else is asking for trouble. 
 That's also one reason why it's good to have times be objects 
 rather than naked numbers. It reduces the risk of people 
 converting them incorrectly. SysTime mostly avoids the whole 
 issue by encapsulating the time (in UTC) with a time zone, 
 making it so that it generally "just works."

One can convert std.datetime.SysTime value to local time just by adding a corresponding offset. That should be easy.
Dec 27 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, December 27, 2011 15:41:59 Kagamin wrote:
 On Tuesday, 27 December 2011 at 10:51:25 UTC, Jonathan M Davis
 
 wrote:
 On the other hand, let's say that on 2 am on October 30th, the
 time falls back to 1:00 am taking that time zone out of DST.
 That means that the times of 1 am up to (but not including) 2
 am happen twice. So, if you have the time 1:30 am in local
 time, what time is it in UTC? You can't know. It could be
 either. The time code is going to have to pick one or the
 other, but since 1:30 am is non-
 unique, it's not necessarily going to have the behavior which
 is correct for your program.

how std.datetime works in this case? You need UTC value for SysTime.

SysTime holds the time in UTC. It's only ever an issue if you try and create a SysTime from a Date or DateTime which fall on a DST switch. And unfortunately, that problem is unavoidable, since you sometimes need to be able to specify a date and time in the local time zone and convert it to UTC. But creating a SysTime from a Date or DateTime shouldn't be something that programs are doing frequently. Usually, you get the time from the system clock, in which case, it's in UTC, and there are no conversion issues.
 So, if you want to deal with time accurately and reliably, you
 need to always keep the time in UTC until you _need_ to convert
 it to local time (typically for display purposes but also for
 things like if you need to know something along the lines of
 what year that time is in in the local time zone).

True. That's why UTCtime proposed earlier makes perfect sense (I'd call it just `Time`).

SysTime already keeps the time in UTC. It only converts anything to a specific time zone when you ask for a value which requires it (such as getting its month or getting it as a string). Adding something like UTCTime adds nothing beyond removing the TimeZone object in an effort to reduce the number of symbols in the executable for those who just want to compare times, and that makes it so that you have to create other types to convert UTCTime to in order to get the time in other time zones (e.g. local time). By having SysTime, you don't need extra time types.
 Code which converts time back and forth between UTC and local
 time is asking for trouble. Even if it gets all of the
 conversions correct (or at least as correct as possible), it's
 going to have issues whenever a DST change occurs. That's one
 of the reasons why non-Windows OSes typically want to put the
 system clock in UTC and what it's so horrible that Windows
 normally puts the system clock in local time

Are you sure? http://msdn.microsoft.com/en-us/library/windows/desktop/ms724390%28v=vs.85%2 9.aspx

Yes. That's why you have to tell Linux to have the system clock in local time when you dual boot with Windows. Windows puts the system clock in local time. Apparently, they did finally make it possible to have it in UTC with either Vista or 7 (I don't remember which) if you set a registry setting accordingly, but it's not that way by default.
 Where I work, we ended up with a bug where when you rewinded
 video and you were east of UTC, it would rewind one hour too
 far for each hour east of UTC that you were (so 2 hours east of
 UTC would rewind 2 hours and 1 second instead of 1 second). It
 turns out that the code had been converting a time value to UTC
 from local time when it was already in UTC (or it might have
 been the other way around - I don't recall exactly which at the
 moment).

As far as I understand, this is a problem only for time values that bear no time zone information and not a problem for, say, UTCtime: when you convert UTCtime to UTC, you get the same value.

It's problem even for times with time zone information if the time isn't in UTC. Using something like UTCTime or SysTime fixes the problem, because the time would always be in UTC. The problem is that the code in that program _was_ converting time back and forth between local time and UTC.
 Times should be kept in UTC as much as possible and converted
 as little as possible. Anything else is asking for trouble.
 That's also one reason why it's good to have times be objects
 rather than naked numbers. It reduces the risk of people
 converting them incorrectly. SysTime mostly avoids the whole
 issue by encapsulating the time (in UTC) with a time zone,
 making it so that it generally "just works."

One can convert std.datetime.SysTime value to local time just by adding a corresponding offset. That should be easy.

A SysTime is in UTC and holds a TimeZone which converts the time to that time zone when it needs to (e.g. getting the day of that time or converting it to a string). The programmer doesn't need to do any conversions with SysTime. If they want to change the time zone of the SysTime, they change the SysTime's timezone property and then any function which needs to adjust for the time zone will use the new TimeZone object. The time itself is always in UTC, so there are no conversion problems when changing time zones. - Jonathan M Davis
Dec 27 2011
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
On Wednesday, 28 December 2011 at 00:15:00 UTC, Jonathan M Davis 
wrote:
 By having SysTime, you don't need extra time types.

Hmm... if you don't have extra time types, how do you format a SysTime? To convert a SysTime to a string you usually need year, month and day. Calculating a year takes some time: leap years, possibly time zone adjustment; when you need month, you have to recalculate year, since SysTime doesn't hold it. That's how it works?
Dec 28 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, December 28, 2011 11:39:16 Kagamin wrote:
 On Wednesday, 28 December 2011 at 00:15:00 UTC, Jonathan M Davis
 
 wrote:
 By having SysTime, you don't need extra time types.

Hmm... if you don't have extra time types, how do you format a SysTime? To convert a SysTime to a string you usually need year, month and day. Calculating a year takes some time: leap years, possibly time zone adjustment; when you need month, you have to recalculate year, since SysTime doesn't hold it. That's how it works?

SysTime's time is held internally as a long holding the number of hecto- nanoseconds (100 ns) from midnight, January 1st, 1 A.D. It also holds a TimeZone which is used to convert that time to the correct time zone when any function requires that the time be in a specific time zone (e.g. getting the year of the SysTime or converting it to a string). SysTime does all of the appropriate calculations for that. And if you want to change the time zone of a SysTime, you set its timezone property to the TimeZone that you want. Its internal time is always in UTC. And yes, if you access any function or property of SysTime which needs the time in any format other than hnsecs from midnight January 1st, 1 A.D., it has to do the appropriate calculations, even if it's done them before. That's why it can be more efficient to convert a SysTime to a DateTime if you need access its properties a bunch of times in a row. But as long as you don't create a SysTime from that DateTime, you shouldn't have conversion problems. You _do_ have potential conversion issues if you set a property on a SysTime (e.g. the year), since it has to convert it to its time zone and back again to do that, but there's no way around that unfortunately. However, adding and subtracting durations from a SysTime have no problems at all, because you don't have to do any conversions. It's just setting a property that must be in the SysTime's time zone (e.g. the year or month) which could have conversion issues. - Jonathan M Davis
Dec 28 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, December 28, 2011 03:03:30 Jonathan M Davis wrote:
 On Wednesday, December 28, 2011 11:39:16 Kagamin wrote:
 On Wednesday, 28 December 2011 at 00:15:00 UTC, Jonathan M Davis
 
 wrote:
 By having SysTime, you don't need extra time types.

Hmm... if you don't have extra time types, how do you format a SysTime? To convert a SysTime to a string you usually need year, month and day. Calculating a year takes some time: leap years, possibly time zone adjustment; when you need month, you have to recalculate year, since SysTime doesn't hold it. That's how it works?

SysTime's time is held internally as a long holding the number of hecto- nanoseconds (100 ns) from midnight, January 1st, 1 A.D. It also holds a TimeZone which is used to convert that time to the correct time zone when any function requires that the time be in a specific time zone (e.g. getting the year of the SysTime or converting it to a string). SysTime does all of the appropriate calculations for that. And if you want to change the time zone of a SysTime, you set its timezone property to the TimeZone that you want. Its internal time is always in UTC.

And actually, this would be the same with UTCTime if we were to create it. It would hold its time internally as a long in hnsecs, and if it had any functions which required the time in any other format, it would have to convert. It would effectively be a SysTime with UTC as its time zone, except that the time zone would be hard-coded instead of using a TimeZone and its time zone being settable. - Jonathan M Davis
Dec 28 2011
prev sibling next sibling parent "Iain S" <staffell gmail.com> writes:
Could I jump back to the original proposal...

Two years after this thread started, I find there is no way to 
print a datetime string in the format I desire..  D doesn't have 
the functionality of strftime(), but does have a hundred or so 
functions that do related, but not quite right, things..

Could this change?
Mar 06 2013
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, March 07, 2013 00:39:06 Iain S wrote:
 Could I jump back to the original proposal...
 
 Two years after this thread started, I find there is no way to
 print a datetime string in the format I desire..  D doesn't have
 the functionality of strftime(), but does have a hundred or so
 functions that do related, but not quite right, things..
 
 Could this change?

Of course it could change, and it will, but a design must be agreed upon, and I've been very busy of late, so I haven't been doing a lot with D, and this has definitely fallen by the wayside. I need to get back to it, and I will. I just haven't yet. And coming up with a design that's both easy to use and very flexible is difficult. But some variation of what's been discussed will likely end up getting into Phobos at some point. - Jonathan M Davis
Mar 06 2013