www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 24446] New: ticksPerSecond is incorrect when posix clock

https://issues.dlang.org/show_bug.cgi?id=24446

          Issue ID: 24446
           Summary: ticksPerSecond is incorrect when posix clock
                    resolution is 1 microsecond or more
           Product: D
           Version: D2
          Hardware: x86
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P1
         Component: druntime
          Assignee: nobody puremagic.com
          Reporter: forestix nom.one

When time.d queries the POSIX API for clock resolution, it assumes that values
of 1 microsecond or greater are wrong, and uses a hard-coded granularity of 1
nanosecond instead:

https://github.com/dlang/dmd/blob/26a4e395e8853de8f83c7c56341066f98e6a8d4f/druntime/src/core/time.d#L2580

The comment on that code claims it's to tame systems that report resolutions of
1 millisecond or worse while updating the clock more frequently than that.
However:

- It activates on reported resolutions >= 1us; a thousand times finer than the
claimed 1ms.
- It does no test at all to validate its assumption that the reported
resolution is wrong.
- It gives no indication to calling code that the returned value is a lie.
- It is most likely to activate on coarse clocks, with a hard-coded value
representing a very fine resolution, making the result not merely a lie, but an
egregious lie.

I wrote a program that prints MonoTimeImpl!(ClockType.coarse).currTime.ticks()
at regular intervals, along with the clock resolutions reported by both POSIX
clock_getres() and D ticksPerSecond(). Here is the output on a Ryzen
7000-series CPU running linux 6.7.9:

          ticks/sec     nsecs/tick
Libc:           250        4000000
Dlib:    1000000000              1
Sampling clock to see how often it actually changes...
        0 nsecs:          0           
  1000000 nsecs:          0           
  2000000 nsecs:    3999934  (changed)
  3000000 nsecs:    3999934           
  4000000 nsecs:    3999934           
  5000000 nsecs:    3999934           
  6000000 nsecs:    7999869  (changed)
  7000000 nsecs:    7999869           
  8000000 nsecs:    7999869           
  9000000 nsecs:    7999869           
 10000000 nsecs:   11999804  (changed)
 11000000 nsecs:   11999804           
 12000000 nsecs:   11999804           
 13000000 nsecs:   11999804           
 14000000 nsecs:   15999737  (changed)
 15000000 nsecs:   15999737           
 16000000 nsecs:   15999737           
 17000000 nsecs:   15999737

As we can see, the clock updates exactly as often as the POSIX call claims, yet
D's clock init code ignores that, and reports an arbitrary resolution instead.

I wonder:

What circumstance led the author of that init code to second-guess the system?
I don't see a problematic OS, libc, or architecture mentioned the comments. Can
that circumstance be reproduced anywhere today?

Why is it applied to all POSIX systems, instead of being a special case for
known-bad systems?

Was the init code's test for (ts.tv_nsec >= 1000) meant to be (ts.tv_nsec >=
1000000) as stated in the comment above it?

Is having the D runtime silently replace values reported by the system really a
good idea? After all, if the OS or its core libs are misbehaving, the bug will
presumably be fixed in a future version, and until then, application code
(which knows its own timing needs) is better prepared to decide how to handle
it.

--
Mar 20