digitalmars.D - Thread-local storage and Performance

dsimcha (6/6) Oct 26 2009 Has D's builtin TLS been optimized in the past 6 months to year? I had

=?UTF-8?B?UGVsbGUgTcOlbnNzb24=?= (3/9) Oct 26 2009 I was under the impression that TLS should be faster due to absence of

Denis Koroskin (8/19) Oct 26 2009 ad
dsimcha (8/17) Oct 26 2009 __gshared == old-skool cowboy sharing, i.e. plain old unsynchronized glo...

Walter Bright (4/23) Oct 26 2009 Nothing has changed. What I would do is to look at the assembler output

dsimcha <dsimcha yahoo.com> writes:

Has D's builtin TLS been optimized in the past 6 months to year?  I had
benchmarked it awhile back when optimizing some code that I wrote and
discovered it was significantly slower than regular globals (the kind that are
now __gshared).  Now, at least on Windows, it seems that there is no
discernible difference and if anything, TLS is slightly faster than __gshared.
 What's changed?

Oct 26 2009

=?UTF-8?B?UGVsbGUgTcOlbnNzb24=?= <pelle.mansson gmail.com> writes:

dsimcha wrote:
 Has D's builtin TLS been optimized in the past 6 months to year?  I had
 benchmarked it awhile back when optimizing some code that I wrote and
 discovered it was significantly slower than regular globals (the kind that are
 now __gshared).  Now, at least on Windows, it seems that there is no
 discernible difference and if anything, TLS is slightly faster than __gshared.
  What's changed?

I was under the impression that TLS should be faster due to absence of 
synchronization.

Oct 26 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Mon, 26 Oct 2009 18:26:02 +0300, Pelle M=C3=A5nsson  =

<pelle.mansson gmail.com> wrote:

 dsimcha wrote:
 Has D's builtin TLS been optimized in the past 6 months to year?  I h=


ad
 benchmarked it awhile back when optimizing some code that I wrote and=


 discovered it was significantly slower than regular globals (the kind=


  =

 that are
 now __gshared).  Now, at least on Windows, it seems that there is no
 discernible difference and if anything, TLS is slightly faster than  =


 __gshared.
  What's changed?

 I was under the impression that TLS should be faster due to absence of=

  =

 synchronization.

__gshared doesn't have any locks/barriers associated with them.
TLS should be slightly slower due to an additional indirection, but I  =

don't think it would be noticeable.

Oct 26 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Pelle Månsson (pelle.mansson gmail.com)'s article
 dsimcha wrote:
 Has D's builtin TLS been optimized in the past 6 months to year?  I had
 benchmarked it awhile back when optimizing some code that I wrote and
 discovered it was significantly slower than regular globals (the kind that are
 now __gshared).  Now, at least on Windows, it seems that there is no
 discernible difference and if anything, TLS is slightly faster than __gshared.
  What's changed?

 I was under the impression that TLS should be faster due to absence of
 synchronization.

__gshared == old-skool cowboy sharing, i.e. plain old unsynchronized globals.

Without getting into the details of my specific case, the reason I'm interested
in
this is that I have some code that I want to be as fast as possible in both
single- and multithreaded environments.  Right now, it has a hack that checks
thread_needLock() and uses plain old globals for everything as long as the
program
is single-threaded because that seemed faster than TLS lookups a while ago.
However, running the same benchmark again shows otherwise.

Oct 26 2009

Walter Bright <newshound1 digitalmars.com> writes:

dsimcha wrote:
 == Quote from Pelle Månsson (pelle.mansson gmail.com)'s article
 dsimcha wrote:
 Has D's builtin TLS been optimized in the past 6 months to year?  I had
 benchmarked it awhile back when optimizing some code that I wrote and
 discovered it was significantly slower than regular globals (the kind that are
 now __gshared).  Now, at least on Windows, it seems that there is no
 discernible difference and if anything, TLS is slightly faster than __gshared.
  What's changed?

 I was under the impression that TLS should be faster due to absence of
 synchronization.

 
 __gshared == old-skool cowboy sharing, i.e. plain old unsynchronized globals.
 
 Without getting into the details of my specific case, the reason I'm
interested in
 this is that I have some code that I want to be as fast as possible in both
 single- and multithreaded environments.  Right now, it has a hack that checks
 thread_needLock() and uses plain old globals for everything as long as the
program
 is single-threaded because that seemed faster than TLS lookups a while ago.
 However, running the same benchmark again shows otherwise.

Nothing has changed. What I would do is to look at the assembler output 
and verify that the TLS globals really are TLS, and the ones that are 
not are really not.

Oct 26 2009

D Programming

C/C++ Programming

Other

digitalmars.D - Thread-local storage and Performance