digitalmars.D - Coding for solid state drives

Walter Bright (4/4) Apr 24 2015 http://codecapsule.com/2014/02/12/coding-for-ssds-part-6-a-summary-what-...

weaselcat (3/7) Apr 24 2015 Part 3 covers read/write optimizations in specific for anyone
tcak (11/15) Apr 24 2015 This article is not about "coding", but information about SSDs.

Kagamin (1/1) Apr 24 2015 Shouldn't system cache apply appropriate sync policy?
Walter Bright (4/20) Apr 24 2015 Things are configurable in std.stdio. But most people will just use the ...

Vladimir Panteleev (6/9) Apr 24 2015 That would be unwise - as HDDs are much slower (and still much

Walter Bright (3/11) Apr 25 2015 Hard disks are dead today for anyone who cares about performance.

Xinok (7/10) Apr 25 2015 For anybody who wants to buy 4TB of storage for $100, hard drives

Walter Bright (6/16) Apr 25 2015 I presume what sensible people wanting speed do is what I do - I have a ...

Vladimir Panteleev (6/10) Apr 24 2015 This article seems to target operating system authors more than

Walter Bright (13/24) Apr 25 2015 "The high-level optimizations are important: * Choose a good SSD * Read ...

ketmar (2/5) Apr 25 2015 yes: don't do anything. it's OS task to cope with that.=

Laeeth Isharc (13/19) Apr 25 2015 well beyond the area I know, but it seems like given the relative

ketmar (3/5) Apr 25 2015 i believe that this must be controlled with `version` or cli arg, and it...

Laeeth Isharc (8/15) Apr 25 2015 I defer to your greater expertise.

ketmar (4/21) Apr 25 2015 that wasn't me who put csv parser in. along with json and xml parsers,=2...
ketmar (11/28) Apr 25 2015 and now something more serious: trying to detect what storage propgram=2...

Vladimir Panteleev (7/20) Apr 25 2015 Well, actually, it should. In theory, all you need to do is to

Walter Bright <newshound2 digitalmars.com> writes:

http://codecapsule.com/2014/02/12/coding-for-ssds-part-6-a-summary-what-every-programmer-should-know-about-solid-state-drives/

An interesting article. Anyone want to see if there are any modifications we 
should make to std.stdio to work better with SSDs? (Such as changing the buffer 
sizes.)

Apr 24 2015

"weaselcat" <weaselcat gmail.com> writes:

On Friday, 24 April 2015 at 08:27:06 UTC, Walter Bright wrote:
 http://codecapsule.com/2014/02/12/coding-for-ssds-part-6-a-summary-what-every-programmer-should-know-about-solid-state-drives/

 An interesting article. Anyone want to see if there are any 
 modifications we should make to std.stdio to work better with 
 SSDs? (Such as changing the buffer sizes.)

Part 3 covers read/write optimizations in specific for anyone 
interested in reading.

Apr 24 2015

"tcak" <tcak gmail.com> writes:

On Friday, 24 April 2015 at 08:27:06 UTC, Walter Bright wrote:
 http://codecapsule.com/2014/02/12/coding-for-ssds-part-6-a-summary-what-every-programmer-should-know-about-solid-state-drives/

 An interesting article. Anyone want to see if there are any 
 modifications we should make to std.stdio to work better with 
 SSDs? (Such as changing the buffer sizes.)


This article is not about "coding", but information about SSDs.

Considering spinning drives and SSDs separately means create two 
separate configurations for software. So you either:

1. Provide two separate code one is written for one 
configuration, and another
for SSD. This way the performance can be kept high,

2. Configurations can be changed on run-time, so there will be 
one executable only. But values won't be constant, so there is no 
compile time determination of values (buffer size etc.)


For most end user, 2nd is suitable.

Apr 24 2015

"Kagamin" <spam here.lot> writes:

Shouldn't system cache apply appropriate sync policy?

Apr 24 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 4/24/2015 2:18 AM, tcak wrote:
 On Friday, 24 April 2015 at 08:27:06 UTC, Walter Bright wrote:
 http://codecapsule.com/2014/02/12/coding-for-ssds-part-6-a-summary-what-every-programmer-should-know-about-solid-state-drives/


 An interesting article. Anyone want to see if there are any modifications we
 should make to std.stdio to work better with SSDs? (Such as changing the
 buffer sizes.)


 This article is not about "coding", but information about SSDs.

The section I linked to is definitely about coding for SSDs.


 Considering spinning drives and SSDs separately means create two separate
 configurations for software. So you either:

 1. Provide two separate code one is written for one configuration, and another
 for SSD. This way the performance can be kept high,

 2. Configurations can be changed on run-time, so there will be one executable
 only. But values won't be constant, so there is no compile time determination
of
 values (buffer size etc.)


 For most end user, 2nd is suitable.

Things are configurable in std.stdio. But most people will just use the default 
settings. The default settings should be optimized for SSDs, not spinning
drives.

Apr 24 2015

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Friday, 24 April 2015 at 19:35:08 UTC, Walter Bright wrote:
 Things are configurable in std.stdio. But most people will just 
 use the default settings. The default settings should be 
 optimized for SSDs, not spinning drives.

That would be unwise - as HDDs are much slower (and still much 
more common), optimizing for SSDs at the expense of HDD 
performance will cause overall performance to be much worse until 
HDDs become rare.

I mean, assuming that such optimizations aren't just theoretical.

Apr 24 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 4/24/2015 10:26 PM, Vladimir Panteleev wrote:
 On Friday, 24 April 2015 at 19:35:08 UTC, Walter Bright wrote:
 Things are configurable in std.stdio. But most people will just use the
 default settings. The default settings should be optimized for SSDs, not
 spinning drives.

 That would be unwise - as HDDs are much slower (and still much more common),
 optimizing for SSDs at the expense of HDD performance will cause overall
 performance to be much worse until HDDs become rare.

 I mean, assuming that such optimizations aren't just theoretical.

Hard disks are dead today for anyone who cares about performance.

I still use them, but only for secondary storage.

Apr 25 2015

"Xinok" <xinok live.com> writes:

On Saturday, 25 April 2015 at 20:12:55 UTC, Walter Bright wrote:
 Hard disks are dead today for anyone who cares about 
 performance.

 I still use them, but only for secondary storage.

For anybody who wants to buy 4TB of storage for $100, hard drives 
are still very much alive. Not to mention USB flash drives and SD 
cards which don't have the performance characteristics of SSDs.

Let's not be so hasty. Until SSDs truly replace all other forms 
of storage, it's best that we don't optimize D and Phobos for one 
type of storage only.

Apr 25 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 4/25/2015 1:42 PM, Xinok wrote:
 On Saturday, 25 April 2015 at 20:12:55 UTC, Walter Bright wrote:
 Hard disks are dead today for anyone who cares about performance.

 I still use them, but only for secondary storage.

 For anybody who wants to buy 4TB of storage for $100, hard drives are still
very
 much alive.

I presume what sensible people wanting speed do is what I do - I have a 256Gb 
SSD for my primary drive, and a 4TB drive as secondary.


 Not to mention USB flash drives and SD cards which don't have the
 performance characteristics of SSDs.

They wouldn't behave like spinning disks do, either.


 Let's not be so hasty. Until SSDs truly replace all other forms of storage,
it's
 best that we don't optimize D and Phobos for one type of storage only.

Um, it's currently optimized for HDs. But those aren't what people who want
fast 
IO use.

Apr 25 2015

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Friday, 24 April 2015 at 08:27:06 UTC, Walter Bright wrote:
 http://codecapsule.com/2014/02/12/coding-for-ssds-part-6-a-summary-what-every-programmer-should-know-about-solid-state-drives/

 An interesting article. Anyone want to see if there are any 
 modifications we should make to std.stdio to work better with 
 SSDs? (Such as changing the buffer sizes.)

This article seems to target operating system authors more than 
application programmers, as OS caches will invalidate most 
application-side changes.

The HN comments are also mostly dismissive of this article:
https://news.ycombinator.com/item?id=9431571

Apr 24 2015

Walter Bright <newshound2 digitalmars.com> writes:

On 4/24/2015 10:24 PM, Vladimir Panteleev wrote:
 On Friday, 24 April 2015 at 08:27:06 UTC, Walter Bright wrote:
 http://codecapsule.com/2014/02/12/coding-for-ssds-part-6-a-summary-what-every-programmer-should-know-about-solid-state-drives/


 An interesting article. Anyone want to see if there are any modifications we
 should make to std.stdio to work better with SSDs? (Such as changing the
 buffer sizes.)

 This article seems to target operating system authors more than application
 programmers, as OS caches will invalidate most application-side changes.

 The HN comments are also mostly dismissive of this article:
 https://news.ycombinator.com/item?id=9431571



"The high-level optimizations are important: * Choose a good SSD * Read and 
Write in "page" multiples and "page" aligned * Use lots of parallel IOs (high 
queue depth) * Do not put unrelated data in the same "page"

A page used to be 4KB, SSDs are now switching to 8KB and will switch to 16KB 
later on. Just pick a reasonable size around that (16KB if you can do it will 
last you a while). Don't sweat the page multiples too much, the SSDs will most 
likely have to handle 4KB pages for a long while due to databases and such so 
they will keep some optimization around that size anyhow, it will make it
easier 
for them if you use a larger size.

I wouldn't heed any of the advice on single-threading, the biggest performance 
boost comes from parallelism and writes are anyway buffered by the SSD (a good 
SSD has a super-cap to have a good sized write cache)."

Apr 25 2015

ketmar <ketmar ketmar.no-ip.org> writes:

On Fri, 24 Apr 2015 01:27:15 -0700, Walter Bright wrote:

 if there are any
 modifications we should make to std.stdio to work better with SSDs?
 (Such as changing the buffer sizes.)

yes: don't do anything. it's OS task to cope with that.=

Apr 25 2015

"Laeeth Isharc" <nospamlaeeth nospam.laeeth.com> writes:

On Saturday, 25 April 2015 at 11:34:22 UTC, ketmar wrote:
 On Fri, 24 Apr 2015 01:27:15 -0700, Walter Bright wrote:

 if there are any
 modifications we should make to std.stdio to work better with 
 SSDs?
 (Such as changing the buffer sizes.)

 yes: don't do anything. it's OS task to cope with that.

well beyond the area I know, but it seems like given the relative 
structure of costs for random seeks for SSDs you often want to 
process files in parallel, whereas the opposite is true for 
spinning platters.  The OS can't help you here.

perhaps not for the standard library, but maybe it would be nice 
to have a function to detect whether a path is on an SSD or not.

I am not sure if there is a standard way to detect this.  There 
is a hacker way here:
https://stackoverflow.com/questions/908188/is-there-any-way-of-detecting-if-a-drive-is-a-ssd

and some others check the output of smartmontools.

But surely, it would be a start to make it easy for the user to 
know so she can shape her approach accordingly.

Apr 25 2015

ketmar <ketmar ketmar.no-ip.org> writes:

On Sat, 25 Apr 2015 14:19:30 +0000, Laeeth Isharc wrote:

 But surely, it would be a start to make it easy for the user to know so
 she can shape her approach accordingly.

i believe that this must be controlled with `version` or cli arg, and it=20
belongs to application logic, not standard library.=

Apr 25 2015

"Laeeth Isharc" <laeeth nospamlaeeth.com> writes:

On Saturday, 25 April 2015 at 16:10:11 UTC, ketmar wrote:
 On Sat, 25 Apr 2015 14:19:30 +0000, Laeeth Isharc wrote:

 But surely, it would be a start to make it easy for the user 
 to know so
 she can shape her approach accordingly.

 i believe that this must be controlled with `version` or cli 
 arg, and it
 belongs to application logic, not standard library.


I defer to your greater expertise.

But I should have thought that if csv parsing belongs in a 
standard library (something that is easy for a user to write 
himself) then detecting whether a path is on an SSD might perhaps 
too.  (Bearing in mind it's more of a system thing not so easy 
for every user to write himself in a platform independent way).


Laeeth.

Apr 25 2015

ketmar <ketmar ketmar.no-ip.org> writes:

On Sat, 25 Apr 2015 16:40:51 +0000, Laeeth Isharc wrote:

 On Saturday, 25 April 2015 at 16:10:11 UTC, ketmar wrote:
 On Sat, 25 Apr 2015 14:19:30 +0000, Laeeth Isharc wrote:

 But surely, it would be a start to make it easy for the user to know
 so she can shape her approach accordingly.

 i believe that this must be controlled with `version` or cli arg, and
 it belongs to application logic, not standard library.

=20
=20
 I defer to your greater expertise.
=20
 But I should have thought that if csv parsing belongs in a standard
 library (something that is easy for a user to write himself) then
 detecting whether a path is on an SSD might perhaps too.  (Bearing in
 mind it's more of a system thing not so easy for every user to write
 himself in a platform independent way).

that wasn't me who put csv parser in. along with json and xml parsers,=20
which people happily replacing anyway. and you want even more crap in=20
standard library.=

Apr 25 2015

ketmar <ketmar ketmar.no-ip.org> writes:

On Sat, 25 Apr 2015 16:40:51 +0000, Laeeth Isharc wrote:

 On Saturday, 25 April 2015 at 16:10:11 UTC, ketmar wrote:
 On Sat, 25 Apr 2015 14:19:30 +0000, Laeeth Isharc wrote:

 But surely, it would be a start to make it easy for the user to know
 so she can shape her approach accordingly.

 i believe that this must be controlled with `version` or cli arg, and
 it belongs to application logic, not standard library.

=20
=20
 I defer to your greater expertise.
=20
 But I should have thought that if csv parsing belongs in a standard
 library (something that is easy for a user to write himself) then
 detecting whether a path is on an SSD might perhaps too.  (Bearing in
 mind it's more of a system thing not so easy for every user to write
 himself in a platform independent way).

and now something more serious: trying to detect what storage propgram=20
using is completely unreliable. you can't optimise for all cases, and you=20
can't even detect all cases. big raid which can be faster than SSD with=20
"SSD pattern"? ah, ok, nobody cares, we detected it as HDD. virtual=20
drive, which can be anything at all? fuse mount point? i can think out=20
alot of that.

that's why operational mode should be controlled by cli switch. if user=20
*really* cares about performance, he *will* know what HW he has and how=20
to make program fully utilize it. and in other cases let OS i/o scheduler=20
do it work without trying to needlessly "help" it.=

Apr 25 2015

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Saturday, 25 April 2015 at 14:19:31 UTC, Laeeth Isharc wrote:
 On Saturday, 25 April 2015 at 11:34:22 UTC, ketmar wrote:
 On Fri, 24 Apr 2015 01:27:15 -0700, Walter Bright wrote:

 if there are any
 modifications we should make to std.stdio to work better with 
 SSDs?
 (Such as changing the buffer sizes.)

 yes: don't do anything. it's OS task to cope with that.

 well beyond the area I know, but it seems like given the 
 relative structure of costs for random seeks for SSDs you often 
 want to process files in parallel, whereas the opposite is true 
 for spinning platters.  The OS can't help you here.

Well, actually, it should. In theory, all you need to do is to 
queue as many reads/writes as you can - using threads, fibers, 
async I/O calls, etc. This is not the same as sequentially 
reading/writing random blocks. The OS I/O scheduler should 
reorder the operations so that the accessed blocks are in order 
and physically close to each other.

Apr 25 2015

D Programming

C/C++ Programming

Other

digitalmars.D - Coding for solid state drives