www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - No tempFile() in std.file

reply =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
Why isn't there a function, say `tempFile()`, in

https://dlang.org/phobos/std_file.html

that creates a temporary _file_ when we already have

https://dlang.org/phobos/std_file.html#tempDir

?
May 15 2017
next sibling parent reply Jonathan M Davis via Digitalmars-d-learn writes:
On Monday, May 15, 2017 21:52:27 Nordlöw via Digitalmars-d-learn wrote:
 Why isn't there a function, say `tempFile()`, in

 https://dlang.org/phobos/std_file.html

 that creates a temporary _file_ when we already have

 https://dlang.org/phobos/std_file.html#tempDir
We have std.stdio.File.tmpfile, which is kind of terrible, because you can't get at its name, and it gets deleted as soon as the File is destroyed. And we briefly had std.stdio.File.scratchFile (IIRC someone didn't like tempFile for one reason or another, which is what I had named it originally), but it pulled in enough dependencies (IIRC, beacuse it used tempDir, which is in std.file, std.datetime got pulled in, because other stuff in std.file uses SysTime, and std.datetime pulled in yet more...) that hello world grew considerably in size (since it uses std.stdio and File for writeln), and there was enough screaming about that that the function was removed. I don't remember what the proposal was for what we needed to do to fix having a bunch of Phobos pulled in to make scratchFile work, but until that's been sorted out, we're kind of stuck, as stupid as that is: https://issues.dlang.org/show_bug.cgi?id=14599 Personally, I think that it would be very much worth making hello world larger, since hello world really doesn't matter, but because there are plenty of folks checking out D who write hello world and then look at the executable size, it was considered unacceptable for it to get much larger. Now, we _could_ add a tempFile to std.file which returned a supposedly unique filename in tempDir, but simply using it would result in a race condition, because (as unlikely as it may be) someone else could create the file between the time that you get the file name from tempFile and the time when you try to open it to write to it - that's why using a solution that involves opening the file is needed. I suppose that we could add a tempFile that did what std.stdio.File.scratchFile did but create an empty file and return its path rather than returning a File, though that would be a bit annoying, since you'd then have to open it to operate on it instead of just writing to it. Maybe it would be worth doing though given the stupidity blocking std.stdio.File.scratchFile. - Jonathan M Davis
May 15 2017
next sibling parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
On Monday, 15 May 2017 at 22:38:15 UTC, Jonathan M Davis wrote:
 Personally, I think that it would be very much worth making 
 hello world larger, since hello world really doesn't matter, 
 but because there are plenty of folks checking out D who write 
 hello world and then look at the executable size, it was 
 considered unacceptable for it to get much larger.
I'm reminded of doing the same thing with C++ using streams and saw the size explode from 60k or so to something like 400k, for seemingly no good reason at all. Hmmm while we're on the subject of size, is there a tool to strip out functions that are never used from the final executable?
May 15 2017
parent reply Anonymouse <asdf asdf.net> writes:
On Tuesday, 16 May 2017 at 05:09:12 UTC, Era Scarecrow wrote:
 On Monday, 15 May 2017 at 22:38:15 UTC, Jonathan M Davis wrote:
 Personally, I think that it would be very much worth making 
 hello world larger, since hello world really doesn't matter, 
 but because there are plenty of folks checking out D who write 
 hello world and then look at the executable size, it was 
 considered unacceptable for it to get much larger.
I'm reminded of doing the same thing with C++ using streams and saw the size explode from 60k or so to something like 400k, for seemingly no good reason at all. Hmmm while we're on the subject of size, is there a tool to strip out functions that are never used from the final executable?
Linker --gc-sections, though in my experience it doesn't cull much. Add --print-gc-sections to see what it does remove.
May 16 2017
parent Jacob Carlborg <doob me.com> writes:
On 2017-05-16 09:39, Anonymouse wrote:

 Linker --gc-sections
IIRC that only works with LDC. With DMD it's possible that it removes sections that are used but not directly referenced. -- /Jacob Carlborg
May 16 2017
prev sibling parent reply bachmeier <no spam.net> writes:
On Monday, 15 May 2017 at 22:38:15 UTC, Jonathan M Davis wrote:
 I suppose that we could add a tempFile that did what 
 std.stdio.File.scratchFile did but create an empty file and 
 return its path rather than returning a File, though that would 
 be a bit annoying, since you'd then have to open it to operate 
 on it instead of just writing to it. Maybe it would be worth 
 doing though given the stupidity blocking 
 std.stdio.File.scratchFile.
That seems perfectly reasonable to me. Couldn't the function return both the path and the file in a struct? This is something that really should be in Phobos. It's one of those little things that makes D a lot less pleasurable to work with, at least for anyone needing that functionality.
May 16 2017
next sibling parent reply Jonathan M Davis via Digitalmars-d-learn writes:
On Tuesday, May 16, 2017 11:19:14 bachmeier via Digitalmars-d-learn wrote:
 On Monday, 15 May 2017 at 22:38:15 UTC, Jonathan M Davis wrote:
 I suppose that we could add a tempFile that did what
 std.stdio.File.scratchFile did but create an empty file and
 return its path rather than returning a File, though that would
 be a bit annoying, since you'd then have to open it to operate
 on it instead of just writing to it. Maybe it would be worth
 doing though given the stupidity blocking
 std.stdio.File.scratchFile.
That seems perfectly reasonable to me. Couldn't the function return both the path and the file in a struct? This is something that really should be in Phobos. It's one of those little things that makes D a lot less pleasurable to work with, at least for anyone needing that functionality.
std.file doesn't have anything to do with File. It only operates on entire files at a time, so it wouldn't make sense for a function in std.file to return a std.stdio.File. At most what would make sense to me would be to have a function in std.file which created the file as empty and closed it and then returned the file name for the program to then open or do whatever else it wants with - which would actually be perfectly fine if you then wanted to use std.file.write or similar to the file. It's just more annoying if you want a File, because then you end up effectively opening the file twice. - Jonathan M Davis
May 16 2017
next sibling parent bachmeier <no spam.net> writes:
On Tuesday, 16 May 2017 at 13:56:57 UTC, Jonathan M Davis wrote:

 std.file doesn't have anything to do with File. It only 
 operates on entire files at a time, so it wouldn't make sense 
 for a function in std.file to return a std.stdio.File. At most 
 what would make sense to me would be to have a function in 
 std.file which created the file as empty and closed it and then 
 returned the file name for the program to then open or do 
 whatever else it wants with - which would actually be perfectly 
 fine if you then wanted to use std.file.write or similar to the 
 file. It's just more annoying if you want a File, because then 
 you end up effectively opening the file twice.

 - Jonathan M Davis
Okay, now I see your point. Your proposal is still a lot better than doing nothing.
May 16 2017
prev sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 16 May 2017 at 13:56:57 UTC, Jonathan M Davis wrote:
 On Tuesday, May 16, 2017 11:19:14 bachmeier via 
 Digitalmars-d-learn wrote:
 On Monday, 15 May 2017 at 22:38:15 UTC, Jonathan M Davis wrote:
 [...]
That seems perfectly reasonable to me. Couldn't the function return both the path and the file in a struct? This is something that really should be in Phobos. It's one of those little things that makes D a lot less pleasurable to work with, at least for anyone needing that functionality.
std.file doesn't have anything to do with File. It only operates on entire files at a time, so it wouldn't make sense for a function in std.file to return a std.stdio.File. At most what would make sense to me would be to have a function in std.file which created the file as empty and closed it and then returned the file name for the program to then open or do whatever else it wants with - which would actually be perfectly fine if you then wanted to use std.file.write or similar to the file. It's just more annoying if you want a File, because then you end up effectively opening the file twice. - Jonathan M Davis
As your solution doesn't inherently solve the race condition associated with temporary files, you could still generate the name with a wrapper around tempnam() or tmpnam() (Posix for Windows I don't know). This would avoid the double open() of the scenario above.
May 16 2017
parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Wednesday, 17 May 2017 at 05:30:40 UTC, Patrick Schluter wrote:
 On Tuesday, 16 May 2017 at 13:56:57 UTC, Jonathan M Davis wrote:
 [...]
As your solution doesn't inherently solve the race condition associated with temporary files, you could still generate the name with a wrapper around tempnam() or tmpnam() (Posix for Windows I don't know). This would avoid the double open() of the scenario above.
But as Jonathan said above, this is not a good solution in any case. In Posix the use the mks*temp() family of functions is standard now.
May 16 2017
parent Jonathan M Davis via Digitalmars-d-learn writes:
On Wednesday, May 17, 2017 05:34:50 Patrick Schluter via Digitalmars-d-learn 
wrote:
 On Wednesday, 17 May 2017 at 05:30:40 UTC, Patrick Schluter wrote:
 On Tuesday, 16 May 2017 at 13:56:57 UTC, Jonathan M Davis wrote:
 [...]
As your solution doesn't inherently solve the race condition associated with temporary files, you could still generate the name with a wrapper around tempnam() or tmpnam() (Posix for Windows I don't know). This would avoid the double open() of the scenario above.
But as Jonathan said above, this is not a good solution in any case. In Posix the use the mks*temp() family of functions is standard now.
As I recall, the main problem with mks*temp() is that some platforms only support a stupidly short list af random values (something like 26 IIRC). Regardless, it's trivial to write that functionality yourself. The key is that you open the file with the flag that indicates that the file must not exist when you open it. Worst case, you have to try a few times with different file names, but if you generate something like a UUID, then the odds of a collision are so low that that's not realistically an issue. And AFAIK, for mks*temp() to guarantee that the file was created by the call, it has to do basically the same thing internally. But in any case, even if we wanted to use mks*temp() in a D program, we'd want to wrap it in a portable D function using D's strings and not char*. So, whether mks*temp() is used is an implementation detail. For the use case where you actually want a file name rather than just a file handle to play around with, you're almost certainly going to be closing and reopening the file or closing it and handing its path off to another program, in which case, the only real downside to having a function that securely creates an empty temporary file and returns its path is that you then have to open it a second time to put actual data in it, whereas a function like scratchFile lets you start writing to it without opening it again. Regardless, as long as a program can get the path to the file, and has the appropriate permissions, as I understand it, there's a risk of them screwing with it even if you have it open (at least on POSIX systems which don't normally use file locks and where - as I understand it - obeying the file locks is completely optional, unlike Windows which likes to lock everything). So, while having scratchFile would definitely be nice, maybe a good solution would be to add function to std.file such as writeToTempFile which acted like std.file.write except that it created a temporary file which was guaranteed to not exist prior to the call and returned its path after it wrote to it. If you want to use std.stdio.File with it, you're still forced to write an empty file and then open it again with File, but for the case where you simply want to write the data to a file and then pass the path to something else to use it (be it something in your program or another program), it should work well. Certainly, it would be far better than what we have now (which would be nothing). So, I should probably dig up my scratchFile implementation and adapt it for std.file as something like writeToTempFile or createTempFile. - Jonathan M Davis
May 17 2017
prev sibling next sibling parent "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Tue, May 16, 2017 at 06:56:57AM -0700, Jonathan M Davis via
Digitalmars-d-learn wrote:
 On Tuesday, May 16, 2017 11:19:14 bachmeier via Digitalmars-d-learn wrote:
 On Monday, 15 May 2017 at 22:38:15 UTC, Jonathan M Davis wrote:
 I suppose that we could add a tempFile that did what
 std.stdio.File.scratchFile did but create an empty file and return
 its path rather than returning a File, though that would be a bit
 annoying, since you'd then have to open it to operate on it
 instead of just writing to it. Maybe it would be worth doing
 though given the stupidity blocking std.stdio.File.scratchFile.
[...] Don't forget that there are security concerns related to this. Most modern OS APIs tend to prefer a temp file creation call that atomically (1) generates a unique filename and (2) creates a file with said filename with permissions set such that it can only be exclusively used by the calling process. The reason for this is that there is a race condition between the generation of the filename and the creation of the file. On Posix, for example, an attacker could guess the generated filename and preemptively create the temp file with unexpected content that may cause the program to malfunction (usually to trigger another security flaw and thereby cause arbitrary code execution or privilege escalation, etc.), or to substitute a symlink that points to a file the program shouldn't be writing to / reading from (e.g., /etc/passwd). Actual exploits have been carried out using this route, hence it's something application programmers need to be aware of. (Note that even if your program doesn't do anything directly security related, it may still be an issue; e.g., it could be a music player but an attacker could potentially leverage it to read sensitive files by redirecting file operations or cause a malfunction that gives the attacker escalation of privileges, e.g., a remote attacker gaining shell access on a local user account, from which further attacks can be launched. In this day and age of advanced exploits being widely disseminated over the 'net, these scenarios are far more likely than we'd like to think.) T -- Без труда не выловишь и рыбку из пруда.
May 16 2017
prev sibling parent "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Tue, May 16, 2017 at 08:06:13AM -0700, H. S. Teoh via Digitalmars-d-learn
wrote:
 On Tue, May 16, 2017 at 06:56:57AM -0700, Jonathan M Davis via
Digitalmars-d-learn wrote:
 On Tuesday, May 16, 2017 11:19:14 bachmeier via Digitalmars-d-learn wrote:
 On Monday, 15 May 2017 at 22:38:15 UTC, Jonathan M Davis wrote:
 I suppose that we could add a tempFile that did what
 std.stdio.File.scratchFile did but create an empty file and return
 its path rather than returning a File, though that would be a bit
 annoying, since you'd then have to open it to operate on it
 instead of just writing to it. Maybe it would be worth doing
 though given the stupidity blocking std.stdio.File.scratchFile.
[...] Don't forget that there are security concerns related to this.
[...] Cf.: https://cwe.mitre.org/data/definitions/377.html https://cwe.mitre.org/data/definitions/378.html https://security.stackexchange.com/questions/34397/how-can-an-attacker-use-a-fake-temp-file-to-compromise-a-program https://security.openstack.org/guidelines/dg_using-temporary-files-securely.html T -- A bend in the road is not the end of the road unless you fail to make the turn. -- Brian White
May 16 2017
prev sibling parent Jonathan M Davis via Digitalmars-d-learn writes:
On Tuesday, May 16, 2017 08:06:13 H. S. Teoh via Digitalmars-d-learn wrote:
 On Tue, May 16, 2017 at 06:56:57AM -0700, Jonathan M Davis via 
Digitalmars-d-learn wrote:
 On Tuesday, May 16, 2017 11:19:14 bachmeier via Digitalmars-d-learn 
wrote:
 On Monday, 15 May 2017 at 22:38:15 UTC, Jonathan M Davis wrote:
 I suppose that we could add a tempFile that did what
 std.stdio.File.scratchFile did but create an empty file and return
 its path rather than returning a File, though that would be a bit
 annoying, since you'd then have to open it to operate on it
 instead of just writing to it. Maybe it would be worth doing
 though given the stupidity blocking std.stdio.File.scratchFile.
[...] Don't forget that there are security concerns related to this. Most modern OS APIs tend to prefer a temp file creation call that atomically (1) generates a unique filename and (2) creates a file with said filename with permissions set such that it can only be exclusively used by the calling process. The reason for this is that there is a race condition between the generation of the filename and the creation of the file.
[...] Yes, which is why you use the system call that fails if the file already exists when you open it. std.stdio.File.scratchFile dealt with all of that (and AFAIK, did so correctly, though I could have missed something). And we'd have it in Phobos still if it weren't for the complaints about hello world's excecutable size. But all of the subtleties around that mess is why we don't have a std.file.tempFile which simply returns a suggested file name. You'd _think_ that it would be a simple issue, but unfortunately, it's not. - Jonathan M Davis
May 16 2017