www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 11365] New: Allow D source file names to have no extension

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365

           Summary: Allow D source file names to have no extension
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: bugzilla digitalmars.com


--- Comment #0 from Walter Bright <bugzilla digitalmars.com> 2013-10-26
14:10:21 PDT ---
---- eles writes ----
This forces scripts to bear the .d extension. For example, if you write a
script on Linux named "git-test" and you put at the top:

#!rdmd

rdmd will pass its name to dmd, and dmd will try to compile... "git-test.d",
which does not exist.

Now, you have either to rename the "git-test" into "git-test.d", or to create a
hardlink named "git-test.d" that points towards "git-test" so that dmd finally
gets satisfied its ".d" hungriness.

The solution with the hardlink carries the well-known burden of redundancy,
let's not even say its idiot and makes back-up-ing a mess.

OTOH, renaming the original script into "git-test.d" has the undesirable effect
wrt to git software.

git uses some nice convention that you can extend its command list by writing
your own "git-command1", "git-command2" scripts and they are invoked
automatically by git when you type:

"git command1" (this will invoke "git-command1") etc.

The problem with being forced to rename "git-command1" into "git-command1.d" is
that, afterwards, you have to type the following command for git:

"git command1.d" (in order to have the "git-command1.d" invoked, as
"git-command1" simply does not exist or, if it would exist, dmd would be blind
about it).

SO, you cannot type "git command1" and to have a "git-command1" script invoked,
because git won't search for "git-command1.d", while dmd won't compile
"git-command1".

So you need both "git-command1" and "git-command1.d" doing the same thing, just
to be able to type "git command1" (not even say that this allows you to invoke,
also "git comman1.d", which is ugly and undesired redundancy).

Now, immagine yourself having to type:

"git checkout.d ."
"git commit.d"
"git log.d"

instead of

"git checkout ."
"git commit"
"git log"

and tell me that ".d" is not an issue.
----------------------

To that end, I propose that for:

    dmd foo

that it will treat 'foo' as the source file if it does not find foo.d or
foo.di.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 26 2013
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365


Vladimir Panteleev <thecybershadow gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |thecybershadow gmail.com


--- Comment #1 from Vladimir Panteleev <thecybershadow gmail.com> 2013-10-27
01:04:00 EEST ---
I should note that "auto-correcting" file names has security implications.

Let's suppose that there exists an upload script file, written in D, called
"upload", in the root of a web server's public directory. The upload script
goes like this:

#!rdmd
(code follows)

The upload script allows users to upload files with any name to the same
directory. Naturally, for security reasons, none of the uploaded files can be
executable, and it's not possible to overwrite the upload script by uploading a
file with the same name.

Now, what happens if someone uploads a file called "upload.d"?

The webserver runs "upload", which runs "rdmd upload", which runs "dmd upload",
which compiles teh file "upload.d", and not "upload". The uploader successfully
got their code running on the server.

Possible solutions:
1) deprecate then remove all name auto-correction features from dmd and rdmd
2) forbid compilation if an ambiguity exists due to name auto-correction
(although now this turns from an RCE vulnerability into a DOS vulnerability)
3) remove auto-correction features from rdmd; make rdmd pass a flag to dmd that
disable name auto-correction

---------------------------------------------------

Another problem with this suggestion:

echo 'void main(){}' > foo.d
dmd foo
rm foo.d
dmd foo

dmd will now try to parse a compiled binary file as an executable.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 26 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365



--- Comment #2 from Vladimir Panteleev <thecybershadow gmail.com> 2013-10-27
01:06:08 EEST ---
One thing I forgot to mention regarding name auto-correction. Perhaps, the most
famous security problem caused by such a mis-feature, is the "MultiViews"
feature in the Apache web server. When enabled, a request for foo.php could
execute foo.php.txt if foo.php was not found. This allowed bypassing upload
script validation checks. Search the web for "MultiViews vulnerability" for
more details.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 26 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365


Jacob Carlborg <doob me.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |doob me.com


--- Comment #3 from Jacob Carlborg <doob me.com> 2013-10-27 03:12:32 PDT ---
(In reply to comment #1)

 3) remove auto-correction features from rdmd; make rdmd pass a flag to dmd that
 disable name auto-correction
That won't fix the problem if one is using "dmd -run". -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 27 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365


Leandro Lucarella <leandro.lucarella sociomantic.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |leandro.lucarella sociomant
                   |                            |ic.com
            Summary|Allow D source file names   |Allow D source file names
                   |to have no extension        |to have no extension (or an
                   |                            |arbitrary extension)


--- Comment #4 from Leandro Lucarella <leandro.lucarella sociomantic.com>
2013-10-27 05:05:52 PDT ---
I just updated the title of the issue, arbitrary extension names should be
allowed for the same reason. I also agree with Vladimir, the compiler shouldn't
add any extension when the file is not found.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 27 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365


Andrej Mitrovic <andrej.mitrovich gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrej.mitrovich gmail.com


--- Comment #5 from Andrej Mitrovic <andrej.mitrovich gmail.com> 2013-10-30
08:34:33 PDT ---
Btw, no extensions might be fine, but I'm totally against D sources having
arbitrary extensions. People will start doing the same thing C++ programmers do
and start inventing 20 different extensions for D sources, so you end up with
extensions like:

.cpp
.cxx
.cp
.cc
.c++

See also:
http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Overall-Options.html

This would just make creating software that deals with D files harder, with no
benefits.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 30 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365



--- Comment #6 from Andrej Mitrovic <andrej.mitrovich gmail.com> 2013-10-30
08:36:07 PDT ---
(In reply to comment #4)
 I just updated the title of the issue, arbitrary extension names should be
 allowed for the same reason. 
So you're the one adding this. What benefit do you see with arbitrary extensions? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 30 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365



--- Comment #7 from Leandro Lucarella <leandro.lucarella sociomantic.com>
2013-10-30 08:41:42 PDT ---
(In reply to comment #6)
 (In reply to comment #4)
 I just updated the title of the issue, arbitrary extension names should be
 allowed for the same reason. 
So you're the one adding this. What benefit do you see with arbitrary extensions?
First, it worth mention that the extension problem with C++ only happened to C++ for historical reasons. There is no reason to think it will happen to D as it doesn't happen in any other language that is flexible in terms of naming files. The reason of having an arbitrary file NAME (the extension is just an artificial separation of a file name) is the same mentioned in the issue description. The compiler have no reason to limit how can I name files. Why if I want to create a script that's called "dlang.org". For some reason I might have a system to fetch stuff from websites and call the scripts after the host name. The moment D tries to pretend it can be used for scripting is the moment D lost its right to place limitations on file naming. Is that simple. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 30 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365


bearophile_hugs eml.cc changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bearophile_hugs eml.cc


--- Comment #8 from bearophile_hugs eml.cc 2013-10-30 08:53:19 PDT ---
(In reply to comment #5)
 Btw, no extensions might be fine, but I'm totally against D sources having
 arbitrary extensions. People will start doing the same thing C++ programmers do
 and start inventing 20 different extensions for D sources, so you end up with
 extensions like:
 
 .cpp
 .cxx
 .cp
 .cc
 .c++
+1. If you offer programmers some freedom, someone will inevitably use it, often with chaotic/confusing results (I see it every day in D.learn). Giving freedom should be done only where there is a large advantage of doing it. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 30 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365



--- Comment #9 from bearophile_hugs eml.cc 2013-10-31 07:42:41 PDT ---
Having a standard extension for D code is useful for programs like "cloc" that
count lines of code, with editors that open .d files with correct D
colorization, for my scripts that select files with .d suffix to test
incompatibilities across different compiler versions. I have testing scripts
that test .d files differently from .py files looking in directories. And it's
not just a matter of my own code, it also mattes from D libraries from other
people.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 31 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365


Dicebot <public dicebot.lv> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |public dicebot.lv


--- Comment #10 from Dicebot <public dicebot.lv> 2013-10-31 07:48:57 PDT ---
As I have already mentioned in NG, the very idea that file extension should
have any relation with its content is just plain wrong and needs to be
discouraged, as well as any arbitrary limitations that may impose.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 31 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365



--- Comment #11 from Dicebot <public dicebot.lv> 2013-10-31 07:50:47 PDT ---
In other words, it is not a as much of a problem of DMD codebase that is uses
".c" for C++ code, it is a problem of IDE's/tools that assume it is a C code
without providing any convenient way to override that assumption.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 31 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365


Mathias LANG <pro.mathias.lang gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pro.mathias.lang gmail.com


--- Comment #12 from Mathias LANG <pro.mathias.lang gmail.com> 2013-10-31
10:08:55 PDT ---
Why should we enforce this ? We enforce things to prevent obvious mistakes. D
language plays well in this field. It ensures what it is sure needs to be
ensured, and give you the tools to build your own rules, with the least
burdens.

It's not a mistake to have a source file with an arbitrary extension, or no
extension at all. DMD will still now it's argument is a source file, whatever
its name is. And they're some valid use cases where you would not want a .d[i]
extension, as eles noticed in the quoted comment above, and in the NG.

(In reply to comment #9)
 Having a standard extension for D code is useful for programs like "cloc" that
 count lines of code, with editors that open .d files with correct D
 colorization, for my scripts that select files with .d suffix to test
 incompatibilities across different compiler versions. I have testing scripts
 that test .d files differently from .py files looking in directories. And it's
 not just a matter of my own code, it also mattes from D libraries from other
 people.
As you point, there are also some use cases where some tool require a specific extension. But that's none of our business, the tool should ensure it, not DMD. The real problem for those tools is to know what the file holds. We don't have such problems with DMD. For the record, good editors solve the problem easily, like vim or emacs: # vim: syntax=d ts=4 sw=4 sts=4 sr noet # -*- d-mode -*- -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 31 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365



--- Comment #13 from Leandro Lucarella <leandro.lucarella sociomantic.com>
2013-10-31 10:50:57 PDT ---
I quickly tried to implement this by only disabling the extension checks when
the `-run` option is passed, but I failed miserably because the automatic
extension appending is some deeply in the module code, and then object.d isn't
found because the .d isn't added to the module name.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 31 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365



--- Comment #14 from Leandro Lucarella <leandro.lucarella sociomantic.com>
2013-10-31 11:38:36 PDT ---
https://github.com/D-Programming-Language/dmd/pull/2700

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 31 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365



--- Comment #15 from Andrej Mitrovic <andrej.mitrovich gmail.com> 2013-10-31
12:08:58 PDT ---
(In reply to comment #12)
 Why should we enforce this ? We enforce things to prevent obvious mistakes. D
 language plays well in this field. It ensures what it is sure needs to be
 ensured, and give you the tools to build your own rules, with the least
 burdens.
 
 It's not a mistake to have a source file with an arbitrary extension, or no
 extension at all. DMD will still now it's argument is a source file, whatever
 its name is.
That's not true. It can't have a .lib extension, or an .obj/.o extension. Arbitrary extensions means import switches will not work, the compiler won't know which files it has to inspect to find D code. So this feature will be useful for scripts and in cases where you're explicitly passing all files to DMD. (In reply to comment #10)
 As I have already mentioned in NG, the very idea that file extension should
 have any relation with its content is just plain wrong and needs to be
 discouraged, as well as any arbitrary limitations that may impose.
That's exactly what happens when you allow arbitrary extensions, tools end up inventing their own semantics *based on* the extension: http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Overall-Options.html: file.c C source code that must be preprocessed. file.i C source code that should not be preprocessed. file.ii C++ source code that should not be preprocessed. file.tcc C++ header file to be turned into a precompiled header or Ada spec. See what I mean? If it's only .d/.di and for [1] extensionless files we allow then we make everything simple.
 In other words, it is not a as much of a problem of DMD codebase that is uses
 ".c" for C++ code, it is a problem of IDE's/tools that assume it is a C code
 without providing any convenient way to override that assumption.
So now every tool in existence has to do heuristics on text files? The benefit of having known and defined extensions is to make it easier to figure out what a file is without having to open it, to make it easier to filter through a directory of files and organize them based on their extension. Using .c for C++ files is Walter's fault and nobody else's. There are no excuses here. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 31 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365



--- Comment #16 from Andrej Mitrovic <andrej.mitrovich gmail.com> 2013-10-31
12:14:39 PDT ---
(In reply to comment #12)
 For the record, good editors solve the problem easily, like vim or emacs:
 # vim: syntax=d ts=4 sw=4 sts=4 sr noet
 # -*- d-mode -*-
You call that a solution? Arbitrary tools adding an arbitrary amount of HEADER information they've invented? So then other tools have to be able to interpret these lines too. This doesn't scale. It's not a solution. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 31 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365



--- Comment #17 from Dicebot <public dicebot.lv> 2013-10-31 13:27:13 PDT ---
(In reply to comment #15)
 That's not true. It can't have a .lib extension, or an .obj/.o extension.
This is purely a problem of how DMD argument list is designed, not meaningful limitation. And yet another example of what apps shouldn't do.
 Arbitrary extensions means import switches will not work, the compiler won't
 know which files it has to inspect to find D code. So this feature will be
 useful for scripts and in cases where you're explicitly passing all files to
 DMD.
Exactly. And someone who wants to use arbitrary extensions will be aware that he is stepping aside from common naming convention and thus losing some convenience offered by compiler. It is perfectly expected.
 (In reply to comment #10)
 As I have already mentioned in NG, the very idea that file extension should
 have any relation with its content is just plain wrong and needs to be
 discouraged, as well as any arbitrary limitations that may impose.
That's exactly what happens when you allow arbitrary extensions, tools end up inventing their own semantics *based on* the extension: ... See what I mean?
It is exactly what happens when _someone_ (compiler, tools, whatever) decides to strictly couple some behavior exclusively to extension. See what I mean? :)
 In other words, it is not a as much of a problem of DMD codebase that is uses
 ".c" for C++ code, it is a problem of IDE's/tools that assume it is a C code
 without providing any convenient way to override that assumption.
So now every tool in existence has to do heuristics on text files?
Yes if it is important (there are standard tools for that like famous "file" command). In most cases though it should just try interpret input as if it is legal file and fail in process if it has garbage. Similar to how it will fail if you put garbage into .d file. And context of interpretation should be defined by compiler switches, configuration files or some other external thing. Using default interpretation defined by convention like file extension is also fine if it can be overridden with a manual option.
 The benefit
 of having known and defined extensions is to make it easier to figure out what
 a file is without having to open it, to make it easier to filter through a
 directory of files and organize them based on their extension.
As I have said, crazy DOS legacy. Luckily, most Linux file managers don't do this and actually explore file metadata.
 Using .c for C++ files is Walter's fault and nobody else's. There are no
 excuses here.
There are no excuses but there is also no disaster. It is bad to break common practice but any sane IDE will allow to trivially configure mapping of .c files to C++ semantics. Just as they should do. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 31 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11365



--- Comment #18 from Leandro Lucarella <leandro.lucarella sociomantic.com>
2013-11-01 07:56:44 PDT ---
(In reply to comment #16)
 (In reply to comment #12)
 For the record, good editors solve the problem easily, like vim or emacs:
 # vim: syntax=d ts=4 sw=4 sts=4 sr noet
 # -*- d-mode -*-
You call that a solution? Arbitrary tools adding an arbitrary amount of HEADER information they've invented? So then other tools have to be able to interpret these lines too. This doesn't scale. It's not a solution.
Just a comment about this, even when is irrelevant to my proposed solution: You only need to add extra information when you depart from the default. Is like D itself. Do you need to write all your code in ASM? No, but when you need it is there. It will be painful and you won't get lots of features, but you can do it. You are a grown up and know what's best for you. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 01 2013