www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Emplace vs closures

reply Lodovico Giaretta <lodovico giaretart.net> writes:
As we all should know, std.conv.emplace does not play well with 
closures:

void main()
{
     int x = 42;
     struct S
     {
         auto getX() { return x; }
     }

     S s;
     assert(s.getX == x);

     auto ps = someBuffer.emplace!S();
     assert(s.getX == x);                // SEGFAULT!
}

As this is not fixable (we cannot pass closures around in D), we 
should IMHO forbid such usages of emplace (i.e. static 
assert(!isNested!S))

I was already working on this, when I stumbled upon this unittest 
in std.conv (simplified):

unittest
{
     int i = 0;
     struct S1
     {
         void foo(){++i;}
     }
     S1 sa = void;
     S1 sb;
     emplace(&sa, sb);   // automagically works
     sa.foo();
     assert(i == 1);
}

Of course there's no way to distinguish between this 
(legitimate?) use case and the former one, so preventing those 
segfaults will also prohibit this kind of usage.

What I'd like to know: is this usage widespread? Should we forbid 
it for the sake of security?
Sep 19 2016
next sibling parent reply ag0aep6g <anonymous example.com> writes:
On 09/19/2016 01:27 PM, Lodovico Giaretta wrote:
 As we all should know, std.conv.emplace does not play well with closures:

 void main()
 {
     int x = 42;
     struct S
     {
         auto getX() { return x; }
     }

     S s;
     assert(s.getX == x);

     auto ps = someBuffer.emplace!S();
     assert(s.getX == x);                // SEGFAULT!
 }
Should probably be `assert(ps.getX == x);`. Note that it would also segfault with `auto ps = S.init;`, and for the same reason: missing context pointer.
 As this is not fixable (we cannot pass closures around in D), we should
 IMHO forbid such usages of emplace (i.e. static assert(!isNested!S))

 I was already working on this, when I stumbled upon this unittest in
 std.conv (simplified):

 unittest
 {
     int i = 0;
     struct S1
     {
         void foo(){++i;}
     }
     S1 sa = void;
     S1 sb;
     emplace(&sa, sb);   // automagically works
     sa.foo();
     assert(i == 1);
 }

 Of course there's no way to distinguish between this (legitimate?) use
 case and the former one, so preventing those segfaults will also
 prohibit this kind of usage.
There is a difference, though: You're copying an existing object here, including the context pointer. So maybe we could disallow the variant above that writes the .init value, and still allow the copying variant.
Sep 19 2016
parent reply Lodovico Giaretta <lodovico giaretart.net> writes:
On Monday, 19 September 2016 at 11:45:25 UTC, ag0aep6g wrote:
 Note that it would also segfault with `auto ps = S.init;`, and 
 for the same reason: missing context pointer.
Oh. I didn't thought about that. This means that in the following example, the initialization of `s` is more than a simple call to `S.init`. I was under the impression that in D `S.init` should represent a valid state for whatever type `S`. void main() { int x = 42; struct S { auto getX() { return x; } } S s; // this line does not simply call S.init, // but also creates a closure and puts it inside s. }
 There is a difference, though: You're copying an existing 
 object here, including the context pointer. So maybe we could 
 disallow the variant above that writes the .init value, and 
 still allow the copying variant.
Yeah, I will try that.
Sep 19 2016
parent reply ag0aep6g <anonymous example.com> writes:
On 09/19/2016 02:24 PM, Lodovico Giaretta wrote:
 Oh. I didn't thought about that. This means that in the following
 example, the initialization of `s` is more than a simple call to
 `S.init`. I was under the impression that in D `S.init` should represent
 a valid state for whatever type `S`.
Yeah, .init and nested structs don't really fit together. On the one hand .init is supposed to be a valid value. On the other hand it must be a static value. Can't have both with nested structs. Maybe they shouldn't have .init at all.
Sep 19 2016
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 19.09.2016 14:35, ag0aep6g wrote:
 On 09/19/2016 02:24 PM, Lodovico Giaretta wrote:
 Oh. I didn't thought about that. This means that in the following
 example, the initialization of `s` is more than a simple call to
 `S.init`. I was under the impression that in D `S.init` should represent
 a valid state for whatever type `S`.
Yeah, .init and nested structs don't really fit together. On the one hand .init is supposed to be a valid value. On the other hand it must be a static value. Can't have both with nested structs. Maybe they shouldn't have .init at all.
This works: import std.stdio; void main(){ int x=3; enum l=()=>x; writeln(x); } I.e. l has the correct runtime context even though it is a static value.
Sep 19 2016
parent reply ag0aep6g <anonymous example.com> writes:
On 09/19/2016 02:55 PM, Timon Gehr wrote:
 This works:
 import std.stdio;
 void main(){
     int x=3;
     enum l=()=>x;
     writeln(x);
 }

 I.e. l has the correct runtime context even though it is a static value.
Enums are a special kind of static value, though. Can't do that with `static` instead of `enum`. Enums (often?) don't manifest until they're used. Could .init work that way?
Sep 19 2016
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 19.09.2016 15:52, ag0aep6g wrote:
 On 09/19/2016 02:55 PM, Timon Gehr wrote:
 This works:
 import std.stdio;
 void main(){
     int x=3;
     enum l=()=>x;
     writeln(x);
 }

 I.e. l has the correct runtime context even though it is a static value.
Enums are a special kind of static value, though. Can't do that with `static` instead of `enum`. Enums (often?) don't manifest until they're used. Could .init work that way?
Maybe -- and local template instantiation should work for all arguments that require it. Then we would not have this issue. But fixing .init for local structs breaks code.
Sep 20 2016
prev sibling next sibling parent Lodovico Giaretta <lodovico giaretart.net> writes:
On Monday, 19 September 2016 at 11:27:03 UTC, Lodovico Giaretta 
wrote:
 What I'd like to know: is this usage widespread? Should we 
 forbid it for the sake of security?
A big issue I'm finding is that inside most Phobos unittests classes and structs are not marked static even when they could, so putting any restriction on emplace will break most of them. This of course is not a problem for Phobos (we can easily put static everywhere it's needed), but most DUB packages will probably have the same poor usage of static without the maintainance effort of Phobos (so will remain broken forever). I already found many (direct or indirect) uses of emplace for non-static nested structs and classes in std.conv, std.typecons and std.experimental.allocator.
Sep 19 2016
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 9/19/16 7:27 AM, Lodovico Giaretta wrote:

 What I'd like to know: is this usage widespread? Should we forbid it for
 the sake of security?
No. There is no security concern here. You are dereferencing a null pointer, which is perfectly safe. -Steve
Sep 19 2016
next sibling parent reply Lodovico Giaretta <lodovico giaretart.net> writes:
On Monday, 19 September 2016 at 14:22:16 UTC, Steven 
Schveighoffer wrote:
 On 9/19/16 7:27 AM, Lodovico Giaretta wrote:

 What I'd like to know: is this usage widespread? Should we 
 forbid it for
 the sake of security?
No. There is no security concern here. You are dereferencing a null pointer, which is perfectly safe. -Steve
Ok, wrong wording. I meant "should we forbid it to avoid long hours of debugging and unexpected behaviours? One uses emplace expecting that it Just Works(TM), which is not true for nested things."
Sep 19 2016
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 9/19/16 10:26 AM, Lodovico Giaretta wrote:
 On Monday, 19 September 2016 at 14:22:16 UTC, Steven Schveighoffer wrote:
 On 9/19/16 7:27 AM, Lodovico Giaretta wrote:

 What I'd like to know: is this usage widespread? Should we forbid it for
 the sake of security?
No. There is no security concern here. You are dereferencing a null pointer, which is perfectly safe.
Ok, wrong wording. I meant "should we forbid it to avoid long hours of debugging and unexpected behaviours? One uses emplace expecting that it Just Works(TM), which is not true for nested things."
Maybe we can disable the emplace that will certainly cause a Null pointer segfault when used, and allow the copying version. But there may still be code that uses the struct without needing the context pointer, that would then be broken. My opinion is just not to worry about it. We don't get much traffic here complaining of this issue, you may be the one hardest hit by it (and you likely won't have it happen any more). For instance, we have way way more complaints of people using classes without allocating them, and that is a similar problem. However, a sufficient warning in the docs is certainly in order, at least as long as we aren't going to forbid it. -Steve
Sep 19 2016
prev sibling parent reply cym13 <cpicard openmailbox.org> writes:
On Monday, 19 September 2016 at 14:22:16 UTC, Steven 
Schveighoffer wrote:
 On 9/19/16 7:27 AM, Lodovico Giaretta wrote:

 What I'd like to know: is this usage widespread? Should we 
 forbid it for
 the sake of security?
No. There is no security concern here. You are dereferencing a null pointer, which is perfectly safe. -Steve
I beg to defer, null pointer dereference is certainly not safe in the general case. In many cases it lead to code execution or privilege escalation. See for example CVE-2008-568 [1] for an example in kernel space or CVE-2009-0385 [2] in user space. The idea is that you are really trying to call a function in a part of memory that is not mapped, but if you are able to map the zero page and control what function pointer is present there then it is exploitable. I'd like people to get away from the idea that null pointer dereference is safe, it's not. In most cases it's not exploitable but that's definitely not a safe spot. That being said I don't think it should be the burden of the library or language to deal with this for the reasons you exposed. [1] http://www.trapkit.de/advisories/TKADV2008-015.txt [2] http://www.trapkit.de/advisories/TKADV2009-004.txt
Sep 20 2016
next sibling parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Tuesday, 20 September 2016 at 08:08:16 UTC, cym13 wrote:
 On Monday, 19 September 2016 at 14:22:16 UTC, Steven 
 Schveighoffer wrote:
 On 9/19/16 7:27 AM, Lodovico Giaretta wrote:

 What I'd like to know: is this usage widespread? Should we 
 forbid it for
 the sake of security?
No. There is no security concern here. You are dereferencing a null pointer, which is perfectly safe. -Steve
I beg to defer,
You mean differ, right?
 null pointer dereference is certainly not safe in the general 
 case. In many cases it lead to code execution or privilege 
 escalation. See for example CVE-2008-568 [1] for an example in 
 kernel space or CVE-2009-0385 [2] in user space.

 The idea is that you are really trying to call a function in a 
 part of memory that is not mapped, but if you are able to map 
 the zero page and control what function pointer is present 
 there then it is exploitable. I'd like people to get away from 
 the idea that null pointer dereference is safe, it's not. In 
 most cases it's not exploitable but that's definitely not a 
 safe spot.

 That being said I don't think it should be the burden of the 
 library or language to deal with this for the reasons you 
 exposed.

 [1] http://www.trapkit.de/advisories/TKADV2008-015.txt
 [2] http://www.trapkit.de/advisories/TKADV2009-004.txt
Interesting, hadn't seen this stuff before. There is also the matter of large offsets taking you to accessible memory, such as you might get with a null pointer to a very large struct.
Sep 20 2016
parent cym13 <cpicard openmailbox.org> writes:
On Tuesday, 20 September 2016 at 08:23:04 UTC, John Colvin wrote:
 On Tuesday, 20 September 2016 at 08:08:16 UTC, cym13 wrote:
 On Monday, 19 September 2016 at 14:22:16 UTC, Steven 
 Schveighoffer wrote:
 On 9/19/16 7:27 AM, Lodovico Giaretta wrote:

 What I'd like to know: is this usage widespread? Should we 
 forbid it for
 the sake of security?
No. There is no security concern here. You are dereferencing a null pointer, which is perfectly safe. -Steve
I beg to defer,
You mean differ, right?
Hmm... yes, sorry.
 null pointer dereference is certainly not safe in the general 
 case. In many cases it lead to code execution or privilege 
 escalation. See for example CVE-2008-568 [1] for an example in 
 kernel space or CVE-2009-0385 [2] in user space.

 The idea is that you are really trying to call a function in a 
 part of memory that is not mapped, but if you are able to map 
 the zero page and control what function pointer is present 
 there then it is exploitable. I'd like people to get away from 
 the idea that null pointer dereference is safe, it's not. In 
 most cases it's not exploitable but that's definitely not a 
 safe spot.

 That being said I don't think it should be the burden of the 
 library or language to deal with this for the reasons you 
 exposed.

 [1] http://www.trapkit.de/advisories/TKADV2008-015.txt
 [2] http://www.trapkit.de/advisories/TKADV2009-004.txt
Interesting, hadn't seen this stuff before. There is also the matter of large offsets taking you to accessible memory, such as you might get with a null pointer to a very large struct.
Another interesting case is the Firefox debug offset. On x86 the address space was scarce so nothing could be mapped in userspace above 0xe0000000 IIRC. Firefox devs decided to use the address 0xefefefef to cause a clear segfault easy to hook in order to ease debugging. When Firefox was ported to x86_64 this range became available, so an attack was to setup that address and cause an error leading to a debug segfault and code execution. I think the fix was to used 0xfefefefe instead but I'm not sure. Anyway segfaults aren't safe, they should be avoided and controlled when possible. It's always better to manage the error.
Sep 20 2016
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 9/20/16 4:08 AM, cym13 wrote:
 On Monday, 19 September 2016 at 14:22:16 UTC, Steven Schveighoffer wrote:
 On 9/19/16 7:27 AM, Lodovico Giaretta wrote:

 What I'd like to know: is this usage widespread? Should we forbid it for
 the sake of security?
No. There is no security concern here. You are dereferencing a null pointer, which is perfectly safe.
I beg to defer, null pointer dereference is certainly not safe in the general case. In many cases it lead to code execution or privilege escalation. See for example CVE-2008-568 [1] for an example in kernel space or CVE-2009-0385 [2] in user space.
In kernel space, yes. So don't do this in your D kernel :) In user space, the chance is very unlikely. It requires a function context to be larger than the reserved page space, and accessing a function context variable outside that space. Not impossible, but very very unlikely. And beyond the control of the exploiter.
 The idea is that you are really trying to call a function in a part of
 memory that is not mapped, but if you are able to map the zero page and
 control what function pointer is present there then it is exploitable.
 I'd like people to get away from the idea that null pointer dereference
 is safe, it's not. In most cases it's not exploitable but that's
 definitely not a safe spot.
Dereferencing a null pointer is perfectly safe in user space (where you can't map the zero page). Indexing a null pointer is not. In this case, we are indexing a null pointer, so there is the potential for abuse, but very very small. I'm still not sure that emplace on an inner struct is a thing we need to allow, especially when it's known that the context pointer will be invalid. Maybe we should only allow if called via a different name, to prevent unwitting uses. -Steve
Sep 20 2016
parent reply cym13 <cpicard openmailbox.org> writes:
On Tuesday, 20 September 2016 at 14:00:00 UTC, Steven 
Schveighoffer wrote:
 On 9/20/16 4:08 AM, cym13 wrote:
 On Monday, 19 September 2016 at 14:22:16 UTC, Steven 
 Schveighoffer wrote:
[...]
I beg to defer, null pointer dereference is certainly not safe in the general case. In many cases it lead to code execution or privilege escalation. See for example CVE-2008-568 [1] for an example in kernel space or CVE-2009-0385 [2] in user space.
In kernel space, yes. So don't do this in your D kernel :) In user space, the chance is very unlikely. It requires a function context to be larger than the reserved page space, and accessing a function context variable outside that space. Not impossible, but very very unlikely. And beyond the control of the exploiter.
 The idea is that you are really trying to call a function in a 
 part of
 memory that is not mapped, but if you are able to map the zero 
 page and
 control what function pointer is present there then it is 
 exploitable.
 I'd like people to get away from the idea that null pointer 
 dereference
 is safe, it's not. In most cases it's not exploitable but 
 that's
 definitely not a safe spot.
Dereferencing a null pointer is perfectly safe in user space (where you can't map the zero page). Indexing a null pointer is not. In this case, we are indexing a null pointer, so there is the potential for abuse, but very very small.
Note that not all operating systems disallow mapping the zero page (although they admitedly should). And yes the potential for abuse is small, I've just getting annoyed at dangerous generalities. In that specific case I don't think it should be treated as a security issue in the general case although it could become one if used in an unfavorable environment.
 I'm still not sure that emplace on an inner struct is a thing 
 we need to allow, especially when it's known that the context 
 pointer will be invalid. Maybe we should only allow if called 
 via a different name, to prevent unwitting uses.

 -Steve
Sep 20 2016
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 9/20/16 12:45 PM, cym13 wrote:
 On Tuesday, 20 September 2016 at 14:00:00 UTC, Steven Schveighoffer wrote:
 Dereferencing a null pointer is perfectly safe in user space (where
 you can't map the zero page). Indexing a null pointer is not. In this
 case, we are indexing a null pointer, so there is the potential for
 abuse, but very very small.
Note that not all operating systems disallow mapping the zero page (although they admitedly should). And yes the potential for abuse is small, I've just getting annoyed at dangerous generalities.
We can simply avoid supporting such OSes, I don't think we lose much! But thanks for pointing this out, I didn't know that.
 In that specific case I don't think it should be treated as a security
 issue in the general case although it could become one if used in an
 unfavorable environment.
I had an interesting thought, and this is actually a safe issue (I'll file an issue request). If you have a potentially large enough struct, then accessing items in that struct has the potential to go beyond the unmapped memory. However, in MOST cases, the compiler knows the offset being used (especially if accessing specific members). In some cases it doesn't. But one thing the compiler can do (at least in safe code) is: 1. If the index is known at compile time to go beyond one page (or beyond the known OS limit for unmapped pages), do a null pointer check first, and abort with segfault if possible. As simple as loading the byte/word at the front of the struct. You also only have to do this once if the struct pointer does not move. 2. If the index is not known until runtime (for instance, indexing a static array in a struct), then check for null pointer before dereferencing conservatively. This would make code that is very common (i.e. small struct size, or accessing front members of the struct) just use the hardware features to prevent access. Code that is rare (accessing members beyond one page of size) can be instrumented to ensure no exploits are possible. -Steve
Sep 22 2016
parent cym13 <cpicard openmailbox.org> writes:
On Thursday, 22 September 2016 at 12:08:19 UTC, Steven 
Schveighoffer wrote:
 On 9/20/16 12:45 PM, cym13 wrote:
 On Tuesday, 20 September 2016 at 14:00:00 UTC, Steven 
 Schveighoffer wrote:
 Dereferencing a null pointer is perfectly safe in user space 
 (where
 you can't map the zero page). Indexing a null pointer is not. 
 In this
 case, we are indexing a null pointer, so there is the 
 potential for
 abuse, but very very small.
Note that not all operating systems disallow mapping the zero page (although they admitedly should). And yes the potential for abuse is small, I've just getting annoyed at dangerous generalities.
We can simply avoid supporting such OSes, I don't think we lose much! But thanks for pointing this out, I didn't know that.
We lose at least all linux kernels before 2.6.23... although it sounds reasonnable it's a choice that's better made knowingly ;)
 In that specific case I don't think it should be treated as a 
 security
 issue in the general case although it could become one if used 
 in an
 unfavorable environment.
I had an interesting thought, and this is actually a safe issue (I'll file an issue request). If you have a potentially large enough struct, then accessing items in that struct has the potential to go beyond the unmapped memory. However, in MOST cases, the compiler knows the offset being used (especially if accessing specific members). In some cases it doesn't. But one thing the compiler can do (at least in safe code) is: 1. If the index is known at compile time to go beyond one page (or beyond the known OS limit for unmapped pages), do a null pointer check first, and abort with segfault if possible. As simple as loading the byte/word at the front of the struct. You also only have to do this once if the struct pointer does not move. 2. If the index is not known until runtime (for instance, indexing a static array in a struct), then check for null pointer before dereferencing conservatively. This would make code that is very common (i.e. small struct size, or accessing front members of the struct) just use the hardware features to prevent access. Code that is rare (accessing members beyond one page of size) can be instrumented to ensure no exploits are possible. -Steve
This is very interesting, I'm sure there are ways to expand on this idea further (but even just that would be a real improvement). One thing is that although I don't see people willingly creating such a big struct not all D code is written by humans and with some mixin or template help it's a possibility. I'd love to see some protection where the compiler can work it out.
Sep 22 2016