digitalmars.D - Turning a SIGSEGV into a regular function call under Linux, allowing
- FeepingCreature (25/25) Mar 13 2012 Note: I worked out this method for my own language, Neat, but the basic ...
- deadalnix (4/29) Mar 13 2012 And is this Exception recoverable in a safe way ?
- FeepingCreature (4/44) Mar 13 2012 I'm not familiar with recovering. Note that you can _not_ safely return ...
- Vladimir Panteleev (18/20) Mar 13 2012 Very nice!
- FeepingCreature (4/18) Mar 14 2012 Sweet. Yeah, I think you need to use naked and reconstruct the stackfram...
- Vladimir Panteleev (7/10) Mar 14 2012 I think it might be safe to just reconstruct the stack frame in
- deadalnix (2/10) Mar 14 2012 Especially if the signal is sent because of stack overflow !
- Vladimir Panteleev (4/21) Mar 14 2012 Not sure if sarcasm..?
- deadalnix (3/20) Mar 14 2012 You can page protect the last segment of the stack, and unprotect it
- deadalnix (2/22) Mar 14 2012 You are loosing EAX in the process.
- FeepingCreature (2/7) Mar 14 2012 It's somewhat unavoidable. One way or another, you need to find _some_ t...
- deadalnix (3/10) Mar 14 2012 Thread local storage is a very easy thing in D. Can't we just use a
- Vladimir Panteleev (2/3) Mar 14 2012 When would this matter? EAX is a scratch register per ABIs, no?
- deadalnix (3/6) Mar 14 2012 You may want to return from the function the standard way an resume
- Vladimir Panteleev (3/11) Mar 14 2012 This doesn't have anything to do with turning signals into
- deadalnix (3/13) Mar 14 2012 No but this does, make sense to catch segfault and act according to it
- Vladimir Panteleev (2/21) Mar 14 2012 You can't resume D exceptions.
- deadalnix (3/20) Mar 14 2012 I'm not talking about Exception anymore. In case of Exception, this
- Vladimir Panteleev (8/36) Mar 14 2012 I don't understand how any of your posts are related to this
- deadalnix (9/38) Mar 14 2012 The topic is *Turning a SIGSEGV into a regular function call under
- Vladimir Panteleev (6/14) Mar 14 2012 OK. But (to me, at least) you sounded like you were criticizing
- deadalnix (7/22) Mar 14 2012 I'm not criticizing at all ! I think this is awesome ! I'm just trying
- H. S. Teoh (9/18) Mar 14 2012 I believe the original purpose of this was to catch SIGSEGV and turn it
- H. S. Teoh (15/61) Mar 13 2012 Nice!! So basically you allow the signal handler to return cleanly so
- Don Clugston (23/48) Mar 14 2012 I didn't realize that was possible. Very interesting.
- Steven Schveighoffer (4/6) Mar 14 2012 SEGFAULT inside a SEGV signal handler aborts the program (no way to turn...
- Don Clugston (2/8) Mar 14 2012 But you're not inside the signal handler when it happens. You returned.
- Steven Schveighoffer (9/20) Mar 14 2012 Then how does the signal handler do anything? I mean, doesn't it need a...
- deadalnix (5/26) Mar 14 2012 The address of the instruction being executed is hijacked, so, instead
- Steven Schveighoffer (25/57) Mar 14 2012 te:
- H. S. Teoh (19/25) Mar 14 2012 That's a good idea. So the signal handler reads the top of the stack
- FeepingCreature (2/21) Mar 14 2012 I think that case is sufficiently rare that it'd have to count somewhere...
- Sean Kelly (6/8) Mar 14 2012 somewhere between "act of god" and "outright developer malice". The =
- deadalnix (2/6) Mar 14 2012 And as a stack overflow is likely to create a SEGFAULT too, we are doome...
- Don Clugston (22/26) Mar 14 2012 void foo()
- deadalnix (181/181) Mar 15 2012 Here is a proof of concept of how we can recover from segfault.
- Kagamin (3/3) Mar 15 2012 Does it recover from
- FeepingCreature (2/6) Mar 15 2012 Not as currently written, no. It should be possible to detect this case ...
- deadalnix (3/9) Mar 17 2012 It is supported as written in my sample code. I have do do another one
Note: I worked out this method for my own language, Neat, but the basic approach should be portable to D's exceptions as well. I've seen it argued a lot over the years (even argued it myself) that it's impossible to throw from Linux signal handlers. This is basically correct, because they constitute an interruption in the stack that breaks exceptions' ability to unroll properly. However, there is a method to turn a signal handler into a regular function call that you can throw from. Basically, what we need to do is similar to a stack buffer overflow exploit. Under Linux, the extended signal handler that is set with sigaction is called with three arguments: the signal, a siginfo_t* and a ucontext_t* as the third. The third parameter is what we're interested in. Deep inside the ucontext_t struct is uc.mcontext.gregs[REG_EIP], the address of the instruction that caused the segfault. This is the location that execution returns to when the signal handler returns. By overwriting this location, we can turn a return into a function call. First, gregs[REG_EAX] = gregs[REG_EIP]; We can safely assume that the function that caused the segfault doesn't really need its EAX anymore, so we can reuse it to reconstruct a proper stackframe to throw from later. Second, gregs[REG_EIP] = cast(void*) &sigsegv_userspace_handler; Note that the naked attribute was not used. If used, it can make this code slightly easier. extern(C) void sigsegv_userspace_handler() { // done implicitly // asm { push ebp; } // asm { mov ebp, esp; } asm { mov ebx, [esp]; } // backup the pushed ebp asm { mov [esp], eax; } // replace it with the correct return address // which was originally left out due to the // irregular way we entered this function (via a ret). asm { push ebx; } // recreate the pushed ebp asm { mov ebp, esp; } // complete stackframe. // originally, our stackframe (because we entered this function via a ret) // was [ebp]. Now, it's [return address][ebp], as is proper for cdecl. // at this point, we can safely throw // (or invoke any other non-handler-safe function). throw new SignalException("SIGSEGV"); }
Mar 13 2012
Le 13/03/2012 11:09, FeepingCreature a écrit :Note: I worked out this method for my own language, Neat, but the basic approach should be portable to D's exceptions as well. I've seen it argued a lot over the years (even argued it myself) that it's impossible to throw from Linux signal handlers. This is basically correct, because they constitute an interruption in the stack that breaks exceptions' ability to unroll properly. However, there is a method to turn a signal handler into a regular function call that you can throw from. Basically, what we need to do is similar to a stack buffer overflow exploit. Under Linux, the extended signal handler that is set with sigaction is called with three arguments: the signal, a siginfo_t* and a ucontext_t* as the third. The third parameter is what we're interested in. Deep inside the ucontext_t struct is uc.mcontext.gregs[REG_EIP], the address of the instruction that caused the segfault. This is the location that execution returns to when the signal handler returns. By overwriting this location, we can turn a return into a function call. First, gregs[REG_EAX] = gregs[REG_EIP]; We can safely assume that the function that caused the segfault doesn't really need its EAX anymore, so we can reuse it to reconstruct a proper stackframe to throw from later. Second, gregs[REG_EIP] = cast(void*)&sigsegv_userspace_handler; Note that the naked attribute was not used. If used, it can make this code slightly easier. extern(C) void sigsegv_userspace_handler() { // done implicitly // asm { push ebp; } // asm { mov ebp, esp; } asm { mov ebx, [esp]; } // backup the pushed ebp asm { mov [esp], eax; } // replace it with the correct return address // which was originally left out due to the // irregular way we entered this function (via a ret). asm { push ebx; } // recreate the pushed ebp asm { mov ebp, esp; } // complete stackframe. // originally, our stackframe (because we entered this function via a ret) // was [ebp]. Now, it's [return address][ebp], as is proper for cdecl. // at this point, we can safely throw // (or invoke any other non-handler-safe function). throw new SignalException("SIGSEGV"); }And is this Exception recoverable in a safe way ? The ucontext_t struct is system dependent. So this is tricky. The Exception should be an Error to comply with nothrow spec.
Mar 13 2012
On 03/13/12 11:23, deadalnix wrote:Le 13/03/2012 11:09, FeepingCreature a écrit :I'm not familiar with recovering. Note that you can _not_ safely return from the userspace handler, because we overwrote EAX to make space for our ESI backup. You'd need to find somewhere else to stick that backup, like a TLS global variable or some known part of the stack.Note: I worked out this method for my own language, Neat, but the basic approach should be portable to D's exceptions as well. I've seen it argued a lot over the years (even argued it myself) that it's impossible to throw from Linux signal handlers. This is basically correct, because they constitute an interruption in the stack that breaks exceptions' ability to unroll properly. However, there is a method to turn a signal handler into a regular function call that you can throw from. Basically, what we need to do is similar to a stack buffer overflow exploit. Under Linux, the extended signal handler that is set with sigaction is called with three arguments: the signal, a siginfo_t* and a ucontext_t* as the third. The third parameter is what we're interested in. Deep inside the ucontext_t struct is uc.mcontext.gregs[REG_EIP], the address of the instruction that caused the segfault. This is the location that execution returns to when the signal handler returns. By overwriting this location, we can turn a return into a function call. First, gregs[REG_EAX] = gregs[REG_EIP]; We can safely assume that the function that caused the segfault doesn't really need its EAX anymore, so we can reuse it to reconstruct a proper stackframe to throw from later. Second, gregs[REG_EIP] = cast(void*)&sigsegv_userspace_handler; Note that the naked attribute was not used. If used, it can make this code slightly easier. extern(C) void sigsegv_userspace_handler() { // done implicitly // asm { push ebp; } // asm { mov ebp, esp; } asm { mov ebx, [esp]; } // backup the pushed ebp asm { mov [esp], eax; } // replace it with the correct return address // which was originally left out due to the // irregular way we entered this function (via a ret). asm { push ebx; } // recreate the pushed ebp asm { mov ebp, esp; } // complete stackframe. // originally, our stackframe (because we entered this function via a ret) // was [ebp]. Now, it's [return address][ebp], as is proper for cdecl. // at this point, we can safely throw // (or invoke any other non-handler-safe function). throw new SignalException("SIGSEGV"); }And is this Exception recoverable in a safe way ?The ucontext_t struct is system dependent. So this is tricky.Yeah, this is Linux only.
Mar 13 2012
On Tuesday, 13 March 2012 at 10:09:55 UTC, FeepingCreature wrote:However, there is a method to turn a signal handler into a regular function call that you can throw from.Very nice! The only similarity with a buffer overflow exploit is that we're overriding the continuation address. There is no execution of data, so it's closer to a "return-to-libc" attack. This is a very clean (and Neat) solution. Here's a D implementation without inline assembler. It's DMD-specific due to a weirdness of its codegen. http://dump.thecybershadow.net/20f792fa05c020e561137cfaf3d65d7a/sigthrow_32.d The 64-bit version is a hack, in that it clobbers the last word on the stack. If the exception was thrown right after a stack frame was created, things might go ugly. The same trick as in my 32-bit implementation (creating a new stack frame with an extern(C) helper) won't work here, and I don't know enough about x64 exception handling to know how to fix it. http://dump.thecybershadow.net/121efc460a01fb4597926ec76352a674/sigthrow_64.d I think something like this needs to end up in Druntime, at least for Linux x86 and x64.
Mar 13 2012
On 03/13/12 23:24, Vladimir Panteleev wrote:On Tuesday, 13 March 2012 at 10:09:55 UTC, FeepingCreature wrote:Argh. Yeah, that's the one I was thinking of.However, there is a method to turn a signal handler into a regular function call that you can throw from.Very nice! The only similarity with a buffer overflow exploit is that we're overriding the continuation address. There is no execution of data, so it's closer to a "return-to-libc" attack.Here's a D implementation without inline assembler. It's DMD-specific due to a weirdness of its codegen. http://dump.thecybershadow.net/20f792fa05c020e561137cfaf3d65d7a/sigthrow_32.d The 64-bit version is a hack, in that it clobbers the last word on the stack. If the exception was thrown right after a stack frame was created, things might go ugly. The same trick as in my 32-bit implementation (creating a new stack frame with an extern(C) helper) won't work here, and I don't know enough about x64 exception handling to know how to fix it. http://dump.thecybershadow.net/121efc460a01fb4597926ec76352a674/sigthrow_64.dSweet. Yeah, I think you need to use naked and reconstruct the stackframe. Not sure how it'd look; I'm not familiar with the x86_64 ABI.I think something like this needs to end up in Druntime, at least for Linux x86 and x64.Would be nice. I mean, Windows already has segfault-as-exception, doesn't it? It's only fair :)
Mar 14 2012
On Wednesday, 14 March 2012 at 07:35:50 UTC, FeepingCreature wrote:Sweet. Yeah, I think you need to use naked and reconstruct the stackframe. Not sure how it'd look; I'm not familiar with the x86_64 ABI.I think it might be safe to just reconstruct the stack frame in the signal handler, and set gregs[REG_EIP] to &_d_throw directly. It should also use a pre-allocated exception object (like how it's done with OutofMemoryError and InvalidMemoryOperationError), in case there's data corruption in the GC.
Mar 14 2012
Le 14/03/2012 17:34, Vladimir Panteleev a écrit :On Wednesday, 14 March 2012 at 07:35:50 UTC, FeepingCreature wrote:Especially if the signal is sent because of stack overflow !Sweet. Yeah, I think you need to use naked and reconstruct the stackframe. Not sure how it'd look; I'm not familiar with the x86_64 ABI.I think it might be safe to just reconstruct the stack frame in the signal handler, and set gregs[REG_EIP] to &_d_throw directly. It should also use a pre-allocated exception object (like how it's done with OutofMemoryError and InvalidMemoryOperationError), in case there's data corruption in the GC.
Mar 14 2012
On Wednesday, 14 March 2012 at 16:39:29 UTC, deadalnix wrote:Le 14/03/2012 17:34, Vladimir Panteleev a écrit :Not sure if sarcasm..? In case of a stack overflow, you can't call _d_throwc (or use the "throw" statement) anyway.On Wednesday, 14 March 2012 at 07:35:50 UTC, FeepingCreature wrote:Especially if the signal is sent because of stack overflow !Sweet. Yeah, I think you need to use naked and reconstruct the stackframe. Not sure how it'd look; I'm not familiar with the x86_64 ABI.I think it might be safe to just reconstruct the stack frame in the signal handler, and set gregs[REG_EIP] to &_d_throw directly. It should also use a pre-allocated exception object (like how it's done with OutofMemoryError and InvalidMemoryOperationError), in case there's data corruption in the GC.
Mar 14 2012
Le 14/03/2012 18:01, Vladimir Panteleev a écrit :On Wednesday, 14 March 2012 at 16:39:29 UTC, deadalnix wrote:You can page protect the last segment of the stack, and unprotect it before throwing.Le 14/03/2012 17:34, Vladimir Panteleev a écrit :Not sure if sarcasm..? In case of a stack overflow, you can't call _d_throwc (or use the "throw" statement) anyway.On Wednesday, 14 March 2012 at 07:35:50 UTC, FeepingCreature wrote:Especially if the signal is sent because of stack overflow !Sweet. Yeah, I think you need to use naked and reconstruct the stackframe. Not sure how it'd look; I'm not familiar with the x86_64 ABI.I think it might be safe to just reconstruct the stack frame in the signal handler, and set gregs[REG_EIP] to &_d_throw directly. It should also use a pre-allocated exception object (like how it's done with OutofMemoryError and InvalidMemoryOperationError), in case there's data corruption in the GC.
Mar 14 2012
Le 13/03/2012 23:24, Vladimir Panteleev a écrit :On Tuesday, 13 March 2012 at 10:09:55 UTC, FeepingCreature wrote:You are loosing EAX in the process.However, there is a method to turn a signal handler into a regular function call that you can throw from.Very nice! The only similarity with a buffer overflow exploit is that we're overriding the continuation address. There is no execution of data, so it's closer to a "return-to-libc" attack. This is a very clean (and Neat) solution. Here's a D implementation without inline assembler. It's DMD-specific due to a weirdness of its codegen. http://dump.thecybershadow.net/20f792fa05c020e561137cfaf3d65d7a/sigthrow_32.d The 64-bit version is a hack, in that it clobbers the last word on the stack. If the exception was thrown right after a stack frame was created, things might go ugly. The same trick as in my 32-bit implementation (creating a new stack frame with an extern(C) helper) won't work here, and I don't know enough about x64 exception handling to know how to fix it. http://dump.thecybershadow.net/121efc460a01fb4597926ec76352a674/sigthrow_64.d I think something like this needs to end up in Druntime, at least for Linux x86 and x64.
Mar 14 2012
On 03/14/12 12:13, deadalnix wrote:Le 13/03/2012 23:24, Vladimir Panteleev a écrit :It's somewhat unavoidable. One way or another, you need to find _some_ threadlocal spot to stick those extra size_t.sizeof bytes, since you mustn't lose data, but the hack works by _overwriting_ the return address.I think something like this needs to end up in Druntime, at least for Linux x86 and x64.You are loosing EAX in the process.
Mar 14 2012
Le 14/03/2012 14:43, FeepingCreature a écrit :On 03/14/12 12:13, deadalnix wrote:Thread local storage is a very easy thing in D. Can't we just use a static variable and set from within the signal handler ?Le 13/03/2012 23:24, Vladimir Panteleev a écrit :It's somewhat unavoidable. One way or another, you need to find _some_ threadlocal spot to stick those extra size_t.sizeof bytes, since you mustn't lose data, but the hack works by _overwriting_ the return address.I think something like this needs to end up in Druntime, at least for Linux x86 and x64.You are loosing EAX in the process.
Mar 14 2012
On Wednesday, 14 March 2012 at 11:11:54 UTC, deadalnix wrote:You are loosing EAX in the process.When would this matter? EAX is a scratch register per ABIs, no?
Mar 14 2012
Le 14/03/2012 17:08, Vladimir Panteleev a écrit :On Wednesday, 14 March 2012 at 11:11:54 UTC, deadalnix wrote:You may want to return from the function the standard way an resume operations. To implement a moving GC using page protection for example.You are loosing EAX in the process.When would this matter? EAX is a scratch register per ABIs, no?
Mar 14 2012
On Wednesday, 14 March 2012 at 16:37:45 UTC, deadalnix wrote:Le 14/03/2012 17:08, Vladimir Panteleev a écrit :This doesn't have anything to do with turning signals into exceptions.On Wednesday, 14 March 2012 at 11:11:54 UTC, deadalnix wrote:You may want to return from the function the standard way an resume operations. To implement a moving GC using page protection for example.You are loosing EAX in the process.When would this matter? EAX is a scratch register per ABIs, no?
Mar 14 2012
Le 14/03/2012 18:00, Vladimir Panteleev a écrit :On Wednesday, 14 March 2012 at 16:37:45 UTC, deadalnix wrote:No but this does, make sense to catch segfault and act according to it to implement such a functionality. This is a very close problem.Le 14/03/2012 17:08, Vladimir Panteleev a écrit :This doesn't have anything to do with turning signals into exceptions.On Wednesday, 14 March 2012 at 11:11:54 UTC, deadalnix wrote:You may want to return from the function the standard way an resume operations. To implement a moving GC using page protection for example.You are loosing EAX in the process.When would this matter? EAX is a scratch register per ABIs, no?
Mar 14 2012
On Wednesday, 14 March 2012 at 17:18:06 UTC, deadalnix wrote:Le 14/03/2012 18:00, Vladimir Panteleev a écrit :You can't resume D exceptions.On Wednesday, 14 March 2012 at 16:37:45 UTC, deadalnix wrote:No but this does, make sense to catch segfault and act according to it to implement such a functionality. This is a very close problem.Le 14/03/2012 17:08, Vladimir Panteleev a écrit :This doesn't have anything to do with turning signals into exceptions.On Wednesday, 14 March 2012 at 11:11:54 UTC, deadalnix wrote:You may want to return from the function the standard way an resume operations. To implement a moving GC using page protection for example.You are loosing EAX in the process.When would this matter? EAX is a scratch register per ABIs, no?
Mar 14 2012
Le 14/03/2012 18:28, Vladimir Panteleev a écrit :On Wednesday, 14 March 2012 at 17:18:06 UTC, deadalnix wrote:I'm not talking about Exception anymore. In case of Exception, this isn't a problem, but in case of regular return, this is.Le 14/03/2012 18:00, Vladimir Panteleev a écrit :You can't resume D exceptions.On Wednesday, 14 March 2012 at 16:37:45 UTC, deadalnix wrote:No but this does, make sense to catch segfault and act according to it to implement such a functionality. This is a very close problem.Le 14/03/2012 17:08, Vladimir Panteleev a écrit :This doesn't have anything to do with turning signals into exceptions.On Wednesday, 14 March 2012 at 11:11:54 UTC, deadalnix wrote:You may want to return from the function the standard way an resume operations. To implement a moving GC using page protection for example.You are loosing EAX in the process.When would this matter? EAX is a scratch register per ABIs, no?
Mar 14 2012
On Wednesday, 14 March 2012 at 19:48:28 UTC, deadalnix wrote:Le 14/03/2012 18:28, Vladimir Panteleev a écrit :I don't understand how any of your posts are related to this thread at all. This thread is about turning SIGSEGV into an exception that 1) you can catch 2) will print a stack trace when uncaught. You've brought in stack overflows, moving garbage collectors, etc. I assure you, we are well-aware of the problems when using this exact code for other purposes.On Wednesday, 14 March 2012 at 17:18:06 UTC, deadalnix wrote:I'm not talking about Exception anymore. In case of Exception, this isn't a problem, but in case of regular return, this is.Le 14/03/2012 18:00, Vladimir Panteleev a écrit :You can't resume D exceptions.On Wednesday, 14 March 2012 at 16:37:45 UTC, deadalnix wrote:No but this does, make sense to catch segfault and act according to it to implement such a functionality. This is a very close problem.Le 14/03/2012 17:08, Vladimir Panteleev a écrit :This doesn't have anything to do with turning signals into exceptions.On Wednesday, 14 March 2012 at 11:11:54 UTC, deadalnix wrote:You may want to return from the function the standard way an resume operations. To implement a moving GC using page protection for example.You are loosing EAX in the process.When would this matter? EAX is a scratch register per ABIs, no?
Mar 14 2012
Le 14/03/2012 21:07, Vladimir Panteleev a écrit :On Wednesday, 14 March 2012 at 19:48:28 UTC, deadalnix wrote:The topic is *Turning a SIGSEGV into a regular function call under Linux, allowing throw*, not only Exception. I don't understand what is the problem here ? Can't we talk about how we could keep trash register clean in case we don't throw - this doesn't make much sense if we throw anyway - ? What your are mentioning here is already done. Nothing to discuss about that. This is why I try to jump into the next topic : how can we do more than just throwing.Le 14/03/2012 18:28, Vladimir Panteleev a écrit :I don't understand how any of your posts are related to this thread at all. This thread is about turning SIGSEGV into an exception that 1) you can catch 2) will print a stack trace when uncaught. You've brought in stack overflows, moving garbage collectors, etc. I assure you, we are well-aware of the problems when using this exact code for other purposes.On Wednesday, 14 March 2012 at 17:18:06 UTC, deadalnix wrote:I'm not talking about Exception anymore. In case of Exception, this isn't a problem, but in case of regular return, this is.Le 14/03/2012 18:00, Vladimir Panteleev a écrit :You can't resume D exceptions.On Wednesday, 14 March 2012 at 16:37:45 UTC, deadalnix wrote:No but this does, make sense to catch segfault and act according to it to implement such a functionality. This is a very close problem.Le 14/03/2012 17:08, Vladimir Panteleev a écrit :This doesn't have anything to do with turning signals into exceptions.On Wednesday, 14 March 2012 at 11:11:54 UTC, deadalnix wrote:You may want to return from the function the standard way an resume operations. To implement a moving GC using page protection for example.You are loosing EAX in the process.When would this matter? EAX is a scratch register per ABIs, no?
Mar 14 2012
On Wednesday, 14 March 2012 at 20:20:05 UTC, deadalnix wrote:The topic is *Turning a SIGSEGV into a regular function call under Linux, allowing throw*, not only Exception. I don't understand what is the problem here ? Can't we talk about how we could keep trash register clean in case we don't throw - this doesn't make much sense if we throw anyway - ? What your are mentioning here is already done. Nothing to discuss about that. This is why I try to jump into the next topic : how can we do more than just throwing.OK. But (to me, at least) you sounded like you were criticizing the implementation for solving that specific task, so it would help if you were clearer of your intentions. For example, losing the contents EAX is relativery harmless, but the contents of EBP, EGS etc. can be very important.
Mar 14 2012
Le 14/03/2012 21:28, Vladimir Panteleev a écrit :On Wednesday, 14 March 2012 at 20:20:05 UTC, deadalnix wrote:I'm not criticizing at all ! I think this is awesome ! I'm just trying to discuss way we can get to the next step. Loosing EAX is harmless in the throwing case, but it is a problem for other tasks. I didn't mentioned this into the topic, but I'm very enthusiastic about that !The topic is *Turning a SIGSEGV into a regular function call under Linux, allowing throw*, not only Exception. I don't understand what is the problem here ? Can't we talk about how we could keep trash register clean in case we don't throw - this doesn't make much sense if we throw anyway - ? What your are mentioning here is already done. Nothing to discuss about that. This is why I try to jump into the next topic : how can we do more than just throwing.OK. But (to me, at least) you sounded like you were criticizing the implementation for solving that specific task, so it would help if you were clearer of your intentions. For example, losing the contents EAX is relativery harmless, but the contents of EBP, EGS etc. can be very important.
Mar 14 2012
On Wed, Mar 14, 2012 at 05:39:38PM +0100, deadalnix wrote:Le 14/03/2012 17:08, Vladimir Panteleev a écrit :I believe the original purpose of this was to catch SIGSEGV and turn it into a thrown Error. So we don't care whether EAX is overwritten since we're never going to return to the code that caused the SEGV; we're just reconstructing the stack frame so that stack unwinding will work correctly when we throw the Error. T -- People tell me that I'm skeptical, but I don't believe it.On Wednesday, 14 March 2012 at 11:11:54 UTC, deadalnix wrote:You may want to return from the function the standard way an resume operations. To implement a moving GC using page protection for example.You are loosing EAX in the process.When would this matter? EAX is a scratch register per ABIs, no?
Mar 14 2012
On Tue, Mar 13, 2012 at 11:09:54AM +0100, FeepingCreature wrote: [...]I've seen it argued a lot over the years (even argued it myself) that it's impossible to throw from Linux signal handlers. This is basically correct, because they constitute an interruption in the stack that breaks exceptions' ability to unroll properly. However, there is a method to turn a signal handler into a regular function call that you can throw from. Basically, what we need to do is similar to a stack buffer overflow exploit. Under Linux, the extended signal handler that is set with sigaction is called with three arguments: the signal, a siginfo_t* and a ucontext_t* as the third. The third parameter is what we're interested in. Deep inside the ucontext_t struct is uc.mcontext.gregs[REG_EIP], the address of the instruction that caused the segfault. This is the location that execution returns to when the signal handler returns. By overwriting this location, we can turn a return into a function call. First, gregs[REG_EAX] = gregs[REG_EIP]; We can safely assume that the function that caused the segfault doesn't really need its EAX anymore, so we can reuse it to reconstruct a proper stackframe to throw from later. Second, gregs[REG_EIP] = cast(void*) &sigsegv_userspace_handler; Note that the naked attribute was not used. If used, it can make this code slightly easier. extern(C) void sigsegv_userspace_handler() { // done implicitly // asm { push ebp; } // asm { mov ebp, esp; } asm { mov ebx, [esp]; } // backup the pushed ebp asm { mov [esp], eax; } // replace it with the correct return address // which was originally left out due to the // irregular way we entered this function (via a ret). asm { push ebx; } // recreate the pushed ebp asm { mov ebp, esp; } // complete stackframe. // originally, our stackframe (because we entered this function via a ret) // was [ebp]. Now, it's [return address][ebp], as is proper for cdecl. // at this point, we can safely throw // (or invoke any other non-handler-safe function). throw new SignalException("SIGSEGV"); }Nice!! So basically you allow the signal handler to return cleanly so that we're out of signal-handling context, but overwrite the return address so that instead of returning to where the signal happened, it gets diverted to a special handler that reconstructs a stack frame and then throws. Cool beans! The only drawback is, this only works on x86 Linux. I think it should be possible to make it work on non-x86 Linux by writing machine-specific code along the same principles. But I'm pretty sure it won't work for other unixen though. They'll probably need their own system-specific hacks. T -- If you compete with slaves, you become a slave. -- Norbert Wiener
Mar 13 2012
On 13/03/12 11:09, FeepingCreature wrote:Note: I worked out this method for my own language, Neat, but the basic approach should be portable to D's exceptions as well. I've seen it argued a lot over the years (even argued it myself) that it's impossible to throw from Linux signal handlers. This is basically correct, because they constitute an interruption in the stack that breaks exceptions' ability to unroll properly. However, there is a method to turn a signal handler into a regular function call that you can throw from. Basically, what we need to do is similar to a stack buffer overflow exploit. Under Linux, the extended signal handler that is set with sigaction is called with three arguments: the signal, a siginfo_t* and a ucontext_t* as the third. The third parameter is what we're interested in. Deep inside the ucontext_t struct is uc.mcontext.gregs[REG_EIP], the address of the instruction that caused the segfault. This is the location that execution returns to when the signal handler returns. By overwriting this location, we can turn a return into a function call. First, gregs[REG_EAX] = gregs[REG_EIP]; We can safely assume that the function that caused the segfault doesn't really need its EAX anymore, so we can reuse it to reconstruct a proper stackframe to throw from later. Second, gregs[REG_EIP] = cast(void*)&sigsegv_userspace_handler; Note that the naked attribute was not used. If used, it can make this code slightly easier. extern(C) void sigsegv_userspace_handler() { // done implicitly // asm { push ebp; } // asm { mov ebp, esp; } asm { mov ebx, [esp]; } // backup the pushed ebp asm { mov [esp], eax; } // replace it with the correct return address // which was originally left out due to the // irregular way we entered this function (via a ret). asm { push ebx; } // recreate the pushed ebp asm { mov ebp, esp; } // complete stackframe. // originally, our stackframe (because we entered this function via a ret) // was [ebp]. Now, it's [return address][ebp], as is proper for cdecl. // at this point, we can safely throw // (or invoke any other non-handler-safe function). throw new SignalException("SIGSEGV"); }I didn't realize that was possible. Very interesting. As it stands, though, that's got some pretty serious issues. You are on the stack of the function that was called, but you don't know for sure that it is a valid stack. asm { push EBX; mov EBX, ESP; mov ESP, 0; // Look ma, no stack! mov int ptr [ESP], 0; // segfault -- null pointer exception mov ESP, EBX; pop EBX; } Now, your user space handler will cause another segfault when it does the mov [ESP], 0. I think that gives you an infinite loop. I think the idea would work, if you had some guarantee that the stack pointer was valid. Then, call a separate handler if it is not. The primary 'trick' in Windows SEH is that it goes to great lengths to verify that the stack is valid. I'm not sure that in Linux user space you have enough information to verify it. But maybe you do. At least, you should be able to check that it's in memory which is owned by your process. Would be awesome if it is possible.
Mar 14 2012
On Wed, 14 Mar 2012 16:08:29 -0400, Don Clugston <dac nospam.com> wrote:Now, your user space handler will cause another segfault when it does the mov [ESP], 0. I think that gives you an infinite loop.SEGFAULT inside a SEGV signal handler aborts the program (no way to turn this off IIRC). -Steve
Mar 14 2012
On 14/03/12 21:31, Steven Schveighoffer wrote:On Wed, 14 Mar 2012 16:08:29 -0400, Don Clugston <dac nospam.com> wrote:But you're not inside the signal handler when it happens. You returned.Now, your user space handler will cause another segfault when it does the mov [ESP], 0. I think that gives you an infinite loop.SEGFAULT inside a SEGV signal handler aborts the program (no way to turn this off IIRC). -Steve
Mar 14 2012
On Wed, 14 Mar 2012 16:45:49 -0400, Don Clugston <dac nospam.com> wrote:On 14/03/12 21:31, Steven Schveighoffer wrote:Then how does the signal handler do anything? I mean, doesn't it need a stack? Or does it just affect register variables? Most signal handlers are normal functions, and isn't there some usage of the stack to save registers? It seems there should be a way to turn off the signal handler during the time when you are suspicous of the stack being the culprit, then re-engage the signal handler before throwing the error. -SteveOn Wed, 14 Mar 2012 16:08:29 -0400, Don Clugston <dac nospam.com> wrote:But you're not inside the signal handler when it happens. You returned.Now, your user space handler will cause another segfault when it does the mov [ESP], 0. I think that gives you an infinite loop.SEGFAULT inside a SEGV signal handler aborts the program (no way to turn this off IIRC). -Steve
Mar 14 2012
Le 14/03/2012 21:53, Steven Schveighoffer a écrit :On Wed, 14 Mar 2012 16:45:49 -0400, Don Clugston <dac nospam.com> wrote:The address of the instruction being executed is hijacked, so, instead of resuming normal operation after the signal handler exit, it get into the throwing handler. This is a very nice trick !On 14/03/12 21:31, Steven Schveighoffer wrote:Then how does the signal handler do anything? I mean, doesn't it need a stack? Or does it just affect register variables? Most signal handlers are normal functions, and isn't there some usage of the stack to save registers? It seems there should be a way to turn off the signal handler during the time when you are suspicous of the stack being the culprit, then re-engage the signal handler before throwing the error. -SteveOn Wed, 14 Mar 2012 16:08:29 -0400, Don Clugston <dac nospam.com> wrote:But you're not inside the signal handler when it happens. You returned.Now, your user space handler will cause another segfault when it does the mov [ESP], 0. I think that gives you an infinite loop.SEGFAULT inside a SEGV signal handler aborts the program (no way to turn this off IIRC). -Steve
Mar 14 2012
On Wed, 14 Mar 2012 17:25:28 -0400, deadalnix <deadalnix gmail.com> wrot= e:Le 14/03/2012 21:53, Steven Schveighoffer a =C3=A9crit :te:On Wed, 14 Mar 2012 16:45:49 -0400, Don Clugston <dac nospam.com> wro=On 14/03/12 21:31, Steven Schveighoffer wrote:On Wed, 14 Mar 2012 16:08:29 -0400, Don Clugston <dac nospam.com> =oeswrote:Now, your user space handler will cause another segfault when it d==the mov [ESP], 0. I think that gives you an infinite loop.SEGFAULT inside a SEGV signal handler aborts the program (no way to=ed.turn this off IIRC). -SteveBut you're not inside the signal handler when it happens. You return=aThen how does the signal handler do anything? I mean, doesn't it need=sstack? Or does it just affect register variables? Most signal handler=are normal functions, and isn't there some usage of the stack to save=theregisters? It seems there should be a way to turn off the signal handler during ==time when you are suspicous of the stack being the culprit, then re-engage the signal handler before throwing the error. -SteveThe address of the instruction being executed is hijacked, so, instead=of resuming normal operation after the signal handler exit, it get int=o =the throwing handler. This is a very nice trick !I get that. What I was saying is, I thought even the signal handler use= s = the stack (thereby it would abort if invalid). And even if it doesn't, = = simply accessing the stack by loading it into a register should be = sufficient to "test" and see if the stack is valid to use (i.e. cause = another SEGV inside the signal handler forcing an abort so we don't have= = an infinite loop). I honestly don't know enough to really be discussing, but it seems like = a = really neat idea, and I grasp how it works. I just don't know all the = particulars of signal calling conventions. -Steve
Mar 14 2012
On Wed, Mar 14, 2012 at 05:35:04PM -0400, Steven Schveighoffer wrote: [...]I get that. What I was saying is, I thought even the signal handler uses the stack (thereby it would abort if invalid). And even if it doesn't, simply accessing the stack by loading it into a register should be sufficient to "test" and see if the stack is valid to use (i.e. cause another SEGV inside the signal handler forcing an abort so we don't have an infinite loop).That's a good idea. So the signal handler reads the top of the stack into EAX (since we're already overwriting EAX anyway), and if the stack is invalid, that will segfault and abort the program. If that doesn't abort, then assume the stack is valid and proceed with the hack to divert the return address to the throwing handler. However, this still assumes that ESP is either valid or null. If the segfault was caused by, say, an exploit attempt, then ESP may be non-null but not pointing to a valid stack either. It's conceivable that someone might try to exploit a D program by crafting a bad stack and pointing ESP at it, then triggering a segfault intentionally. (The bad stack could, for example, contain strange stack frames that causes the stack unwinder to do something unintended, like execute arbitrary code.) I don't know how to solve this, though. T -- There is no gravity. The earth sucks.
Mar 14 2012
On 03/14/12 21:08, Don Clugston wrote:I didn't realize that was possible. Very interesting. As it stands, though, that's got some pretty serious issues. You are on the stack of the function that was called, but you don't know for sure that it is a valid stack. asm { push EBX; mov EBX, ESP; mov ESP, 0; // Look ma, no stack! mov int ptr [ESP], 0; // segfault -- null pointer exception mov ESP, EBX; pop EBX; } Now, your user space handler will cause another segfault when it does the mov [ESP], 0. I think that gives you an infinite loop.I think that case is sufficiently rare that it'd have to count somewhere between "act of god" and "outright developer malice". The assumption that the stack frame is valid is, I'd say, safe to make in the vast majority of cases. You pretty much have to actively try to break it, for no clearly discernible reason.
Mar 14 2012
On Mar 14, 2012, at 1:54 PM, FeepingCreature wrote:=20 I think that case is sufficiently rare that it'd have to count =somewhere between "act of god" and "outright developer malice". The = assumption that the stack frame is valid is, I'd say, safe to make in = the vast majority of cases. You pretty much have to actively try to = break it, for no clearly discernible reason. The prevalence of buffer overflow attacks might suggest otherwise.=
Mar 14 2012
Le 14/03/2012 21:59, Sean Kelly a écrit :On Mar 14, 2012, at 1:54 PM, FeepingCreature wrote:And as a stack overflow is likely to create a SEGFAULT too, we are doomed !I think that case is sufficiently rare that it'd have to count somewhere between "act of god" and "outright developer malice". The assumption that the stack frame is valid is, I'd say, safe to make in the vast majority of cases. You pretty much have to actively try to break it, for no clearly discernible reason.The prevalence of buffer overflow attacks might suggest otherwise.
Mar 14 2012
On 14/03/12 21:59, Sean Kelly wrote:On Mar 14, 2012, at 1:54 PM, FeepingCreature wrote:void foo() { bar(); } void bar() { int y; int *p = &y; p[1] = 0; } The assignment to p[1]=0 clobbers the location where EBP was pushed. Then: mov ESP, EBP; // ESP is OK pop EBP; // EBP is now 0 ret; now return to foo, where we get: call bar; -> mov ESP, EBP; // ESP is now 0 pop EBP; // segfault ret Unfortunately it's not difficult to corrupt ESP.I think that case is sufficiently rare that it'd have to count somewhere between "act of god" and "outright developer malice". The assumption that the stack frame is valid is, I'd say, safe to make in the vast majority of cases. You pretty much have to actively try to break it, for no clearly discernible reason.The prevalence of buffer overflow attacks might suggest otherwise.
Mar 14 2012
Here is a proof of concept of how we can recover from segfault. This isn't perfect as it doesn't protect everything (like floating point registers). This is mostly because I can't find the precise documentation about what must be saved or not. The handler call a naked function that will set up a stack simulation a standard call, and then call a D function. This function recieve as parameter the memory address that cause the segfault. We can do whatever we want in the D function, at this point we have a clean stack. Then, if the function call return the standard way, things are set back and the code triggering the segfault ran again. In the example below, I use memory protection to trigger the segfault. In the handler, I remove the memory protection, so the program can continue its execution. The code is based on Vladimir Panteleev's prototype. import core.sys.posix.signal; import core.sys.posix.ucontext; import std.stdio; // Missing details from Druntime version(X86_64) { enum { REG_R8 = 0, REG_R9, REG_R10, REG_R11, REG_R12, REG_R13, REG_R14, REG_R15, REG_RDI, REG_RSI, REG_RBP, REG_RBX, REG_RDX, REG_RAX, REG_RCX, REG_RSP, REG_RIP, REG_EFL, REG_CSGSFS, /* Actually short cs, gs, fs, __pad0. */ REG_ERR, REG_TRAPNO, REG_OLDMASK, REG_CR2 } } else version (X86) { enum { REG_GS = 0, REG_FS, REG_ES, REG_DS, REG_EDI, REG_ESI, REG_EBP, REG_ESP, REG_EBX, REG_EDX, REG_ECX, REG_EAX, REG_TRAPNO, REG_ERR, REG_EIP, REG_CS, REG_EFL, REG_UESP, REG_SS } } // Init shared static this() { sigaction_t action; action.sa_sigaction = &handleSignal; action.sa_flags = SA_SIGINFO; sigaction(SIGSEGV, &action, null); } // Sighandler space alias typeof({ucontext_t uc; return uc.uc_mcontext.gregs[0];}()) REG_TYPE; static REG_TYPE saved_EAX, saved_EDX; extern(C) void handleSignal(int signum, siginfo_t* info, void* contextPtr) { auto context = cast(ucontext_t*)contextPtr; // Save registers into global thread local, to allow recovery. saved_EAX = context.uc_mcontext.gregs[REG_EAX]; saved_EDX = context.uc_mcontext.gregs[REG_EDX]; // Hijack current context so we call our handler. context.uc_mcontext.gregs[REG_EAX] = cast(REG_TYPE) info._sifields._sigfault.si_addr; context.uc_mcontext.gregs[REG_EDX] = context.uc_mcontext.gregs[REG_EIP]; context.uc_mcontext.gregs[REG_EIP] = cast(REG_TYPE) &sigsegv_userspace_handler; } // User space // This function must be called with faulting address in EAX and original EIP in EDX. void sigsegv_userspace_handler() { asm { naked; push EDX; // return address (original EIP). push EBP; // old ebp mov EBP, ESP; push ECX; // ECX is a trash register and must be preserved as local variable. // Parameter address is already set as EAX. call sigsegv_userspace_process; // Restore register values and return. call restore_registers; pop ECX; // Return pop EBP; ret; } } // The return value is stored in EAX and EDX, so this function restore the correct value for theses registers. REG_TYPE[2] restore_registers() { return [saved_EAX, saved_EDX]; } // User space handler class SignalError : Error { this(string msg) { super(msg); } } extern(C) int mprotect(void*, size_t, int); void sigsegv_userspace_process(void* address) { import std.stdio; writeln("Handler starting."); writeln("SEGFAULT triggered at address : ", address); // Dirty trick to get stack trace, for debug purpose. try { throw new SignalError("SIGSEGV"); } catch(SignalError se) { writeln(se.toString()); } // Allow write access to memory. So when we return the operation causing SEGFAULT will succeed. import core.sys.posix.sys.mman; mprotect(address, 4096, PROT_READ|PROT_WRITE); writeln("Handler ending."); // throw new SignalError("SIGSEGV"); } // Demonstration void foo(void* x) { *(cast(int*) x) = 1; } void main() { import core.sys.posix.sys.mman; import std.stdio; void* x = mmap(cast(void*) 0x12340000, 4096, PROT_NONE, MAP_PRIVATE|MAP_ANON, -1, 0); if(x == cast(void*) 0x12340000) { writeln("Try to write at ", x); foo(x); } else { write("Can't mmap :("); } assert(*(cast(int*) x) == 1); writeln("Value successfully written ! SIGSEGV recovered !"); }
Mar 15 2012
On 03/15/12 16:16, Kagamin wrote:Does it recover from void function() f=null; f();Not as currently written, no. It should be possible to detect this case and get a proper stackframe back, though.
Mar 15 2012
Le 15/03/2012 21:20, FeepingCreature a écrit :On 03/15/12 16:16, Kagamin wrote:It is supported as written in my sample code. I have do do another one for x86_64.Does it recover from void function() f=null; f();Not as currently written, no. It should be possible to detect this case and get a proper stackframe back, though.
Mar 17 2012