www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - array.reverse segfaults

reply Moritz Warning <moritzwarning web.de> writes:
Hi,

This piece of code segfaults on Debian Linux (with dmd 1.035):
Can someone tell me why?

char[] get(char[] str)
{
    return new char[](4);
}

void main(char[][] args)
{
    char[] str =  get("abc");
   char[] reversed = str.reverse; // <-- access violation
}
Oct 22 2008
next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Wed, Oct 22, 2008 at 7:16 PM, Moritz Warning <moritzwarning web.de> wrote:
 Hi,

 This piece of code segfaults on Debian Linux (with dmd 1.035):
 Can someone tell me why?

 char[] get(char[] str)
 {
    return new char[](4);
 }

 void main(char[][] args)
 {
    char[] str =  get("abc");
   char[] reversed = str.reverse; // <-- access violation
 }

Does str.reverse actually return anything? I think you need to do that as: str.reverse; char[] reversed = str; However, if it's not supposed to return anything, then it's a bug that it compiles. If it's supposed to return something, then it's a bug that it crashes. Does it work doing it as two lines? --bb
Oct 22 2008
prev sibling next sibling parent Moritz Warning <moritzwarning web.de> writes:
On Wed, 22 Oct 2008 19:28:08 +0900, Bill Baxter wrote:

 On Wed, Oct 22, 2008 at 7:16 PM, Moritz Warning <moritzwarning web.de>
 wrote:
 Hi,

 This piece of code segfaults on Debian Linux (with dmd 1.035): Can
 someone tell me why?

 char[] get(char[] str)
 {
    return new char[](4);
 }

 void main(char[][] args)
 {
    char[] str =  get("abc");
   char[] reversed = str.reverse; // <-- access violation
 }

Does str.reverse actually return anything? I think you need to do that as: str.reverse; char[] reversed = str; However, if it's not supposed to return anything, then it's a bug that it compiles. If it's supposed to return something, then it's a bug that it crashes. Does it work doing it as two lines? --bb

From the specs: .reverse Reverses in place the order of the elements in the array. Returns the array. Removing the assignment to "reversed" doesn't change anything.
Oct 22 2008
prev sibling parent reply Tomas Lindquist Olsen <tomas famolsen.dk> writes:
Moritz Warning wrote:
 Hi,
 
 This piece of code segfaults on Debian Linux (with dmd 1.035):
 Can someone tell me why?
 
 char[] get(char[] str)
 {
     return new char[](4);
 }
 
 void main(char[][] args)
 {
     char[] str =  get("abc");
    char[] reversed = str.reverse; // <-- access violation
 }

Simpler version: void main() { char[4] str; str.reverse; } Crashes in _adReverseChar when trying to memmove (3 - 255) bytes ;) My best guess is that is just doesn't handle char.init values properly!
Oct 22 2008
next sibling parent Tomas Lindquist Olsen <tomas famolsen.dk> writes:
Tomas Lindquist Olsen wrote:
 Moritz Warning wrote:
 Hi,

 This piece of code segfaults on Debian Linux (with dmd 1.035):
 Can someone tell me why?

 char[] get(char[] str)
 {
     return new char[](4);
 }

 void main(char[][] args)
 {
     char[] str =  get("abc");
    char[] reversed = str.reverse; // <-- access violation
 }

Simpler version: void main() { char[4] str; str.reverse; } Crashes in _adReverseChar when trying to memmove (3 - 255) bytes ;) My best guess is that is just doesn't handle char.init values properly!

When it tries to get the lower stride, it gets 0xFF from the table, but it doesn't check if this value is usable. Probably just ignoring these invalid bytes would make it work. But I think the real question is, what should _adReverseChar really do on invalid UTF-8 input?
Oct 22 2008
prev sibling next sibling parent Moritz Warning <moritzwarning web.de> writes:
On Wed, 22 Oct 2008 13:10:20 +0200, Tomas Lindquist Olsen wrote:

 Tomas Lindquist Olsen wrote:
 Moritz Warning wrote:
 Hi,

 This piece of code segfaults on Debian Linux (with dmd 1.035): Can
 someone tell me why?

 char[] get(char[] str)
 {
     return new char[](4);
 }

 void main(char[][] args)
 {
     char[] str =  get("abc");
    char[] reversed = str.reverse; // <-- access violation
 }

Simpler version: void main() { char[4] str; str.reverse; } Crashes in _adReverseChar when trying to memmove (3 - 255) bytes ;) My best guess is that is just doesn't handle char.init values properly!

When it tries to get the lower stride, it gets 0xFF from the table, but it doesn't check if this value is usable. Probably just ignoring these invalid bytes would make it work. But I think the real question is, what should _adReverseChar really do on invalid UTF-8 input?

I think it should do the same as on an invalid pointer: result in undefined behavior (=> segfault).
Oct 22 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Wed, 22 Oct 2008 15:21:03 +0400, Moritz Warning <moritzwarning web.de>  
wrote:

 On Wed, 22 Oct 2008 13:10:20 +0200, Tomas Lindquist Olsen wrote:

 Tomas Lindquist Olsen wrote:
 Moritz Warning wrote:
 Hi,

 This piece of code segfaults on Debian Linux (with dmd 1.035): Can
 someone tell me why?

 char[] get(char[] str)
 {
     return new char[](4);
 }

 void main(char[][] args)
 {
     char[] str =  get("abc");
    char[] reversed = str.reverse; // <-- access violation
 }

Simpler version: void main() { char[4] str; str.reverse; } Crashes in _adReverseChar when trying to memmove (3 - 255) bytes ;) My best guess is that is just doesn't handle char.init values properly!

When it tries to get the lower stride, it gets 0xFF from the table, but it doesn't check if this value is usable. Probably just ignoring these invalid bytes would make it work. But I think the real question is, what should _adReverseChar really do on invalid UTF-8 input?

I think it should do the same as on an invalid pointer: result in undefined behavior (=> segfault).

It should not pass the assert(isValidUtf8String(str)) prior to in-place reverse, thus throwing an exception in debug mode. Release behaviour is a subject to debat, but I think it should be more robust. Given wrong input it may produce whatever wrong output, but segfault? That's too bold.
Oct 22 2008
prev sibling next sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Wed, Oct 22, 2008 at 9:46 AM, Denis Koroskin <2korden gmail.com> wrote:
 On Wed, 22 Oct 2008 15:21:03 +0400, Moritz Warning <moritzwarning web.de>
 wrote:

 On Wed, 22 Oct 2008 13:10:20 +0200, Tomas Lindquist Olsen wrote:

 Tomas Lindquist Olsen wrote:
 Moritz Warning wrote:
 Hi,

 This piece of code segfaults on Debian Linux (with dmd 1.035): Can
 someone tell me why?

 char[] get(char[] str)
 {
    return new char[](4);
 }

 void main(char[][] args)
 {
    char[] str =  get("abc");
   char[] reversed = str.reverse; // <-- access violation
 }

Simpler version: void main() { char[4] str; str.reverse; } Crashes in _adReverseChar when trying to memmove (3 - 255) bytes ;) My best guess is that is just doesn't handle char.init values properly!

When it tries to get the lower stride, it gets 0xFF from the table, but it doesn't check if this value is usable. Probably just ignoring these invalid bytes would make it work. But I think the real question is, what should _adReverseChar really do on invalid UTF-8 input?

I think it should do the same as on an invalid pointer: result in undefined behavior (=> segfault).

It should not pass the assert(isValidUtf8String(str)) prior to in-place reverse, thus throwing an exception in debug mode. Release behaviour is a subject to debat, but I think it should be more robust. Given wrong input it may produce whatever wrong output, but segfault? That's too bold.

I'd expect it to work like every other piece of code in the runtime that deals with unicode and throw a UtfException or whatever it is.
Oct 22 2008
prev sibling parent Moritz Warning <moritzwarning web.de> writes:
On Wed, 22 Oct 2008 17:46:26 +0400, Denis Koroskin wrote:

 On Wed, 22 Oct 2008 15:21:03 +0400, Moritz Warning
 <moritzwarning web.de> wrote:
 
 On Wed, 22 Oct 2008 13:10:20 +0200, Tomas Lindquist Olsen wrote:

 Tomas Lindquist Olsen wrote:
 Moritz Warning wrote:
 Hi,

 This piece of code segfaults on Debian Linux (with dmd 1.035): Can
 someone tell me why?

 char[] get(char[] str)
 {
     return new char[](4);
 }

 void main(char[][] args)
 {
     char[] str =  get("abc");
    char[] reversed = str.reverse; // <-- access violation
 }

Simpler version: void main() { char[4] str; str.reverse; } Crashes in _adReverseChar when trying to memmove (3 - 255) bytes ;) My best guess is that is just doesn't handle char.init values properly!

When it tries to get the lower stride, it gets 0xFF from the table, but it doesn't check if this value is usable. Probably just ignoring these invalid bytes would make it work. But I think the real question is, what should _adReverseChar really do on invalid UTF-8 input?

I think it should do the same as on an invalid pointer: result in undefined behavior (=> segfault).

It should not pass the assert(isValidUtf8String(str)) prior to in-place reverse, thus throwing an exception in debug mode. Release behaviour is a subject to debat, but I think it should be more robust. Given wrong input it may produce whatever wrong output, but segfault? That's too bold.

I was only referring to release builds. Imho, If additional robustness doesn't result in a speed hit, then throwing an exception would be better.
Oct 22 2008