www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - obliterate

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Hello,


I will soon get to work on typed allocators; I figured there will be 
some issues percolating to untyped allocators that will require design 
changes (hopefully minor).

For starters, I want to define a function that "obliterates" an object, 
i.e. makes it almost surely unusable and not obeying its own invariants. 
At the same time, that state should be entirely reproducible and 
memory-safe.

Here's what I'm thinking. First, obliterate calls the destructor if 
present and then writes the fields as follows:

* unsigned integers: t.max / 2

* signed integers: t.min / 2

* characters: ?

* Pointers and class references: size_t.max - 65_535, i.e. 64K below the 
upper memory limit. On all systems I know it can be safely assumed that 
that area will cause GPF when accessed.

* Arrays: some weird length (like 17), and also starting at size_t.max 
minus the memory occupied by the array.

* floating point numbers: NaN, or some ridiculous value like F.max / 2?



Andrei
Nov 12 2013
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, November 12, 2013 16:33:17 Andrei Alexandrescu wrote:
 Hello,
 
 
 I will soon get to work on typed allocators; I figured there will be
 some issues percolating to untyped allocators that will require design
 changes (hopefully minor).
 
 For starters, I want to define a function that "obliterates" an object,
 i.e. makes it almost surely unusable and not obeying its own invariants.
 At the same time, that state should be entirely reproducible and
 memory-safe.
 
 Here's what I'm thinking. First, obliterate calls the destructor if
 present and then writes the fields as follows:
 
 * unsigned integers: t.max / 2
 
 * signed integers: t.min / 2
 
 * characters: ?
 
 * Pointers and class references: size_t.max - 65_535, i.e. 64K below the
 upper memory limit. On all systems I know it can be safely assumed that
 that area will cause GPF when accessed.
 
 * Arrays: some weird length (like 17), and also starting at size_t.max
 minus the memory occupied by the array.
 
 * floating point numbers: NaN, or some ridiculous value like F.max / 2?
1. How is this different from destroy aside from the fact that it's specifically choosing values which aren't T.init? 2. What is the purpose of not choosing T.init? - Jonathan M Davis
Nov 12 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/12/13 4:59 PM, Jonathan M Davis wrote:
 On Tuesday, November 12, 2013 16:33:17 Andrei Alexandrescu wrote:
 Hello,


 I will soon get to work on typed allocators; I figured there will be
 some issues percolating to untyped allocators that will require design
 changes (hopefully minor).

 For starters, I want to define a function that "obliterates" an object,
 i.e. makes it almost surely unusable and not obeying its own invariants.
 At the same time, that state should be entirely reproducible and
 memory-safe.

 Here's what I'm thinking. First, obliterate calls the destructor if
 present and then writes the fields as follows:

 * unsigned integers: t.max / 2

 * signed integers: t.min / 2

 * characters: ?

 * Pointers and class references: size_t.max - 65_535, i.e. 64K below the
 upper memory limit. On all systems I know it can be safely assumed that
 that area will cause GPF when accessed.

 * Arrays: some weird length (like 17), and also starting at size_t.max
 minus the memory occupied by the array.

 * floating point numbers: NaN, or some ridiculous value like F.max / 2?
1. How is this different from destroy aside from the fact that it's specifically choosing values which aren't T.init? 2. What is the purpose of not choosing T.init?
Consider a memory-safe allocator (oddly enough they exist: in brief think non-intrusive unbounded per-type freelist). That would allow access after deallocation but would fail in a reproducible way. The idea is that it should fail, so T.init is not good. Andrei
Nov 12 2013
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, November 12, 2013 17:19:04 Andrei Alexandrescu wrote:
 On 11/12/13 4:59 PM, Jonathan M Davis wrote:
 On Tuesday, November 12, 2013 16:33:17 Andrei Alexandrescu wrote:
 Hello,
 
 
 I will soon get to work on typed allocators; I figured there will be
 some issues percolating to untyped allocators that will require design
 changes (hopefully minor).
 
 For starters, I want to define a function that "obliterates" an object,
 i.e. makes it almost surely unusable and not obeying its own invariants.
 At the same time, that state should be entirely reproducible and
 memory-safe.
 
 Here's what I'm thinking. First, obliterate calls the destructor if
 present and then writes the fields as follows:
 
 * unsigned integers: t.max / 2
 
 * signed integers: t.min / 2
 
 * characters: ?
 
 * Pointers and class references: size_t.max - 65_535, i.e. 64K below the
 upper memory limit. On all systems I know it can be safely assumed that
 that area will cause GPF when accessed.
 
 * Arrays: some weird length (like 17), and also starting at size_t.max
 minus the memory occupied by the array.
 
 * floating point numbers: NaN, or some ridiculous value like F.max / 2?
1. How is this different from destroy aside from the fact that it's specifically choosing values which aren't T.init? 2. What is the purpose of not choosing T.init?
Consider a memory-safe allocator (oddly enough they exist: in brief think non-intrusive unbounded per-type freelist). That would allow access after deallocation but would fail in a reproducible way. The idea is that it should fail, so T.init is not good.
Except that pretty most of your examples don't seem like they would fail any more than T.init would. int.max / 2 in so no more valid or invalid than 0. The only difference I see would be that by setting pointers/references/arrays to a weird value rather than null, they'll be treated as if they have a value and then blow up rather than blowing up on a null value. For all the built-in types, T.init is essentially supposed to be as invalid as the type can get without pointing off into memory that it shouldn't be addressing. So, the only thing I see that this suggestion does over using T.init is that on pointers/references/arrays, you won't end up with code that checks for null and avoids blowing up. Code that checks for null would then blow up just as much as code that assumes that the pointer/reference/array was non-null. But that's the only difference I see, since none of the other types end up with values that are any more invalid than T.init. - Jonathan M Davis
Nov 12 2013
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Wednesday, 13 November 2013 at 01:37:15 UTC, Jonathan M Davis 
wrote:
 Except that pretty most of your examples don't seem like they 
 would fail any
 more than T.init would. int.max / 2 in so no more valid or 
 invalid than 0. The
 only difference I see would be that by setting 
 pointers/references/arrays to a
 weird value rather than null, they'll be treated as if they 
 have a value and
 then blow up rather than blowing up on a null value. For all 
 the built-in
 types, T.init is essentially supposed to be as invalid as the 
 type can get
 without pointing off into memory that it shouldn't be 
 addressing.

 So, the only thing I see that this suggestion does over using 
 T.init is that
 on pointers/references/arrays, you won't end up with code that 
 checks for null
 and avoids blowing up. Code that checks for null would then 
 blow up just as
 much as code that assumes that the pointer/reference/array was 
 non-null. But
 that's the only difference I see, since none of the other types 
 end up with
 values that are any more invalid than T.init.

 - Jonathan M Davis
+1. Essentially, I don't see how any of this is better than .init. What about user defined types? How do you want to deal with those? Recursively search all basic types and set them to the above mentioned values?
Nov 14 2013
parent Brad Roberts <braddr puremagic.com> writes:
On 11/14/13 1:36 AM, monarch_dodra wrote:
 On Wednesday, 13 November 2013 at 01:37:15 UTC, Jonathan M Davis wrote:
 Except that pretty most of your examples don't seem like they would fail any
 more than T.init would. int.max / 2 in so no more valid or invalid than 0. The
 only difference I see would be that by setting pointers/references/arrays to a
 weird value rather than null, they'll be treated as if they have a value and
 then blow up rather than blowing up on a null value. For all the built-in
 types, T.init is essentially supposed to be as invalid as the type can get
 without pointing off into memory that it shouldn't be addressing.

 So, the only thing I see that this suggestion does over using T.init is that
 on pointers/references/arrays, you won't end up with code that checks for null
 and avoids blowing up. Code that checks for null would then blow up just as
 much as code that assumes that the pointer/reference/array was non-null. But
 that's the only difference I see, since none of the other types end up with
 values that are any more invalid than T.init.

 - Jonathan M Davis
+1. Essentially, I don't see how any of this is better than .init. What about user defined types? How do you want to deal with those? Recursively search all basic types and set them to the above mentioned values?
It can be nice to distinguish between initial state and final state.
Nov 14 2013
prev sibling next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Wednesday, 13 November 2013 at 00:33:17 UTC, Andrei 
Alexandrescu wrote:
 For starters, I want to define a function that "obliterates" an 
 object, i.e. makes it almost surely unusable and not obeying 
 its own invariants. At the same time, that state should be 
 entirely reproducible and memory-safe.
What's this for? When will it be used? How will it behave in release mode? No-op, or same as non-release?
 Here's what I'm thinking. First, obliterate calls the 
 destructor if present and then writes the fields as follows:

 * unsigned integers: t.max / 2

 * signed integers: t.min / 2

 * characters: ?
Why not 0xFF? (char.init, invalid UTF-8 code unit)
 * Pointers and class references: size_t.max - 65_535, i.e. 64K 
 below the upper memory limit. On all systems I know it can be 
 safely assumed that that area will cause GPF when accessed.
Make that value odd. That will also guarantee a GPF on systems where unaligned pointer access is forbidden.
 * Arrays: some weird length (like 17), and also starting at 
 size_t.max minus the memory occupied by the array.
I guess the non-zero length is for code which is going to check it? Because otherwise, leaving length as just 0 will, in debug mode, cause a range error. In release mode, array index access will not check the length anyway.
 * floating point numbers: NaN, or some ridiculous value like 
 F.max / 2?
NaNs are viral, so there's that.
Nov 12 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/12/13 5:10 PM, Vladimir Panteleev wrote:
 On Wednesday, 13 November 2013 at 00:33:17 UTC, Andrei Alexandrescu wrote:
 For starters, I want to define a function that "obliterates" an
 object, i.e. makes it almost surely unusable and not obeying its own
 invariants. At the same time, that state should be entirely
 reproducible and memory-safe.
What's this for? When will it be used?
Safe allocators.
 How will it behave in release mode? No-op, or same as non-release?
A safe allocator would obliterate in release mode if it wants to stay safe.
 Here's what I'm thinking. First, obliterate calls the destructor if
 present and then writes the fields as follows:

 * unsigned integers: t.max / 2

 * signed integers: t.min / 2

 * characters: ?
Why not 0xFF? (char.init, invalid UTF-8 code unit)
 * Pointers and class references: size_t.max - 65_535, i.e. 64K below
 the upper memory limit. On all systems I know it can be safely assumed
 that that area will cause GPF when accessed.
Make that value odd. That will also guarantee a GPF on systems where unaligned pointer access is forbidden.
 * Arrays: some weird length (like 17), and also starting at size_t.max
 minus the memory occupied by the array.
I guess the non-zero length is for code which is going to check it? Because otherwise, leaving length as just 0 will, in debug mode, cause a range error. In release mode, array index access will not check the length anyway.
I'm thinking along the lines of - empty arrays are common in sane objects.
 * floating point numbers: NaN, or some ridiculous value like F.max / 2?
NaNs are viral, so there's that.
Cool. Andrei
Nov 12 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/12/13 5:10 PM, Vladimir Panteleev wrote:
 * Pointers and class references: size_t.max - 65_535, i.e. 64K below
 the upper memory limit. On all systems I know it can be safely assumed
 that that area will cause GPF when accessed.
Make that value odd. That will also guarantee a GPF on systems where unaligned pointer access is forbidden.
Ha, I missed that. Nice! Andrei
Nov 12 2013
parent reply KlausO <oberhofer users.sf.net> writes:
Am 13.11.2013 02:21, schrieb Andrei Alexandrescu:
 On 11/12/13 5:10 PM, Vladimir Panteleev wrote:
 * Pointers and class references: size_t.max - 65_535, i.e. 64K below
 the upper memory limit. On all systems I know it can be safely assumed
 that that area will cause GPF when accessed.
Make that value odd. That will also guarantee a GPF on systems where unaligned pointer access is forbidden.
Ha, I missed that. Nice! Andrei
The classics: 0xFFFFDEAD dead 0xFFFFD1ED died Not so obvious: 0xFFFFFA1D failed 0xFFFFACED (the result makes you dumb) faced 0xFFFFFEED don't feed me with this 0xFFFFDF0F don't follow (this pointer) or (you are) f*****
Nov 12 2013
next sibling parent =?UTF-8?B?U2ltZW4gS2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:
On 13.11.2013 08:16, KlausO wrote:
 Am 13.11.2013 02:21, schrieb Andrei Alexandrescu:
 On 11/12/13 5:10 PM, Vladimir Panteleev wrote:
 * Pointers and class references: size_t.max - 65_535, i.e. 64K below
 the upper memory limit. On all systems I know it can be safely assumed
 that that area will cause GPF when accessed.
Make that value odd. That will also guarantee a GPF on systems where unaligned pointer access is forbidden.
Ha, I missed that. Nice! Andrei
The classics: 0xFFFFDEAD dead 0xFFFFD1ED died Not so obvious: 0xFFFFFA1D failed 0xFFFFACED (the result makes you dumb) faced 0xFFFFFEED don't feed me with this 0xFFFFDF0F don't follow (this pointer) or (you are) f*****
I thought I recalled some system initializing its data to 0xF001, but apparently it was Algol 68-R using the character string "FOOLFOOLFOOL..."[0]. Still, I guess a case could be made for 0xFFFFF001. [0]: http://www.catb.org/jargon/html/F/fool.html -- Simen
Nov 13 2013
prev sibling parent "Andrea Fontana" <nospam example.com> writes:
On Wednesday, 13 November 2013 at 07:16:40 UTC, KlausO wrote:
 Am 13.11.2013 02:21, schrieb Andrei Alexandrescu:
 On 11/12/13 5:10 PM, Vladimir Panteleev wrote:
 * Pointers and class references: size_t.max - 65_535, i.e. 
 64K below
 the upper memory limit. On all systems I know it can be 
 safely assumed
 that that area will cause GPF when accessed.
Make that value odd. That will also guarantee a GPF on systems where unaligned pointer access is forbidden.
Ha, I missed that. Nice! Andrei
The classics: 0xFFFFDEAD dead 0xFFFFD1ED died Not so obvious: 0xFFFFFA1D failed 0xFFFFACED (the result makes you dumb) faced 0xFFFFFEED don't feed me with this 0xFFFFDF0F don't follow (this pointer) or (you are) f*****
You missed the best classic: 0xDEADBEEF
Nov 14 2013
prev sibling next sibling parent "Brad Anderson" <eco gnuk.net> writes:
On Wednesday, 13 November 2013 at 00:33:17 UTC, Andrei 
Alexandrescu wrote:
 Hello,


 I will soon get to work on typed allocators; I figured there 
 will be some issues percolating to untyped allocators that will 
 require design changes (hopefully minor).

 For starters, I want to define a function that "obliterates" an 
 object, i.e. makes it almost surely unusable and not obeying 
 its own invariants. At the same time, that state should be 
 entirely reproducible and memory-safe.

 Here's what I'm thinking. First, obliterate calls the 
 destructor if present and then writes the fields as follows:

 * unsigned integers: t.max / 2

 * signed integers: t.min / 2

 * characters: ?

 * Pointers and class references: size_t.max - 65_535, i.e. 64K 
 below the upper memory limit. On all systems I know it can be 
 safely assumed that that area will cause GPF when accessed.

 * Arrays: some weird length (like 17), and also starting at 
 size_t.max minus the memory occupied by the array.

 * floating point numbers: NaN, or some ridiculous value like 
 F.max / 2?



 Andrei
Perhaps these are of interest: http://stackoverflow.com/a/127404/216300
Nov 12 2013
prev sibling next sibling parent "Joseph Cassman" <jc7919 outlook.com> writes:
On Wednesday, 13 November 2013 at 00:33:17 UTC, Andrei 
Alexandrescu wrote:
 * Arrays: some weird length (like 17), and also starting at 
 size_t.max minus the memory occupied by the array.
This question probably reflects the fact that I do not know how arrays are implemented by the D runtime. But wouldn't shrinking an array cause it's memory to have the possibility of being reclaimed/reapportioned? At least I expect that to happen in normal D code any time I change an array's length to make it smaller; I figure the GC might kick in at any point after that and reclaim the formerly used memory. Of course, inside an allocator such functionality could be short cut so that the GC does not touch it, and shrinking an array is simply a notational adjustment. The idea behind a reproducable way to fail all seems good. I am just concerned that this part might have a performance impact. No? Joseph
Nov 13 2013
prev sibling next sibling parent "qznc" <qznc web.de> writes:
On Wednesday, 13 November 2013 at 00:33:17 UTC, Andrei 
Alexandrescu wrote:
 Hello,


 I will soon get to work on typed allocators; I figured there 
 will be some issues percolating to untyped allocators that will 
 require design changes (hopefully minor).

 For starters, I want to define a function that "obliterates" an 
 object, i.e. makes it almost surely unusable and not obeying 
 its own invariants. At the same time, that state should be 
 entirely reproducible and memory-safe.
 * floating point numbers: NaN, or some ridiculous value like 
 F.max / 2?
NaN would be semantically the right thing and there even is the concept of "signaling NaNs". I do not know how it works out in practice, though. http://en.wikipedia.org/wiki/NaN#Signaling_NaN
Nov 13 2013
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
13-Nov-2013 04:33, Andrei Alexandrescu пишет:
 Hello,
[snip]
 Here's what I'm thinking. First, obliterate calls the destructor if
 present and then writes the fields as follows:
 * unsigned integers: t.max / 2

 * signed integers: t.min / 2

 * characters: ?
As Vladimir said: 0xFF for char 0xFFFF for wchar 0x10_FFFF for dchar Alternatives for w/dchar : all in range of 0xFDD0-0xFDEF 0xFFFE and 0xFFFF 0x1FFFE and 0x1FFFF 0x2FFFE and 0x2FFFF and so on, up to 0x10FFFE and 0x10FFFF Relevant passage from the Unicode standard: Noncharacters are code points that are permanently reserved in the Unicode Standard for internal use. They are forbidden for use in open interchange of Unicode text data. -- Dmitry Olshansky
Nov 13 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/13/13 10:18 AM, Dmitry Olshansky wrote:
 As Vladimir said:
 0xFF for char
 0xFFFF for wchar
 0x10_FFFF for dchar

 Alternatives for w/dchar :
 all in range of 0xFDD0-0xFDEF
 0xFFFE and 0xFFFF
 0x1FFFE and 0x1FFFF
 0x2FFFE and 0x2FFFF
 and so on, up to
 0x10FFFE and 0x10FFFF

 Relevant passage from the Unicode standard:
 Noncharacters are code points that are permanently reserved in the
 Unicode Standard for internal use. They are forbidden for use in open
 interchange of Unicode text data.
Great, thanks folks. Andrei
Nov 13 2013