www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Understanding SIGSEGV issues

reply Russel Winder <russel winder.org.uk> writes:
So I have a D program that used to work. I come back to it, recompile it, a=
nd:

|> dub run -- ~/lib/DigitalTelevision/DVBv5/uk-CrystalPalace__RW=20
Performing "debug" build using /usr/bin/ldc2 for x86_64.
libdvbv5_d 0.1.1: target for configuration "library" is up to date.
dvb-tune ~master: target for configuration "application" is up to date.
To force a rebuild of up-to-date targets, run again with --force.
Running ./bin/dvb-tune /home/users/russel/lib/DigitalTelevision/DVBv5/uk-Cr=
ystalPalace__RW
Device: Silicon Labs Si2168, adapter  0, frontend  0
Program exited with code -11

(gdb) r ~/lib/DigitalTelevision/DVBv5/uk-CrystalPalace__RW
Starting program: /home/users/russel/Repositories/Git/Masters/Public/DVBTun=
e/bin/dvb-tune ~/lib/DigitalTelevision/DVBv5/uk-CrystalPalace__RW
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Device: Silicon Labs Si2168, adapter  0, frontend  0

Program received signal SIGSEGV, Segmentation fault.
__GI___libc_free (mem=3D0xa) at malloc.c:3093
3093	malloc.c: No such file or directory.
(gdb)=20

Can anyone give me any hints as to where to start even getting a glimmer of=
 an
understanding of WTF is going on?

--=20
Russel.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Dr Russel Winder      t: +44 20 7585 2200
41 Buckmaster Road    m: +44 7770 465 077
London SW11 1EN, UK   w: www.russel.org.uk
Jan 02 2019
parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Thursday, 3 January 2019 at 06:25:46 UTC, Russel Winder wrote:
 So I have a D program that used to work. I come back to it, 
 recompile it, and:

 [...]
 __GI___libc_free (mem=0xa) at malloc.c:3093
You've tried to free a pointer that, while not null, was derived from a pointer that was, i.e. an offset to a field of a struct. A backtrace would help a lot, otherwise it really is just guessing.
Jan 02 2019
parent reply Russel Winder <russel winder.org.uk> writes:
On Thu, 2019-01-03 at 07:52 +0000, Nicholas Wilson via Digitalmars-d-learn
wrote:
 On Thursday, 3 January 2019 at 06:25:46 UTC, Russel Winder wrote:
 So I have a D program that used to work. I come back to it,=20
 recompile it, and:
=20
 [...]
 __GI___libc_free (mem=3D0xa) at malloc.c:3093
=20 You've tried to free a pointer that, while not null, was derived=20 from a pointer that was, i.e. an offset to a field of a struct. =20 A backtrace would help a lot, otherwise it really is just=20 guessing.
Sorry about that, fairly obvious that the backtrace is needed in hindsight.= :- ) file.d:282 rror reading variable: Cannot access memory at address 0xa>) at channels.d:= 144 ) at channels.d:144 Which indicates that the destructor is being called before the instance has been constructed. Which is a real WTF. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Jan 03 2019
parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Thursday, 3 January 2019 at 08:35:17 UTC, Russel Winder wrote:
 Sorry about that, fairly obvious that the backtrace is needed 
 in hindsight. :- )



 (dvb_file=0x5555555a1320) at dvb_file.d:282

 types.d:83

 channels.TransmitterData.__fieldDtor() (this=<error reading 
 variable: Cannot access memory at address 0xa>) at 
 channels.d:144

 (this=...) at channels.d:144


 Which indicates that the destructor is being called before the 
 instance has been constructed. Which is a real WTF.
Not quite, this occurs as a TransmitterData object goes out of scope at the end of main(stick a fflush'ed printf there to see): TransmitterData is a struct that has no destructor defined but has a field of type File_Ptr that does. The compiler generates a destructor, __aggrDtor, which calls the fields that have destructors, __fieldDtor (e.g. the File_Ptr) which in turn calls its destructor, File_Ptr.~this(). solution to this is to either ensure it is initialised in the constructor of TransmitterData, or account for it possibly being null by defining a destructor for TransmitterData.
Jan 03 2019
parent reply Russel Winder <russel winder.org.uk> writes:
On Thu, 2019-01-03 at 11:23 +0000, Nicholas Wilson via Digitalmars-d-learn
wrote:
 On Thursday, 3 January 2019 at 08:35:17 UTC, Russel Winder wrote:
 Sorry about that, fairly obvious that the backtrace is needed=20
 in hindsight. :- )
=20


 (dvb_file=3D0x5555555a1320) at dvb_file.d:282

 types.d:83

 channels.TransmitterData.__fieldDtor() (this=3D<error reading=20
 variable: Cannot access memory at address 0xa>) at=20
 channels.d:144

 (this=3D...) at channels.d:144

=20
 Which indicates that the destructor is being called before the=20
 instance has been constructed. Which is a real WTF.
=20 Not quite, this occurs as a TransmitterData object goes out of=20 scope at the end of main(stick a fflush'ed printf there to see):
I am not sure this analysis is correct. The code never reaches the end of main.=20
 TransmitterData is a struct that has no destructor defined but=20
 has a field of type File_Ptr that does. The compiler generates a=20
 destructor, __aggrDtor, which calls the fields that have=20
 destructors, __fieldDtor (e.g. the File_Ptr) which in turn calls=20
 its destructor, File_Ptr.~this().
TransmitterData has a destructor defined but with no code in it. This used = to work fine =E2=80=93 but I cannot be certain which version of LDC that was. The problem does seem to be in the construction of the TransmitterData obje= ct because a destructor is being called on the File_Ptr field as part of the transmitterData constructor.

 solution to this is to either ensure it is initialised in the=20
 constructor of TransmitterData, or account for it possibly being=20
 null by defining a destructor for TransmitterData.
For some reason it seems File_Ptr.~this() is being called before File_Ptr.this() in the TransmitterData.this(). This is totally weird. Having added some writeln statements: (gdb) bt 7LockingTextWriterTaTS5types8File_PtrZQCtFKQChxAaQBcZk (w=3D..., fmt=3D...,= _param_2=3D...) at /usr/lib/ldc/x86_64-linux-gnu/include/d/std/format.d:47= 2 ZQBeMFQBcQBbaZv (this=3D..., _param_0=3D..., _param_1=3D..., _param_2=3D10 = '\n') at channels.d:1586 QzQxZv (_param_0=3D..., _param_1=3D...) at channels.d:3917 libdvbv5_d8dvb_file16dvb_file_formatsZSQDiQDc (this=3D..., path=3D..., dels= ys=3D0, format=3Dlibdvbv5_d.dvb_file.dvb_file_formats.FILE_DVBV5) at channe= ls.d:143 --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Jan 04 2019
parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Saturday, 5 January 2019 at 07:34:17 UTC, Russel Winder wrote:
 TransmitterData has a destructor defined but with no code in 
 it. This used to work fine – but I cannot be certain which 
 version of LDC that was.

 The problem does seem to be in the construction of the 
 TransmitterData object because a destructor is being called on 
 the File_Ptr field as part of the transmitterData constructor.


 The solution to this is to either ensure it is initialised in 
 the constructor of TransmitterData, or account for it possibly 
 being null by defining a destructor for TransmitterData.
For some reason it seems File_Ptr.~this() is being called before File_Ptr.this() in the TransmitterData.this(). This is totally weird. Having added some writeln statements: (gdb) bt dvb_file.d:276 types.d:83 _D3std6format__T14formattedWriteTSQBg5stdio4File17LockingTextWriterTaTS5types8File_ trZQCtFKQChxAaQBcZk (w=..., fmt=..., _param_2=...) at /usr/lib/ldc/x86_64-linux-gnu/include/d/std/format.d:472
Maybe it is a problem with copying a File_Ptr (e.g. missing a increase of the reference count)? Like, `auto a = File_Ptr(); { auto b = a; }` and b calls the destructor on scope exit. That would be consistent with having problems copying to object to pass to writeln.
Jan 05 2019
next sibling parent reply Russel Winder <russel winder.org.uk> writes:
On Sat, 2019-01-05 at 10:31 +0000, Nicholas Wilson via Digitalmars-d-learn
wrote:
[=E2=80=A6]
=20
 Maybe it is a problem with copying a File_Ptr (e.g. missing a=20
 increase of the reference count)? Like, `auto a =3D File_Ptr(); {=20
 auto b =3D a; }` and b calls the destructor on scope exit.
 That would be consistent with having problems copying to object=20
 to pass to writeln.
I found the problem and then two minutes later read your email and bingo we have found the problem. Previously I had used File_Ptr* and on this occasion I was using File_Ptr a= nd there was no copy constructor because I have disable this(this). Except th= at clearly copying a value is not copying a value in this case. Clearly this situation is what is causing the destructor to be called on an unconstructe= d value. But I have no idea why. The question now, of course, is should I have been using File_Ptr instead o= f File_Ptr* in the first place. I am beginning to think I should have been. M= ore thinking needed. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Jan 05 2019
parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Saturday, 5 January 2019 at 10:52:48 UTC, Russel Winder wrote:
 I found the problem and then two minutes later read your email 
 and bingo we have found the problem.
Well done.
 Previously I had used File_Ptr* and on this occasion I was 
 using File_Ptr and there was no copy constructor because I have 
  disable this(this). Except that clearly copying a value is not 
 copying a value in this case. Clearly this situation is what is 
 causing the destructor to be called on an unconstructed value. 
 But I have no idea why.
Could you post a minimised example? Its a bit hard to guess without one.
 The question now, of course, is should I have been using 
 File_Ptr instead of File_Ptr* in the first place. I am 
 beginning to think I should have been. More thinking needed.
From the name, File_Ptr sounds like it is wrapping a reference to a resource. So compare with C's FILE/ D's File which is a reference counted wrapper of a FILE*. Would you ever use a File* (or a FILE**)? Probably not, I never have.
Jan 05 2019
parent reply Russel Winder <russel winder.org.uk> writes:
On Sat, 2019-01-05 at 11:30 +0000, Nicholas Wilson via Digitalmars-d-learn
wrote:
=20
[=E2=80=A6]
 Could you post a minimised example? Its a bit hard to guess=20
 without one.
Indeed. I should do that to see if I can reproduce the problem to submit a proper bug report. [=E2=80=A6]
  From the name, File_Ptr sounds like it is wrapping a reference to=20
 a resource. So compare with C's FILE/ D's File which is a=20
 reference counted wrapper of a FILE*. Would you ever use a File*=20
 (or a FILE**)? Probably not, I never have.
File_Ptr is wrapping a dvb_file * from libdvbv5 to try and make things a bi= t for D and to ensure RAII. libdvbv5 is a C API with classic C approach to handling objects and data structures. My DStep/with manual binding is at https://github.com/russel/libdvbv5_d and the application using it (which is causing the problems) is at=20 https://github.com/russel/DVBTune I have a feeling that I am really not doing things in a D idiomatic way. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Jan 05 2019
parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Saturday, 5 January 2019 at 12:14:15 UTC, Russel Winder wrote:
 Indeed. I should do that to see if I can reproduce the problem 
 to submit a proper bug report.

 File_Ptr is wrapping a dvb_file * from libdvbv5 to try and make 
 things a bit for D and to ensure RAII. libdvbv5 is a C API with 
 classic C approach to handling objects and data structures.

 My DStep/with manual binding is at 
 https://github.com/russel/libdvbv5_d and the application using 
 it (which is causing the problems) is at 
 https://github.com/russel/DVBTune
Your problem possibly (probably?) stems from auto channelsData = TransmitterData(args[1]).scan(frontendId); The temporary TransmitterData(args[1]) is, well, temporary and its destructor runs after that expression is done. As the returned object from scan references data from the temporary, you have a stale pointer.
 I have a feeling that I am really not doing things in a D 
 idiomatic way.
Some driveby style comments then:
	bool opEquals()(const FrontendId other) const {
		if (this is other) return true;
		if (other is null) return false;
		return this.adapter_number == other.adapter_number && 
this.frontend_number == other.frontend_number;
	}
The compiler generated default opEquals will do basically the same thing. Ditto for the other types. You usually want to take a const ref for opEquals since there is no point copying it.
 if (other is null)
I'm surprised the compiler doesn't warn or error on that as the only way that could make sense would be if it had an alias this to a pointer type. You should consider reference counting your pointer wrapper types, FrontendParameters_Ptr/File_Ptr/ScanHandler_Ptr You seem to like const, good! You don't need to take `const int`s as parameters, you're getting a copy anyway. You have a bunch of redundant casts as well. I'll have another looks tomorrow when I'm a bit more awake.
Jan 05 2019
parent reply Russel Winder <russel winder.org.uk> writes:
On Sat, 2019-01-05 at 13:14 +0000, Nicholas Wilson via Digitalmars-d-learn
wrote:
=20
[=E2=80=A6]
 Your problem possibly (probably?) stems from
=20
 auto channelsData =3D TransmitterData(args[1]).scan(frontendId);
=20
 The temporary TransmitterData(args[1]) is, well, temporary and=20
 its destructor runs after that expression is done. As the=20
 returned object from scan references data from the temporary, you=20
 have a stale pointer.
Actually that is not a worry since the TransmitterData instance is only nee= ded to call the scan function which creates a ChannelsData instance that holds = no references to the TransmitterData instance. It turns out that whilst the code used to run, and now doesn't, all the thi= ngs we have been talking of are nothing to do with the core problem. It turns o= ut that the function call to initialise the File_Ptr object is returning a val= id object with invalid data. Thus the unknown change is likely in the libdvbv5 library either due to the C API doing different things or the adapter creat= ed using DStep doing the wrong thing. =20
 I have a feeling that I am really not doing things in a D=20
 idiomatic way.
=20 Some driveby style comments then: =20
 	bool opEquals()(const FrontendId other) const {
 		if (this is other) return true;
 		if (other is null) return false;
 		return this.adapter_number =3D=3D other.adapter_number &&=20
 this.frontend_number =3D=3D other.frontend_number;
 	}
=20 The compiler generated default opEquals will do basically the=20 same thing. Ditto for the other types. You usually want to take a=20 const ref for opEquals since there is no point copying it.
I deleted them, added a test or three (should have done this ages ago) and = the tests pass. So teh generated methods do the needful. Thanks for that "heads up".
 if (other is null)
=20 I'm surprised the compiler doesn't warn or error on that as the=20 only way that could make sense would be if it had an alias this=20 to a pointer type. =20 You should consider reference counting your pointer wrapper=20 types, FrontendParameters_Ptr/File_Ptr/ScanHandler_Ptr
Very true. For now I have forbidden copying, which is wrong for this sort o= f thing. If D had the equivalent of C++ std::shared_ptr or Rust std::rc::Rc o= r std::rc::Arc, that would be the way forward But I guess having explicit reference counting is not too hard.
 You seem to like const, good! You don't need to take `const int`s=20
 as parameters, you're getting a copy anyway. You have a bunch of=20
 redundant casts as well.
I am a person who always makes Java variables final, and loves that Rust variables are immutable by default! --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Jan 08 2019
parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Tuesday, 8 January 2019 at 10:23:30 UTC, Russel Winder wrote:
 Actually that is not a worry since the TransmitterData instance 
 is only needed to call the scan function which creates a 
 ChannelsData instance that holds no references to the 
 TransmitterData instance.

 It turns out that whilst the code used to run, and now doesn't, 
 all the things
 we have been talking of are nothing to do with the core 
 problem. It turns out
 that the function call to initialise the File_Ptr object is 
 returning a valid
 object with invalid data. Thus the unknown change is likely in 
 the libdvbv5
 library either due to the C API doing different things or the 
 adapter created
 using DStep doing the wrong thing.
Ahh. Good that you've found that, I can't help you much more with that then.
 The compiler generated default opEquals will do basically the 
 same thing. Ditto for the other types. You usually want to 
 take a const ref for opEquals since there is no point copying 
 it.
I deleted them, added a test or three (should have done this ages ago) and the tests pass. So teh generated methods do the needful. Thanks for that "heads up".
No problems, less code is good code.
 Very true. For now I have forbidden copying, which is wrong for 
 this sort of thing. If D had the equivalent of C++ 
 std::shared_ptr or Rust std::rc::Rc or std::rc::Arc, that would 
 be the way forward But I guess having explicit reference 
 counting is not too hard.
I believe Andrei and Razvan are working on that, part of that being the Copy Constructor DIP. Hopefully it will arrive soon.
 You seem to like const, good! You don't need to take `const 
 int`s as parameters, you're getting a copy anyway. You have a 
 bunch of redundant casts as well.
I am a person who always makes Java variables final, and loves that Rust variables are immutable by default!
Indeed, less to think about is always nice. Good luck figuring out why your data is dud. Nic
Jan 08 2019
parent reply Russel Winder <russel winder.org.uk> writes:
On Tue, 2019-01-08 at 11:51 +0000, Nicholas Wilson via Digitalmars-d-learn
wrote:
=20
[=E2=80=A6]
 Ahh. Good that you've found that, I can't help you much more with=20
 that then.
Indeed. :-) Your hep to get to this point though has been invaluable. Thanks you for putting in the time and effort. [=E2=80=A6]
 Good luck figuring out why your data is dud.
It really is totally weird. My new Rust binding to libdvbv5 and associated version of the same application works fine. So libdvbv5 itself is not the cuprit. This has to mean it is something about the D compilers that has changed the way the D binding to libdvbv5 behaves. If only the D plugin to CLion were much further down the road this would be much easier to fix. I had an issue in the Rust and it was fixed in a couple= of minutes because of the way CLion drives GDB for you. Using GDB manually is t of giving up on the D version.=20 In an ideal world JetBrains would take over the D plugin, but that isn't go= ng to happen =E2=80=93 unlike what happened for Go and Rust. What the D plugin= needs is some full time workers: the great work by the current volunteers is slow progress by nature of it being volunteer effort by a few people. =20 --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Jan 09 2019
next sibling parent reply Johannes Loher <johannes.loher fg4f.de> writes:
On Wednesday, 9 January 2019 at 16:48:47 UTC, Russel Winder wrote:
 On Tue, 2019-01-08 at 11:51 +0000, Nicholas Wilson via 
 Digitalmars-d-learn wrote:
 [...]
[…]
 [...]
[...]
If debugger integration is that important to you, you might want to try out visual studio code with the corresponding plugins (you need a separate plugin for debugger support). I found it to work quite decently.
Jan 09 2019
parent Russel Winder <russel winder.org.uk> writes:
On Wed, 2019-01-09 at 20:03 +0000, Johannes Loher via Digitalmars-d-learn
wrote:
=20
[=E2=80=A6]
 If debugger integration is that important to you, you might want=20
 to try out visual studio code with the corresponding plugins (you=20
 need a separate plugin for debugger support). I found it to work=20
 quite decently.
Or I could stop doing D and Rust programming and switch to Kotlin and Java = and get stuck into helping the IntelliJ-DLanguage team as I said I would a coup= le of years ago. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Jan 09 2019
prev sibling parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Wednesday, 9 January 2019 at 16:48:47 UTC, Russel Winder wrote:
 It really is totally weird. My new Rust binding to libdvbv5 and 
 associated version of the same application works fine. So 
 libdvbv5 itself is not the cuprit. This has to mean it is 
 something about the D compilers that has changed the way the D 
 binding to libdvbv5 behaves.
Hmm, if you think the binding could be the problem you could try using app as an alternative, see if it makes any difference.
Jan 09 2019
parent Russel Winder <russel winder.org.uk> writes:
On Thu, 2019-01-10 at 07:36 +0000, Nicholas Wilson via Digitalmars-d-learn
wrote:
[=E2=80=A6]
 Hmm, if you think the binding could be the problem you could try=20
 using app as an alternative, see if it makes any difference.
I did a proper update of the generated files of the binding, and magically everything works again. I do not recollect any change being significant, bu= t clearly something was. So the problem is that I had failed to notice the update of libdvbv5 with a breaking change and update the binding accordingly. Of course I got lots of other good stuff done with the code by having this problem and people such as yourself commenting. Big win all round. :-) Thanks again for helping on this. =20 --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Jan 10 2019
prev sibling parent reply Russel Winder <russel winder.org.uk> writes:
On Sat, 2019-01-05 at 10:52 +0000, Russel Winder wrote:
 On Sat, 2019-01-05 at 10:31 +0000, Nicholas Wilson via Digitalmars-d-lear=
n
 wrote:
 [=E2=80=A6]
 Maybe it is a problem with copying a File_Ptr (e.g. missing a=20
 increase of the reference count)? Like, `auto a =3D File_Ptr(); {=20
 auto b =3D a; }` and b calls the destructor on scope exit.
 That would be consistent with having problems copying to object=20
 to pass to writeln.
=20 I found the problem and then two minutes later read your email and bingo =
we
 have found the problem.
=20
 Previously I had used File_Ptr* and on this occasion I was using File_Ptr
 and
 there was no copy constructor because I have  disable this(this). Except
 that
 clearly copying a value is not copying a value in this case. Clearly this
 situation is what is causing the destructor to be called on an unconstruc=
ted
 value. But I have no idea why.
=20
 The question now, of course, is should I have been using File_Ptr instead=
of
 File_Ptr* in the first place. I am beginning to think I should have been.
 More
 thinking needed.
Switching to using File_Ptr* I now get the SIGSEGV at the end of main as yo= u This code used to work. :-( --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Jan 05 2019
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/5/19 6:33 AM, Russel Winder wrote:
 On Sat, 2019-01-05 at 10:52 +0000, Russel Winder wrote:
 On Sat, 2019-01-05 at 10:31 +0000, Nicholas Wilson via Digitalmars-d-learn
 wrote:
 […]
 Maybe it is a problem with copying a File_Ptr (e.g. missing a
 increase of the reference count)? Like, `auto a = File_Ptr(); {
 auto b = a; }` and b calls the destructor on scope exit.
 That would be consistent with having problems copying to object
 to pass to writeln.
I found the problem and then two minutes later read your email and bingo we have found the problem. Previously I had used File_Ptr* and on this occasion I was using File_Ptr and there was no copy constructor because I have disable this(this). Except that clearly copying a value is not copying a value in this case. Clearly this situation is what is causing the destructor to be called on an unconstructed value. But I have no idea why. The question now, of course, is should I have been using File_Ptr instead of File_Ptr* in the first place. I am beginning to think I should have been. More thinking needed.
Switching to using File_Ptr* I now get the SIGSEGV at the end of main as you This code used to work. :-(
Russel, make sure your destructor both checks whether the underlying resource is set, and clears it to invalid when freeing it. Even types that can't be copied can be moved, or temporarily created as rvalues. When they are moved the shell they get moved out of is still destructed! So it has to have a state where it can be destroyed, even though there is no resource. Maybe some inspiration here: https://github.com/MartinNowak/io/blob/master/src/std/io/file.d#L189-L196 -Steve
Jan 08 2019
parent reply Russel Winder <russel winder.org.uk> writes:
On Tue, 2019-01-08 at 09:59 -0500, Steven Schveighoffer via Digitalmars-d-
learn wrote:
=20
[=E2=80=A6]
=20
 Russel, make sure your destructor both checks whether the underlying=20
 resource is set, and clears it to invalid when freeing it.
=20
 Even types that can't be copied can be moved, or temporarily created as=
=20
 rvalues. When they are moved the shell they get moved out of is still=20
 destructed! So it has to have a state where it can be destroyed, even=20
 though there is no resource.
I have added tests in the destructor but given the constructor should throw= an exception on a failure to initialise the internal state correctly, it reall= y ought to be unnecessary. but I guess it cant hurt being there! As I noted to Nicholas it seems the application is getting a valid data structure returned with invalid data and that is where the SIGSEGV is. This= is really weird as I have just finished a Rust version of the same application and it works fine. And this D version used to work fine. It is a real myste= ry why there is a problem now. Sadly the D plugin to CLion doesn't as yet have the same functionality as t= he Rust plugin. Debugging these sorts of thing is just so much better in CLio= n than trying to work GDB manually.
 Maybe some inspiration here:=20
 https://github.com/MartinNowak/io/blob/master/src/std/io/file.d#L189-L196
=20
I will check that out, thanks for the pointer. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Jan 09 2019
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/9/19 11:39 AM, Russel Winder wrote:
 On Tue, 2019-01-08 at 09:59 -0500, Steven Schveighoffer via Digitalmars-d-
 learn wrote:

 […]
 Russel, make sure your destructor both checks whether the underlying
 resource is set, and clears it to invalid when freeing it.

 Even types that can't be copied can be moved, or temporarily created as
 rvalues. When they are moved the shell they get moved out of is still
 destructed! So it has to have a state where it can be destroyed, even
 though there is no resource.
I have added tests in the destructor but given the constructor should throw an exception on a failure to initialise the internal state correctly, it really ought to be unnecessary. but I guess it cant hurt being there!
The point is that some libraries are not robust enough to handle freeing data multiple times. And with the way postblit/dtors work, you have to handle this properly in D.
 As I noted to Nicholas it seems the application is getting a valid data
 structure returned with invalid data and that is where the SIGSEGV is. This is
 really weird as I have just finished a Rust version of the same application
 and it works fine. And this D version used to work fine. It is a real mystery
 why there is a problem now.
Hm... your description of having the problem happen at the end of main seems to suggest it has something to do with destruction. -Steve
Jan 10 2019
parent reply Russel Winder <russel winder.org.uk> writes:
On Thu, 2019-01-10 at 10:00 -0500, Steven Schveighoffer via Digitalmars-d-
learn wrote:
[=E2=80=A6]
=20
 Hm... your description of having the problem happen at the end of main=
=20
 seems to suggest it has something to do with destruction.
=20
It seems that there was a change in one file of libdvbv5 1.14.x =E2=86=92 1= .16.y that introduced a breaking change wrt the D binding. I did a regeneration using DStep, didn't notice anything significant, and yet everything now works aga= in. So it was very significant. The underlying problem here was that I had failed to notice the upgrade of libdvbv5! =20 --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Jan 10 2019
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/10/19 12:30 PM, Russel Winder wrote:
 On Thu, 2019-01-10 at 10:00 -0500, Steven Schveighoffer via Digitalmars-d-
 learn wrote:
 […]
 Hm... your description of having the problem happen at the end of main
 seems to suggest it has something to do with destruction.
It seems that there was a change in one file of libdvbv5 1.14.x → 1..16.y that introduced a breaking change wrt the D binding. I did a regeneration using DStep, didn't notice anything significant, and yet everything now works again. So it was very significant. The underlying problem here was that I had failed to notice the upgrade of libdvbv5!
That is one problem with linking against C or C++ code -- changes to certain things (e.g. struct layout) don't change the mangling. You may want to consider using dpp instead if possible. -Steve
Jan 10 2019
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 10, 2019 at 01:09:22PM -0500, Steven Schveighoffer via
Digitalmars-d-learn wrote:
 On 1/10/19 12:30 PM, Russel Winder wrote:
 On Thu, 2019-01-10 at 10:00 -0500, Steven Schveighoffer via Digitalmars-d-
 learn wrote:
 […]
 Hm... your description of having the problem happen at the end of
 main seems to suggest it has something to do with destruction.
 
It seems that there was a change in one file of libdvbv5 1.14.x → 1..16.y that introduced a breaking change wrt the D binding. I did a regeneration using DStep, didn't notice anything significant, and yet everything now works again. So it was very significant. The underlying problem here was that I had failed to notice the upgrade of libdvbv5!
That is one problem with linking against C or C++ code -- changes to certain things (e.g. struct layout) don't change the mangling.
Yeah, this is the same problem with shared library soname versioning on Posix. Technically everytime the ABI changes the version must be bumped, but since this is not automated, it's prone to human error, or rather, negligence. It makes one wonder if there should somehow be a way of encapsulating the changes to the ABI in a way that can be automatically checked. (It has to be automatic, otherwise it would be too onerous and nobody would do it in practice.) The most obvious way is to mangle the field types of the struct as part of the struct's mangled name, though this does introduce a lot of symbol bloat (and may need another round of ridiculously-long symbol names that need some manner of compression to keep under control). Barring that, perhaps some kind of low-collision hash of the struct contents where the kind of small changes that tend to happen in code will be highly unlikely to collide, so any such changes will be easily detected. If one were paranoid, one could use cryptographic hashes for pretty much guaranteeing uniqueness, but that'd be total overkill. // OTOH, perhaps the more pertinent issue here is that the bindings were generated *manually as a separate step* outside of the build system. Ideally, you'd automate the generation of bindings as part of your build, so that they will *always* be up-to-date. I'm a big fan of automation, because this is the kind of tedious housekeeping that humans are really, really good at forgetting and/or screwing up. (Side-note: and this is why I insist that my build systems must support generic dependency building. All these sorts of tasks *need* to be part of the build rather than done by hand, precisely to prevent these sorts of time-wasting, head-scratching mishaps.)
 You may want to consider using dpp instead if possible.
[...] Or this. Which is essentially equivalent to automatically generating bindings. T -- Маленькие детки - маленькие бедки.
Jan 10 2019
prev sibling parent reply Russel Winder <russel winder.org.uk> writes:
On Thu, 2019-01-10 at 13:09 -0500, Steven Schveighoffer via Digitalmars-d-
learn wrote:
=20
[=E2=80=A6]
 That is one problem with linking against C or C++ code -- changes to=20
 certain things (e.g. struct layout) don't change the mangling.
I am having nightmares trying to decide what to do with the Rust version ba= sed around generate on demand or on version change. With bindgen in Rust though= , there is no need for manual tweaking so automated is possible. Except that = it puts a massive dependency burden on any project using it. DStep generated bindings tend to need some manual tweaking that cannot be automated, which is surprising given that bindgen can do things without man= ual intervention for Rust.
 You may want to consider using dpp instead if possible.
DPP cannot build "out of the box" on Debian Sid, so I have not actually tri= ed it. There are three audiences here: 1. People building libraries for their own use on their own machines. 2. People building for OS distributions. 3. People building to distribute to others who will not be building for themselves.=20 Categories 1 and 2 could probably cope with automated generation despite th= e huge dependency burden, assuming there are no version conflicts =E2=80=93 t= his seems to be a massive unsolved problem with Rust/Cargo/crates.io :-( . Category 3 needs as light a weight and speed of build as possible, and the ability to dynamically adapt to the APIs and ABIs of execution at run time. On the other hand, suspect I may be the sole user of RUst and D bindings to libdvbv5! --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Jan 10 2019
parent reply Jacob Carlborg <doob me.com> writes:
On 2019-01-11 06:31, Russel Winder wrote:

 DStep generated bindings tend to need some manual tweaking that cannot be
 automated, which is surprising given that bindgen can do things without manual
 intervention for Rust.
It's not surprising at all. Different tools, different approaches, different amount of man power, different output languages and so on. Please file any bugs and/or enhancement requests that you find. -- /Jacob Carlborg
Jan 13 2019
parent reply Russel Winder <russel winder.org.uk> writes:
On Sun, 2019-01-13 at 21:56 +0100, Jacob Carlborg via Digitalmars-d-learn
wrote:
 On 2019-01-11 06:31, Russel Winder wrote:
=20
 DStep generated bindings tend to need some manual tweaking that cannot =
be
 automated, which is surprising given that bindgen can do things without
 manual
 intervention for Rust.
=20 It's not surprising at all. Different tools, different approaches,=20 different amount of man power, different output languages and so on.
Indeed. Apologies for any aggressiveness that I might have appeared to set out, it was not intended. It's really just frustration.=20 To balance things out a bit: DStep does some stuff that Bindgen just doesn'= t even tackle. I am finding I am having to do some manual stuff with Bindgen.
 Please file any bugs and/or enhancement requests that you find.
Wilco. But I'll try and create test cases rather than just point at the ful= l project. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: www.russel.org.uk
Jan 14 2019
parent Jacob Carlborg <doob me.com> writes:
On 2019-01-14 15:07, Russel Winder wrote:

 Wilco. But I'll try and create test cases rather than just point at the full
 project.
Reduced test cases are definitely appreciated. Unfortunately I don't use DStep enough to find all bugs and enhancements. Hopefully that will change. -- /Jacob Carlborg
Jan 14 2019