digitalmars.D - ESA's Schiaparelli Mars probe crashed because of integer overflow
- qznc (5/16) Nov 24 2016 That is why we need CheckedInt, folks. Reminder End. ;)
- Timon Gehr (5/18) Nov 24 2016 I don't think overflow is what happened. Rather, the statistical model
- Patrick Schluter (4/38) Nov 24 2016 Hey, sounds suspicously similar to Ariane 5 explosion. Does ESA
- Alix Pexton (4/32) Nov 25 2016 I thought Ariane was caused by errorcodes from one module being sent on
- Patrick Schluter (10/58) Nov 25 2016 Nope it was an oveflowing down cast
- Kagamin (4/10) Nov 28 2016 The mistake was that hardware was upgraded, but software and
- Timon Gehr (5/14) Nov 25 2016 I don't think we have enough information to judge, but remember that
- Claude (12/15) Nov 25 2016 Well, from the little information we have, I suppose we can only
- Walter Bright (14/21) Nov 25 2016 I'd like to know what really happened with the code.
- deadalnix (3/10) Nov 26 2016 You got a great teacher right there !
- Walter Bright (14/24) Nov 26 2016 It was actually institute policy, not an individual teacher's. Another p...
- Shachar Shemesh (29/35) Nov 26 2016 My experience is slightly different. More accurately, I think your
- deadalnix (3/3) Nov 26 2016 I can confirm. i know some people in the car industry and that
- lobo (18/62) Nov 27 2016 My real world experience differs from yours but probably it comes
- Era Scarecrow (5/12) Nov 27 2016 With them pushing self-driving cars, if that gets off the ground
- Walter Bright (4/7) Nov 27 2016 Frankly, Google needs to hire some engineers from the aviation industry,...
Although, the article [0] does not say that literally, it sounds like an integer overflow:After trawling through mountains of data, the European Space Agency said Wednesday that while much of the mission went according to plan, a computer that measured the rotation of the lander hit a maximum reading, knocking other calculations off track.That led the navigation system to think the lander was much lower than it was, causing its parachute and braking thrusters to be deployed prematurely."The erroneous information generated an estimated altitude that was negative—that is, below ground level," the ESA said in a statement.That is why we need CheckedInt, folks. Reminder End. ;) [0] http://phys.org/news/2016-11-glitch-blamed-european-mars-lander.html
Nov 24 2016
On 24.11.2016 20:49, qznc wrote:Although, the article [0] does not say that literally, it sounds like an integer overflow:I don't think overflow is what happened. Rather, the statistical model they used to filter the sensor data didn't match reality. It put too much trust into a malfunctioning sensor -- I assume the sensor readings were extremely implausible.After trawling through mountains of data, the European Space Agency said Wednesday that while much of the mission went according to plan, a computer that measured the rotation of the lander hit a maximum reading, knocking other calculations off track.That led the navigation system to think the lander was much lower than it was, causing its parachute and braking thrusters to be deployed prematurely."The erroneous information generated an estimated altitude that was negative—that is, below ground level," the ESA said in a statement.That is why we need CheckedInt, folks. Reminder End. ;) [0] http://phys.org/news/2016-11-glitch-blamed-european-mars-lander.html
Nov 24 2016
On Thursday, 24 November 2016 at 20:22:00 UTC, Timon Gehr wrote:On 24.11.2016 20:49, qznc wrote:Hey, sounds suspicously similar to Ariane 5 explosion. Does ESA not learn from its errors or am I only reading too much in it (probably)?Although, the article [0] does not say that literally, it sounds like an integer overflow:I don't think overflow is what happened. Rather, the statistical model they used to filter the sensor data didn't match reality. It put too much trust into a malfunctioning sensor -- I assume the sensor readings were extremely implausible.After trawling through mountains of data, the European Space Agency said Wednesday that while much of the mission went according to plan, a computer that measured the rotation of the lander hit a maximum reading, knocking other calculations off track.That led the navigation system to think the lander was much lower than it was, causing its parachute and braking thrusters to be deployed prematurely."The erroneous information generated an estimated altitude that was negative—that is, below ground level," the ESA said in a statement.That is why we need CheckedInt, folks. Reminder End. ;) [0] http://phys.org/news/2016-11-glitch-blamed-european-mars-lander.html
Nov 24 2016
On 25/11/2016 07:14, Patrick Schluter wrote:On Thursday, 24 November 2016 at 20:22:00 UTC, Timon Gehr wrote:I thought Ariane was caused by errorcodes from one module being sent on the same bus as telemetry and interpreted as instructions by another module? A...On 24.11.2016 20:49, qznc wrote:Hey, sounds suspicously similar to Ariane 5 explosion. Does ESA not learn from its errors or am I only reading too much in it (probably)?Although, the article [0] does not say that literally, it sounds like an integer overflow:I don't think overflow is what happened. Rather, the statistical model they used to filter the sensor data didn't match reality. It put too much trust into a malfunctioning sensor -- I assume the sensor readings were extremely implausible.After trawling through mountains of data, the European Space Agency said Wednesday that while much of the mission went according to plan, a computer that measured the rotation of the lander hit a maximum reading, knocking other calculations off track.That led the navigation system to think the lander was much lower than it was, causing its parachute and braking thrusters to be deployed prematurely."The erroneous information generated an estimated altitude that was negative—that is, below ground level," the ESA said in a statement.That is why we need CheckedInt, folks. Reminder End. ;) [0] http://phys.org/news/2016-11-glitch-blamed-european-mars-lander.html
Nov 25 2016
On Friday, 25 November 2016 at 09:19:26 UTC, Alix Pexton wrote:On 25/11/2016 07:14, Patrick Schluter wrote:Nope it was an oveflowing down cast https://around.com/ariane.html The irony was that the specific module that had made the wrong calculation was even formally proved to be correct. This accident also gave Bertrand Meyer (Eiffel) a lot of wind for his sails about design by contract https://archive.eiffel.com/doc/manuals/technology/contract/ariane/ in that context it might be even interesting for the D language, as it is one of the few languages that have (inbuilt) contracts.On Thursday, 24 November 2016 at 20:22:00 UTC, Timon Gehr wrote:I thought Ariane was caused by errorcodes from one module being sent on the same bus as telemetry and interpreted as instructions by another module? A...On 24.11.2016 20:49, qznc wrote:Hey, sounds suspicously similar to Ariane 5 explosion. Does ESA not learn from its errors or am I only reading too much in it (probably)?Although, the article [0] does not say that literally, it sounds like an integer overflow:I don't think overflow is what happened. Rather, the statistical model they used to filter the sensor data didn't match reality. It put too much trust into a malfunctioning sensor -- I assume the sensor readings were extremely implausible.After trawling through mountains of data, the European Space Agency said Wednesday that while much of the mission went according to plan, a computer that measured the rotation of the lander hit a maximum reading, knocking other calculations off track.That led the navigation system to think the lander was much lower than it was, causing its parachute and braking thrusters to be deployed prematurely."The erroneous information generated an estimated altitude that was negative—that is, below ground level," the ESA said in a statement.That is why we need CheckedInt, folks. Reminder End. ;) [0] http://phys.org/news/2016-11-glitch-blamed-european-mars-lander.html
Nov 25 2016
On Friday, 25 November 2016 at 17:06:14 UTC, Patrick Schluter wrote:This accident also gave Bertrand Meyer (Eiffel) a lot of wind for his sails about design by contract https://archive.eiffel.com/doc/manuals/technology/contract/ariane/ in that context it might be even interesting for the D language, as it is one of the few languages that have (inbuilt) contracts.The mistake was that hardware was upgraded, but software and tests weren't, contracts wouldn't help unless it was spark.
Nov 28 2016
On 25.11.2016 08:14, Patrick Schluter wrote:On Thursday, 24 November 2016 at 20:22:00 UTC, Timon Gehr wrote:I don't think we have enough information to judge, but remember that writing correct software is hard. This is no less true if it should automatically land a spacecraft on the surface of Mars using real time data from possibly malfunctioning sensors. :)... I don't think overflow is what happened. Rather, the statistical model they used to filter the sensor data didn't match reality. It put too much trust into a malfunctioning sensor -- I assume the sensor readings were extremely implausible.Hey, sounds suspicously similar to Ariane 5 explosion. Does ESA not learn from its errors or am I only reading too much in it (probably)?
Nov 25 2016
On Friday, 25 November 2016 at 07:14:45 UTC, Patrick Schluter wrote:Hey, sounds suspicously similar to Ariane 5 explosion. Does ESA not learn from its errors or am I only reading too much in it (probably)?Well, from the little information we have, I suppose we can only be reading too much in it. So, I like too to think it's just due to an integer overflow. But not from a software engineer perspective, but more from a Marxist approach. One misses a simple test over an integer, and you make a rocket-ship worth billions of good money (that could be used in education, medical care or whatever) explode in tiny cold little pieces, 54 millions km from here. What an ironic and subversive bug, the engineer who did that should be immensely proud of himself. :)
Nov 25 2016
On 11/25/2016 4:22 AM, Claude wrote:So, I like too to think it's just due to an integer overflow. But not from a software engineer perspective, but more from a Marxist approach. One misses a simple test over an integer, and you make a rocket-ship worth billions of good money (that could be used in education, medical care or whatever) explode in tiny cold little pieces, 54 millions km from here. What an ironic and subversive bug, the engineer who did that should be immensely proud of himself. :)I'd like to know what really happened with the code. But as someone who has worked on flight critical systems for airliners, the designs are required to account for any single failure of anything. That means all inputs must be validated for "reasonableness", and the same for outputs. If any of this is outside reasonable bounds, there must be failover to a backup method. A negative altitude is not reasonable. ----- It reminds me of college, where we were told that if we worked a problem and came up with unreasonable answers, such as negative energy, we were expected to note: "I know this answer is unreasonable, but I cannot find the mistake." and the worst you'd get is a 0. Unreasonable answers, and no note, meant you'd get a negative score!
Nov 25 2016
On Saturday, 26 November 2016 at 05:50:19 UTC, Walter Bright wrote:It reminds me of college, where we were told that if we worked a problem and came up with unreasonable answers, such as negative energy, we were expected to note: "I know this answer is unreasonable, but I cannot find the mistake." and the worst you'd get is a 0. Unreasonable answers, and no note, meant you'd get a negative score!You got a great teacher right there !
Nov 26 2016
On 11/26/2016 3:16 AM, deadalnix wrote:On Saturday, 26 November 2016 at 05:50:19 UTC, Walter Bright wrote:It was actually institute policy, not an individual teacher's. Another policy is no grades can be based on attendance (unless it was P.E.). A third is that if you can pass the finals, you can opt out of any class and yet receive full credit for it. A fourth was grades will not be on a curve - you either met the standard or you didn't. There's more. Oh, one more you'll recognize. You'd get a 0 on any computation where you prematurely rounded the results :-) The algebra had to be worked out to its final form before plugging in numbers. (Lots of times intermediate terms would algebraically cancel out, so calculating intermediate values would result in spurious rounding errors.) I thought it was a fairly enlightened system of grading, quite a step up from what I was used to.It reminds me of college, where we were told that if we worked a problem and came up with unreasonable answers, such as negative energy, we were expected to note: "I know this answer is unreasonable, but I cannot find the mistake." and the worst you'd get is a 0. Unreasonable answers, and no note, meant you'd get a negative score!You got a great teacher right there !
Nov 26 2016
On 26/11/16 07:50, Walter Bright wrote:I'd like to know what really happened with the code. But as someone who has worked on flight critical systems for airliners, the designs are required to account for any single failure of anything. That means all inputs must be validated for "reasonableness", and the same for outputs. If any of this is outside reasonable bounds, there must be failover to a backup method.My experience is slightly different. More accurately, I think your experience is too narrow. Yes, civilian aviation code gets a very high level of scrutiny. Number's I've heard range from 1:9 to 1:18 ratio between resources spent writing the code and resources spent testing it. Code is written to extremely high standards, that relate to the level of dependency flight safety has on the code. So, code actually flying the aircraft > code used to display flight critical information to the pilot > code used to display information the pilot may depend on > code used to display generic information. That last category, BTW, may run Windows and off the shelf applications. So that part corroborates Walter's story, BUT THIS ONLY APPLIES TO CIVILIAN AIRCRAFTS This level of standard does not apply to: * Military aircrafts * Spaceships * Auto car industry * Medical equipment I'm sure there's more Even drones, until fairly recently (around 2008), were completely unregulated. I'm talking about huge unmanned flying platforms, some as big as four seat airplanes. In some of those fields, things aren't as bad as that. The car industry is slowly getting better. High financial stakes in the space field cause caution. The military aviation field is done by much of the same players as the civilian aviation, and thus some care is carried over. As far as regulations go, however, we're screwed. Shachar
Nov 26 2016
I can confirm. i know some people in the car industry and that software fall into the same bucket as law and sausage: you don't want to know how it's done.
Nov 26 2016
On Sunday, 27 November 2016 at 05:43:11 UTC, Shachar Shemesh wrote:On 26/11/16 07:50, Walter Bright wrote:My real world experience differs from yours but probably it comes down to the organisation you're with and for larger companies even which group. I've worked in military aviation, commercial drones for mining and exploration, not military, and medical devices and it was all heavily regulated software. I haven't come across too many cowboy outfits. I cannot speak for the other industries you mention such as automotive. The problem we face today in medical is not the lack of scrutiny and regulation but that regulations have not caught up with the security issues. The latest FDA guidelines address this somewhat for for pre and post market devices but there are many devices out there running a full linux with nothing more than SSH disabled. The majority will still have a root user account and probably even enable root over serial console. bye, loboI'd like to know what really happened with the code. But as someone who has worked on flight critical systems for airliners, the designs are required to account for any single failure of anything. That means all inputs must be validated for "reasonableness", and the same for outputs. If any of this is outside reasonable bounds, there must be failover to a backup method.My experience is slightly different. More accurately, I think your experience is too narrow. Yes, civilian aviation code gets a very high level of scrutiny. Number's I've heard range from 1:9 to 1:18 ratio between resources spent writing the code and resources spent testing it. Code is written to extremely high standards, that relate to the level of dependency flight safety has on the code. So, code actually flying the aircraft > code used to display flight critical information to the pilot > code used to display information the pilot may depend on > code used to display generic information. That last category, BTW, may run Windows and off the shelf applications. So that part corroborates Walter's story, BUT THIS ONLY APPLIES TO CIVILIAN AIRCRAFTS This level of standard does not apply to: * Military aircrafts * Spaceships * Auto car industry * Medical equipment I'm sure there's more Even drones, until fairly recently (around 2008), were completely unregulated. I'm talking about huge unmanned flying platforms, some as big as four seat airplanes. In some of those fields, things aren't as bad as that. The car industry is slowly getting better. High financial stakes in the space field cause caution. The military aviation field is done by much of the same players as the civilian aviation, and thus some care is carried over. As far as regulations go, however, we're screwed. Shachar
Nov 27 2016
On Sunday, 27 November 2016 at 05:43:11 UTC, Shachar Shemesh wrote:THIS ONLY APPLIES TO CIVILIAN AIRCRAFTS This level of standard does not apply to: * Military aircrafts * Spaceships * Auto car industry * Medical equipment I'm sure there's moreWith them pushing self-driving cars, if that gets off the ground we will be having a lot of accidents. Some will inevitably be due to overflows, misinformation from Google servers.
Nov 27 2016
On 11/27/2016 1:21 PM, Era Scarecrow wrote:With them pushing self-driving cars, if that gets off the ground we will be having a lot of accidents. Some will inevitably be due to overflows, misinformation from Google servers.Frankly, Google needs to hire some engineers from the aviation industry, who know how to do these sorts of things. From the accounts of how the Toyota car computers were set up, they have no idea how to do it.
Nov 27 2016