In the article by Nancy Leveson in RISKS 3.72, she mentioned that a task based on the Viking Landing software was going to be used as the basis for some experiments on concerning fault tolerant software. There have in fact been a number of carefully cntrolled experiments aimed at assessing the possible cost-effectiveness of fault tolerance in software. I then read the Tony Hoare quote in RISKS 3.74, bemoaning the fact that formal verification had not been used in any safety-critical software. Offhand, I do not know of any similar controlled experiments being performed on the cost-effectiveness of formal verification. Indeed, it strikes me that it would be very interesting if the planned experiments that Nancy refers to were to cover various verification and testing, as well as, fault tolerance experiments. Ideally risks should be quantified, so claimed remedies should be the subject of experimental evaluation, as well as eloquent pleading. Brian Randell - Computing Laboratory, University of Newcastle upon Tyne ARPA : firstname.lastname@example.org UUCP : <UK>!ukc!cheviot!brian JANET : email@example.com
There has been some discussion on Risks lately about the application of hardware reliability models to software. The purpose of such models is to make predictions. The accuracy of a prediction based upon a mathematical model depends on whether the assumptions of the underlying model fit the situation in which it is being applied. Hardware reliability models make such assumptions as: 1) component failures will occur independently in independent replicated units 2) the behavior of a physical component can be predicted from data gathered from observations of other components that are assumed to be similar. 3) the design of the system is free from faults. None of these seem to apply to software. Attempting to come up with some strange meaning of "wear out" so that the models can be applied to software is begging the question. We know that "wear out" AS IS MEANT IN THESE MODELS does not apply to software. Therefore, the results of applying the models to software may be inaccurate. The burden of proof is in showing that the assumptions apply as originally conceived in the models. As an example, trying to fit software to "bathtub curve" models (which were built by observing hardware) would seem to be a less fruitful line of endeavor than attempting to build models from what we observe about software. Nancy Leveson Info. & Computer Science University of California, Irvine
City buses on this continent generally have rear or center exit doors interlocked with the brakes, so that once the exit doors are opened, the bus cannot be moved until they are closed. An excellent safety feature, yes? You'd never move a bus while someone might be getting off... Well, one day a few years ago, on the Toronto transit system, the exit doors of a bus popped open spontaneously due to a malfunction in their control system, and stayed open. The bus was on a level crossing, and was full, and a few seconds later the barriers started lowering as a train approached. The collision was frightful. In the investigation it turned out that the buses were fitted with a control to override the interlock, but it was in a concealed location (for maintenance access only) and drivers were not trained in its use. Needless to say this was promptly changed. On the other hand, I could also cite several instances in the well- documented history of British railway accident investigations where both drivers and signalmen* were provided with overrides to be used only in case of equipment malfunction, and did not believe their equipment, and used the overrides to cause accidents. *They WERE all men in those days. I don't know what the modern word is. The moral seems to be that overrides are indeed a good thing to have, but you have to be very sure that the user knows when to use them. And if the engineer or programmer isn't the one training the users, this can be rather difficult. By the way, those reading this somewhere else than on Usenet may be interested to know that people who use an interface called Pnews to post Usenet articles are asked: This program posts news to many hundreds of machines throughout the world. Are you absolutely sure that you want to do this? [ny] This comes up before the message is entered; afterwards, the question Send, abort, edit, or list? is asked, so the initial question is not the only chance to abort. In effect, the extra initial confirmation asks users to override a safety feature on every normal invocation of the program. Is this useful? [ANSWER TO MARK PLEASE, NOT RISKS ON THIS QUESTION.] Mark Brader, utzoo!dciem!msb
In Risks 3.74, Jerry Leichter writes: >Accidents in general are fairly low-probability events. As such, they have to >be reasoned about carefully - our intuitions on such events are usually based >on too little data to be worth much. Also, since we have little direct expe- >rience, we are more likely to let emotional factors color our thinking. The >thought of being trapped in a burning or sinking car is very disturbing to >most people, so they weight such accidents much more heavily than their actual >probability of occurrence merits. An interesting article on this topic (perception of risk) appeared in Scientific American a few years back. To summarize, small non-zero risks have much more emotional weight than they "deserve" (statistically, that is). Large variations in the middle of the scale have less effect than they deserve. Memory fails me on how risk at the other end of the scale (near certainty) is perceived. Personally, I find the thought of being sent through the windshield at least as disturbing as (and much more likely than) being trapped in a burning car.
In the 1973 book "Beyond Stonehenge", Gerald S. Hawkins is telling about the digitization of the layout of Stonehenge from new aerial photos ... Back at the laboratory, two pictures, red and green, are projected. The operator looks through special glasses. A miniature Stonehenge sits there in the machine, three- dimensional, vividly real. A small white spot moves in the machine, controlled by hand dials. It can be moved along the ground; up ... down ... The machine reads height of stone or height of ground above datum. The method is accurate, absolute, unambiguous, mechanically final. The details are safely left with the engineer. When I saw the first photogrammetic plan I was puzzled. The number of stones was wrong. There was an extra stone mapped in the bluestone horseshoe. I raised the question with Mr. Herschel. The engineers put the film back in the infallible machine and redid the mea- surements. Apologies! The object was not a stone. It was human. The error was excusable and quite understandable. There was a gentleman, a sightseer (bald-headed), who happened to stand in a gap in the line of bluestones at the instant of the click-click of the passing plane. His shadow was like that of a stone; his head top looked like polished dolerite. "Vertical object, height 5 ft 10 ins", recorded the machine. Mark Brader
I understand and appreciate your comments in the mod.risks about nth party/ hearsay stuff. But, from the examples you gave, in case you are really looking for some aviation accidents partially due to obedience to the "book", here are two - both commercial accidents at Toronto International (Now Pearson International). Both from MOT (then DOT) accident investigations: About 15 years ago a Air Canada DC-8 was coming in for a landing. At 60 feet above the runway, the pilot asked the co-pilot to "arm" the spoilers. The co-pilot goofed and fired them. The aircraft dropped abruptly onto the runway, pulling about 4 G's on impact. At which point one of the engine/pylon assembly tore away from the wing - this was an aircraft defect because the engines were supposed to withstand this impact - a 6 G impact is supposed to shear the mounting pins. Not aware of this fact, the pilot performed what the book told him to do - go around for another try. He only made it halfway around - the pylon had tore away a portion of the fuel tank and the aircraft caught fire and crashed in a farmer's field killing all aboard. In retrospect, the pilot should have stayed on the ground, contrary to the book. Many would have survived the fire on the ground. However, it was difficult to see how the flight crew could have realized that the aircraft was damaged as it was in the short time that they had to decide. The spoiler arming system was altered to make this more unlikely. The second incident was about 8 years ago - on a Air Canada DC-9 taking off. During take off one of the tires blew throwing rubber fragments through one of the engines. One of these fragments damaged a temperature sensor in the engine, causing an "engine fire" indication to come on in the cockpit. The pilot did what the book said, "abort takeoff", even though he was beyond the safe stopping point. The aircraft slid off the end of the runway and into the now infamous 50 foot deep gully between the runway and the 401 highway. The fuselage broke in 2 places, causing one death and several broken bones and minor back injuries. In retrospect, if the pilot had not aborted takeoff, he would have been able to take off successfully and come around for reasonably safe landing, saving the aircraft and preventing any injuries. However, there was absolutely no way that they could have determined that the engine was not on fire. Results: - in spite of the findings, I seem to remember that the pilot was suspended for some time. - Recommendations: - filling in the gully - not done - cutting grooves in the runways for improved braking - not done yet, but the media is still pushing the MOT. (I'm neutral on this one, the MOT has some good reasons for not doing it) - cleaning the tarmac of burned rubber - only done once if I recall correctly. As a counter example, I offer you another: It had become common practise for twin-otter pilots to place the props in full reverse pitch while landing, instants before actually touching down. This had the effect of shortening the landing run considerably over the already short runs (twin-otter is STOL). However, due to a number of accidents being traced to pilots doing this too soon - eg: 50 feet up, the aircraft manufacturer then modified the aircraft so as to prevent reverse pitch unless the aircraft was actually on the ground. (The above, however, is from a newspaper, and would bear closer research).
Please report problems with the web pages to the maintainer