The Risks Digest

The RISKS Digest

Forum on Risks to the Public in Computers and Related Systems

ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Volume 3 Issue 77

Wednesday, 8 October 1986

Contents

o Evaluating software risks
Brian Randell
o Misapplication of hardware reliability models
Nancy Leveson
o Deliberate overrides?
Mark Brader
Ephraim
o Trusting-infallible-machines Stonehenge anecdote
Mark Brader
o [More Aviation Hearsay?]
C Lewis
o Info on RISKS (comp.risks)

Evaluating software risks

Brian Randell <brian%cheviot.newcastle.ac.uk@Cs.Ucl.AC.UK>
Mon, 6 Oct 86 18:59:47 gmt
In the article by Nancy Leveson in RISKS 3.72, she mentioned that a
task based on the Viking Landing software was going to be used as the
basis for some experiments on concerning fault tolerant software.
There have in fact been a number of carefully cntrolled experiments
aimed at assessing the possible cost-effectiveness of fault tolerance
in software.
I then read the Tony Hoare quote in RISKS 3.74, bemoaning the fact
that formal verification had not been used in any safety-critical
software. Offhand, I do not know of any similar controlled experiments
being performed on the cost-effectiveness of formal verification.
Indeed, it strikes me that it would be very interesting if the planned
experiments that Nancy refers to were to cover various verification and
testing, as well as, fault tolerance experiments. Ideally risks should
be quantified, so claimed remedies should be the subject of experimental
evaluation, as well as eloquent pleading.

Brian Randell - Computing Laboratory, University of Newcastle upon Tyne

  ARPA  : brian%cheviot.newcastle@ucl-cs.arpa
  UUCP  : <UK>!ukc!cheviot!brian
  JANET : brian@uk.ac.newcastle.cheviot


Misapplication of hardware reliability models

<leveson@sei.cmu.edu>
Mon, 6 Oct 86 11:39:34 edt


There has been some discussion on Risks lately about the application of
hardware reliability models to software.  The purpose of
such models is to make predictions.  The accuracy of a prediction based upon
a mathematical model depends on whether the assumptions of the underlying
model fit the situation in which it is being applied.  Hardware reliability
models make such assumptions as:
   1) component failures will occur independently in independent
      replicated units
   2) the behavior of a physical component can be predicted from data
      gathered from observations of other components that are assumed
      to be similar.
   3) the design of the system is free from faults.


None of these seem to apply to software.  Attempting to come up with some
strange meaning of "wear out" so that the models can be applied to software
is begging the question.  We know that "wear out" AS IS MEANT IN THESE
MODELS does not apply to software.  Therefore, the results of applying
the models to software may be inaccurate.  The burden of proof is in
showing that the assumptions apply as originally conceived in the models.
As an example, trying to fit software to "bathtub curve" models (which 
were built by observing hardware) would seem to be a less fruitful 
line of endeavor than attempting to build models from what we observe 
about software.

    Nancy Leveson
    Info. & Computer Science
    University of California, Irvine


Deliberate overrides?

<decvax!utzoo!dciem!msb@ucbvax.Berkeley.EDU>
Wed, 8 Oct 86 04:32:03 edt
City buses on this continent generally have rear or center exit doors
interlocked with the brakes, so that once the exit doors are opened,
the bus cannot be moved until they are closed.  An excellent safety
feature, yes?  You'd never move a bus while someone might be getting off...

Well, one day a few years ago, on the Toronto transit system, the exit
doors of a bus popped open spontaneously due to a malfunction in their
control system, and stayed open.  The bus was on a level crossing, and
was full, and a few seconds later the barriers started lowering as a
train approached.  The collision was frightful.

In the investigation it turned out that the buses were fitted with a
control to override the interlock, but it was in a concealed location
(for maintenance access only) and drivers were not trained in its use.
Needless to say this was promptly changed.

On the other hand, I could also cite several instances in the well-
documented history of British railway accident investigations where
both drivers and signalmen* were provided with overrides to be used
only in case of equipment malfunction, and did not believe their
equipment, and used the overrides to cause accidents.

*They WERE all men in those days.  I don't know what the modern word is.

The moral seems to be that overrides are indeed a good thing to have,
but you have to be very sure that the user knows when to use them.
And if the engineer or programmer isn't the one training the users,
this can be rather difficult.

By the way, those reading this somewhere else than on Usenet may be
interested to know that people who use an interface called Pnews to
post Usenet articles are asked:

   This program posts news to many hundreds of machines throughout the world.
   Are you absolutely sure that you want to do this? [ny]

This comes up before the message is entered; afterwards, the question

   Send, abort, edit, or list?

is asked, so the initial question is not the only chance to abort.
In effect, the extra initial confirmation asks users to override a
safety feature on every normal invocation of the program.  Is this useful?
    [ANSWER TO MARK PLEASE, NOT RISKS ON THIS QUESTION.]
Mark Brader, utzoo!dciem!msb


Re: Overrides and tradeoffs (Risks 3.74)

<decvax!wanginst!wang!ephraim@ucbvax.Berkeley.EDU>
Tue, 7 Oct 86 20:20:56 edt
In Risks 3.74, Jerry Leichter writes:
>Accidents in general are fairly low-probability events.  As such, they have to
>be reasoned about carefully - our intuitions on such events are usually based
>on too little data to be worth much.  Also, since we have little direct expe-
>rience, we are more likely to let emotional factors color our thinking.  The
>thought of being trapped in a burning or sinking car is very disturbing to
>most people, so they weight such accidents much more heavily than their actual
>probability of occurrence merits.

An interesting article on this topic (perception of risk) appeared in
Scientific American a few years back.  To summarize, small non-zero risks
have much more emotional weight than they "deserve" (statistically, that
is).  Large variations in the middle of the scale have less effect than
they deserve.  Memory fails me on how risk at the other end of the scale
(near certainty) is perceived.

Personally, I find the thought of being sent through the windshield at
least as disturbing as (and much more likely than) being trapped in a burning
car.


Trusting-infallible-machines Stonehenge anecdote

<decvax!utzoo!dciem!msb@ucbvax.Berkeley.EDU>
Wed, 8 Oct 86 05:14:25 edt
In the 1973 book "Beyond Stonehenge", Gerald S. Hawkins is telling about
the digitization of the layout of Stonehenge from new aerial photos ...

    Back at the laboratory, two pictures, red and green, are
    projected.  The operator looks through special glasses.
    A miniature Stonehenge sits there in the machine, three-
    dimensional, vividly real.  A small white spot moves in
    the machine, controlled by hand dials.  It can be moved
    along the ground; up ... down ...   The machine reads height
    of stone or height of ground above datum.  The method is
    accurate, absolute, unambiguous, mechanically final.  The
    details are safely left with the engineer.

    When I saw the first photogrammetic plan I was puzzled.
    The number of stones was wrong.  There was an extra stone
    mapped in the bluestone horseshoe.

    I raised the question with Mr. Herschel.  The engineers put
    the film back in the infallible machine and redid the mea-
    surements.

    Apologies!

    The object was not a stone.  It was human.

    The error was excusable and quite understandable.  There was
    a gentleman, a sightseer (bald-headed), who happened to
    stand in a gap in the line of bluestones at the instant of
    the click-click of the passing plane.  His shadow was like
    that of a stone; his head top looked like polished dolerite.
    "Vertical object, height 5 ft 10 ins", recorded the machine.

Mark Brader


[More Aviation Hearsay?]

<mnetor!spectrix!clewis@seismo.CSS.GOV>
Wed Oct 8 12:04:57 1986
I understand and appreciate your comments in the mod.risks about nth party/
hearsay stuff.  But, from the examples you gave, in case you are really
looking for some aviation accidents partially due to obedience to the
"book", here are two - both commercial accidents at Toronto International
(Now Pearson International).  Both from MOT (then DOT) accident 
investigations:

About 15 years ago a Air Canada DC-8 was coming in for a landing.  At
60 feet above the runway, the pilot asked the co-pilot to "arm" the spoilers.
The co-pilot goofed and fired them.  The aircraft dropped abruptly onto
the runway, pulling about 4 G's on impact.  At which point one of the
engine/pylon assembly tore away from the wing - this was an aircraft 
defect because the engines were supposed to withstand this impact - a 
6 G impact is supposed to shear the mounting pins.  Not aware of this 
fact, the pilot performed what the book told him to do - go around for 
another try.  He only made it halfway around - the pylon had tore away 
a portion of the fuel tank and the aircraft caught fire and crashed in
a farmer's field killing all aboard.

In retrospect, the pilot should have stayed on the ground, contrary
to the book.  Many would have survived the fire on the ground.  However, 
it was difficult to see how the flight crew could have realized that
the aircraft was damaged as it was in the short time that they had to
decide.  The spoiler arming system was altered to make this more unlikely.

The second incident was about 8 years ago - on a Air Canada DC-9 taking
off.  During take off one of the tires blew throwing rubber fragments
through one of the engines.  One of these fragments damaged a temperature
sensor in the engine, causing an "engine fire" indication to come on in
the cockpit.  The pilot did what the book said, "abort takeoff", even
though he was beyond the safe stopping point.  The aircraft slid off the
end of the runway and into the now infamous 50 foot deep gully between
the runway and the 401 highway.  The fuselage broke in 2 places, causing
one death and several broken bones and minor back injuries.

In retrospect, if the pilot had not aborted takeoff, he would have been
able to take off successfully and come around for reasonably safe landing,
saving the aircraft and preventing any injuries.  However, there was 
absolutely no way that they could have determined that the engine was not 
on fire.  

Results: 
    - in spite of the findings, I seem to remember that the pilot was 
      suspended for some time.
    - Recommendations:
        - filling in the gully - not done
    - cutting grooves in the runways for improved braking - not done yet,
      but the media is still pushing the MOT.  (I'm neutral on this one,
      the MOT has some good reasons for not doing it)
    - cleaning the tarmac of burned rubber - only done once if I recall
      correctly.

As a counter example, I offer you another:

It had become common practise for twin-otter pilots to place the props
in full reverse pitch while landing, instants before actually touching down.
This had the effect of shortening the landing run considerably over the
already short runs (twin-otter is STOL).  However, due to a number of
accidents being traced to pilots doing this too soon - eg: 50 feet up,
the aircraft manufacturer then modified the aircraft so as to prevent
reverse pitch unless the aircraft was actually on the ground.

(The above, however, is from a newspaper, and would bear closer research).

Please report problems with the web pages to the maintainer

Top