The Risks Digest

The RISKS Digest

Forum on Risks to the Public in Computers and Related Systems

ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Volume 3 Issue 74

Friday, 3 October 1986

Contents

o Opinions vs. Facts in RISKS Reports (re Aviation Accidents)
Danny Cohen
o Mathematical checking of programs (quoting Tony Hoare)
Niall Mansfield
o Risks of maintaining computer timestamps revisited [RISKS-3.57]
Ian Davis
o Keyword indexing in automated catalogs
Betsy Hanes Perry
o Re: Viking Landers -- correction
Scott Preece
o Re: Confidence in software via fault expectations
Scott Preece
o Overrides and tradeoffs
Jerry Leichter
o Re: Deliberate overrides
Brint Cooper
o Re: idiot-proof cars (risks-3.68)
Col. G. L. Sicherman
o Info on RISKS (comp.risks)

Opinions vs. Facts in RISKS Reports (re Aviation Accidents)

<COHEN@B.ISI.EDU>
3 Oct 1986 09:18:27 PDT
    Opinions vs. Facts in RISKS Reports (re Aviation Accidents)
    -----------------------------------------------------------

Everyone is entitled to opinions and to facts.  Keeping the two
distinguishly separated is the basis of good reporting -- including
the reports/contributions to RISKS.

The RISKS readers are best served by being able to tell one from
the other, and to tell what is based on opinions/rumors and what on
facts.  Two examples follow.


In RISKS-3.27 Stephen Little reported about "one major accident in which
the pilot followed the drill for a specific failure, as practiced on the
simulator, only to crash because a critical common-mode feature of the
system was neither understood, or incorporated in the simulation."

Since this is a very important evidence of "major accident" (with
possible/probable loss of hundreds of lives) I tried to follow up on it
and offered to pursue this report.  

The best way to verify such a report is by a reference to the official
NTSB (National Transportation Safety Board) accident investigation
report.  Therefore, I have volunteered to pursue this reference myself
if anyone could give me details like the date (approximately), place
(country, for example), or the make and type of the aircraft.

My plea for this information appeared in RISKS-3.34, on 8/9/1986.

In response, one RISKS reader provided me with a pointer to what he
vaguely remembered to be such a case.  After pursuing the original
report we both found that the pilot (Capt. John Perkins, of United
Airlines) claimed that [computer based] simulator training helped him
and his crew to survive a windshear encounter (not the kind of story
the RISKS community finds to be of interest).

       (The long discussion about the F-16 does not relate to this
    topic since it was concentrated on what the simulator software
    should do and what the aircraft software should do, rather than
    on the fidelity of the simulator and on its training value).

If the original report about that computer-induced major accident is
based on facts -- let's find them, we tried but did not succeed.
If it is based or rumors -- let's say so explicitly.


A more recent RISKS (3.72) has another report, this time by a pilot,
Peter Ladkin, who also provides the place and the make and type of the
aircraft (just as I asked for).  His report says:

      "    An example of a deliberate override that led to disaster:
    An Eastern Airlines 727 crashed in Pennsylvania with considerable
    loss of life, when the pilots were completing an approach in
    instrument conditions (ground fog), 1000 feet lower than they
    should have been at that stage.
    They overrode the altitude alert system when it gave warning.  "

I found it very interesting.  The mention of the aircraft type and the
location are helpful hints for pursuing such accidents.

However, I failed to locate any information about that "Eastern
Airlines 727 [which] crashed in Pennsylvania".

I (and Eastern Airlines, too) know of only two losses of Eastern
Airlines 727's -- neither in Pennsylvania.  One in JFK to (windshear)
and one in La Paz, Bolivia (flying into a mountain, in IFR conditions).

However, I know of the 9/11/1974 Eastern Airline crash of a DC-9 in
Charlotte, North Carolina -- which, I guess, is what Peter Ladkin's
report is about.  This guess may be wrong.

  I APOLOGIZE TO PETER LADKIN IF I DID NOT GUESS THE RIGHT ACCIDENT.

According to the NTSB accident report (NTSB-AAR-75-9) about the DC-9 in
Charlotte: "The probable cause of the accident was the flightcrew's lack
of altitude awareness at critical points during the approach due to poor
cockpit discipline in that the crew did not follow predescribed
procedure."  [They were too low, and too fast.]

The report also mentions that "The flightcrew was engaged in
conversations not pertinent to the operation of the aircraft.  These
conversations covered a number of subjects, from politics to used cars,
and both crew members expressed strong views and mild aggravation
concerning the subjects discussed.  The Safety Board believes that these
conversations were distractive and reflected a casual mood and a lax
cockpit atmosphere, which continued throughout the reminder of the
approach and which contributed to the accident."

What also contributed to the accident is that "the captain did not make
the required callout at the FAF [Final Approach Fix], which should have
included the altitude (above field elevation)".  They also did not make
other mandatory callouts.

Other possible contributing factors was a confusion between QNE and QFE
altitudes (the former is above sea level, and the latter above the field
elevation).  [This may be the 1,000' confusion mentioned in Peter
Ladkin's report.]

"The terrain warning alert sounded at 1,000 feet above the ground but
was not heeded by the flightcrew" (which is typical to many airline
pilots who regard this signal more of nuisance than a warning).

Question: What did Ladkin mean by "An example of a deliberate override
          that led to disaster: ..... They overrode the altitude alert 
          system when it gave warning" ?

According to the NTSB they just did not pay attention to it.  According
to the Ladkin report they DELIBERATELY OVERRODE it, which implies
explicit taking some positive action to override it.  It is hard to
substantiate this suggestion.

Not paying attention is not a "deliberate override" as promised in the
first line of the Ladkin report, just as flying under VFR conditions
into the ground is not "a deliberate override of the visual cues" -- it
is a poor practice.  (The only thing DELIBERATE in that cockpit was the
discussion of used cars!)

Does this example contribute to the RISKS discussion about "deliberate
override"?


In summary: Starting from wrong "facts" based on third hand vague
            recollections is not always the best way to develop theories.

Again, the RISKS readers are best served by more accurate reporting.
They deserve it.

                            Danny Cohen.


Mathematical checking of programs (quoting Tony Hoare)

Niall Mansfield <MANSFIEL%DHDEMBL5.BITNET@WISCVM.WISC.EDU>
Thu 2 Oct 86 11:53:55 N
In "New Scientist", 18-Sep-86, C.A.R. Hoare discusses mathematical
techniques for improving the reliability of programs, especially
life-critical ones.  The following somewhat arbitrary excerpts (quoted
without permission) include some interesting ideas:

  But computers are beginning to play an increasing role in "life-critical
  applications", situations where the correction of errors on discovery is not
  an acceptable option - for example, in control of industrial processes,
  nuclear reactors, weapons systems, oil rigs, aero engines and railway
  signalling.  The engineers in charge of such projects are naturally worried
  about the correctness of the programs performing these tasks, and they have
  suggested several expedients for tackling the problem.  Let me give some
  examples of four proposed methods.

  The first method is the simplest.  I illustrate it with a story.  When
  Brunel's ship the SS Great Britain was launched into the River Thames, it
  made such a splash that several spectators on the opposite bank were
  drowned.  Nowadays, engineers reduce the force of entry into the water by
  rope tethers which are designed to break at carefully calculated intervals.

  When the first computer came into operation in the Mathematish Centrum in
  Amsterdam, one of the first tasks was to calculate the appropriate intervals
  and breaking strains of these tethers.  In order to ensure the correctness
  of the program which did the calculations, the programmers were invited to
  watch the launching from the first row of the ceremonial viewing stand set
  up on the opposite bank.  They accepted and they survived.

  ... [1.5 pages omitted]

  I therefore suggest that we should explore an additional method, which
  promises to increase the reliability of programs.  The same method has
  assisted the reliability of designs in other branches of engineering, namely
  the use of mathematics to calculate the parameters and check, the soundness
  of a design before passing it for construction and installation.

  Alan Turing first made this suggestion some 40 years ago; it was put into
  practice, on occasion, by the other great pioneer of computing, John von
  Neumann.  Shigeru Igarashi and Bob Floyd revived the idea some 20 years ago,
  providing the groundwork for a wide and deep research movement aimed at
  developing the relevant mathematical techniques.  Wirth, Dijkstra, Jones,
  Gries and many others, (including me) have made significant contributions.
  Yet, as far as I know, no one has ever checked a single safety-critical
  program using the available mathematical methods.  What is more, I have met
  several programmers and managers at various levels of a safety-critical
  project who have never even heard of the possibility that you can establish
  the total correctness of computer programs by the normal mathematical
  techniques of modelling, calculation and proof.

  Such total ignorance would seem willful, and perhaps it is.  People working
  on safety-critical projects carry a heavy responsibility.  If they ever get
  to hear of a method which might lead to an improvement in reliability, they
  are obliged to investigate it in depth.  This would give them no time to
  complete their current projects on schedule and within budget.  I think that
  this is the reason why no industry and no profession has ever voluntarily
  and spontaneously developed or adopted an effective and relevant code of
  safe practice.  Even voluntary codes are established only in the face of
  some kind of external pressure or threat, arising from public disquiet,
  fostered by journals and newspapers and taken up by politicians.

  A mathematical proof is, technically, a completely reliable method of
  ensuring the correctness of programs, but this method could never be
  effective in practice unless it is accompanied by the appropriate attitudes
  and managerial techniques.  These techniques are in fact based on the same
  ideas that have been used effectively in the past.

  It is not practical or desirable to punish errors in programming by instant
  death.  Nevertheless, programmers must stop regarding error as an inevitable
  feature of their daily lives.  Like surgeons or airline pilots, they must
  feel a personal commitment to adopt techniques that eliminate error and to
  feel the appropriate shame and resolution to improve when they fail.  In a
  safety-critical project, every failure should be investigated by an
  impartial enquiry, with powers to name the programmer responsible, and
  forbid that person any further employment on safety-critical work.  In cases
  of proven negligence, criminal sanctions should not be ruled out.  In other
  engineering disciplines, these measures have led to marked improvement in
  personal and professional responsibility, and in public safety.  There is
  not reason why programmers should be granted further immunity...

  ... [1 page, to end of article, omitted]


Risks of maintaining computer timestamps revisited [RISKS-3.57]

Ian Davis <ijdavis%watdaisy.waterloo.edu@CSNET-RELAY.ARPA>
Wed, 1 Oct 86 17:47:29 edt
CP-6 has a further problem when first loaded that was encountered recently
at Wilfrid Laurier University.  A check is made to ensure that front end
processors (FEP's) are up and running, but not that they contain the correct
software... the consequence in W.L.U's case was that after loading version
C01 for testing and then rebooting C00 software they left C01 software in
the FEP's.  Unfortunately, this resulted (for whatever reason) in disk
record writes being interpreted as disk record deletes.  The problem became
apparent when using the editor which performs direct disk updates... but its
severity was not at first appreciated... the system was brought down very
rapidly when it was....  Ian Davis.


Keyword indexing in automated catalogs

Betsy Hanes Perry <betsy%dartmouth.edu@CSNET-RELAY.ARPA>
Wed, 1 Oct 86 10:40:39 edt
The recent notice about title-indexing (article titles must include all
important article keywords in their first five words) struck a real chord in
me.  My current job is maintaining and updating Dartmouth College's
automated card catalog.

We have a database of over 800,000 records, all completely free-text
searchable (EVERY WORD in every record is indexed).  We are beginning to
suffer storage limitations, and are exploring our options.  However, if we
tried to suggest anything so restrictive as "five keywords per title", we'd
have a revolution on our hands.

The instance cited seems to me to be a clear example of shaping the
task to suit the tools at hand.  Somebody out there ought to be ashamed
of him/herself.  At the very least, the notice explaining why articles'
titles must be rewritten should have been

  1.  Extremely apologetic    and
  2.  Should have given a time by which this temporary limitation
      would no longer apply.

As it stands, the system sounds as if it is going to be less useful
than some of the available conventional journal indexes -- what 
incentive does this give for using it?

Tsk, tsk.


Re: Viking Landers -- correction

"Scott E. Preece" <preece%ccvaxa@GSWD-VMS.ARPA>
Thu, 2 Oct 86 09:33:19 cdt
> From: leveson@sei.cmu.edu
> Small, straightforward problems with very little complexity in the
> logic (e.g., just a series of mathematical equations) may not say much
> about the reliability of large, complex systems.

And there, of course, lies the heart of the structured programming
movement.  You improve reliability by reducing the complexity of
program logic.  You turn a large, complex system into a small,
straightforward system by building it in layers, each of which
makes use of primitives defined in the layer below.

The reason it may not be as effective as many have hoped is
that even simple, straightforward programs often turn out to
have bugs...

scott preece, gould/csd - urbana, uucp: ihnp4!uiucdcs!ccvaxa!preece


Re: Confidence in software via fault expectations

"Scott E. Preece" <preece%ccvaxa@GSWD-VMS.ARPA>
Thu, 2 Oct 86 09:25:04 cdt
> From: hplabs!sdcrdcf!darrelj@ucbvax.Berkeley.EDU (Darrel VanBuer)

> The thing is software DOES wear out in the sense that it loses its
> ability to function because the world continues to change around it...
----------
That's like saying "People do live forever in the sense that some of their
atoms linger."  The sense you depend on is not in the words you use.

"Becoming obsolete" is NOT the same thing as "wearing out."  The word "wear"
is in there for a reason.  Software does not suffer wear (though storage
media do).  The only exception I can think of would be demonstration
packages that self-destruct after a set number of uses.

Words are important; if you smear their meaning, you lose the ability to say
exactly what you mean.  This is a risk the computing profession has
contributed to disproportionately.

scott preece


Overrides and tradeoffs

<LEICHTER-JERRY@YALE.ARPA>
3 OCT 1986 13:26:54 EST
The recent discussions on manual overrides for airplane landing gear and car
brakes have all been ignoring a fundamental issue:  To compute the expected
cost/risk of having/not having an automated system, you need more than just a
few gedanken experiments; you need some estimates of the probabilities of
various situations, and, in each of those situations, the expected costs of
using or not using the automatic systems.

Here's a simple, well-known example:  Some people claim they don't wear seat
belts because, in an accident, they might be trapped in a burning car, or one
sinking into a lake.  Is this a valid objection?  Certainly; it COULD happen.
But the reality is that such accidents are extremely rare, while accidents in
which seat belts contribute positively are quite common.  So, on balance, the
best you can do is wear seat belts.  Of course, if you are in some very spe-
cial situation - doing a stunt that involves driving a car slowly across a
narrow, swaying bridge over a lake, for example - the general statistics fail
and you might properly come to a different conclusion.

In the United States, how many people regularly drive on gravel roads?  Per-
haps for those relatively few who do, an override for the automatic brake
system, or even a car WITHOUT such a system might make sense.  Perhaps the
costs for all those people who almost never drive on gravel roads can be shown
to be trivial.  There certainly ARE costs; every additional part adds cost,
weight, something that can break; plus, there's another decision the driver
might not want to be burdened with.  And there are "external" costs:  An
uncontrolled, skidding car could easily injure someone besides the driver who
chose to override the ABS.

Accidents in general are fairly low-probability events.  As such, they have to
be reasoned about carefully - our intuitions on such events are usually based
on too little data to be worth much.  Also, since we have little direct expe-
rience, we are more likely to let emotional factors color our thinking.  The
thought of being trapped in a burning or sinking car is very disturbing to
most people, so they weight such accidents much more heavily than their actual
probability of occurrence merits.

It's also worth remembering another interesting statistic (I wish I knew a
reference):  When asked, something like 80% of American male drivers assert
that their driving abilities are "above average".  Given such a population
of users, there are risks in providing overrides of safety systems.

                            -- Jerry


Re: Deliberate overrides

Brint Cooper <abc@BRL.ARPA>
Fri, 3 Oct 86 13:53:54 EDT
> .....  Yet, perhaps such vehicles should have a switch to disable
> anti-lock and allow conventional braking.  Imaging trying to stop quickly
> with anti-lock brakes on a gravel road...

But the whole point of anti-lock brakes is to avoid skidding when traction
is lost.  If the vehicle skids, it'll hit the cow.  Overrides, as has been
said before, allow incompetent operators to substitute their opinions for
facts.
                                        Brint


Re: idiot-proof cars (risks-3.68)

"Col. G. L. Sicherman" <colonel%buffalo.csnet@CSNET-RELAY.ARPA>
Mon, 29 Sep 86 09:15:13 EDT
Chuck Fry's argument for override provisions in automated controls on cars
makes a lot of sense.  Frankly, though, I'd rather see as few new automatic
controls as we can manage with.  I live in the Buffalo area--heavy industry
with cobwebs on it--and people here are driving cars that ought to have been
junked last year.

Airplanes get first-class maintenance, or at least second-class.  With cars
it's different; when something breaks, many people just can't afford to have
it fixed.  The simpler a car's design, the longer a poor man can keep it
running safely.

Maybe I'm being cynical, but I believe that so simple an improvement as
putting brake lights on rear windshields will prevent far more accidents
than any amount of intermediary computerization.

     [Since deregulation, you might be surprised that the airlines like
      everyone else believe in cutting expenses to the bone.  Maintenance
      may or may not be what it was.  I have seen several reports that it
      is not, although it is certainly nowhere near so bad as with autos.  PGN]

Please report problems with the web pages to the maintainer

Top