The Risks Digest

The RISKS Digest

Forum on Risks to the Public in Computers and Related Systems

ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Volume 3 Issue 72

Wednesday, 1 October 1986

Contents

o Viking Lander
Nancy Leveson
o Deliberate override
George Adams
o Overriding overrides
Peter Ladkin
o A propos landing gear
Peter Ladkin
o Paths in Testing
Mark S. Day
o Confidence in software via fault expectations (RISKS-3.69)
Darrel VanBuer
o Info on RISKS (comp.risks)

Viking Lander

<leveson@sei.cmu.edu>
Wed, 1 Oct 86 11:44:21 edt
I have spoken to a man who was involved in the building of the Viking
Lander software.  He told me that the on-board software consisted of
three basic functions:  terminal descent software, some transmission
of data, and a sequence of operations while the lander was on the ground.
The Honeywell computer had a memory of about 18,000 24-bit words.  It
was programmed in assembly language.  There was an interrupt-driven
executive program.  What is interesting is his claim that there were
several bugs in the software.  There just were not any bugs that caused
an abort of the mission or other serious consequences.  The programs
were, in fact, overloaded on the way to Mars because of discovery of
some problems.  This man claims that there was a great unwillingness 
to admit that there were bugs in the software and so some remain 
undocumented or difficult to find in the documentation.  But the software
was NOT bug-free.

There was also ground system software which was responsible for
auxiliary computations such as pointing the antenna.  

One of the reasons I have come into contact with this software is that
I have been asked to participate in a fault tolerance experiment which
will use the Viking lander as an example.  In examining the software
in the light of using it for such an experiment, Janet Dunham at RTI
has found it not to be adequate in its present form because it is too
small and straightforward.  In fact, she is presently writing a 
specification of the terminal descent portion which adds complexity to
the problem to make it more interesting and complex for the experiment
(she is afraid that there just will not be enough errors made given the
original problem).  She says that the navigation portion of the 
terminal descent software (which is the most complex part) is only 
about 100 lines of Ada code.  

The point to remember is that not all software is alike.
Small, straightforward problems with very little complexity in the logic
(e.g., just a series of mathematical equations) may not say much about
the reliability of large, complex systems.  We know that scaling-up
is a very serious problem in software engineering.  In fact, it has been
suggested that small vs. large software projects have very few similarities.
It should also be noted that the avionics part of this relatively small 
and relatively simple software system cost $18,000,000 to build.  Although
the 18,000 words of memory were overloaded a few times, the amount of 
money spent per line of code was extremely high.  

I worry when anecdotal evidence about one software project is used as
"proof" about what will happen with general software projects.  There 
just are too many independent variables and factors to do this with 
confidence.  And, in fact, we do not even know for sure what the important
variables are.

     Nancy Leveson
     Info. & Computer Science Dept.
     University of California, Irvine

     (arpanet address, for angry replies, is nancy@ics.uci.edu despite 
      evidence above to the contrary)


Deliberate override

George Adams <gba@riacs.edu>
Wed, 1 Oct 86 10:18:43 pdt
   Even automobiles might appropriately have overrides of automated
controls, and even of automated safety systems.  I have only read about
the following automobile item and wish I had the opportunity to verify
it, but it seems reasonable.
   Consider the anti-lock braking systems now becoming more widely
available in automobiles.  The driver can apply a constant input to the
brake pedal, but modulated braking forces are applied at the wheels so
that the wheels do not lock.  Many have probably seen the ad on tele-
vision in which the car with anti-lock brakes sucessfully negotiates the
turn on wet pavement while coming to a rapid stop without skidding out of
control.  Yet, perhaps such vehicles should have a switch to disable
anti-lock and allow conventional braking.  Imaging trying to stop quickly
with anti-lock brakes on a gravel road.
   Even if an incompetent driver forgot to enable the system on hard
pavement, performance would be no worse than now common.  Without the
switch a competent driver might hit that cow instead of stopping in time.
   Regarding aircraft, a report on the midair collision involving the
Aero Mexico flight said that the flight crew applied thrust reversers
after the collision.  This seems like a creative response, and one that
might easily be disallowed in a more automated aircraft in which a
check for weight on extended landing gear was a prerequisite for thrust
reversal.  While thrust reversal had no benefit for the Aero Mexico
flight itself, perhaps it reduced impact speed and consequently reduced
the extent of damage on the ground.
   A vehicle and its operator are a system.  By automating vehicle systems
we can adapt operator workload to better match the capabilities of human
beings and make it possible for an operator to do a better job.  We can
also automate to limit operator options for coping with non-routine
situations and impede rapid operator override, thereby making a more
expert system and also a less generally capable one.


Overriding overrides

Peter Ladkin <ladkin@kestrel.ARPA>
Wed, 1 Oct 86 17:08:22 pdt
An example of a deliberate override that led to disaster:
An Eastern Airlines 727 crashed in Pennsylvania with considerable
loss of life, when the pilots were completing an approach in
instrument conditions (ground fog), 1000 feet lower than they
should have been at that stage.
They overrode the altitude alert system when it gave warning.

Peter Ladkin


A propos landing gear

Peter Ladkin <ladkin@kestrel.ARPA>
Wed, 1 Oct 86 16:55:14 pdt
Alan Marcum's comment on gear overrides in the Arrow reminded me
of a recent incident in my flying club (and his, too).
The Beech Duchess, a light training twin, has an override that
maintains the landing gear *down* while there is weight on the
wheels, ostensibly to prevent the pilot from retracting the
gear while on the ground (this is a problem that Beech has in
some of its airplanes, since they chose to use a non-standard
location for gear and flap switches, encouraging a pilot to
mistake one for the other).
Pilots can get into the habit of *retracting* the gear before
takeoff, secure in the knowledge that it will remain down
until weight is lifted off the wheels, whence it will commence
retracting. This has the major advantages that it's one less
thing to do during takeoff, allowing more concentration on
flying, and the gear is retracted at the earliest possible
moment, allowing maximum climb performance, which is important
in case an engine fails at this critical stage.
Can anyone guess the disadvantage of this procedure yet?

Our club pilot, on his ATP check ride, with an FAA inspector
aboard, suffered nosewheel collapse on take-off, and dinged
the nose, and both props, necessitating an expensive repair
and engine rebuild. Thankfully, all walked away unharmed.
It was a windy day.

It is popularly supposed that the premature retraction technique
was used, and a gust of wind near rotation speed caused the weight
to be lifted off the nosewheel. When the plane settled, the 
retraction had activated, and the lock had disengaged, 
allowing the weight to collapse the nosewheel.

Both pilots assure that the gear switch was in the down position,
contrary to the popular supposition.

All gear systems in the aircraft were functioning normally when
tested after the accident.

The relevance to Risks? The system is simple, and understood in
its entirety by all competent users. The technique of
premature retraction has advantages. It's not clear that a
gedankenexperiment could predict the disadvantage.

Peter Ladkin


Paths in Testing

Mark S. Day <MDAY@XX.LCS.MIT.EDU>
Wed 1 Oct 86 11:57:32-EDT
"Paths" as used in the discussion of "path coverage" are probably
intended to be what are called "basis paths."  A piece of code with
loops can indeed have an infinite number of paths, but every path is a
linear combination of a much smaller set of paths.  Testing that
covers every basis path and also tests each loop using "engineer's
induction" ("zero, one, two, three... good enough for me") is
significantly better than random testing to "see what breaks" and much
more feasible than trying to test all the combinations of basis paths.
The McCabe or cyclomatic complexity metric defines the number of basis
paths through a piece of code; see

   T.J. McCabe, "A Complexity Measure", IEEE Transactions on Software
   Engineering SE-2, 4 (Dec 1976) pp. 308-320.

A quick approximation of McCabe complexity is that straight-line code
has a complexity of 1 (obviously, I guess) and most control statements
(if-then, if-then-else, while, repeat, for...) each add 1 to the
complexity.  An n-way case statement adds n-1 paths to the "straight"
path, so it adds n-1 to the complexity.  This approximation only applies
to code with no gotos.

The IEEE Computer Society puts out a tutorial volume called
"Structured Testing" that includes the previously-cited paper and a
number of other related articles, including a heuristic for using the
McCabe complexity to select test paths.

--Mark


Re: Confidence in software via fault expectations (RISKS-3.69)

Darrel VanBuer <hplabs!sdcrdcf!darrelj@ucbvax.Berkeley.EDU>
Tue, 30 Sep 86 18:37:27 pdt
  >From: Dave Benson <benson%wsu.csnet@CSNET-RELAY.ARPA>
  >...  Software ordinarily has no manufacturing defects and the
  >usual way ordinary backups are done insures that most software does not
  >wear out before it becomes obsolete. 

The thing is software DOES wear out in the sense that it loses its ability
to function because the world continues to change around it (maybe a bit
because the pattern of bits does NOT wear out): e.g. operating systems which
have gone psychotic because the number of bits used to represent a date
because compatible hardware has continued to run far longer than designers
of software and hardware anticipate (how many IBM-360 programs will correctly
handle the fact that year 2100 is NOT a leap year, but still be running inside
some emulation/automatic retranslation) or financial software unable to deal
with 1000 fold inflation because all the numbers overflow...

Darrel J. Van Buer, PhD,  System Development Corp.,  2525 Colorado Ave
Santa Monica, CA 90406  (213)820-4111 x5449             /     !sdcrdcf!darrelj
...{allegra,burdvax,cbosgd,hplabs,ihnp4,orstcs,sdcsvax,ucla-cs,akgua}

       [This one is getting to be like the "YES, VIRGINIA, THERE IS A
        SANTA CLAUS" letter that used to appear each year in the Herald
        Tribune.  But I keep reprinting the recurrences because some of 
        you still don't believe it.  There is no sanity clause.  PGN]

Please report problems with the web pages to the maintainer

Top