The Risks Digest

The RISKS Digest

Forum on Risks to the Public in Computers and Related Systems

ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Volume 4 Issue 61

Tuesday, 10 March 1987

Contents

o More on human errors
Brian Randell
o Re: Teflon flywheels and safe software
Brian Randell
o Re: Computers in the Arts
Alan Wexelblat
Jeffrey R Kell
o Local telephone service problems
Jonathan Thornburg
o Computer Failure Delays Flights at Atlanta Airport
PGN
o Ozone hole a false alarm?
Henry Spencer
o More on Requiring Mode C transponders
John Allred
Ken Calvert
o Info on RISKS (comp.risks)

More on human errors

Brian Randell <brian%kelpie.newcastle.ac.uk@Cs.Ucl.AC.UK>
Mon, 9 Mar 87 15:53:22 gmt
  Since my RISKS contribution on the BBC-TV documentary on human error, I
have had a chance to find out somewhat more about some of the studies which
have been made of human errors, and on proposals for improved system
interfaces which would reduce the chance of human operator error. Some
sample references which other RISKS readers who share my ignorance of this
topic might find interesting:

  J. Reason. Recurrent Errors in Process Environments: Some Implications for
  the Design of Intelligent Decision Support Systems. In "Intelligent Decision
  Support oi Process Environments" (ed. E. Hollnagel et al) NATO ASI Series,
  Vol F21, Springer Verlag (1986) 255-270.

  D. A. Norman. Design Rules Based on Analyses of Human Error. Comm ACM 26,4
  (April 1983) 254-258.

  D. A. Norman. Steps Towards a Cognitive Engineering. Proc. Conf. on Human
  Factors in Computer Systems, Mar 15-17, 1982, Gaithersburg, MD.

Doubtless there are many other references, but I have not found any dealing
with the analysis of faults within computer systems arising from errors made
by their designer, which was my original hope. I checked with Jim Reason, who 
indicated to me that he does not know offhand of any such work. Incidentally,
his paper listed above, and one entitled "The Cognitive Bases of Predictable
Human Error" (Proc. Ann. Conf.of the Ergonomics Society, Swansea, UK, April
1987) used what I found a very interesting computer-like model of human
cognition, in order to "explain" observed patterns of human error.

   [Note that the entire April 1983 issue of CACM (which includes the first
   of the above Norman papers) is devoted to the humanization of computer
   system interfaces.  PGN]


Re: Teflon flywheels and safe software (RISKS 4.56)

Brian Randell <brian%kelpie.newcastle.ac.uk@Cs.Ucl.AC.UK>
Mon, 9 Mar 87 10:12:43 gmt
The problems of building time-deterministic devices out of non-deterministic
components was debated at length during a recent IFIP WG10.4 meeting, during
a seminar on Hard Real-Time Systems, organised by Gerard Le Lann and Herman
Kopetz. The view of a number of us was that "time determinacy" was an
abstract concept, alright for abstract algorithms, but in general
inappropriate for real systems - certainly not systems and devices whose
components and/or design cannot be assumed to be absolutely faultless. Thus
we preferred to regard all performance figures, even so-called "guarantees",
as being probabilistic. Thus we viewed the task of the system designer as
that of satisfying him/herself (and others!) of the probability of some
given time constraint being exceeded remaining within some acceptable
figure, rather than that of "obtaining determinism from non- determinism".
Needless to say, even after the problem is recast in these terms, it will
usually still be formidable!

Brian Randell - Computing Laboratory, University of Newcastle upon Tyne

  ARPA  : brian%cheviot.newcastle.ac.uk@cs.ucl.ac.uk
  UUCP  : <UK>!ukc!cheviot!brian   JANET : brian@uk.ac.newcastle.cheviot


Re: Computers in the Arts (or The Show Must Go On ...)

Alan Wexelblat <wex@MCC.COM>
Tue, 10 Mar 87 10:52:08 CST
My wife is currently taking her MFA in lighting design, and I've run some of
the computerized boards -- so this is pretty much experience-based.

There are two kinds of "computerized" boards available today.  One kind is
designed with the computer built in; the other kind provides software and a
black box (usually just a bunch of d-to-a converters) to connect some sort of
PC (usually an Apple or IBM) to a pre-existing board.

In the first kind, you program light cues in advance.  Each cue is a set of
numbers (0-100) fed to dimmers.  The numbers represent the percentage of
maximum power fed through that dimmer.  Then you load up the program and by
pushing a single button, you move from cue to cue.  Once a cue is in place
("hot"), you can change it by hand, but changing hot cues is usually only
done in emergencies (like when a lamp blows out).  The board software
usually allows you to have a preset (the lights that are on while the
audience is entering the theatre) and to preview the next cue in a sequence.

The PC-based software usually does the same thing but often has more
sophisticated capability.  Some programs will allow you to construct cue
sequences on the fly by selecting out of a library of predesigned cues;
others allow modification of the cue just before it becomes hot, etc.

Now, to answer your question:  When the computer went down, the people in
the booth could not access the sequence of cues (and probably didn't have a
presetter on hand anyway).  What probably happened was that the board
operator set what's called a "standard wash" and left it.  Basically, he (or
she) didn't know what lights were supposed to come on when (stage managers
call the cues by number, which doesn't tell you what lights are used for
what cues).

This is an example of the RISK of not having paper copies of information
that's on-line and of the RISK of not having personnel available to do the
computer's job if it fails.

Alan Wexelblat
    UUCP: {seismo, harvard, gatech, pyramid, &c.}!sally!im4u!milano!wex


Re: Computers in the Arts (or The Show Must Go On ...)

Jeffrey R Kell <JEFF%UTCVM.BITNET@wiscvm.wisc.edu>
Tue, 10 Mar 87 13:23:02 EST
In my spare time, I play synthesizer with a local band in a large showroom,
and one of our techs here also works as stage hand in productions at the
fine arts center.  I have had exposure to the computer lighting systems...
and seen one fail.  The case you explained sounded a great deal like
problems other than simply the computer.  In all systems I have seen, all
lighting controls can still be done manually (perhaps not as quickly, but
you can use all the available lighting instruments).  The computer simply
digitizes the dimmer settings on the panel as it is programmed, and later
replays them in real time.  In either case, a real analog low-voltage signal
goes down a real wire to the dimmer (power) packs at the stage area which
control the lights.  It would seem better to leave the settings digital and
multiplex them on a single cable to the stage, but I have yet to see this
principle used (although it may in fact exist in a very large system).

The most RISKy component is the control cable leading to the dimmers
themselves.  If the computer goes bad, you still have manual control.
If the cable goes out, gets broken, etc., you really have trouble.  It
is possible to turn lights off/on manually at the packs, but that would
not be very feasible as a backup.

The new digital synthesizers are RISKy as well, speaking first-hand.
I own a Korg DSS-1 Sampling keyboard (not an advertisement) which has
512K RAM and a 3.5 inch double-sided floppy for sample/program storage.
When trying to change voices between songs during a stage black-out, I
inadvertantly pressed 'system save' rather than 'system load' and,
being in a hurry anyway (it takes 45 seconds to load), pressed the
verification without looking.  Out of personal stupidity, I had left
the disk write-enabled.  Realizing this, I attempted to abort the save,
only to corrupt the directory of the disk (when it rains, it pours).
Fortunately there was a backup disk (whew) but there was a considerable
delay (several minutes is a lot of dead air during a concert).

I rather miss the days when your worst nightmare was having a note out
of tune, or something relatively minor :-)
                                                  <Jeff>
 Jeffrey R Kell, Dir Tech Services |  Bell:  (615)-755-4551           
 Admin Computing, 117 Hunter Hall  |Bitnet:  JEFF@UTCVM.BITNET        
 Univ of Tennessee at Chattanooga  |Internet address below:           
 Chattanooga, TN  37403            |JEFF%UTCVM.BITNET@WISCVM.WISC.EDU 


Local telephone service problems, serendipity, and synchronicity

<Jonathan_Thornburg%UBC.MAILNET@MIT-MULTICS.ARPA>
Tue, 10 Mar 87 00:42:26 PST
Around 11:30pm local time (PST) on 9 mar 87, our local phone system (area
code 604, 736 exchange) died.  Calls in progress were disconnected.  After a
minute or two of no dial tone, it came back up again.  When the same thing
happened again about 3 minutes later, I called the operator and was told
"we've had a flurry of calls in the last couple of minutes".  All seems to
be ok now (an hour later).

The phone system in this exchange is an all-new digital one, installed
with considerable fanfare about a year or so ago.  I haven't heard any
reports of other problems with it, and line quality for modem work has
been excellent.

By a truly remarkable coincidence, at the time of the crash, I was scanning
our on-line Risks forum archive file via a modem over the phone, and had
read a couple of phone failure items only a few 10s of minutes previously.

It's interesting to consider what the odds are of having this sort of
accident happen while you're reading about the chances and/or hazards
of this sort of thing.  As it happens, they have been calculated by Luis
Alvarez for a somewhat similar situation, reading about someone's death
just after thinking of that person for the first time in a long time.  He
estimates of the order of 3000 such occurences per year in the US (this is
an order of magnitude estimate only, with a *large* error margin).  See
        Science 148, 1541 (1965)
for details.  Two followup items (in the context of the significance of
such occurences to "parapsychology") are
        Science 149, 910 (1965)
        Science 150, 436 (1965)
                                             - Jonathan Thornburg

       [I recall seeing "China Syndrome" THE NIGHT BEFORE Three Mile Island.
       That certainly made an impression on me.  However, seeing "WarGames"
       the night before reading about a big computer security scam was much 
       less surprising.  Considering that telephone system outages do occur
       (but don't tend to get national news coverage), I guess your tale is
       not all THAT surprising.  The same goes for air traffic control outages
       -- see the next item.  But there have been numerous reports of 
       seemingly paranormal communications from people who have just died.
       (Who said RISKS is not eclectic?)  PGN]


Computer Failure Delays Flights at Atlanta Airport

Peter G. Neumann <Neumann@CSL.SRI.COM>
Tue 10 Mar 87 14:13:37-PST
An ATC computer crashed yesterday, resulting in long delays in arriving
and departing flights at Hartsfield Atlanta Intern'l Airport on 9 Mar 87.
The main computer was down from 9:50 a.m. until 12:55 p.m.  The backup
system worked properly.  However, it does not handle flight-plan information,
which had to be done manually (and thus contributed to the delays).
(FAA spokesman Jack Barker said, "Safety was never a problem.")

   [There have been enough reports on the ATC system in RISKS lately that I am
   by no means including everything I find.  But I wouldn't want you to think
   everything was perfect.  And safety is never a problem unless it is a
   problem.  Yogi Berra might have said that.  PGN]


Ozone hole a false alarm?

<pyramid!utzoo!henry@hplabs.HP.COM>
Mon, 9 Mar 87 18:51:17 pst
A side note on the matter of skeptical software hiding the existence of the
Antarctic ozone hole:  the Jan 12 issue of Aviation Week notes that some
doubts have been raised about whether the hole is real.  The problem is that
there was a lot of volcanic activity early in the decade, and the dust from
it has been much more persistent at high altitude than anyone expected.  The
satellite instruments are not good at distinguishing dust effects from
changes in gas composition.

                Henry Spencer @ U of Toronto Zoology
                {allegra,ihnp4,decvax,pyramid}!utzoo!henry


More on Requiring Mode C transponders

John Allred <jallred@labs-b.bbn.com>
Tue, 10 Mar 87 10:19:41 EST
  > ... The simple fact is that your actions put others (like me) under 
  > involuntary risk, and preventing this sort of thing is the fundamental
  > reason why laws and governments exist.  Phil

Wrong, Phil.  Pilot's actions place you at risk *only* if the pilots break 
the rules.  In the case of several midairs between commercial and private
aircraft, the "busting" of the Terminal Control Area by the private aircraft
(intentionally or unintentionally) played a major role.

  > I don't care whether 5% or 50% or 100% of small planes lack electrical 
  > systems; if they can't be flown without hazard to other planes, then 
  > they shouldn't be flown at all.

I could use this argument to justify any safety item *at any cost*.  How about
$30k for an active collision avoidance radar for a $6k aircraft?  It would make
things safer, wouldn't it?  Clearly, there is a point of diminishing returns.
The $1500 (or whatever) cost per aircraft for a mode C transponder could be
better spend on training and enforcement.
                                             John Allred, BBN Labs, Inc.


More on Requiring Mode C transponders

Ken Calvert <calvert@sally.utexas.edu>
Tue, 10 Mar 87 14:50:23 CST
>(karn@faline.bellcore.com:)
> The simple fact is that your actions put others (like me) under involuntary
> risk, ...

I have several problems with this, but the main thing is to point out out
that it is clearly not the case that "small planes" without electrical
systems and/or transponders can't be flown without threatening innocent
airline passengers.  I expect that most such planes virtually never enter
the busy airspaces (i.e., Terminal Control Areas) where midairs tend to
occur.  One reason is that regulations ALREADY require radios and
transponders for aircraft operating in TCAs, as well as permission from the
controlling authority.

       [These last two sentences reach an apparently false conclusion.  
       (For example, Los Angeles and Chicago routinely report many such 
       incursions each day.)  There is a huge difference between regulations 
       and actualities -- which in general is often a problem system 
       designers tend to ignore!  PGN]

Airplanes will occasionally collide, as will cars and trains.  We should
indeed be working to reduce the RISKs, but in many cases (and I think this
is one of them) we should be focusing on the hard problem of making people
better pilots (and drivers, and programmers), instead of throwing money and
technology at the problem in order to appear to be doing something.
Especially when (again, as in this case) there are probably also technical
difficulties with the proposed "techological" solution (e.g., capacity of
the ATC system).

    [OK. Perhaps we have done enough on this for now.  People are most often
    the weak link in ATC, but technology can help.  However, if the people
    rely too much (or blindly) on the technology, then the existence of the
    technology may be debilitating.  PGN]

Please report problems with the web pages to the maintainer

Top