The RISKS Digest
Volume 11 Issue 79

Tuesday, 4th June 1991

Forum on Risks to the Public in Computers and Related Systems

ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Please try the URL privacy information feature enabled by clicking the flashlight icon above. This will reveal two icons after each link the body of the digest. The shield takes you to a breakdown of Terms of Service for the site - however only a small number of sites are covered at the moment. The flashlight take you to an analysis of the various trackers etc. that the linked site delivers. Please let the website maintainer know if you find this useful or not. As a RISKS reader, you will probably not be surprised by what is revealed…

Contents

FYA: CREATORS ADMIT UNIX, C HOAX
Mike Taylor of The Vogon News Service
via Jim Horning
Software Short (at) Circuit City: Senior Citizen spurned
Peter Amstein
Old RISK of misconfigured printer
Pete Kaiser
Lauda Air Boeing 767 crash
Steven Philipson
W.A.Simon
David Lesher
Re: AFTI/F-16
A. Padgett Peterson
Info on RISKS (comp.risks)

CREATORS ADMIT UNIX, C HOAX

Jim Horning <horning@Pa.dec.com>
Tue, 4 Jun 91 11:41:50 PDT
>>>>>>>>>>>>>>>>  T h e   V O G O N   N e w s   S e r v i c e  >>>>>>>>>>>>>>>>

 Edition : 2336              Tuesday  4-Jun-1991            Circulation :  8466

VNS TECHNOLOGY WATCH:                          [Mike Taylor, VNS Correspondent]
=====================                          [Littleton, MA, USA            ]

COMPUTERWORLD 1 April

                     CREATORS ADMIT UNIX, C HOAX

    In an announcement that has stunned the computer industry, Ken Thompson,
    Dennis Ritchie and Brian Kernighan admitted that the Unix operating
    system and C programming language created by them is an elaborate April
    Fools prank kept alive for over 20 years.  Speaking at the recent
    UnixWorld Software Development Forum, Thompson revealed the following:

    "In 1969, AT&T had just terminated their work with the GE/Honeywell/AT&T
    Multics project. Brian and I had just started working with an early
    release of Pascal from Professor Nicklaus Wirth's ETH labs in
    Switzerland and we were impressed with its elegant simplicity and
    power. Dennis had just finished reading `Bored of the Rings', a
    hilarious National Lampoon parody of the great Tolkien `Lord of the
    Rings' trilogy. As a lark, we decided to do parodies of the Multics
    environment and Pascal. Dennis and I were responsible for the operating
    environment. We looked at Multics and designed the new system to be as
    complex and cryptic as possible to maximize casual users' frustration
    levels, calling it Unix as a parody of Multics, as well as other more
    risque allusions. Then Dennis and Brian worked on a truly warped
    version of Pascal, called `A'. When we found others were actually
    trying to create real programs with A, we quickly added additional
    cryptic features and evolved into B, BCPL and finally C. We stopped
    when we got a clean compile on the following syntax:

    for(;P("\n"),R-;P("|"))for(e=C;e-;P("_"+(*u++/8)%2))P("| "+(*u/4)%2);

    To think that modern programmers would try to use a language that
    allowed such a statement was beyond our comprehension!  We actually
    thought of selling this to the Soviets to set their computer science
    progress back 20 or more years. Imagine our surprise when AT&T and
    other US corporations actually began trying to use Unix and C!  It has
    taken them 20 years to develop enough expertise to generate even
    marginally useful applications using this 1960's technological parody,
    but we are impressed with the tenacity (if not common sense) of the
    general Unix and C programmer.  In any event, Brian, Dennis and I have
    been working exclusively in Pascal on the Apple Macintosh for the past
    few years and feel really guilty about the chaos, confusion and truly
    bad programming that have resulted from our silly prank so long ago."

    Major Unix and C vendors and customers, including AT&T, Microsoft,
    Hewlett-Packard, GTE, NCR, and DEC have refused comment at this time.
    Borland International, a leading vendor of Pascal and C tools,
    including the popular Turbo Pascal, Turbo C and Turbo C++, stated they
    had suspected this for a number of years and would continue to enhance
    their Pascal products and halt further efforts to develop C.  An IBM
    spokesman broke into uncontrolled laughter and had to postpone a
    hastily convened news conference concerning the fate of the RS-6000,
    merely stating `VM will be available Real Soon Now'.  In a cryptic
    statement, Professor Wirth of the ETH institute and father of the
    Pascal, Modula 2 and Oberon structured languages, merely stated that P.
    T. Barnum was correct.

    In a related late-breaking story, usually reliable sources are stating
    that a similar confession may be forthcoming from William Gates
    concerning the MS-DOS and Windows operating environments.  And IBM
    spokesman have begun denying that the Virtual Machine (VM) product is
    an internal prank gone awry.
    {COMPUTERWORLD 1 April}
    {contributed by Bernard L. Hayes}

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
        Please send subscription and backissue requests to CASEE::VNS

    Permission to copy material from this VNS is granted (per DIGITAL PP&P)
    provided that the message header for the issue and credit lines for the
    VNS correspondent and original source are retained in the copy.

>>>>>>>>>>>>>>>>   VNS Edition : 2336     Tuesday  4-Jun-1991  >>>>>>>>>>>>>>>>


Software Short (at) Circuit City: Senior Citizen spurned.

Peter Amstein <amstein@condor.Metaphor.COM>
Tue, 4 Jun 91 10:30:29 PDT
>From Herb Caen's column in the San Francisco Chronicle, June 4, 1991:
"Ron Lemmen took his old friend Nellie White to Circuit City [a local
discount electronics chain - PA] to buy a TV set and after she wrote a
check with more than adequate ID, the computer turned her down.  It
hadn't been programmed for somebody born before 1900 (Nellie's 92)..."


Old RISK of misconfigured printer

E/ACT Open Systems 04-Jun-1991 1446 <kaiser@heron.enet.dec.com>
Tue, 4 Jun 91 05:48:52 PDT
In 1968 I was a system programmer at the Columbia University Computer Center,
which had a tremendous complex of IBM mainframes: a 360/91 tightly coupled to a
360/75, with a total of several megabytes of memory between them.  Each
computer had a disk farm, there was a 2321 Data Cell, a 2540 card reader, and a
[number?] card reader/punch.  And of course, several 1200lpm 1403-N1 printers.

Usually the computer room was a reassuring hum of noises ... off in the far
corner, the very faint pulse of the PCU ("Plumbing Control Unit": the 91 was
cooled by a closed-loop distilled water system!); the snicker-click of the
1315s and 2315s as the heads moved back and forth; the ka-CHUNK of the data
cell as the selectors grabbed, turned, and released the magnetic cards in the
cells; the susurration of the card readers and occasionally the clunk of the
punch; and of course, the humming percussion of the printers.

One day I was in the room when suddenly it lapsed into near-total silence,
except for the whiz of paper slewing out of one of the printers at high speed.
The disks, card devices, data cell, and second printer had all stopped dead.
And what was worse, very few of the 50 gazillion lights on the consoles of the
computers were winking.  Guessing immediately that something to do with the
printer had brought the whole huge system to a halt, I ran over to it and hit
(usually I don't like the word "hit" to mean "press", but this time I really
did HIT) the STOP button.  The printer stopped, but unfortunately nothing else
came to life.  Eerie dead silence, except for the water pump in the far corner.
I set to work to find the problem.

We did figure it out: the listing on that printer had specified carriage
control meaning "skip to the next punch in channel 12 of the carriage tape" --
but there was nothing punched in channel 12 of the carriage tape.  And the
system, it seems, had at that instant serviced all its interrupts except the
one it was expecting from the printer — which never came, because the only
thing that could cause the interrupt (a punch in channel 12 of ...) never
happened.  So the entire multiprocessor system was at a dead stop waiting for
the interrupt.  What to do, what to do?

I removed the carriage tape and punched a hole in channel 12, then replaced it
in the printer.  I hit the START button, the paper slewed and stopped, the
interrupt arrived, and the whole system came to life.  Right?

Nope.  Nothing happened.  No sweat, I'm a system programmer, entitled to wander
into the machine room and do anything I think I can get away with; so fine,
I'll just warm-start the suckers and off we'll go.  We're talking about
$6,000,000 worth of computers, the finest IBM had built, servicing all the
needs both academic and administrative (I mean, my PAYCHECK was calculated and
printed on those machines!) of a prestigious university.  One of whose
trustees, by the way, was Thomas J. Watson, Jr.  So I warm-started the system.

Unfortunately, that didn't work either.  Nothing worked.  Nor could IBM make it
work.  We had to reinitialize the whole system from scratch, the only time in
my memory that that was necessary.

Now what to do about this terrible bug that could cause the whole machine to
grind to an irreparable halt?  A committee met several times to try to figure
it out.  (I wasn't invited.  Too junior.  Too technical.)  The committee never
came to a decision; they couldn't figure out whether to press IBM to solve the
problem, or to try to find and fix it ourselves, or what.

I did my part.  With no one's permission, I checked all the carriage tapes to
make sure that they all had at least one punch in every channel, and I
instructed the operators that whenever they made a new tape, they should make
sure that every channel had a punch somewhere.  End of problem.
                                                                  ---Pete
kaiser@heron.enet.dec.com   +33 92.95.62.97


Boeing 767 crash

Steven Philipson <stevenp@kodak.pa.dec.com>
Mon, 3 Jun 91 23:41:16 -0700
I've received additional data on that is pertinent to the discussion of the 767
crash.  Boeing tested the 767 during certification for thrust reverser
activation in flight.  Not only does in-flight deployment not cause damage, but
the aircraft can remain in flight in this condition.  This is because the
thrust reversers are not as efficient as the non- reversed engines.  There is
still enough total thrust produced such that the aircraft can maintain flight
with one engine at full throttle and the other at full throttle with the
reverser deployed.

   I realize that this information has no computer risk relevance, but
speculation on the cause of this "computer controlled" aircraft has run high,
even on the RISKs group.  The above info answers some of the concerns that
people had about operation of this aircraft.  It also sheds some light on the
fault tolerant nature of the Boeing hardware design, which should be
instructive to those of us who design software.
                            Steve


Re: Lauda Air Crash

W.A.Simon <alain%elevia.UUCP@Larry.McRCIM.McGill.EDU>
Mon, 3 Jun 91 16:31:45 EDT
> New evidence that the crash of a Lauda Air Boeing 767-300 in Thailand had been
> caused by in-flight reversal of the thrust on one engine has stunned the
> aviation industry.  Niki Lauda, owner of Lauda Air, made the claim in Vienna

This may have happened, but what about the ground witnesses who said the plane
flared up like a fire cracker, and what about the small size of the pieces of
the wreck, and what about the dispersion site surface?  All of these are
inconsistent with a simple loss of control resulting in an impact with the
ground.  There seems to have been some mid air explosion.

Alain          (514) 934 6320           UUCP: alain@elevia.UUCP


Re: Lauda Air Crash

David Lesher <wb8foz@mthvax.cs.miami.edu>
Mon, 3 Jun 91 18:16:15 -0400
>Herr Lauda said the
>flight data recorder was damaged and could not be used to analyse the crash.

RISK of improvement, if true.

The old FDR's used metal scribes on stainless ribbon. They've been replaced by
digital data recorders using magnetic tape. They offer more reliability, less
error (in at least one famous case, the NTSB put the recorder in a centrifuge
to explore the effects of high g-loads on the mechanical slop in the pen
linkages!), larger number of channels - hence many more data points, maybe
automatic reuse (as the voice recorder does) and likely lower cost.

But, according to friends in the airline industry, they also are far less
indestructible. Makes sense - no matter what you do, ferric oxide on a plastic
base melts at a lower temperature than stainless steel.  And that does not even
consider the Curie point of the oxide.

wb8foz@mthvax.cs.miami.edu


Re: AFTI/F-16 (John Rushby, RISKS-11.78)

A. Padgett Peterson <padgett%tccslr.dnet@uvs1.orl.mmc.com>
Mon, 3 Jun 91 16:34:16 -0400
Before launching into a discussion of John's posting, some background is
necessary. Between 1979 and 1982, one of my assignments was on the AFTI-F16
program. My prime task was to augment the "user interface" between the
designers and the FLCCs (FLight Control Computers), Bendix bdx 930, AMD
2901-based systems with a custom microcode, 4 Mhz cycle, and 450 nsec access
UVPROM memory. Specs that seem pre-historic today but were what we had to work
with.

During the flight test phase, as John points out, we had some "glitches",
primarily from complex interactions that seem simple when explained but
had us covering hallways with brush recorder outputs trying to figure out
what happened. The telemetry and flight record capability available to
us was typically (as I recall) limited to about 16 channels for each FLCC.
Picking the correct software data points to monitor from in excess of 5000
locations was sometimes difficult.

Readers must remember that the AFTI-F16 was a technology demonstrator and
for the era, we were "pushing the envelope" in a very steep learning curve
not only for the digital flight controls, but for the entire process of
designing digital flight controls. "What if" discussions abounded right down
to the philosophical discussions of what the pilot was permitted to do.
(At the time it seemed centered on the lowest common denominator thought
that (IMHO) resulted in the Iranian debacle of 1980. There were political
as well as practical problems to be solved.

I vividly remember one discussion concerning PLA (power level angle: throttle)
authority. The P&W F100 engine had design limits (much like the red-line in a
car) and the thinking at the time said to design the controls so that this
point could not be exceeded. A few of us were of the opinion that this was a
combat aircraft and that a fighter pilot should have the authority to exceed
"design" limits if necessary to complete his mission. Warn him, but give him
the option even at the risk of destroying his own aircraft. In combat, the
rules MUST be different.

Today, it seems incredible that the opposing viewpoint existed, but it did and
was quite pervasive in some governmental circles. Then, we were the mavericks.

>It seems that redundancy management became the primary source of unreliability
>in the AFTI-F16 DFCS.

Cost constraints & paper studies decreed that we would try a triplex design with
hydromechanical back-up, in production, lessons learned on AFTI resulted in a
Quadraplex system. Trying to develop a dual-fail-operational flight-critical
system was not easy.

>...the unsynchronized individual computers may sample sensors
>at slightly different times, they can obtain readings that differ quite
>appreciably from one another...

Remember that I said "dual-fail operational". Synchronous operation would
have eliminated such latency, but a "first-fail" could include loss of
synchronization. Therefore asynchronous design was a level-1 decision.

Today, processing and sensor speed has increased to the point that this
approach would not be a problem but at 500 kips, cycle rates were under 100/
second and when > MACH 1.3, a lot can happen in a couple of milliseconds.

>     An even more serious shortcoming of asynchronous systems arises when the
>control laws contain decision points.  Here, sensor noise and sampling skew may
>cause independent channels to take different paths at the decision points and
>to produce widely divergent outputs.  This occurred on Flight 44 of the
>AFTI-F16 flight tests [4, p. 44].  Each channel declared the others failed; the
>analog back-up was not selected because the simultaneous failure of two
>channels had not been anticipated and the aircraft was flown home on a single
>digital channel.

The pilot had a switch that allowed him to select which computer(s) was selected
and could over-ride this digital decision if necessary. Note that the aircraft
still had the hydromechanical back-up with "get home" capability if necessary.
This condition HAD been anticipated.

>     Another illustration is provided by a 3-second "departure" on Flight 36 of
>the AFTI-F16 flight tests, during which sideslip exceeded 20deg, normal
>acceleration exceeded first -4g, then +7g, angle of attack went to -10deg, then
>+20deg, the aircraft rolled 360deg, the vertical tail exceeded design load, all
>control surfaces were operating at rate limits, and failure indications were
>received from the hydraulics and canard actuators.

I do not have the records here but suspect that this was one of the "find a
long hallway" ones. This was probably the case where a combination of
extremely high AOA in a near-stall condition caused the envelope to be
exceeded on the back side i.e. the plane was no longer flying & the control
surfaces had little effect. My memory may be going, but I seem to recall one
set of readings that indicated near-zero air speed with an AOA > 80 degrees.

>     The AFTI-F16 flight tests revealed numerous other problems of a similar
>nature.  Summarizing, Mackall [4, pp. 40-41] writes:

>"The criticality and number of anomalies discovered in flight and ground tests
>owing to design oversights are more significant than those anomalies caused by
>actual hardware failures or software errors..."

Easy words to say. Remember, this was a full-authority multiple-redundant
flight control system containing five modes of flight with a computer that
could only address 32k of memory (the upper bit of the 16 bit addressing was
used to indicate an indirect operation).

>"...qualification of such a complex system as this, to some given level of
>reliability, is difficult ...[because] the number of test conditions becomes so
>large that conventional testing methods would require a decade for completion.

In other words, the only real way to test it and learn where the mistakes were
was to strap in a pilot and wish him luck. (of course thousands of hours in
a flight simulator connected to production hardware helped).

>The fault-tolerant design can also affect overall system reliability by being
>made too complex and by adding characteristics which are random in nature,
>creating an untestable design.

Huh ? Nothing in a digital system is random. period. Interactions may be
unanticipated, but not random. Things were a bit more difficult before PCs
though.

>2: However, the greater the benefit provided by DFCS, the less plausible it
>becomes to provide adequate back-up systems employing different technologies.
>For example, the DFCS of an experimental version of the F16 fighter (the
>"Advanced Fighter Technology Integration" or AFTI-F16) provides control in
>flight regimes beyond the capability of the simpler analog back-up system.
>Extending the capability of the back-up system to the full flight envelope of
>the DFCS would add considerably to its complexity--and it is the very
>simplicity of that analog system that is its chief source of credibility as a
>back-up system [2].

Doubletalk. Sure, an analog system is going to have trouble with a Mach 1.3
50 ft. terrain following mode (so are the pilot's kidneys). What we found
out was that you can make a plane do things with DFCS (digital flight control
system) that are impossible with an analog system. In TF, if you have a
failure, the back-up does not try to maintain that condition, instead a fly-up
is instigated and the aircraft returns to a maintainable mode.

>     The danger of wide sensor selection thresholds is dramatically illustrated
>by a problem discovered in the X29A. ... It was subsequently discovered that
>if the nose probe failed to zero at low speed, it would still be within the
>threshold of correct readings...

At least we did not have this problem: on AFTI valid sensor ranges were
confined so that any sensor reading zero or full scale was automatically
declared failed.

All in all, I thought that AFTI was pretty successful & lead to the PDFCS
(Production Digital Flight Control System) program. We made mistakes
and learned from them. If anything, most thresholds were set too high so that
failures were declared that did not need to be, but we were kind of cautious
in those days. Probably the riskiest thing was to bet that technology would
allow us to replace that 450 ns memory with 250 ns units for a needed through
put improvement - three manufacturers had announced them but no-one had shipped
any when we froze the design.

In any event, first flight was almost exactly ten years ago, and the most
significant event was that the chase planes had to double their normal
clear-space distances since the AFTI-F16 could translate horizontally and
vertically without any warning, it was just suddenly someplace else.
                                             Padgett

Please report problems with the web pages to the maintainer

x
Top