The Risks Digest

The RISKS Digest

Forum on Risks to the Public in Computers and Related Systems

ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Volume 7 Issue 7

Friday 10 June 1988

Contents

o Accidental breach of software security
Martin Minow
o "Sewage flows into river; Computer Failure Blamed"
Randal L. Schwartz
o Canadian Public Service warned against SINing
John Coughlin
o Betting network crash in Australia
George Michaelson
o John Pershing on ATMs
David Thomasson
o A typo in "UK Poly; another root typo"
Matt Bishop
o Re: The Challenger and visionary software architects
Eugene Miya
o COMPASS '88 CONTACT
Frank Houston
o Info on RISKS (comp.risks)

Accidental breach of software security

Martin Minow THUNDR::MINOW ML3-5/U26 223-9922 <minow%thundr.DEC@decwrl.dec.com>
9 Jun 88 16:08
From Electronic Engineering Times, June 6, 1988, condensed and abstracted.

    Unauthorized access to software cited
           Shuttle security lapse
         by Richard Doherty

An accidental breach of software security at Rockwell International Corp.
gave unauthorized engineers and programmers access to raw code being prepared
for the space shuttle Discovery's return to space, now slated for Aug. 22.
The six-month lapse in security resulted from "an unintentional keyboard
error," Last November, the most recent embodiment of "Operational Increment"
shuttle software (revision OI8-A) was somehow stripped of its normal RCAF
[Resource Access Control Facility] protection... [IBM] Engineers discovered
on Nov. 22, 1987, that they were able to modify the code while in the
supposedly restrictive "browse" mode.  An IBM supervisor then immediately
notified NASA, and by afternoon's end, a weekend-long bit-by-bit comparison
was initiated.

NASA says that in all, 16 changes were made.  The agency still does not know by
whom or when.  Because multiple copies of the OI8-A software existed at that
time, NASA said there is no chance the security breach could endanger the
mission.

NASA conducted a six-month long inquiry into how the event happened.
As a result, the agency said it now knows the normal protection was
somehow stripped off during a change in editing modes (between integer
and block) and who the operator was at the time.

But NASA said it still doesn't know how many accesses were made and by
whom over a six-month period.  Nor has it discovered why it took six
months to notice the code was open to editing.

Shuttlegate?

The acknowledgement by NASA and Rockwell engineers came in the wake of
a $5.2 million lawsuit field by shuttle engineers who were dismissed
after expressing concern about overall OI8-A software vulnerability,
among many other problems, to Rockwell management last fall.

The engineers, Sylvia Robins and Ria Solomon, first took their concerns
to the company's ombudsman, then to the NASA Inspector General and,
eventually, to the White House.

Finally, seeing no action taken on their concerns regarding accelerated
software verification audits and a steady lack of secure data access,
Robins and Solomon took action by filing a lawsuit here [Houston] against
Rockwell and its prime software contractor, Unisys, last fall.  That action
triggered widespread denial by Rockwell of the lawsuit's many safety concerns,
including the charge of lax shuttle programming security that was finally
acknowledged last month.

[There's more, but this is getting long.  Any idea what "between integer
and block" modes mean?]                           Martin


"Sewage flows into river; Computer Failure Blamed"

Randal L. Schwartz <mipos3!merlyn%intelob.Berkeley.EDU@ucbvax.Berkeley.EDU>
Tue, 07 Jun 88 10:34:13 PDT
Page one (below the fold) of the 6/7/88 morning edition of The
Oregonian (Portland, Oregon) reports:

[BEGINNING OF ARTICLE]
"Sewage flows into river; computer failure blamed"

* The five-hour spill from the Sullivan Pump Station poured about 5.4
million gallons into the Willamette River downtown

A computer failure caused about 5.4 million gallons of raw sewage to
spill into the Willamette River in downtown Portland early Monday,
prompting state officials to warn against recreational use of the river
through Tuesday morning.

"That's a major spill," said Shirley Kengla, spokeswoman for the Oregon
Department of Environmental Quality.

[What a quote!  We Oregonians now know what a major spill is...
sheesh.]

The spill began about 3 a.m. when a computer failure at the Sullivan
Pump station caused sewage to flow directly into the river [...]. The
spill was stopped by 8 a.m. [...].

Peterson [operations director] said the computer failure was caused by
a loss of electrical power, shutting down the pumps at the pump
station.

He said the computer system was designed so that the operators could
override computer commands at the station, but they discovered Monday
they could only do that when the computer had electrical power [!!!].
Monday's loss of power prevented that.

[... more stuff about volume and repair efforts ...]

Kengla said the DEQ is investigating the spill.

"Because it was a computer failure, it was an accident [!!!] and for
that reason, we're just looking toward making sure it doesn't happen
again," she said.

[... more stuff about location of spill ...]

Contact with the water could cause sickness with severe flu-like
symptoms.

In June 1985, another computer failure caused the dumping of more
than 3 million gallons of raw sewage into the Willamette from the
same pump station.  Kengla said that state and city officials had
been working to prevent a recurrence of the problem.

"We thought we had done enough, but obviously we hadn't," she said.
[Brilliant quotes!]

[... stuff about another tiny spill last year, not computer-related...]
[END OF ARTICLE]

(1) Why did they have a computer-controlled system that depended on
    electrical power to operate with no backups?

(2) Given that they already had a failure before, how could they have
    failed to design a failsafe system?

(3) The statements made by the spokeswoman, like "because it was a computer
    failure, it was an accident..." make me wonder.  Why should an accident
    be so obviously required to happen after a computer failure?  Why
    should we not *plan* for the system to break?  These guys obviously
    didn't.

Sheesh.  And during Rose Festival too!

-- Randal L. Schwartz, Stonehenge Consulting Services (503)626-6907
on contract to Intel Technical Publications: merlyn@omepd.intel.com (for now)

      [Moderator's note: This article was also reported by  
      Andrew Klossner <andrew%frip.gwd.tek.com@RELAY.CS.NET> and by
      Richared {?} England, Mentor Graphics Corp., Customer Services, 
        Technical Support, Beaverton, Oregon, pdx.MENTOR!rengland@uiucdcs.
      The abridged version from Randal was chosen for brevity, but the
      following comments from R. England are also worth including:

          [This case] points out, once again, the need for total design
          consideration.  A truly fail-safe system would include manual
          overrides instead of requiring all control signals to originate
          from the automatic control system.  Conversation with the writer
          lead me to believe that there was a wide-spread power loss which
          may have precluded operation of the pumps as well, in which case
          nothing would have helped.  Note, however, that the computer gets
          the blame.   [R. England]


Canadian Public Service warned against SINing

John Coughlin <JC%CARLETON.BITNET@CORNELLC.CCS.CORNELL.EDU>
10 Jun 88 10:25:00 EDT
   The  following article appeared in the Ottawa Citizen,  Friday June  10,
and is reproduced without permission.

===========================================================================

PS warned against SINing

By Iain Hunter
Citizen staff writer

   Treasury Board President Pat Carney has ordered bureaucrats who run hundreds
of federal programs to SIN no more.  And Justice Minister Ray Hnatyshyn has
warned that the government may bring in laws to regulate the collection and use
of the social insurance number if other levels of government and the private
sector don't follow the federal example.

   Carney announced Thursday that the use of the individual identity number
will be phased out over five years in several institutions to reduce the risks
of invasion of [...] privacy.  Carney said the use of the SIN will be
restricted mainly to the administration of tax, pension and social benefit
programs.  Any new collection or use of the SIN beyond those covered under the
new policy will have to be approved by Parliament.

   A Treasury Board spokesman said "hundreds" of programs in which the number
is now used will be affected by the new restrictions.  The cost of phasing out
the use of SIN is estimated at $16 million.

   Privacy Commissioner John Grace called Carney's announcement heartwarming
and said it puts the federal government "in a much stronger moral position to
preach controls on the use of SIN to other governments and the private sector."


Betting network crash in Australia

George Michaelson <munnari!ditmela.oz.au!G.Michaelson@uunet.UU.NET>
Thu, 09 Jun 88 09:40:23 +1000
Earlier this week (Monday) the T.A.B suffered a major loss of a
computerised betting control system which caused much anguish to punters
across the state of Victora.  I phoned their computer manager & got a
loose description of the problem, he was understandably reticent since
the network manages many millions of dollars of bets daily, and there
are public relations issues aside from the security ones.

    Brief Description of TAB & Gambling:
    ++++++++++++++++++++++++++++++++++++

Gambling, the 4th most ancient vice after programming sex, religion and
politics is endemic in Australia.  It is also very tightly controlled by
the state, concentrating in several HUGE casinos with banks of "pokeys"
or one-armed bandits, and of course the ponies. [remember phar lap anyone?]

Apart from pre-computed instant win cards and lotterys the only [legal]
form of betting off the racetrack in Australia is run by the T.A.B.  or
Totalizer Agency Board.

They run a "totalizer" scheme, where the pool of cash placed in any
given race is split into the dividend, so that the winnings are more
"socially" adjusted, and also not simply calculated from starting price
("SP" bookmakers thrive in the better drinking establishments and police
stations, according to popular myth & corruption tribunals) however
there is an element of feedback, in that offered "odds" are obviously
adjusted according to the poolsize as well as traditional form, much as
in normal betting.  the "divi" depends on the size of the pool and is
post-close-of-betting calculated, although a running total could be
available (I don't know for sure, I'm not a gambling man you
understand...)

Each state has a transaction processing network which underpins this
exercise, so that the pool/dividend holds across the entire state for
each race.

    The Problem:
    ++++++++++++

The Victorian T.A.B.  uses a tightly coupled "cluster" of 5 nodes
handling the incoming transactions, and a central node that acts as a
controller and calculates the divi.  Some form of shared memory is used
to do all this.

At some stage in the day the central node noticed an inconsistency in 2
separate pools, across two of the nodes.  They were disconnected,
possibly automatically.  Afterwards, the entire network was taken down
and the betting calculated manually to prevent any data inconsistency
being propagated out into the real world.  -From the "feedback"-y nature
of the tote algorithm it's plausible that had this not been done, many
many thousands of punters would have been rooked of their winnings..
actually perhaps they would have made too much money as well!

-The problem was verbally ascribed to software, but no details are
available.  According to radio reports over $1M was lost due to the much
less efficient manual methods.  Punters were furious.


I wanted to end with some pun about shutting the sTABle door after the
horse has bolted, but I can't think of a good punchline.

        George Michaelson, CSIRO Division of Information Technology

ACSnet: G.Michaelson@ditmela.oz                      Phone: +61 3 347 8644
Postal: CSIRO, 55 Barry St, Carlton, Vic 3053 Oz       Fax: +61 3 347 8987


John Pershing on ATMs

David Thomasson <ST401405%BROWNVM.BITNET@MITVMA.MIT.EDU>
Thu, 09 Jun 88 09:38:27 EDT
A tip of the hat to John Pershing (pershing@ibm.com) for his reply
concerning risks of ATM's (RISKS 7.6). This piece is truly a standout in
that it is (a) not anecdotal and (b) not hysterical in tone. I commend it to
others as a model for crafting their arguments and replies.


A typo in "UK Poly; another root typo" (Re: RISKS-7.6)

Matt Bishop <bishop@bear.Dartmouth.EDU>
Thu, 9 Jun 88 09:01:57 EDT
   Error in my last letter: he typed "halt", not "exit"; "halt" halts the CPU
and "exit" takes you out of superuser mode.  Sorry for the boo-boo.    Matt


Re: The Challenger and visionary software architects

Eugene Miya <eugene@amelia.nas.nasa.gov>
Thu, 09 Jun 88 16:16:29 PDT
>From: stork@humu.nosc.mil (Kent Stork)
>The May issue of Defense Science validates something that many computer
>scientists have probably suspected: ultimately, the failure of the Challenger
>and the death of the astronauts was due to a control loop software design
>oversight - just another bug.

I have grabbed the May issue of Defense Science (an oxymoron):

%A Yale Jay Lubkin
%T The Challenger Disaster
%J Defense Science
%V 7
%N 5
%D May 1988
%P 13-18 (actually only 3 pages)

This article neither validates nor mentions software, bugs, or computers.
There are two sets of problems 1) Kent's comment, and 2) the article itself.
The article is basically a summary of different causal hypotheses: the
cold joint and the AW&ST puncture (Abu-Taha) hypotheses.  There is a good
tone of conspiracy in the paper, but I am not in a position to side in
defense of Washington bureaucrats or Lubkin.

The closest comment to "control loop software design" is:
    "It was not the leak that killed the astronauts.  It was the
    attempt to correct the sidethrust, which sent the Challenger
    into violent oscillations.
There are probably some difficulties in asserting "cause."  Further:
    "If the Challenger had been permited to go off course,
    without attempting the major correction, the side booster would
    not have broken out, the booster would have burnt out with the
    Callenger still intact, and the crew could have ejected, off
    course but aslive."
This latter is probably false, but we will not really know.  I don't usually
trust words like "ultimately."

Researching this article did have a positive side benefit while in the library,
I found a good 5 page obit to Richard Feynman in Engineering and Science
[Caltech's Alumni magazine].

There is a very "hardware must fly" orientation in NASA, and you should
understand software is not well understood in the Agency.  
This was documented by internal reports [Sagan et al. 1980].
Any disaster will have underlying PHYSICAL causes sought.

>To what extent must software architects be visionaries? Certainly the
>requirements are proportional to the power of the system the software controls.
>Do such visionaries exist to design our Challengers and our other more
>aggresive weapons?

I take some offense at you calling the Challenger a weapon, and I know
several thousand others who would, too.  I would not call any of the software
people working on the Shuttle "visionaries," there is a degree of
salesmanship, however.  Your codes have to withstand many walk throughs
(or walk overs depending on your prespective 8-). Don't think the control
problem scales linearly, it doesn't, typically O(n^2) and greater.

--eugene miya
  NASA Ames (not speaking directly for the Agency)


COMPASS '88 CONTACT

Frank Houston <houston@nrl-csr.arpa>
Fri, 10 Jun 88 16:51:28 edt
Regarding the COMPASS '88 advance program and registration info.

Some people have had trouble contacting me by email and I, in turn, have
had some difficulty getting the nrl-csr mailer to recognize return addresses.
I recommend including a surface mail address with requests for registration
info.  For those who could not reach me by net, my telephone number is
(301)443-5020.  Registration is still open.

Frank Houston (I really exist @nrl-csr.arpa)


    [Note, not on the foregoing, but on the following: 
    I have had a request to add the issue number on the standard trailer,
    as you see I have done.  That seems like a fine idea for people who tend
    to string RISKS issues together.  However, this may defeat the 
    UNDIGESTIFIER programs if they look for the precise trailer.  Please
    let me know if this causes any troubles.  PGN]

Please report problems with the web pages to the maintainer

Top