The Risks Digest

The RISKS Digest

Forum on Risks to the Public in Computers and Related Systems

ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Volume 14 Issue 22

Monday 4 January 1993

Contents

o Things that cannot possibly go wrong
Pete Mellor
o DISA yaks to FCC on PCS
Paul Robinson
o Re: Dutch chemical plant explodes
Nancy Leveson
Meine van der Meulen
o Re: Antiviral company target of legal action
Aryeh Goretsky
o Microprocessor design faults
Brian A Wichmann
o Call for Papers, 1993 National Computer Security Conference
Jack Holleran
o Info on RISKS (comp.risks)

Things that cannot possibly go wrong

Pete Mellor <pm@cs.city.ac.uk>
Mon, 4 Jan 93 13:50:17 GMT
The following extract from Douglas Adams' latest book* contain a lesson
for designers of complex systems, particularly computerised ones (e.g.,
fly-by-wire):

     ... all mechanical or electrical or quantum-mechanical or hydraulic or
     even wind, steam or piston-driven devices, are now required to
     have a certain legend emblazoned on them somewhere. It doesn't matter
     how small the object is, the designers of the object have got to find
     a way of squeezing the legend in somewhere, because it is their attention
     which is being drawn to it rather than necessarily that of the user's.

     The legend is this:

     `The major difference between a thing that might go wrong and a thing
      that cannot possibly go wrong is that when a thing that cannot possibly
      go wrong goes wrong it usually turns out to be impossible to get at or
      repair.'

* "Mostly Harmless" (The fifth book in the increasingly inaccurately named
  "Hitch Hiker's Guide to the Galaxy" trilogy) by Douglas Adams, Heinemann,
  London, 1992, ISBN 0434 00926 1

Peter Mellor, Centre for Software Reliability, City University, Northampton
Sq., London EC1V 0HB, Tel: +44(0)71-477-8422, JANET: p.mellor@city.ac.uk


[TDR] DISA yaks to FCC on PCS

"Paul Robinson, Contractor" <FZC@CU.NIH.GOV>
Mon, 04 Jan 1993 18:41:26 EST
"DISA yaks to FCC on PCS"

Article Summary
Government Computer News, January 4, 1993, Page 38

This is a summary of an article about a technology you've probably never seen,
complained about by an agency you've probably never heard of.

In an article titled "Defense agency wants PCS voice services in public
domain", author S. A. Marud tells how the Defense Information Systems Agency
(DISA) has jumped into the Federal Communications Commission (FCC) inquiry
into the standards to be set on the operation of the startup Personal
Communications Services (PCS) industry.

PCS is a wireless digital technology which operates at 2 gigahertz.  Cellular
is analog.  Also, one advantage of the service is that a number can be
assigned to a person, not to a telephone.

Two groups in DISA, the Federal Wireless Services User Forum (FWSUF) and the
Interagency Cellular Radio Working Group (ICRWG) were the impetus for filing
comments.  They want to be certain that PCS supports at least Group 3 / Group
4 Fax, paging, images, and voice and data encrypted with an STU-III device.
i.e. that a group 3 fax modem should work the same whether it's plugged into a
wall jack or a PCS phone.  PCS should also support dialing "0" for Operator
and 911 for Emergency.  ICRWG wants there to be two nationwide carriers for
PCS, or in the alternative, at least one frequency block reserved nationally
to one carrier and the rest awarded to local carriers.

DISA's concerns on National Security and Emergency Preparedness makes it want
certain basic services (Such as area code 710?) to be part of the new system,
and that at least voice services to be available through the public switched
(read local telco, AT&T, FTS-2000, MCI etc.) network.  The systems should be
made to be interoperable (meaning the phone you use in Dallas should also work
in Kansas City, Chicago, New York and Los Angeles), either from the start or
soon after some industry standards can be developed.  DISA would also prefer
that PCS licenses be issued for large areas if no nationwide carrier(s) are
authorized.

DISA is worried that PCS may be declared to be "private carriers" which means
that the government cannot mandate that they be part of the Telecommunications
Priority System (TSP) which allows the government to seize telephone lines.
TSP was invoked by the federal government for more than 4000 circuits and
services during Hurricane Andrew.

Certain industry groups are watching the rulemaking process on PCS, including
the Wireless Information Network Forum (which represents computer and
communications companies including Apple, AT&T & IBM), Cellular
Telecommunications Industry Association (CTIA) (guess who they represent).
CTIA is worried that the FCC might decide that PCS license won't be issued to
a cellular operator in the same area.

A decision on how the PCS industry is to be structured is expected from the
FCC sometime in Fall 1993.

Paul Robinson -- TDARCOS@MCIMAIL.COM  These opinions are mine alone.


Re: Dutch chemical plant explodes (RISKS-14.20)

Nancy Leveson <nancy@murphy.ICS.UCI.EDU>
Thu, 31 Dec 92 13:28:39 -0800
It is this type of oversimplified explanation that encourages
misunderstandings about accidents and how to prevent them and maybe leads to
more and unnecessary accidents in the future.

     The Dutch news said that the responsible person has been found and he
     will be charged with negligible conduct causing death.

I certainly hope it is not the poor schnook that was typing and made a
perfectly predictable and inevitable error.  We should not be blaming
accidents on people who do human things and make errors that are inevitable.
Who are they going to put in jail?  The programmer who wrote code that allowed
such a predictable input error to cause a dangerous output?  The person who
wrote the requirements and neglected this?  The chemical engineers who built a
plant that could blow up with one error like that?  The managers who allowed
all this to happen?  What about the regulatory authorities who gave a license
for such a dangerous design?

Actually, everything I've said above could be COMPLETELY WRONG!  Without a
complete investigation, nobody should be talking about the "cause" of an
accident.  Were there interlocks?  If so, why didn't they work?  If not, why
weren't there any?  etc. etc.  There are hundreds of factors that we know
nothing about that could have been the real causes of the accident.

The Bhopal explosion was blamed by Union Carbide as a maintenance error.  But
a complete investigation turned up hundreds of factors that were involved,
most of which go back to poor management.  The maintenance error was probably
the least important (if it actually ever happened, nobody knows, it's just the
explanation that Union Carbide management put forth).  An accident was
inevitable at Bhopal because of all the other factors.  Something else just
would have been the proximate cause, and it sounds like an accident was
inevitable in this case too.  The chemical industry can ignore all the lessons
from Bhopal because it was a "maintenance error."  We are probably going to
hear for years now about the chemical plant that blew up because of a typing
error or computer error (when it may not have had anything very significant to
do with the accident).

The law likes to simplify causes down to one simple event.  But we, as
scientists and engineers, should require more than this.

Nancy
         [Similar messages were received from many others, including
         PINE_RIDGE@ORVB.SAIC.COM (Brad Dolan) and horning@src.dec.com
         (Jim Horning), both of whom gagged on NEGLIGIBLE/NEGLIGENT,
         ergo@netcom.com (Isaac Rabinovitch), who wondered why the designer
         was not held responsible, desj@ccr-p.ida.org (David desJardins),
         who distinguished blame and fault and discussed the range of
         implications...  The following message gives some details.  PGN]


large accident at CINDU plant in The Netherlands (additional info)

<MEULEN@tno.nl>
Mon, 4 Jan 93 16:21 MET
In RISKS-14.20 a contribution on the accident at the chemical factory Cindu
appeared. The information provided was sparse and slightly erroneous. As we
were involved in the accident examination we have more detailed information
which I give here:

FACTS, Database for Industrial Safety Acc.#: 11057 Extended abstract
Country: NL    Date : 1992 0708

At a chemical factory a heavy explosion occurred which caused the death of 3
firemen of the works fire brigade and injured 11 workers included 4 firemen of
the works fire brigade.  The damage was estimated at several 10th of millions
NL guilders. There was a severely material damage.  The fragments where found
at a distance of 1 km.

The accident started with a typing error in a prescription by a laboratory
worker.  Instead of tank 632 he typed tank 634.  In tank 632 there was stored
resin feed classic (UN-1268) and normally used in the batch process.  In tank
634 DCDP (dicyclopentadiene) was stored.  The operator, who had to check if
the tank contents was equal with the prescription, filled the reactor with the
wrong chemicals.  The batch process started with steam heating via the coil in
the reactor. After temperature was rising, first the operator tried to cool
the reactor with more water of the water mains and later on the works fire
brigade was alarmed to cool the reactor.

An administrator, who checked the prescription every morning, found the error
and tried to contact the operator, but it was too late.  Because the works
fire brigade expected that the contents of the reactor would released via the
safety valve and the bursting disc, they were connecting deluge guns to
prevent spreading of the expected fire. The firemen did not wear the
prescribed personal safety articles, such as hand gloves and breathing
apparatus, because they expected to do a relative, easy job.

After releasing chemicals via the safety valve and the bursting disc, several
seconds later the reactor ruptured, the contents of the reactor released and
an explosion followed.

The local fire brigade was alarmed and together with the works firebrigade,
they tried to prevent the fire to spread to the other installations, such as
cylinders filled with boron trifluoride.  To prevent enormous damage to the
environment due to polluted fire fighting water, it was decided to let the
fire burn out by itself.

[This information is compiled by TNO with greatest care from qualified
source documents. TNO cannot accept responsibility for any inaccuracy.
Meine van der Meulen (meulen@tno.nl), The Netherlands Organization for
Applied Scientific Research TNO, Department of Industrial Safety,
Apeldoorn, The Netherlands Phone: +31 55 493493]


Re: Antiviral company target of legal action

McAfee Associates <mcafee@netcom.com>
Sun, 3 Jan 93 23:45:49 -0800
RISKS Vol. 14, Issue 20 paraphrased a Washington Post article that appeared in
the San Francisco Chronicle about the temporary restraining order Imageline,
Inc. of Richmond, VA has been granted against McAfee Associates.

McAfee Associates believes the suit to be without merit and will vigorously
defend ourselves against it.  Allow me to share some pertinent facts with you:

1.      The temporary restraining order [TRO] applies to our retail
        product, PRO-SCAN, and then only to a particular version of
        it.  This particular version has not been shipped since October
        1991, and has less than 100 registered users.

2.      PRO-SCAN accounts for approximately 6% of our sales and less
        than 2% of our licensing activity.  It is essentially provided
        as a convenience for people who do not wish to use a modem to
        download our shareware products.

3.      The particular version of PRO-SCAN employs different
        technology then our shareware VIRUSCAN series, e.g., they
        are separate programs.  The TRO in no way effects the main
        product line, which is distributed electronically.

In summary, while we intend to vigorously defend ourselves, this litigation is
unlikely to have any impact on our overall business.

Aryeh Goretsky, Manager, Technical Support Department, McAfee Associates, Inc.
3350 Scott Blvd, Bldg 14, Santa Clara, CA 95054-3107  1-408-988-3832
FAX 1-408-970-9727  mcafee@netcom.COM  CompuServe ID: 76702,1714


Microprocessor design faults

Brian A Wichmann <baw@seg.npl.co.uk>
Wed, 16 Dec 92 17:07:59 GMT
Microprocessor design faults
B A Wichmann
National Physical Laboratory, Teddington, Middlesex, TW11 0LW, UK

Introduction

Modern microprocessors are very reliable. Generally, we take this for
granted and we are therefore not concerned about the possibility of
a chip having a design fault. However, in some applications, and error
could have serious consequences, so that all reasonable precautions must
be taken against such potential errors.

This note makes some proposals which would allow users of critical
applications to protect themselves against such problems in a reasonable
manner.

The problem

Modern microprocessor chips are getting very complex indeed. The current gate
count can exceed 2.5 million. One must therefore expect that new versions of
such chips will contain logical bugs. A common form of bug is in the
microcode, but since the distinction between a microcode fault and another
form of design bug is difficult to define, the distinction is not made here.
We are *not* concerned with fabrication faults.

The price/performance improvements have also been dramatic, which has been
encouraged in a highly competitive market. Of the three attributes,
performance, price and reliability, the issue of reliability comes third for
most users. Hence the commercial market is not in the business of producing
chips without design faults.

Several research projects have been undertaken or proposed to produce a design
which can be formally verified mathematically \cite{viper,veri-micro}.
Unfortunately, it is very difficult for industry to use chips other than those
to commercial designs, due to the investment in compilers and other tools.
Hence it is much more advantageous to provide commercially designed chips with
as high a reliability as is feasible.

Chip suppliers are naturally concerned about the use of their products in
applications which are critical for fear that any error could result in claims
for damages. Also, open reporting of bugs is not welcome, since it could be
potentially damaging to their market share unless it was required of all
suppliers. In consequence, suppliers do not feely provide information on bugs,
or even allow the user to decode the external marking on the chip to discover
the mask version used. Attempts to report bugs openly have not been successful
\cite{micro-rpt}.

A consequence of the above is that it is very difficult of users undertaking a
critical application to protect themselves against a potential design bug. One
approach that has been tried with one project is to use identical chips from
the same mask so that rig and development testing will extrapolate to the
final system. In some cases, the suppliers have provided information under a
non-disclosure agreement, be this seems to be restricted to major projects.

In contrast, quite a few software vendors have an open bug reporting scheme
--- and almost all provide a version number to the user. Hence it appears
in this area, software is in `advance' of hardware.

Some information

Over a period of three years, I have collected examples of design errors
in chips from several different sources. In August 1992, I posted a message
on Comp.risks (a bulletin board moderated by Peter Neumann), requesting
other examples. Unfortunately, there are problems publishing this information
in its entirety as follows:

 * Some of the information comes from sources which have probably signed
non-disclosure agreements and hence they have asked for the information not
to be published;

 * It would be difficult (and expensive) for me to trace all my sources
to ask permission to publish;

 * Much of the information does not contain some details which could
result in it being misleading --- perhaps the bug only applies to very
early releases of a chip;

 * It is clear that the information I have is not comprehensive.

Hence I have decided to extract from this information some useful points
rather than attempt to publish it as fully as possible.

The key issues extracted are as follows:

 * Early chips are unreliable:

There have been some dramatic errors in very early releases of chips.

 * Rarely used instructions are unreliable:

One report sent to me reported that some instructions not generated by
the `C' compiler were completely wrong. Another report noted that
special instructions for 64-bit integers did not work, and when this was
reported, the supplier merely removed them from the documentation!

 * Undocumented instructions are unreliable:

Obviously, such instructions must be regarded with suspicion.

 * Exceptional case handling is unreliable:

A classic instance of this problem is an error which has been reported
to me several times of the jump instructions on the 6502. When such an
instruction straddled a page boundary, it did not work correctly. This
issue potentially gives the user most cause for concern, since it may
be very difficult to avoid the issue. For instance, with machine
generated code form a compiler, the above problem with the 6502 would
be impossible to avoid.

Hence is would appear that the reliability growth models which have been
applied to large software systems apply equally to complex chips. This
appears to imply that the chips on the market represent to best that the
supplier thinks that the market requires, rather than one which has either
every known bug removed or one which has been shown correct by formal or
informal reasoning.

Conservative system design should therefore use `well-established' chips,
avoid rarely used or undocumented instructions. Much of this is conventional
wisdom.

The key issue is the extent to which chips which pass the above criteria
could be expected to be fault-free (in operation). Just one example
reported to me shows that we cannot expect too much. A compiler vendor
had a bug reported which the supplier of the software had some difficulty
in tracing. Eventually, it was found that the chip in question microcoded
the integer divide instruction by making it interruptible. Unfortunately,
the status of the registers was not preserved correctly after the interrupt.
Clearly, a bug of that type could go undetected for years and yet cause
the system to fail tomorrow.

The above has clear implications for those producing systems requiring
very high reliability. Even formal proof that the machine-code implements
the mathematical specification of the system is insufficient. Unfortunately,
no figure can be provided as an upper limit on the reliability of a single
processor system (without design diversity).

A proposal

It is currently very difficult for a designer of a high reliability system
to minimise the risks from design faults in chips for the reasons given
above. Of course, the risks are {\em small}, but for very critical systems,
all reasonable steps must be taken to reduce the risk to ALARP (As Low As
Reasonably Practical).

Further improvements would be possible if there was greater visibility
of the design process for the chips by the supplier to the users
developing critical systems. My proposal for this is as follows:

 1. The actual version of the device is determinable from the
external marking;

 2. The supplier is registered to ISO 9000;

 3. The supplier's quality assurance procedures requires that all user
reported bugs are recorded, and that the list for any specific version of
the device is available to any user who might reasonably require it.
(Obviously, suppliers should be able to charge for this, perhaps also
in the chip costs as well, and perhaps it might only be applied to
the `older' chips);

 4. Government procurement should request conformance to this scheme.
(Government and its agencies are responsible for many of the most
critical systems, and such a requirement would ensure the availability
of chips following this proposal.)

References

\bibitem{micro-rpt}
 Microprocessor Report. MicroDesign Resources Inc. ISSN 0899-9341.

\bibitem{veri-micro}
 W A Hunt. FM8502: A verified microprocessor. Institute for Computer Science,
 University of Texas, Technical Report 47. 1986.

\bibitem{viper}
 J Kershaw. Safety Control Systems and the VIPER Microprocessor.
 RSRE Memorandum No 3805. Malvern. Worcs. 1985.


Appendix

Document Details

 * Status: This is a working document.

 * Project: None.

 * File: Stored on the Sun in file baw/misc/chip2.tex.

 * History: First written, 16th December 1992.

 * Actions: BAW to copy to DTI (SQU and IMT2) and also the BCS Task Force
   and Specialist Group Committee.


Call for Papers - 1993 National Computer Security Conference

Jack Holleran <Holleran@DOCKMASTER.NCSC.MIL>
Fri, 1 Jan 93 20:37 EST
                        CALL  FOR  PAPERS
          16th NATIONAL  COMPUTER  SECURITY  CONFERENCE
     Sponsored by the National Computer Security Center and
       the National Institute of Standards and Technology

                        SEPTEMBER 20-23, 1993
                            BALTIMORE, MD

 The National Computer Security Conference audience represents a broad range
of interests drawn from government, industry, and academic communities.  Their
interests include technical research topics, security applications, and
management issues.  Papers may be addressed toward the entry level or skilled
practitioner.  Special emphasis will be placed on papers addressing the
special needs of users and creating better security for user information
technology resources.

We are pleased to invite academic Professors to recommend Student papers in
the application of Computer Security methodology.  Three student submissions
will be selected by the Technical Committee for publication in the Conference
Proceedings.  To be considered, the submission must be authored by an
individual student with the assistance of their academic Professor and be
recommended by their academic Professor.  Only one copy for student submission
is required.


BY FEBRUARY 8, 1993:  Send eight copies of your draft paper* or panel
                      suggestions to the following address.  See author
                      instructions for your submission format.

 *   Government employees or those under Government sponsorship
     must so identify their papers.

Mailing Information
National Computer Security Conference
ATTN:  NCS Conference Secretary,  AS 11
National Computer Security Center
Fort George G. Meade, MD 20755-6000

BY MAY 15, 1993:  Speakers selected to participate in the conference
                  will be notified when their camera-ready paper is
                  due to the Conference Committee.  All referee comments
                  will be forwarded to the primary author at this time.

For additional information on submissions, please call (410) 850-0272.

Preparation Instructions for the Authors
  To assist the Technical Review Committee, the following is required
for all submissions:

   Page 1:  Type of submission (paper, panel, tutorial)
            Title of submission
            Keywords
            Abstract (not to exceed 250 words)
            Author(s)
            Organization(s)
            Phone number(s)
            Net address(es), if available
            Point of Contact

  Submissions having U.S.  Government sponsorship must also provide
the following information:
     U.S. Government Program Sponsor or Procuring Element
     Contract number (if applicable)
     U.S. Government Publication Release Authority
   Note:  Responsibility for U.S. Government pre-publication review lies
          with the author(s).

  The submission (pages 2-9, these are the 8 pages of your submission):
     Title of submission - do not include author(s), address(es)
                            or organization(s)
     Abstract (with keywords)
     The paper
          (Suggested Length: 8 pages, including figures and diagrams;
                      pitch:  no smaller than 8 point; 1 inch
                      margins on top, bottom and sides.)

A Technical Review Committee, composed of Government and Industry Computer
Security experts, will referee submissions only for technical merit for
publication and presentation at the National Computer Security (NCS)
Conference.  No classified submissions will be accepted for review.

The Conference Committee provides for a double "blind" refereeing.  Please
place your names and organizations ONLY on page 1 of your submission, as
defined above.  Failure to COMPLY with the above instructions may result in
non-selection BEFORE the referee process.  Papers in excess of 8 pages may
also result in non-selection BEFORE the referee process.

Papers drafted as part of the author's official U.S.  Government duties may
not be subject to copyright.  Papers submitted that are subject to copyright
must be accompanied by a written assignment to the NCS Conference Committee or
written authorization to publish and release the paper at the Committee's
discretion.  Papers selected for presentation at the NCS Conference requiring
U.S.  Government pre-publication review must include, with the submission of
the final paper to the committee, a written release from the U.S.  Government
Department or Agency responsible for pre-publication review.  The release is
required no later than July 1, 1992.  Failure to comply may result in
rescinding selection for publication and for presentation at the 16th NCS
Conference.

Technical questions can be addressed to the NCS Conference Committee by mail
(see Mailing Information) or by phone, (410) 850-0CSC [0272].  For other
information about the conference, please call (301) 975-2775.

Please report problems with the web pages to the maintainer

Top