The RISKS Digest, Volume 11 Issue 8

Peter G. Neumann

The RISKS Digest
Volume 11 Issue 8

Wednesday, 13th February 1991

Forum on Risks to the Public in Computers and Related Systems

ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Please try the URL privacy information feature enabled by clicking the flashlight icon above. This will reveal two icons after each link the body of the digest. The shield takes you to a breakdown of Terms of Service for the site - however only a small number of sites are covered at the moment. The flashlight take you to an analysis of the various trackers etc. that the linked site delivers. Please let the website maintainer know if you find this useful or not. As a RISKS reader, you will probably not be surprised by what is revealed…

News of His Death Much Exaggerated: Jeff Johnson
Prison terms for airline computer ticketing fraud: Rodney Hoffman
PWR system "abandoned owing to technical problems": Martyn Thomas
Risks of having a sister: Robyn A Grunberg
Charles Meo
Re: Study links leukemia to power lines, TV's: Steve Bellovin
Re: Predicting System Reliability...: Jay Elinsky
Tanner Andrews
Martyn Thomas
Jay Elinsky
Paul Ammann
Info on RISKS (comp.risks)

News of His Death Much Exaggerated

Jeff Johnson <jjohnson@hpljaj.hpl.hp.com>

Tue, 12 Feb 91 14:15:18 PST

The San Francisco Chronicle (11 Feb 91) has on the second page a photo of a man
pointing to the Vietnam War Memorial wall in Washington, D.C.  The caption
reads:

  "Vietnam veteran Eugene J. Toni of suburban Virginia pointed to his name
  on the Vietnam Memorial in Washington yesterday.  Toni, a 41-year-old
  former Army sergeant, is one of 14 Americans who can find their own
  names carved in black granite among the 58,175 dead and missing in the
  war.  Toni was listed because a wrong number was typed into a computer."

JJ, HP Labs

Prison terms for airline computer ticketing fraud

Rodney Hoffman <Hoffman.El_Segundo@Xerox.com>

Wed, 13 Feb 1991 09:50:06 PST

In RISKS 7.72, I summarized a 'Wall Street Journal' article about a travel
agency employee charged with breaking into American Airline's computer
reservations system for fraud.

I believe this recent item is the conclusion of that case:

'Los Angeles Times', 11 Feb. '91: TRAVEL AGENTS SENTENCED: Their federal terms
ranged from nearly two years to four years in prison for running a scheme to
defraud American Airlines of frequent-flier tickets totaling $1.3 million
between 1986 and 1987.  Through a computer reservation terminal at North Ranch
Travel Agency in Woodland Hills (CA), the three men changed American Airlines'
records on frequent fliers, crediting fictitious accounts with miles flown by
legitimate passengers not enrolled in the frequent-flier program.  The
defendants then used the miles to apply for free flights, sold them for profit
or gave them to friends and family.  They were convicted after a trial last
year.  (Case No. 90-409.  Sentencing Feb. 5)

PWR system "abandoned owing to technical problems"

Martyn Thomas <mct@praxis.co.uk>

Wed, 13 Feb 91 13:25:04 GMT

The following story is from Nucleonics Week (pubs: McGraw-Hill) Vol 32 No 1
(Jan 3 1991) and No 2 (Jan 10 1991).

Electricite de France (EDF) has decided in principle to abandon the Controbloc
P20 decentralised plant supervisory computer system developed by Cegelec for
EDF's new N4 Pressurised Water Reactor (PWR) series, because of major
difficulties in perfecting the new product, according to EDF officials.

EDF does not yet know [as of Jan 3rd] what it can use in place of the P20 to
control the N4 reactors, the first of which is nearly completed. [They were
meeting to decide the way forward on January 25th. Options include trying to
salvage parts of the P20, or reverting to the N20 system used to control the
earlier P4 series of reactors {the numbering seems maximally confusing}.
Unfortunately, the P20 data acquisition and control uses dual LANs, called
Controbus, whereas the N20 uses cables. If they fall back to the N20, they will
have to design miles of cables into the reactor to replace the LANs.]

A Cegelec official described the P20 as "the most ambitious system you could
imagine". It has distributed control and monitoring, programmable logic
controllers, and 32-bit microprocessors. The N20 used 8-bit microprocessors.

Cegelec blame EDF reorganisations for the cancellation, but EDF's engineering
and construction division say that the problems were strictly technical.
According to Pierre Bacher, the division's president, the failure to achieve
sufficient capacity to process the mass of acquired reactor data with the
original P20 architecture had led to "increasingly complex software programs"
with "increasingly numerous interactions between subsystems". The complexity
apparently grew to the point where modification became difficult and there was
fear that the system could never be qualified [which I take to mean certified
for use].

According to the report, "Ontario Hydro faced a similar situation at its
Darlington station, in which proving the safety effectiveness of a
sophisticated computerized shutdown system delayed startup of the first unit
through much of 1989. Last year, faced with regulatory complaints that the
software was too difficult to adapt to operating changes, Hydro decided to
replace it altogether". [ I hope that Dave Parnas or Nancy Leveson can fill in
the details here.]

Of particular interest to UK RISKS readers is the fact that the P20 system is
on order for the Sizewell B PWR (due to load fuel in November 1993, and the
only remaining scheduled PWR in the UK nuclear power programme).  The P20 "is
to be applied less to safety systems at Sizewell than was planned on the N4",
the report says. [Sizewell has a separate shutdown system, although there are
rumours that all is not well with it.]

There is a fully computerised N4 control room designed to go with the P20
system. If the P20 cannot be salvaged, presumably this will be abandoned too.

[There is more detail in the two reports, which I recommend interested readers
acquire].

Martyn Thomas, Praxis plc, 20 Manvers Street, Bath BA1 1PX UK.
Tel:    +44-225-444700.   Email:   mct@praxis.co.uk

Risks of having a sister

Robyn A Grunberg <rag@yarra.oz.au>

12 Feb 91 02:32:26 GMT

On Thursday 7th February, I arrived home from an interstate trip to find a
letter in the mail stating that my driver's license had been cancelled for 6
months.  The cancellation took effect from January 15th, 1991 and was to
continue until July 15th 1991.  The cancellation was due to my driving a car
while exceeding the state limit of .05% alcohol in my bloodstream.  This
interested me greatly as I had not been breathalised nor blood tested on (or
even near) the day in question, stated on the notice as December 17th 1990.

The following day I approached VICROADS who had sent me the notice.  I
explained to them that I was not the offender of the crime and the clerk called
up the details of the charge.  My name was listed, as well as my license
number, however the registration number of the car involved in the incident was
not the registration number of my car.  The clerk suggested I fill out a
Statuary Declaration and file that (along with the notice) with them so that
the department could place the matter under investigation.

I then went to the Police Station where I obtained a Statuary Declaration and
had it witnessed.  I also asked if the officer could check and see whose car
was involved, as it wasn't mine.  The officer checked out the records and
returned to tell me that the car belonged to my sister, who is unlicensed.  He
also explained that I was able to drive as long as I carried the Stat Dec with
me at all times.

Unfortunately, my licensed expired *that day*, so I then had to approach
VICROADS and try and get them to reissue my license.  The clerk would not
reissue my license as it was currently under cancellation.  I showed him the
Stat Dec, which was no use to him (or me) at all, he could not reissue the
license until the matter is resolved.  He suggested I continue driving with the
Stat Dec.  I would not accept this statement from him and asked he put in
writing the fact that I had attempted to renew my license and he had refused to
reissue it.  He would not put it in writing, and suggested I speak to his
supervisor.

So here I am without a license, and waiting for the matter to be heard.  It
would appear that my sister was breathalised and gave my details when asked who
she was.  The car, she explained, she had borrowed from her sister Michealle,
which the police accepted in good faith.  As far as the police are concered,
all you need do is state your name, address and birthdate (which she did) and
the police will accept this and demand that you show your license at a later
date.  Unfortunately, they also went ahead and cancelled my license without any
proof that she was who she stated being, as she hasn't produced the license at
any stage.

Re: Risks of having a sister

Charles Meo <cm@yarra.oz.au>

13 Feb 91 04:17:20 GMT

For non-Australians it is worth pointing out that under unique (and
unsuccessfully opposed) legislation, the burden of proof has been reversed and
police are empowered to record a conviction _without_ any judicial process
whatever and the driver is then obliged to prove his or her _innocence_ in the
matter.

This has enabled local police to generate enormous government revenues as many
traffic infringements are now handled in this way.

I do not know of any other civilised country that would allow this (the spirits
of the old prison governors are alive and well in our seats of government!) and
of course, when this law is translated into computer systems with _no_ safe
guards the situation Robyn has described can happen easily.

C. Meo #6512441/L (Turn to the right!)

Re: Study links leukemia to power lines, TV's

<smb@ulysses.att.com>

Sat, 09 Feb 91 23:46:22 EST

The original AP story was considerably longer, and included many more
qualifiers.  As noted in the excerpt your paper ran, this study will receive
very careful scrutiny because it was sponsored by an industry group.

The methodology has been described as somewhat suspect by the Electric Power
Research Institute, though the University describes the findings as
significant.  The parents of children being treated for leukemia were quizzed
about their child's activities; their responses were compared with those of a
control group.  (The article did not say how the control group was selected.)
Unfortunately, memory plays funny tricks; in an era where newspapers often seem
to feature the carcinogen of the week, parents of such children may — and I
stress the word ``may'' — be more likely to recall suspect behavior patterns.
(For example, the article noted a correlation to home use of pesticides, and to
paternal exposure to spray paint at work during the pregnancy.)

More troubling, the objective measurements taken don't seem to agree with the
incidence of disease.  For example, bedroom electric field strength
measurements did not differ between the two groups, though since the
measurements used were 24 hour averages, there may have been differing peaks.
Similarly, there is no particularly obvious reason to suspect black-and-white
TVs; according to the article, the researchers ``speculated'' that such sets
might be older, and hence might not meet current standards.  (If we're
guessing, I'd guess that such TVs are smaller, and hence would be watched from
a closer distance.)

No statistically significant correlation was found with use of electric
blankets or hair curlers; the former, at least, would (as I recall) contradict
other studies.

The study itself has not been released, and will not be, pending peer review
and publication in a refereed journal.  But a precis was released by the
university and by the sponsors.

        --Steve Bellovin

Re: Predicting System Reliability...

"Jay Elinsky" <ELINSKY@IBM.COM>

Sun, 10 Feb 91 00:16:56 EST

Re Brad Knowles' statement that disk drives are tested in parallel, and Bruce
Hamilton's rejoinder that mechanical systems can't be tested in parallel: Just
as the airframes in Bruce's example are presumably pressurized and
depressurized at a much higher rate than occurs in actual takeoff-landing
cycles, disk drives can presumably be tested at a higher duty cycle than they
see in actual use.  That is, the manufacturer can keep the heads thrashing
around continually, unlike the typical drive on a desktop computer.  I don't
know how one would accelerate a test on a mainframe disk drive that perhaps
does thrash around 24 hours a day, nor do I know if it's possible to accelerate
the testing of the platter bearings, which are spinning 24 hours a day even on
powered-up but otherwise idle machines.

So, I assume (and I'm no M.E. either) that parallel testing is combined with
tests that tend to accelerate wear of components where possible.

Jay Elinsky, IBM T.J. Watson Research Center, Yorktown Heights, NY

Re: Building very reliable systems

Dr. Tanner Andrews <tanner@ki4pv.compu.com>

Sun, 10 Feb 91 9:51:25 EST

) The theory here is that running 100 units for 100 hours gives you
) the same information as running one unit for 10000 hours.
The theory is crocked.  It builds heat slowly. The actual behavior:
    100 hours:  a little warm
    200 hours:  case is softening
    250 hours:  case melts
    257 hours:  catches fire
The times and failure modes will vary, depending on the type of
device in question.

...!{bikini.cis.ufl.edu allegra uunet!cdin-1}!ki4pv!tanner

Re: building very reliable systems

Martyn Thomas <mct@praxis.co.uk>

Mon, 11 Feb 91 11:47:59 GMT

Jerry Leichter <leichter@lrw.com> writes:
.....
:   2.  Functional decomposition of the system into a number of modules
:       such that failure can occur only when ALL the modules fail.

: Now, the criticism of technique 2 is that the multiplication of failure proba-
: bilities is only valid when the failures of the different modules are known
: to be uncorrelated.
.....
: So, is technique 2 worthless?  By no means:  It's just often misapplied.  To
: use it, you need to establish (by formal techniques, testing, experience in
: the field) not just the failure rates for the individual modules, but an
: upper bound on failure correlation among modules.  This is by no means impos-
: sible to accomplish.
...........
: That's not to say that some still-unknown variant of n-version programming
: can't be made to work.  In fact, I'd guess that it can be, though it won't
: be easy - and I certainly wouldn't want to propose a mechanism.  If so, then
: software systems to which we can reasonably ascribe "1 in 10^9" failure
: probabilities should be quite buildable.

[I have extracted the elements from Jerry's article that I want to disagree
with. I thought the articles as a whole  was a very valuable contribution to
the discussion. I apologise in advance if I have distorted his argument by
selective quotation.]

How can we have confidence that the means by which we have combined the
n-versions (for example, the voting logic) has a failure probability below 1 in
10^9?

How can we be sure that our analysis of the upper bound on failure correlation
among modules is accurate? How accurate does it need to be - does it need to
have a probability of less than 1 in 10^9 that it is grossly wrong? (By
"grossly wrong" I mean wrong enough to invalidate the calculation that the
overall system meets the "1 in 10^9" figure).  This would seem impossible.
Consider, for example, the probability that the common specification is wrong.

I also have a question for statisticians: if we are attempting to build a
system "to which we can reasonably ascribe a 1 in 10^9 failure probability",
what *confidence level* should we aim for, if we are using statistical
methods? Does it make sense to be satisfied with 99% confidence of 1 in
10^9? Or should we aim for 99.9999999%? (I hope the answer isn't simply "it
depends what you mean by "reasonably". I am looking for guidance on how the
failure probability and the confidence levels interact in practical use).

(I suspect that I am missing some contributions to this discussion. I would
be grateful if anyone following-up would also copy me by email).

Martyn Thomas, Praxis plc, 20 Manvers Street, Bath BA1 1PX UK.
Tel:    +44-225-444700.   Email:   mct@praxis.co.uk

Re: Predicting System Reliability...

"Jay Elinsky" <ELINSKY@YKTVMZ.BITNET>

Sun, 10 Feb 91 00:11:17 EST

Re Brad Knowles' statement that disk drives are tested in parallel, and Bruce
Hamilton's rejoinder that mechanical systems can't be tested in parallel: Just
as the airframes in Bruce's example are presumably pressurized and
depressurized at a much higher rate than occurs in actual takeoff-landing
cycles, disk drives can presumably be tested at a higher duty cycle than they
see in actual use.  That is, the manufacturer can keep the heads thrashing
around continually, unlike the typical drive on a desktop computer.  I don't
know how one would accelerate a test on a mainframe disk drive that perhaps
does thrash around 24 hours a day, nor do I know if it's possible to accelerate
the testing of the platter bearings, which are spinning 24 hours a day even on
powered-up but otherwise idle machines.

So, I assume (and I'm no M.E. either) that parallel testing is combined with
tests that tend to accelerate wear of components where possible.

Jay Elinsky, IBM T.J. Watson Research Center, Yorktown Heights, NY

Re: Building very reliable systems (Jerry Leichter, RISKS-11.07)

Paul Ammann <pammann@gmuvax2.gmu.edu>

Mon, 11 Feb 91 13:22:03 -0500

>    1.  Testing (whether by explicit test in a lab or by actual use in
>        the field) of very large numbers of copies of the system
>    2.  Functional decomposition of the system into a number of modules
>        such that failure can occur only when ALL the modules fail.

The first technique assesses performance directly, and can be applied to any
system, regardless of its construction.  As Jerry points out, various
assumptions must be made about the environment in which the testing takes
place.  The second technique estimates performance from a predictive model.

> [...]                                                                  To
>use [NVP], you need to establish (by formal techniques, testing, experience in
>the field) not just the failure rates for the individual modules, but an
>upper bound on failure correlation among modules.

The Eckhardt and Lee model (TSE Dec 1985) makes it clear that performance
prediction is much more difficult.  To evaluate a particular type of system,
one must know what fraction of the components are expected to fail over
the entire distribution of inputs.  The exact data is, from a practical
point of view, impossible to collect.  Unfortunately, minor variations in
the data result in radically different estimates of performance.  For a
specific system, it is not clear (to me, anyway) what an appropriate
"upper bound of failure correlation among modules" would be, let alone
how one would obtain it.

>In fact, techniques 1 and 2 are fundamentally the same thing:  One cuts the
>world "vertically" between many complete copies of the same system; the other
>cuts the system itself "horizontally" across its components.  The same two
>issues - reliability of the individual slices; independence of failure modes -
>occurs in both cases.

I am uncomfortable with merging the issues of direct measurement with those
of indirect estimation.  The difficulties in 1 are primarily system issues;
details of the various components are by and large irrelevant. In technique 2
the major issue is the failure relationship between components.

>                       Either technique can be used to get believable failure
>estimates in the 1 in 10^8 (or even better) range.  Such estimates are never
>easy to obtain - but they ARE possible.  Rejecting them out of hand is as much
>a risk as accepting them at face value.

I am unaware of any application of NVP in which it has been (believably)
demonstrated that components of modest failure probability (say 1 in 10^4)
can been used to generate a system with a very low failure probability
(say 1 in 10^8).  The relatively scant empirical evidence indicates that NVP
might be good for an order of magnitude or so (which may be great, depending
upon the system).  However, there are no guarantees; in certain
circumstances, NVP may well be worse than the use of a single component.
The real issue is economic: could better systems be built by applying
development resources to other technique(s).  There are strong views on
both sides of the question.

(As a final aside, there are random algorithms that, for certain well
behaved problems, *can* justifiably employ an independence model to obtain
very low system failure probabilities.  However, these techniques are not
in the domain of NVP).

-- Paul Ammann: pammann@gmuvax2.gmu.edu (703) 764-4664
-- George Mason University, Fairfax VA

Please report problems with the web pages to the maintainer

Top

The RISKS DigestVolume 11 Issue 8

Wednesday, 13th February 1991

Forum on Risks to the Public in Computers and Related Systems

ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Contents

News of His Death Much Exaggerated

Prison terms for airline computer ticketing fraud

PWR system "abandoned owing to technical problems"

Risks of having a sister

Re: Risks of having a sister

Re: Study links leukemia to power lines, TV's

Re: Predicting System Reliability...

Re: Building very reliable systems

Re: building very reliable systems

Re: Predicting System Reliability...

Re: Building very reliable systems (Jerry Leichter, RISKS-11.07)

The RISKS Digest
Volume 11 Issue 8