The RISKS Digest, Volume 24 Issue 41

Peter G. Neumann

The RISKS Digest
Volume 24 Issue 41

Tuesday, 5th September 2006

Forum on Risks to the Public in Computers and Related Systems

ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Please try the URL privacy information feature enabled by clicking the flashlight icon above. This will reveal two icons after each link the body of the digest. The shield takes you to a breakdown of Terms of Service for the site - however only a small number of sites are covered at the moment. The flashlight take you to an analysis of the various trackers etc. that the linked site delivers. Please let the website maintainer know if you find this useful or not. As a RISKS reader, you will probably not be surprised by what is revealed…

UK 141M-pound benefits computer system shelved: Martyn Thomas
Taxiway altered before KY crash: PGN
The Case of the Patriot System in the Gulf War: Diego Latella
High-tech Product Sabotage: Peter Mellor
British MP falls foul of wiki-d pranksters: M. Hackett
Swedish Atomic Power Plant Shutdown: Debora Weber-Wulff
Another power outage: Kurt Fredriksson
Re: LA power outages: Michael Bacon
Merlyn Kline
Re: Your Cable Company ...: Robert de Bath
More on the Sony lithium-ion laptop battery fire issue: Curt Sampson
Spread sheets weak point of Security: Al Macintyre
Re: LA power outages: Rex Black
Brave New Ballot, Avi Rubin: PGN
Info on RISKS (comp.risks)

UK 141M-pound benefits computer system shelved

<"Martyn Thomas" <martyn@thomas-associates.co.uk>>

Tue, 5 Sep 2006 14:45:13 +0100


"A new computer system used to process benefits payments has been scrapped
at a cost to the taxpayer of (UK) 141M pounds, the BBC has learned.  The IT
project, key to streamlining payments by the UK Department for Work and
Pensions (DWP), was quietly axed at an internal meeting last month. ...  It
is the latest in a long series of computer problems for the government."
[Source: BBC News, 5 Sept 2006]
  http://news.bbc.co.uk/1/hi/uk_politics/5315280.stm

  [Phillip Hammond, the Conservatives' shadow work and pensions secretary,
  is quoted: "It is pretty disgraceful that after two and half years of
  spending public money on this project, the government has walked away from
  it.  We never hear of somebody actually losing their job because they
  have failed to implement a project they were responsible for."  PGN-ed]

Taxiway altered before KY crash

<"Peter G. Neumann" <neumann@csl.sri.com>>

Mon, 28 Aug 2006 16:09:44 PDT


The taxi route for commercial jets at Blue Grass Airport was altered a week
before Comair Flight 5191 took the wrong runway and crashed, killing all but
one of the 50 people aboard.  Both the old and new taxiways to reach the
main commercial runway cross over the shorter general aviation runway, where
the commuter jet tried to take off on 27 Aug 2006.  [Source: Crash Probe
Focuses on Use of Shorter Runway, Richard Fausset and Alan C. Miller, *Los
Angeles Times*, 28 Aug 2006; PGN-ed; more details in subsequent reports]
http://www.latimes.com/news/nationworld/nation/la-082806plane,0,242799.story?coll=la-home-headlines

The Case of the Patriot System in the Gulf War

Mon, 04 Sep 2006 14:18:00 +0200


Diego Latella, The Case of Patriots in the Gulf War. [in italian]
MAGAZINE: SAPERE - Ed. Dedalo srl - www.edizionidedalo.it
Directors: C. Bernardini and F. Lenci

This paper addresses the controversy on the performance of the Patriot
system ATBM during the 1991 Gulf War.  The controversy has been initiated by
the seminal work of Prof. T.A. Postol and his colleagues at MIT, where
several aspects of Patriot performance have been analysed and declarations
of Army officials as well as the press have been questioned.

The paper is aimed at the general (although motivated) public more than
specialists.  It starts by giving a brief introduction to the technical
features of the system and its development history. The dramatic scarcity of
data concerning the events in the Gulf War is then addressed, and reasons
for understanding it are discussed.  The most significant failures of the
system during the Gulf War are presented, in contrast to the initially
overly positive assessments of the success of the system during the war.
The discussion is broadened including issues from the debate involving
Postol's group, GAO, the Army, Raytheon researchers, and the Panel On Public
Affairs of the American Physical Society (which, incidentally, judged very
positively the work of the MIT group).  Some personal closing remarks are
presented on the use of computers in war, having to regret that not so much
has changed, since SDI, on the expectations many people, including
researchers, still put on computers, despite the lessons we should have
learned on their practical as well as conceptual limitations. The paper
includes a rich bibliography with more than sixty references.

Dott. Diego Latella, Ist. di Scienza e Tecnologie dell'Informazione A. Faedo
I56124, Pisa, ITALY +39 0503152982 http://www.isti.cnr.it/People/D.Latella

  [The translation into English is Diego's, although it has been PGN-ed.
  Even with my limited ability to read Italian, the original article appears
  to be very well researched.  PGN]

High-tech Product Sabotage

Fri, 25 Aug 2006 20:25:51 EDT


The following is an extract from an article based on the series "Trust me
I'm an economist", BBC2. (Second episode 7pm on 25th August 2006.)  The
author and presenter is Tim Harford, a *Financial Times* columnist and
author of "The Undercover Economist".

  Supermarkets package their cheapest products to look more like famine
  relief than something you'd want to pay for.  It's not because they can't
  afford sexy packaging even for their cheapest foods - it's because they
  want to persuade richer customers to buy something more expensive instead.

  Economists call this "product sabotage" and it can reach extreme levels.

  In the hi-tech world it is common to produce a high-specification product,
  sold at a premium price, and then sell the same product more cheaply with
  some of the functions disabled.

  Intel did this with its 486 computer chip in the early 1990s, and IBM did
  it with a printer: the economy version for home users was simply the
  top-of-the-range model with a chip in it to slow it down.

  These tactics might seem sneaky or unethical, and they certainly don't go
  down well with customers.

  Yet frustrating as it is, product sabotage is often the cheapest way to
  produce two different versions of a product. For the hi-tech industry the
  alternative is to design the whole product twice.

  And two different versions are what you need if you want to reach
  price-sensitive customers.

The full article is on:
http://newsvote.bbc.co.uk/mpapps/pagetools/print/news.bbc.co.uk/2/hi/business/5274352.stm

During the 1970s when working for ICL, I was told by customer support
engineers that the 'conversion' of a 1902 mini-mainframe to the faster 1902A
model was to snip one connecting wire on the back-plane.  Plus ca change ...

Peter Mellor;   Mobile: 07914 045072;   e-mail: MellorPeter@aol.com

British MP falls foul of wiki-d pranksters

<"M. Hackett" <dist23@juno.com>>

Sat, 2 Sep 2006 01:47:56 -0700


Any fool could tell you that [mixing] Wikis and policy making could only
result in this kind of mess.

Currently I am considering the possibility bulk e-mailing UK
parliamentarians (or UK PMO / Royal Palace) for an undisclosed client. In my
case the matter concerns an international broadcasting linkage between
Canada and the UK.  So few political wikis exist to get your MP or MLA's
attention — that bulk e-mail and bulk faxing has become the only solid and
workable alternative.

Max Power, CEO, Power Broadcasting, http://HireMe.geek.nz/
British MP falls foul of wiki-d pranksters

A British Government Minister may have thought he was keeping up with
modern trends when he put a draft policy on the Internet on Friday, but
he was soon left red-faced when hundreds of pranksters defaced it.

Weblogging, techno-savvy Environment Secretary David Miliband, tipped as a
bright young spark in Prime Minister Tony Blair's administration, had put a
draft "environment contract" on his department's website, setting out social
responsibilities for people, government and businesses.

But embarrassed administrators were forced to haul it down after more than
170 cyber-jokers trashed the document by adding in bizarre paragraphs for
fun.

The page used "wiki" editing techniques, which allow readers to alter the
content.

A heading of "Who are the parties to the environmental contract?" became,
"Where is the party for the environmental contract? Can I come? Will there
be cake? Hooray!"

Another asked: "What would an environmental contract for energy look like?
Will it look like my face? My beautiful face?"

The tricky question of "what tools can be used to deliver the environmental
contract?" received the answer: "Spade, Organic Yoghurt Stirrer, Old washing
up liquid bottle, Sticky Back Plastic."

Meanwhile, a list of tools that "create the right incentive frameworks" was
doctored to include "Big stick" and "Owl magnet".

Some of the Internet pranksters put the boot into the Government when
monkeying around with the text.

Under a list of things citizens should do, one wag added: "Pay a higher
proportion of their income to the government, and see little tangible
improvement in their standard of living".

One passage said everyone had the capacity to tackle environmental problems,
but that people were too often dissuaded by "doubts about whether our
actions will make any difference".

One joker swiftly tagged on: "Besides which we just can't help but meddle,
interfere, impose our views on others, and generally use taxpayers'
resources in ways that are wasteful except in our own self-aggrandisement".

Word about the document spread like wildfire across several Internet
weblogs.

Administrators were forced into action and left a message of their own:
"Please note - the Wiki has been 'locked' for the time being to prevent
editing.

"Thanks to everyone for their interest so far - do visit again and continue
the discussion. In the meantime, you're welcome to read the comments and
materials submitted."

A spokeswoman for the red-faced Department for Environment Food and Rural
Affairs said the page was an experiment.

"It's unfortunate that these things do happen. We are currently looking at
security on the site," she said.

Swedish Atomic Power Plant Shutdown

Sun, 03 Sep 2006 00:22:49 +0200


[This has not been widely reported in the mainstream press, which, given
the gravity of what happened shocks me rather more than what happened - dww]

On August 17, 2006, there was a class two incident that occurred at the
Swedish atomic reactor Forsmark. A short circuit in the electricity network
caused a problem inside the reactor and it needed to be shut down
immediately, using emergency backup electricity. However, in two of the four
generators, which run on AC, the AC/DC converters died. They disconnected,
leaving the reactor in a state where the operators did not know what the
current state of the system was for approx. 20 minutes. A meltdown could
have occurred, such as we had in Tschernobyl.

In Sweden, the government immediately shut down all reactors that were built
similarly until the problem could be cleared up. In Germany, people were
insisting that Brunsb�ttel was built similarly, but the company operating
the reactor (in both cases, Forsmark and Brunsb�ttel: Vattenfall) insisted
it was not the same. When it was discovered that Brunsb�ttel was indeed the
same, the German environmental minister, Sigmar Gabriel, threatened to shut
it down right away. But he has been pacified and the reactor is still
running.

This seems to be a very similar problem to the LA situation - the emergency
systems had not been tested with the grid electricity going off.
Additionally, it appears that Brunsb�ttel has had three incidents in 2002
pertaining to the emergency electricity system.

According to the taz from August 31, 2006, there exists a list of 260
security problems with Brunsb�ttel which the ministry in Kiel is keeping
under wraps. The ministry says that a list does exist, but refuses to
publish it at the same time it is telling Vattenfall that it is not
communicating its problems properly.

The risks involved here are very seldom but very lethal - a core meltdown is
no joke.

An extremely technical report can be found here:
http://www.neimagazine.com/story.asp?sectionCode=132&storyCode=2038313

Prof. Dr. Debora Weber-Wulff, FHTW Berlin, FB 4, Treskowallee 8, 10313
Berlin GERMANY +49-30-5019-2320 http://www.f4.fhtw-berlin.de/people/weberwu/

Another power outage

<"Kurt" <kurt.fredriksson@ieee.org>>

Wed, 30 Aug 2006 15:55:45 +0200


This happened many years ago now, but might be of interest, as an example of
an omission of looking at the whole system.

We had a computer center with a large battery backup. The 3-phase AC power
was converted to DC, which kept the batteries fully loaded. The power for
the servers was converted from DC to 3-phase AC.

One day the latter converter broke, so we only got 2 phases of power.
(There was no interruption in the mains.) We then, to our horror, discovered
that there were no switches installed to bypass the faulty converter, and we
thus had to close down all servers.

The obvious remedy was to install those switches.

Nobody had thought of the possibility that the converters might fail.  Not
surprisingly, as these converters are used in million of telephone exchanges
all over the world, and had an impressing MTBF.

Kurt Fredriksson

Re: LA power outages (RISKS-24.37,38,39,40)

Wed, 30 Aug 2006 02:02:03 -0700 (PDT)


A data centre in London, UK suffered a total power outage on Sunday, 23 Jul
2006 when the incoming supply and the in-house stand-by generation failed.
Other events contributed to a loss of service lasting 11 hours and four
minutes.

The events ran as follows:

At 10:56 a public supply 132kV to 11kV transformer failed with consequential
failure in the 40kV to 132kV transformer feeding the area of London that
feeds the data centre.  Incoming power failed on all six cables.

Generators 1 and 3 started OK, but generator 2 did not — owing to
insufficient air pressure to engage the starter motors.  A stand-by
diesel-powered air-compressor was tried, but could not maintain pressure in
the air-starter piping.

Generator 1 began to overheat and shut down and generator 3 shut down
shortly afterwards due to a high load condition.

While the generators were running, the supply was repeatedly in and out of
tolerance. When out of tolerance, the UPS batteries provided the primary
power source and this drained the power from the batteries even when the
generators were running.

As a result and just before generators 1 and 3 failed, UPS3 shutdown safely
and went into bypass mode.  No fuses were blown.

When the generated power were finally lost, UPS3 reverted to a non-bypass
configuration. The remaining UPSs then went into normal battery back-up and
carried the load for 22 minutes at which time the UPS modules began to
discharge their batteries.

UPS1 then shut down on a "battery low" alarm for one module.  When the
second module reached "battery low", there was no power available on the
module and it shut down without going into bypass.  This cascaded until the
last module was taking the entire load and, as this discharged, the voltage
and frequency dropped resulting in the inverter thyristors failing to switch
off. This caused both inverter and battery fuses to blow as well as one of
the inverter thyristors themselves.

UPS 2 exhibited a similar failure.

All UPSs were then off and auto bypass was not enabled.

When the mains were restored, raw mains was not provided directly to the
PDUs. As a consequence, the input breaker shunt-tripped. This could not be
reset until the UPS power has been restored at the PDU level and then each
input breaker had to be manually re-set.

While this was going on, Building Management System (BMS) connectivity was
lost and the fire alarm went off.  The Fire Brigade arrived and evacuated
the building.

An hour later, the Fire Brigade had confirmed the safety of the building,
the fire alarm system had been taken off-line and staff had re-entered the
building

Commercial power was restored at 13:42.

All DC and HVAC systems normalised without intervention, but the UPS systems
had to be restored manually.

About one hour after this, UPSs 1 and 3 were brought back online.

A further hour later UPS 2 was brought back online.

Half an hour later, data centre personnel began to restore various UPS PDUs
to operation.

Some 90 minutes after this, UPS power was briefly restored to the BMS and
the Integrated Management and Monitoring System (IMMS) began to report
sporadic alarms.

By 20:00 all UPSs were back online.

Two hours later, service had been restored, albeit that some customers were
suffering odd failures the following day - possibly owing to the loss of
power to equipment that had been previously running continuously for
thousands of hours.

Some years before, an international bank in London suffered a power outage
while having its UPS serviced.

They had two levels of battery UPS with stand-by generation through
diesel-powered alternators.  Level One UPS was scheduled for maintenance.
It was taken down and (Sod's Law at work!) within minutes the incoming
supply failed (as a result of "JCB-fade").  "Never mind!" the engineers
cried, "There is always the Level Two UPS".  Sadly, this did not take up the
load, owing to the previously undetected failure of an "AND" circuit that
noted the absence of incoming mains power and a low voltage condition on
UPS1.  "Never mind!" the engineers cried again.  "The stand-by generators
will start and carry the load."  They didn't!  The reason?  The stand-by
generators required a signal from UPS1 ... which was down for maintenance.
Attempts to manually start the SBGs failed, firstly because the batteries
were flat (!) and secondly (after replacing the batteries) because the
manual-start process still required a signal from UPS1 to release an
interlock.  Some five hours of blackout was experienced before the incoming
supply was restored.

Michael "Streaky" Bacon

Re: LA power outages (Fairfax, RISKS-24.40)

<"Merlyn Kline" <merlyn@zynet.net>>

Wed, 30 Aug 2006 10:30:02 +0100


[...]

> Redundancy isn't hard.  Engineering is hard.

But not impossible. Stephen Fairfax was able to think of a wide range of
possibly untested failure modes for this system while dashing off an email.
If suitably paid, I've no doubt he (and many others) could think of a
comprehensive, even exhaustive, list and design and implement a suitable
test programme. But who would be willing to pay? The results of losing a
data centre rarely compare to the results of losing a Boeing or a skyscraper
or a power plant. Of course occasionally they do. I wonder how well tested
*those* data centres are?

Merlyn Kline

Re: Your Cable Company ... (RISKS-24.37)

Wed, 30 Aug 2006 07:04:24 +0100 (BST)


On Sat, 12 Aug 2006, Lauren Weinstein wrote:

> I found it rather amusing, in a "sad commentary" sort of way.

I disagree with the "sad commentary" part of this, to explain.

A UPS is for risk managment not risk elimination you decide two numbers

(a) How long do you expect to need internal power.
(b) How long does it take to do a controlled shutdown.

Add them together, add some more for growth and luck. The numbers you come
up with then reduce the risk to an "acceptable" level.

But there's a twist to the first number. For example in a hospital computer
system you only need batteries for long enough to get the diesel generator
started. You only need enough diesel to keep you going until you are sure
you can get a refill. Exactly the same reasoning has happened with that
telecom node. (possibly after the fact of course)

Robert de Bath <robert$ @ debath.co.uk> <http://www.debath.co.uk/>

More on the Sony lithium-ion laptop battery fire issue

Wed, 6 Sep 2006 00:18:01 +0900 (JST)


Here's a link to an interesting analysis that's rather more extensive than
others I've seen, perhaps in part because it relies on subscription-only
sources to Japanese news sites:

  http://japaninc.typepad.com/terries_take/2006/08/index.html#top

The particularly good bit:

  The technical causes of the batteries overheating were well explained in a
  recent Nikkei interview of a professor at Kyoto University, who is an
  expert on battery technology. He points out that there are two possible,
  complementary reasons for the Dell notebook fires — one of which offsets
  some of the blame from Sony.

  Firstly, there was the well publicized manufacturing failure, consisting
  of metal particles that were introduced into the battery electrolyte and
  which can eventually lead to internal short circuits and thus
  overheating. Sony takes full responsibility for this.

  The second reason, however, is probably not so well known, but allows Sony
  to share the blame with with Dell and Apple.

  Apparently some PC designs by both companies push the Lithium-ion battery
  technology past its safe point by virtue of the fast recharging cycle the
  makers have implemented. According to the professor, when Lithium cells
  are exposed to rapid charging, they can form metal fragments through
  chemical reaction between the electrodes and a high concentration of
  Lithium atoms.

  Once formed, these conductive metallic fragments can penetrate the plastic
  separator between the positive and negative electrodes, causing major
  short-circuits and thus catastrophic over-heating.  This failure in
  circuit design is probably why Sony investors are betting that the company
  won't have to cover the entire cost of the recall.

Curt Sampson <cjs@cynic.net> +81 90 7737 2974

Spread sheets weak point of Security

Sun, 03 Sep 2006 15:53:39 -0500


*Computerworld* rediscovers that it is well known in business, that the
official data may be well secured in data bases, with sensitive data going
to trusted employees who reorganize the data in spread sheets BI tools that
are not as well secured against data breaches.
http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9002950&source=rss_topic17

Story found thanks to http://socrates.berkeley.edu:7077/it-security/

Re: LA power outages (Borg, RISKS-24.40)

Fri, 25 Aug 2006 22:18:09 -0500


Finding and fixing this particular defect actually comes at almost zero
cost.  Here's what happens:

1. Professional tester designs high-fidelity test, including a step that
   involves powering down of the fuel pump.

2. During a test case review with the system engineers, someone says, "Hey,
   we can't do that, if the diesel engine is run dry, it'll break."

3. Professional tester says, "Yes, okay, but let's just suppose we *did* run
   the test.  Guess what I learned while designing the test?  The fuel pump
   is connected to utility power with no fail-over to generator power.  So,
   when utility power fails, the pump stops, which means the diesel soon is
   running without fuel, which means that not only does the diesel engine
   become damaged, but we don't get our backup power."

4. System engineers say, "Ohhhhh."  System engineers leave test case review,
   go off and solve problem.

5. Professional tester, without running a single test, saves the
   organization thousands, potentially millions of dollars.

I have seen scenarios like this happen dozens of times in my career being a
professional tester and managing professional testers.  Amateur
testers--i.e., people who do not make a study and profession of the field of
testing--will usually miss situations like this.

Now, I will grant you that there are plenty of instances where a truly
high-fidelity test *is* judged by management to be cost-prohibitive.  For
example, some people do not performance test in completely accurate test
environments, which casts a lot of doubt on their performance test results.
(By the way, please note that we are talking about *software* testing here,
so this is a case where the combinatorial explosion is not actually what
gets you into trouble; in fact, the combinatorial explosion is not that
difficult to deal with.)  The explanation is usually, "It'll cost too much
to replicate the production environment."  Of course, that was exactly the
reason why NASA didn't test the effect of foam strikes on shuttle wings,
which were going on for years before the US lost one very expensive shuttle
and seven very expensive--indeed, to their families, priceless--astronauts.
I bet that foam strike test that was proposed to be run at Southwest Texas
Research Institute--and cancelled due to cost considerations--looks like a
bargain to those same NASA managers now....

Rex Black Consulting Services, Inc.  31520 Beck Road, Bulverde, TX 78163 USA
CTO, Pure Testing, Pvt Ltd  +1 (830) 438-4830 www.rexblackconsulting.com

Brave New Ballot, Avi Rubin