The Risks Digest

The RISKS Digest

Forum on Risks to the Public in Computers and Related Systems

ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Volume 15 Issue 81

Friday 29 April 1994

Contents

o Boot Prom commits Denial of Service Attack
Dave Wortman
o Cyrix 486 CPU Bug
Dave Methvin
o Call Identifier (tm) forgets list of received calls
Robert Chesler
o Re: Unwanted FAX received via voicemail
Declan A. Rieb
o Re: Stress Analysis of a Software Project
Tom Davis via Joan Eslinger
A. Padgett Peterson
o Inspecting Critical Software
David Parnas via Jan Arsenault
o Info on RISKS (comp.risks)

Boot Prom commits Denial of Service Attack

Dave Wortman <dw@pdp1.sys.toronto.edu>
Fri, 29 Apr 1994 12:52:08 -0400
A major power outage here on Tuesday demonstrated the risks of
excessive automation and administrative convenience.

Our computing environment consists of a heterogeneous network of
Sun, Dec and IBM workstations and related fileservers.
When a Sun workstation boots up, a hardware prom issues a rarp request
to establish the workstations network address and to identify a server
that can provide it with a bootprogram and then the Unix kernel.
The boot prom uses the trivial file transfer protocol (tftp)
to request the boot program.  It initially issues a tftp request
to the server it has identified, but if that tftp request times
out then it broadcasts a tftp request on its local network
looking for any server that can provide it with a bootprogram.
It keeps repeating this process until it receives a boot program.
One the Suns the prom has no builtin knowledge of its network address
or the network address of the server.
There are some good reasons for keeping the boot prom ignorant of
its network environment and using a broadcast protocol, including
the administrative convenience of not having to do anything to
workstations when the server changes and providing a degree of robustness
in a multi-server environment.

In recent years there have been security problems related to the
tftp protocol so in our environment the Dec workstations run security
monitoring software that keeps a log of failed tftp attempts to help
detect potential intruders.  The security software writes a log file
of failed tftp requests and also puts a message on the affected
machines console.

What got us into trouble after the power outage was that the Sun workstations
came back online but the corresponding Sun servers came up in a wedged
state in which they responded to the initial rarp request but then failed
to respond to any workstations tftp request for a boot program.
After the initial tftp request to the Sun server timed out, our network
was flooded with tftp requests from many Sun workstations all trying to
find any server that could boot them.

In the meantime the Dec workstations on the network had rebooted
successfully and were being used by a number of professors and students.
However, these machines soon became unusable due to the effort required
to deal with the flood of tftp requests.  The security monitoring
software contributed to this problem by writing messages to each
machines console window (ignorable but consumptive of resources) and
by almost filling up a critical file system with its log files.
If this filesystem had filled up, the machines would have been totally
unusable.  Even if we hadn't been running the security monitoring
software, usability of these workstations would have been impaired
by the handling of the tftp requests.

There are several things that could have been done better:
- the question of whether falling back to a broadcast protocol for
  booting is the right approach should be reexamined.  On most systems
  the set of servers that could successfully respond to a boot request is
  a) small, b) well-known, and c) changes very slowly over time.
- the boot proms should use some form of backoff strategy when
  the tftp  requests consistently fail to avoid overloading the
  network.
- our security logging software needs to be more robust in dealing
  with its log files.  Waiting until a log file write fails due
  to a full filesystem is too late if the full file system will
  cause other processes to crash.  This is tricky since we
  don't want a introduce a mechanism that would allow an intruder
  to overwhelm the security software with failed attempts and then
  proceed to do dirty work once logging has been suspended due to log
  file overflow.

A curious legal question comes to mind:  could the manufacturer
or the proprietor of the workstation containing boot prom be held
guilty of a "denial of service attack" on our Dec workstations?
If an individual had issued all of those tftp requests we certainly
would be considering the question.


Cyrix 486 CPU Bug

Dave Methvin <0003122224@mcimail.com>
Fri, 29 Apr 94 07:40 EST
I'm an editor at Windows Magazine.  In our May issue I wrote a news
story reporting a bug in the Cyrix Cx486DX CPU.  The Cyrix Cx486DX was
designed to be completely software-compatible with Intel's i486DX
processor.  However, Ed Curry of Lone Star Evaluation Labs (LSEL) found
a bug relating to floating-point operations while doing some in-depth
compatibility testing.  Cyrix shipped thousands of chips with this bug
before April 1994, but has now fixed the problem.

The bug occurs when a register load instruction (such as MOV reg,mem)
is followed by an instruction that clears the floating-point status
register (FCLEX).  If the memory location being referenced is in the
CPU's internal cache, the MOV instruction works fine.  If, however, the
MOV requires an external bus cycle, executing the FCLEX instruction
aborts the cycle.  As a result, the register is not loaded properly.

The risk here is that someone may run software on the Cx486DX
that generates incorrect results where an i486DX would work fine.
The Cyrix position is that this is a minor bug and that we (Windows
Magazine and LSEL) are making too much of it.  However, LSEL has seen
the bug in their test code compiled under OS/2 and Windows NT.  The
test code performs typical engineering and scientific calculations, so
it's not contrived or artificial.  We have not found the problem in
any shrink-wrapped application.  Most MS-DOS and Microsoft Windows
insert a FWAIT instruction before any floating-point instruction, so
they generally won't exhibit the problem.

What does the Risks readership think? Are we making too much of this?
Is anyone out there using PC with a Cx486DX?


Call Identifier (tm) forgets list of received calls

Robert Chesler <rob@chesler.absol.com>
Fri, 29 Apr 94 13:28:10 -0400
I accepted a no-installation-cost trial of Caller ID and found
it somewhat useful for correlating call times with answering machine
messages, but found 90% of my received calls were out of my area
and thus had no number actually displayed, only the date and time.

Last night I noticed that the box had cleared out its memory.
No call had been received on that line between the time I had last
checked it and the time I noticed it with an empty list.

The risk here is that if some message was sent to the box through
the phone line to clear its list, then the box would be less useful
for someone using the box to catch a crank caller or even log
when important calls or messages were received.  If the caller
ID protocol includes such a message, then such a message could
undoubtedly be faked if someone got physical access to a residence's
network interface or telephone company signalling.

I'm sure that boxes more advanced than the promotional box that
was given to me might have precautions or a printed log, but I would
imagine that the promotional boxes are widely used.

--Robert


Unwanted FAX received via voicemail

"Declan A. Rieb" <darieb@sandia.gov>
Tue, 26 Apr 1994 15:09:57 -0600 (MDT)
The voicemail system I use allows incoming FAXs to be saved and handled as
messages. Upon receipt, the system notifies the user that there is an
incoming fax message, and you can even query for the number of pages.
When a message exists in the "voice mailbox", one can have the system
forward it to a real FAX machine (either a preselected "primary" FAX or
any other phone number.) Requesting such a forward places the FAX message
into a queue, meaning it may actually be sent at some future time.

Last week I received a 5-page FAX message. It did not come from a local
caller (one on the same telephone switch.) All I knew was that it was
five pages. I sent it off to my primary printer, and an hour or so later
went to pick it up. No FAX for me there. I tried again. No FAX for me.
FAX machine broken? After a day of this, I sent the FAX to a machine and
promptly went to watch. Out came a list of imported tequila prices, and
several blank pages! I recalled seeing several such lists at the other
FAX machines...But none were addressed to me! Surely they weren't
mine...but a closer inspection showed that the FAX phone number listed
was indeed mine (perhaps a missing area code?)

Whoa! that kind of business is illegal here! And I've been spreading the
things around the area. At least I didn't have my name on them, but the
phone number was mine!

Welcome to the wonderful world of hi-tech. It used to be that FAX
machines were relatively rare, and "dialing" a wrong number would mean
the FAX doesn't get sent. Now, EVERY phone here can receive a FAX, and we
can send multiple copies out without knowing what it is we sent! Yes,
I'll be a bit more careful in the future.

   [A surprising number of readers chided me for NOT having
   appended a "You mean a FAX PAS?  PGN" appendum.  THANKS!  PGN]


Re: Stress Analysis of a Software Project (Davis/Leichter R-15.80)

Joan Eslinger <wombat@kilimanjaro.engr.sgi.com>
Fri, 29 Apr 1994 19:35:15 GMT
The memo Jerry Leichter posted was an actual Silicon Graphics memo.
However, life for Silicon Graphics and Tom Davis is not quite so bleak
as some might think. Tom Davis wrote the original memo to point out
problems and ask everyone to help fix them.

It was very effective. I installed a beta version of the new 5.2 release on my
Indy in January, and only rebooted the machine a couple of weeks ago because I
was moving to another building. Sure, I had to add another swap file
on-the-fly about once a month because my emacs processes grew so large :-),
but the system did not crash. And performance is quite snappy. "Watch the
skies."

Since the memo has been popping up all over the net, Tom has written a reply
to it, included below. There isn't really a RISKS tie-in, unless you count the
risk of having only the "bad" half of a story get wide distribution.

Joan Eslinger / Silicon Graphics / wombat@sgi.com

-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-

I am the author of the original memo below, which was intended for
internal Silicon Graphics use only, and was not for anyone outside the
company.  But since it has been leaked to the net, and is beginning to
be used by competitor's sales people, I feel a response is required.

I don't believe that these problems are unique to Silicon Graphics.
>From discussions with friends who are insiders in many different
companies, I am certain that similar memos could be written about the
software of each of our competitors.

What I like about working for Silicon Graphics is that at least here,
something is being done about it -- I worked for companies in the past
where the response would have been to stick our heads in the sand in
hopes that the problems would just fix themselves.  If I hadn't thought
that the memo would catalyze some change here, I wouldn't have written
it.

The details appear as comments to my original article below.  Luckily,
the article is 6 months old, and we have had a chance to make some
significant progress.

Typically, what happens is that each faster generation of hardware is
followed by software that more than compensates for the increased
speed, but as a result of this memo, Silicon Graphics has been able to
skip one of the slowing software cycles, making, instead, a performance
and quality based release.  The next release is going to be similar,
and in the meantime, we get an extra hardware boost from the faster
R4600 processors.

  -- Tom Davis
     Silicon Graphics


General comments:

As a fairly direct result of this memo, SGI decided not to continue
"business as usual" in software development.  The approach we took to
the problem was the following:

With the 4.0.5abcdefghi... fiasco, and the fact that the 5.* releases
were still for specific machines, our developers were desperate for an
all-platforms release.  We decided to make such a release relatively
soon -- and 5.2 actually MRed in February.  The 5.2 release had two
goals -- primarily, all-platform, and given that it went out in
February, do as much performance-tuning and bug-fixing as time allowed.

In that period, the performance on 16MB systems was essentially doubled,
which improved performance on larger systems as well, but to a lesser
degree.  Significant numbers of bugs were fixed as well.

Some people hoped that a few quick fixes would bring back all the
performance in 5.2, but a little investigation indicated that was a
list of things to be done, and that another quality release would be
required.

The 5.3 release, not officially scheduled, but which should be MRed
around October or November is that quality (performance and bug-fix)
release.  We'll add a few new features, but they will be the exception
rather than the rule.  The longer time before the 5.3 release should
give us time to do a thorough job of solving our problems.

For 5.3, there's also time to set up solid performance and bug-fixing
goals, and these are already being discussed.

And most important -- the worst problems were with 16 MB systems that
paged their brains out.  They are better now, but not great.  But we
don't sell them.  One of the 5.3 goals is to improve performance (or
reduce sizes enough) that it will be acceptable on a 16 MB machine.

The kernel memory leaks are all fixed, and many of the important
programs have been reduced in size.  For 5.2, 5 or 6 of our most
heavily-used programs were subjected to close scrutiny to find out where
the performance went, and many were significantly improved.

A lot more work is planned for 5.3 to reduce the sizes of the
executables.

Work is continuing on the DSOs to split them up properly so that they
don't all have to be loaded, and to improve their performance and
start-up time.  We're working to make "quick-starting" happen more
automatically.

>     PERFORMANCE UPDATE

I don't think it's unusual to do benchmarks with non-standard compiler
settings.  Both we and our competition have done so for a long time.  We
do ship all the libraries, et cetera, necessary to duplicate these
results so customers for whom speed is the only objective can pay the
cost of larger executables in exchange for the added speed.

Unfortunately, I can't re-run some of these tests, but 5.2 is definitely
better than 5.1.


I think the 5.1 fiasco has caused a lot of our management to see the
light, and in conversations with people at all levels, it's clear that
nobody wants to see anything like it happen again.  The 5.2 and future
5.3 releases seem to be steps in the right direction.

But there's still a lot of work to do, and we in engineering can use
every minute between now and the 5.3 release to improve things.

The 5.3 release is being planned with reasonable beta-cycles, and with
enough time between now and "code freeze" to make significant
improvements.

> Management Issues:

I think this sort of disconnect is not too unusual -- there is always
enormous pressure to announce a very low entry price-point, and the
16MB system provided that.  Everybody does this with the full knowledge
that on a minimum system, you won't be able to run many interesting
applications, and almost everyone will have to purchase a bit more
memory.  It's just that in the case of Indy, there were so many new
features that the proposed minimal system was embarrassingly slow.

The "fix" is simply not to ship the 16MB systems which will insure that
everyone will get a very usable machine.  All we really lose is our low
entry price point, and the gain is that we won't have to deal with the
few irate customers who bought a minimal system.

Although some of our performance loss is due to more complicated
features, the vast majority is due to the fact that more memory is
required, and without it, the systems page with a consequent dramatic
reduction in performance.  The 4.0.X -> 5.X change on our large machines
was measurable, but not nearly so noticeable as on the smaller ones.


We're still not completely there (as far as I can tell) with respect to
better software management.  The good thing is that many of our
higher-level managers are acutely aware of the problem now -- Forest
Baskett and Tom Jermoluk are extremely concerned, for example.

It's too bad it took a shock like 5.1 to make everybody take notice, but
at least they did, and we're doing the right sorts of things to correct it.

   [Moderator deleted the entire interstiated message from RISKS-15.80.  PGN]


Re: Stress Analysis of a Software Project (Davis/Leichter R-15.80)

A. Padgett Peterson <padgett@tccslr.dnet.mmc.com>
Fri, 29 Apr 94 08:22:09 -0400
For years, people have been postulating projects that are too complicated to
comprehend and we have seen several examples of what happens when this occurs.
IMHO the only solution is to separate functions into stand-alones that utilize
a common and understandable foundation and which are understandable.

Where many have felt that a single integrated system is best, I have often
been called in to "put out fires" and the first thing I do is to separate the
problem into "atoms", the least divisible pieces. It is astounding how often
problems that cannot be seen when tightly wrapped in a package becomes obvious
when viewed by itself. Sometimes you just can't see the tree for the forest.

> Some people claim that we need new software debugging tools to look at
> the problem, and that may be true, but it's not a short-term solution,
> and it runs the risk of causing us to spend all our time designing
> performance measurement tools, rather than fixing performance.

This is disturbing. Unless you have the tools to properly examine a system you
cannot tell what is really going on and the reccuring theme of the memo seems
to be that no-one knows. Without the proper tools, the job will never be
completed.

Again I can only speak from personal experience but cannot count the times
when called in to fix a problem that supervisors have gotten very antsy
waiting for something to happen while the envelope is being defined. Have
found that unless the system is understood, it *can't* be fixed (see "little
silver hammer" syndrome).

The problem with the engineers also appears symptomatic. Engineers are
supremely good at taking a concept and making it work. They are not generally
good at determining that a concept is flawed in the first place, instead often
they will continue to work as if the concept were correct and they were just
lacking in skill. This leads to precisely the morale problems described.

The major problem with engineers is that they accomplish the impossible
so often that the marketeers come to expect it from them.

The real problem seems to be simply "no-one in charge" and is all too common
in large organizations. History is rife with examples of companies, states,
countries that became too concentrated at the top and fell victim to the
huns/vandals/Standard Oil as they rose to power. "Think of it as Evolution in
Action" - Jerry Pournelle

Padgett


Inspecting Critical Software, a course by David Parnas

<arsenau@mcmail.cis.mcmaster.ca>
Tue, 26 Apr 94 20:43:34 EDT
  Inspecting Critical Software: An Intensive 3-day Course offered by
  The Faculty of Engineering, McMaster University, Hamilton, Ontario, Canada

  Taught by Prof. David Lorge Parnas, with the support of TRIO

  June 7, 8, 9, 1994

1.  Background

    Software is critical to the operation of modern companies and is
frequently a key component of modern products.  Some pieces of software
are particularly critical; if they are not correct, the system will have
serious failures.  Standard methods of software inspection are not
systematic. This course teaches a procedure for software inspection that
is based on a sound mathematical model and can be carried out
systematically by large groups.
    The software inspection procedure combines methods used at IBM, work
originally done at the U.S. Naval Reserve Laboratory for the A-7E
aircraft, and procedures applied to the inspection of software at the
Darlington Nuclear Power Generating Station.  The method has been refined
and enhanced by the Software Engineering Research Group at McMaster
University's Communication Research Laboratory.  It can be applied to
software written in any imperative programming language.

2.  What Will Participants Learn?

    Participants in the course should return to their workplace with an
understanding of the way that mathematics can be used to document and
analyze programs.  They will also return with documentation of a piece of
their employer's code that can be used to explain the work to others.

3.  Programme

Day 1  Predicate Logic and Program-Functions/Relations

1)  Overview and Case Study
    A discussion of previous applications of the method.
2)  Predicate Logic
    The inspection method is based on predicate logic, which will be
    reviewed in this section.
3)  Tabular Expressions
    This session will be devoted to the writing of readable
    predicates using two-dimensional notations rather than classical
    one-dimensional expressions.  There will be numerous examples.
    Participants will be taught to read and write tabular expressions.
4)  Describing Program Function
    This session will be devoted to writing program descriptions
    using predicates and tables.

Day 2  Inspection of Dijkstra's Dutch National Flag Program

    Participants will be given a copy of E.W. Dijkstra's explanation of a
program along with several sample programs.  They will be asked to apply
the inspection method and approve or reject each program.  The instructor
and some assistants will be available as consultants during this process.

Day 3  Morning:  Inspection of a "Real" Program

    Working in small groups, the participants will take a section of a
program from their company and inspect it using the method learned so
far, producing documentation as they go.  Day 3 Afternoon: Report on the
Inspection Results, Discussion of Testing
    The first part of the afternoon will be devoted to a series of
reports by the participants on the results of their efforts in the
morning.  The remainder of the afternoon will be devoted to a discussion
of the interaction between testing and inspection.  We treat testing, not
as an alternative to inspection, but as complementary to inspection.  We
discuss the way that the documentation produced in the inspection process
can be used in the testing process.

4.  Learning By Doing

    The course is language-independent.  In fact, on the third day,
participants will inspect code written in any language that they use in
the workplace.  This course presents an approach to active design reviews
that has the reviewers writing precise documentation about the program
and explaining their documentation to an audience of other reviewers.  A
significant part of each day will be spent using the ideas that have been
presented to determine whether or not programs do what they are supposed
to do.  On the last day, participants will inspect a small program that
they brought with them from their company.  Participants should leave the
course with improved ability to inspect software.

5.  Who Should Attend?

    Participants should be experienced programmers and not afraid of
learning a little mathematics.  The mathematical basis for the method is
classical and takes up only a few hours in the course.  However, it is
fundamental to understanding the method.  It is expected that the
participants will be used to reading code written by others and it will
be helpful if they can read Pascal.

6.  What Should You Bring With You?

    For the exercise on the third day, each participant should bring a
small program, perhaps 50 lines that are critical to some project.  It
need not be "mature" code, but it should compile and have survived some
testing or use.  If there are several participants from the same company,
they may work in small groups on slightly larger programs.  You may want
to bring a reference manual and some conventional documentation about the
program with you.  It will help if one of the participants is familiar
with the program.

7.  The Instructor

    The course will be taught by Prof. David L. Parnas, an
internationally recognized expert on Software Engineering.  Dr.  Parnas
initiated and led the U.S. Navy's Software Cost Reduction Project, where
the tabular notation was first used, advised the AECB on the use of these
methods at Darlington, worked with IBM's Federal Systems Division, leads
the Software Engineering Research Group at McMaster University and is a
Project Leader for the Telecommunications Research Institute of Ontario.

Information about costs, registration, etc. can be obtained from:
Jan Arsenault, Faculty of Engineering,
JHE-201A, McMaster University,
1280 Main Street West,
Hamilton, ON, Canada, L8S 4L7.

Telephone: 905 525 9140 x 24910
email: arsenau@mcmail.cis.mcmaster.ca

Please report problems with the web pages to the maintainer