1957 Bob Ashenhurst hoax on Rick Gould's PhD Thesis] "Lost Squirrel Causes Troublesome Power Surge" Providence Journal, Thursday, October 30, 1986 An electrical power surge caused computers to go on the blink in Providence brokerage houses, banks, and office buildings yesterday. A Narragansett Electric Co. spokesman said a squirrel caused a short-circuit in a transformer. Charles Moran, the spokesman, said the squirrel got into a transformer at the Narragansett Electric's Dyer Street substation at 11:10am. Moran said a backup transformer took over automatically and prevented a power failure in downtown Providence. But "there was a slight power surge," he said. Computers in the money-market divisions of the Fleet and Old Stone Banks were down for half an hour after the power surge, but banking services were not disrupted, spokesmen said. Dean Witter Reynolds Inc., a brokerage firm, had trouble getting quotes on stock prices, according to Sharon Tallman, who said some of the firm's Quotron machines went down. At Superior Court, the computer was down for two hours, but it didn't affect court scheduling, a spokeman said. "The mainframe on our IBM computer was down for over an hour," said Robert Perreira of the Providence Journal Co.'s computer services unit. Perreira said 14 systems went down and "three of them did not come up immediately." A Journal Co. electrician said the power surge caused "our lightning control panel to behave like a runaway monster." It caused a computer to activate a program designed to save energy on weekends by shutting off the lights in part of the building. "The computer thinks it's Sunday," the electrician said. [A similar squirrelcide happened at SRI a while back. The side-effects were quite prolonged and unanticipated. On occasional Saturdays for several months all of SRI was powerless while repairs were repeatedly attempted but not quite completely accomplished. PGN]
A few months ago, Sixty Minutes ran an episode about the fact that the FAA had rejected Honeywell's collision avoidance system in favor of its own (untested, uncompleted) system. I think the episode aired shortly after the Air Mexico collision in California. One of the people Sixty Minutes interviewed had been an FAA official (executive?) until he became too vocal about the fact that the FAA was ignoring a workable system. It was his opinion that *many* collisions and near-misses would never have happened if the Honeywell system had been adopted when it was first introduced. The Honeywell system resides in the aircraft and projects an envelope ahead of the plane that can be detected by another Honeywell system. The system communicates with the pilot by issuing a warning when an intersection with another plane's envelope is detected and gives a direction in which to turn to avoid collision. The FAA system is tied into the ground-control system and seems to rely on tracking aircraft from radar on the ground. I was not too clear on this. The advantage of the Honeywell system is that it is small, cheap, and does not require the pilot to rely on any outside assistance. The drawback is that *all* planes need to be equipped with the system. But, since it is small and cheap that would not be a great problem. I can't remember all the pros and cons of the FAA system, but the cons had a clear majority. The system is much more complicated, involves ground-control personnel notifying pilots about impending collisions, and is expensive. Charlie Hurd
After graduating about ten years ago, I entered the Air Force as a Satellite Systems Engineer. I was assigned to a unit operating a particular NORAD satellite system...no names, no mission statements, please. A buddy DID almost start World War III one night, though. My job was real-time and non-real-time analysis of mission data from the spacecraft; the end result of my analysis was to advice the NORAD Senior Director of the validity of the data. A lot of factors had to be incorporated in my analysis...in "N" seconds, I had to take into account which spacecraft had reported, its health and status, DEFCON level, and "numerous other mission critical elements." Nudge, nudge... Anyway, the job was highly dependent upon the experience of the analyst, as well as his intuition...we had to have a FEEL for what was right. Three years after I joined the squadron, the unit was reassigned from the Aerospace Defense Command (ADCOM) to the Strategic Air Command (SAC). Now, SAC is the largest producer of automatic humans in the free world. In a word, SAC is checklist crazy...every task is broken down to the largest number of subtasks. SAC treats its checklists as a way to eliminate the human element. Training two people to work as a team is unecessary...all they have to be able to do is call off the proper steps from the checklist. SAC uses simulators to allow its people to practice every step, and to handle every contingency. For instance, a missile launch officer has gone through the launch procedure in the simulator dozens of times before he is placed in an actual control room. The opening sequence in WAR GAMES is an example of what SAC is trying to avoid: The crew must automatically perform its tasks, spending no time thinking about what the consequences are. The crew must not bring their emotions into play, nor even any additional knowledge they must have. Every action must be governed by a checklist step. You can see what our problem was...how to you place "intuition" and "gut feel" onto a checklist? Our job could not be performed by an automaton; we had to call on experience and a deep understanding of system operation in order to provide our assessment. We argued, to no avail. We had to have a checklist. So we thought and thought, and broke the analysis task into as many subelements as we could. The last subelement was OPERATOR INTUITION. Did SAC complain? Nahhhhh...they never read the thing. Occasionally they'd show up for Operational Readiness Inspections. During the simulation, their checklist called for them to verify that we had our EVENT ASSESSMENT checklist open. Their checklist didn't call for them to actually read our checklists...
[Dave Benson said that we should assume that an overloaded system will fail to handle any load at all. I said an overloaded system could fail by handling no load, by handling its ceiling load and no more, or by handling its ceiling load and some decreasing part of additional traffic, and that we had no grounds for making that decision until a design, designers, and implementors existed. Dave Benson said history tells us no system works without extensive realistic testing.] If that summary sounds as if I thought Dave's remarks didn't address what I said, that's correct. I know of systems (not military systems, with which I have have no experience) which demonstrate each of those overload behaviors; I'm sure he does, too. Overload behavior is something that certainly can be stated explicitly as part of the design and it's generally a pretty easy thing to simulate, compared with the problem of simulating all possible inputs. Note that I am talking ONLY about response to overload, which is where the discussion started. I have plenty of doubts about many parts of the SDI program and I don't for a minute expect that they will come up with a design or an implementation that I will be willing to trust. But Dave's original statement that "We should assume that a system capable of handling N targets/sec will, when presented with 2N targets, fail to handle any at all." is without basis and his further statements referring to 30 years of software development history offer nothing to support it. Systems fail in many ways and there is no reason to assume a particular failure mode without looking at the design and implementation. Worst-case assumptions are often useful, but in this case they are unenlightening; we all know that in the worst case nothing works, all the missiles fall through, and c'est ca. I'm a lot more interested in the probability of that worst case than in the fact that that IS the worst case. Dave did not say anything to convince me that an arbitrary system's most likely response to overload is total failure; in my own experience (admittedly only 20 years) more systems respond to overload with degraded or limited performance than with total failure. scott preece gould/csd - urbana uucp: ihnp4!uiucdcs!ccvaxa!preece
The latest issue of DATAMATION has an excellent article on computerized vote counting. I recommend it to all. It addresses problems with punch card voting, but doesn't address the problems with computerized voting booths. The three biggest problems with computerized voting booths are secrecy of internal operation, lack of recount capability, and inability for the voters to ensure that the computer votes as instructed. Some of the people whose names are in the article were at BU in August for the Symposium on Security and Reliability of Computers in the Electoral Process. These people are doing great work, especially considering the fact that they are generally financing it on their own. I am presently compiling some poll watching guidelines for computerized elections. I can send a copy to anyone who will be a poll watcher on Tuesday.
[Remembering that the RISKS Forum is aimed at fostering better systems in the future as well as exposing limitations with existing systems, it is appropriate to include the following item. PGN] CALL FOR PAPERS FTCS17 THE SEVENTEENTH INTERNATIONAL SYMPOSIUM ON FAULT-TOLERANT COMPUTING sponsored by IEEE Computer Society's Technical Committee on Fault-Tolerant Computing Pittsburgh, PA, July 6-8, 1987 **** NOTE NEW DATES **** The Fault-Tolerant Computing Symposium has, since 1971, become the most important forum for discussion of the state-of-the-art in fault-tolerant computing. It addresses all aspects of specifying, designing, modeling, implementing, testing, diagnosing and evaluating dependable and fault-tolerant computing systems and their components. A special theme of the conference will be the practical application of fault-tolerance to the design of safety critical systems, real-time systems, switching systems and transaction systems. Papers relating to the following areas are invited: a) design methods, algorithms for distributed fault-tolerant software systems, b) specification, design, testing, verification of reliable software, c) specification, design, testing, verification, diagnosis of reliable hardware d) fault-tolerant hardware system design and architecture, e) reliability, availability, safety modeling and measurements, f) fault-tolerant computing systems for safe process control, digital switching, manufacturing automation, and on-line transaction processing. Authors should submit 6 copies of papers before the submission deadline December 5, 1986 to the program co-chairmen: Flaviu Cristian, IBM Research K55/801, 650 Harry Rd., San Jose, Ca 95120-6099, USA, and Jack Goldberg, SRI International, 333 Ravenswood Ave., Menlo Park, Ca 94025. Papers in areas a, b, and f should be sent to F. Cristian, and papers in areas c, d, and e to J. Goldberg. Papers should be no longer than 5000 words, should include a clear description of the problem being discussed, comparisons with extant work, and a section on major original contributions. The front page should include a contact author's complete mailing address, telephone number and net address (if available), and should clearly indicate the paper's word count and the area to which the paper is submitted. Submissions arriving late or departing from these guidelines risk rejection without consideration of their merits. The Symposium chair and vice-chair are John Shen and Dan Siewiorek, both from Carnegie Mellon University, USA. The program co-chairmen are: Flaviu Cristian, IBM Research, USA, and Jack Goldberg, SRI International, USA. Publicity chairman is Bella Bose, Oregon State Univ., USA. [Program Committee omitted here.]
Please report problems with the web pages to the maintainer