There are still a lot of unknowns, several days later. A current theory on the 14 Aug 2003 Northeast blackout attributes what started the cascade effect to human failure to respond properly to an alarm denoting the failure of transmission lines (including a tree in contact with a power line) near Cleveland at 3:06pm -- over an hour before the massive nine-second propagation. According to the front page of *The New York Times*, 17 Aug 2003, "It is not clear whether the problem with the alarm delayed action by the utility, FirstEnergy Corporation, or the consortium that controls the regional grid, the Midwest Independent System Operator." The same article notes that the newly released timeline of the North American Electric Reliability Council (investigating the blackout) does not answer how a local failure "could have spread catastrophically to other regions, overwhelming mechanisms designed to halt such a spread." Note the similarity with the massive *West* Coast grid blackout on 2 Jul 1996 in which a tree touched a power line, and the operator who had detected an anomaly could not find the phone number required for the manual alert. And then of course, there were numerous claims that such a massive outage could not happen again -- until the 8-state collapse on 10 Aug 1996! (See RISKS-18.27 to 29 and especially RISKS-22.32, on "why it couldn't happen again"!) Rather than harp on the lessons that need to be learned, let me suggest that you read Pat Lincoln's thoughtful piece, which follows.
One lesson that can be drawn from incidents like the recent massive power outage is that decreasing margins in all our infrastructures place critical societal functions at greater and greater risk of significant disruptions from rare accidental and malicious acts. Redefining acceptable levels of risks and protections as the world changes is hard work, but need to be done. Cost pressures and tight engineering under benign assumptions lead to thin margins. Optimized engineering leads to most events being of small consequence (we've engineered systems to tolerate them), but some rare events can cause massive disruption. It would be 'bad engineering' to overdesign a system to tolerate very rare events, if that tolerance costs more than the failures it would prevent (in expected value to customer terms). Fragility to extremely rare events can be seen as good business. It would be surprising if there weren't rare disruptions (like massive power outages) in highly optimized infrastructures. But the invisible hand of economics and good engineering leave systems designed and optimized under assumptions of relatively benign environments at great risk if new or unexpected threats arise. Computer systems change very rapidly, and new threats arise with disturbing speed. The current hardware manufacture, software development, and people practices of our cyber infrastructure are obviously subject to the same economic motivations as described above. So they are already (and will become even more) fragile to rare or unexpected accidental or malicious events. That's 'good business' paving the road to vulnerabilities. Post 9/11, we can point out how previously almost unthinkable scenarios are more thinkable now, and thus engineered defenses against potential attacks are more strongly motivated. Govt procurement practices, corporate and individual liability, government mandates, and other mechanisms could have a profound impact on the reliability and cost of cyber infrastructure, but also on large-scale economic concerns, so it may be imprudent to act without defining the threats. To define and quantify cyber threats and their impact, particularly in combination with coordinated physical and psychological attacks and effects, requires deep (read: expensive) contemplative research, development, large experimentation, etc. Once new threats and defenses are defined, all the costs associated with deployment of those mechanisms can be at least partially quantified, and then well-reasoned decisions can be made about appropriate levels of protection against various risks. The pace of technology change and societal reliance on these systems amplify the uncertainty, urgency, and magnitude of risk here. It is almost unthinkable that western societies would not put very large resources against a problem of this grave potential.
Many of the reports on today's blackout have expressed the view that it comes as a complete surprise. The reality of course is that such a blackout was entirely expected by those who follow the power industry, as I discuss in the new short audio (mp3) Fact Squad Radio feature, "Blackouts and Bush's Buddies." It's playable via: http://www.factsquad.org/radio Lauren Weinstein, email@example.com http://www.pfir.org/lauren Moderator, PRIVACY Forum - http://www.vortex.com Tel: +1 (818) 225-2800 [Also, see System's Crash Was Predicted: http://www.washingtonpost.com/wp-dyn/articles/A61117-2003Aug15.html PGN]
During yesterday's blackout in northeast U.S. states and several major Canadian cities, wireless networks and Internet connections allowed people to keep communicating. The chief business officer of Equinix, which operates Internet Business Exchange centers that serve more than 90% of the world's Internet routes, explains: "We lost all utility power out there, but we immediately went to battery power for a few seconds, at which point all of our major generators kicked in" to allow normal operations that were "totally seamless to customers." Internet customers therefore suffered "no disruptions whatsoever" to their Internet service resulting from the electrical system failures. [AP/*San Jose Mercury News*, 15 Aug 2003; NewsScan Daily, 15 August 2003] http://www.siliconvalley.com/mld/siliconvalley/6540489.htm
Quoting [with permission] a colleague in a hallway conversation: Review of yesterday's lesson: Q. What is the ONE critical Infrastructure? [upon which all the others depend] A. Electricity Q. What is its most salient feature? A. Nobody knows how it works. [Or perhaps more correctly, how it DOESN'T work.] Declan A Rieb, <firstname.lastname@example.org>, 505 845-8515 Sandia National Laboratories MS1202, Albuquerque NM 87185-1202
> All this supposedly happened in nine seconds, and yet the cause is still > unclear! The very fact that it happened so fast is one reason that expert speculation on the cause has been slow to come. (Political speculation, of course, occurred almost as fast as the outage.) Most large outages, including the ones in the northeast US in 1965 and 1977, propagated over a period of many minutes. They involved overloads which did not immediately trip protection. The symptoms were such that the (human) system operators were expected to see them and react, and the failure of the operators to shed load quickly was a major factor in the extent of the outages. By contrast, this outage spread so fast that only automatic controls had any chance of stopping it once it began. (Whether recognizably dangerous conditions existed before the first failure remains to be seen. Analysis of contingencies is a major part of online control systems, but choosing the proper actions to minimize risk is an extremely complex problem.) In those outages decades ago, the system was gradually pulled over the brink. In this outage, it was tossed over the edge like a finger flicking a match stick. The Niagara area saw a flow change of 3GW, the output of three nukes, in under a second. We're already hearing "we will put changes in place so that this will not happen again". But a system operator who has spent eight hours a day for the past 25 years keeping a system up -- successfully -- needs more than a few seconds to shift mindset and do the almost unthinkable -- shed load -- to protect the system, even when the signs are clear. Problems of this nature are so rare that we do not, cannot, trust either the humans or the computers. Perhaps the best action would be to provide effective simulators so that the operators can spend a few hours a week reminding themselves of what a real emergency feels like. But most likely we will see proposals which leave the humans out of the loop. Of course, certain technical measures would help. So far, the newspaper analyses of the outages correctly point out limited transmission capacity as a problem. Deeper problems are the anti-regulatory environment, that safety doesn't sell, and the failure to invest in conservation. Building "excess" transmission capacity has no market incentive. Excess capacity is essential to safety, but safety doesn't sell. The market calls it excess capacity; people call it a safety net. When a critical line fails, parallel lines must have "excess" capacity to take over the flow, and this safety net must remain intact when lines are out of service for maintenance. Safety nets are not cheap. Conservation is far more cost-effective than new construction at ensuring continuous availability of electricity. But this is not a market-savvy investment, so until we accept that we need non-market investments in conservation, we will continue to waste our most effective resource.
> A grid overload just after 4pm EDT knocked out power in NY City, Boston, > Cleveland, Detroit, Toronto, and Ottawa, among many other cities, For the record, Boston did not lose power. According to the *Boston Herald*, the only cities in Massachusetts that lost power were Pittsfield and Springfield. I don't know first-hand whether that information is accurate, but I do know that the greater Boston area, or at least the portions of it in which I and my coworkers traveled yesterday, never lost power.
Possible connection? Wild guess? I'm not competent to evaluate this: http://www.heise.de/newsticker/data/ju-15.08.03-001/ [in German] [The cited article is written by Juergen Schmidt, senior editor of heise, which publishes c't, which we have quoted in RISKS before. (See http://www.heise.de/ct/impress.shtml ; tel +49 511 53 52 300.) Basically, this article notes that National Grid is a "reference client" of Northern Dynamics, and that OPC uses COM/DCOM, and that this is precisely the technology that the Blaster worm trashes. It does not *claim* that OPC was used for any of the SCADA applications that might have triggered the propagation, but merely raises the question of whether this might have been the case. The possibility is not too far fetched, especially if the common flaw existed in multiple distributed computerized control systems. ADDED NOTE, *The International Herald Tribune* has a story this weekend on MS shutting down www.windowsupdate.com saying that "Security experts say they have found no evidence that the blackout ... was related" to Blaster. But then so much else is unclear, so who knows? Thanks to Peter Ladkin for providing background on this. PGN]
A program Microsoft instructed customers to use to fix a hole in its Windows software, which is vulnerable to attack by the Blaster/Lovsan worm that infected computers this week, may itself be flawed. A glitch in the Microsoft Windows Update patch-management system used to download Windows software fixes has tricked some customers into thinking their systems were patched to prevent Lovsan, when they really were not, said Russ Cooper, moderator of a mailing list with 30,000 subscribers that tracks Microsoft's software weaknesses. ... [Source: CBS MarketWatch, 15 Aug 2003] http://www.chron.com/cs/CDA/ssistory.mpl/business/2049216
I recently received an e-mail from Microsoft, with the title: "Actions for the Blaster Worm - Special Edition, Microsoft Australia News and Events". It contained (mostly useful) advice on dealing with the Blaster worm, but included this: > Your computer is not vulnerable to the Blaster worm if > either of these conditions apply to you: > > * If you are using Microsoft Windows 95; Windows 98; > Windows 98 Second Edition (SE); or Windows Millennium (Me). > * If you downloaded and installed security update MS03-026 > prior to 11 August 2003, the date the worm was discovered. The second of these would be valid if we know for sure that the worm was not in the wild before it was discovered, but I don't see how we can be confident of that. I would expect the rate of spread to be approximately exponential, until the net begins to become saturated. The worm might have been around for days or even weeks before it was formally "discovered". Michael Smith, Aurema Pty Limited, PO Box 305, Strawberry Hills 2012, Australia 79 Myrtle Street, Chippendale 2008, Australia +61 2 9698 2322 www.aurema.com
[Ah, for the good old days of analog phones with dials.] [Unfortunately, neither news report gives much in details about the real cause of the problem or why only some payphones are having problems.] In Australia, about half of Telecom's 5000 public payphones were out of order due to a software bug, and the situation is slow to improve. Manual reset of each phone may be necessary. [Sources: Bug Downs Pay Phones, Today In New Zealand News, 10 Aug 2003, IRN, and Payphone glitch toll known today, Philip English, 12 Aug 2003, New Zealand Herald; PGN-ed] http://xtramsn.co.nz/news/0,,3882-2576232,00.html http://www.nzherald.co.nz/storydisplay.cfm?storyID=3517597 [Erroneous title corrected in archive copy]
GNU Servers Hacked, Linux Software May Be Compromised, *Techweb News* http://www.internetwk.com/breakingNews/showArticle.jhtml?articleID=13100280 In mid-March 2003, someone hacked the primary file servers hosted by the GNU Project, the group which supports the development of many of the components in the Linux operating system, the group acknowledged Wednesday. It warned that the attacker may have inserted malicious code into the free software available for download, including Linux, and posted a set of hashes that users can check against to determine if what they retrieved is clean. The CERT Coordination Center noted in an advisory posted on 13 Aug 2003 that "because this system serves as a centralized archive of popular software, the insertion of malicious code into the distributed software is a serious threat." At the same time, it reported that there isn't any evidence that the source code posted on the FTP servers was, in fact, compromised. The Free Software Foundation (FSF), which oversees the GNU Project, has posted a series of checksums, validation numbers generated by the source code known not to have been compromised, which users can use to verify what they've downloaded. The attack took place in March, but was only discovered in late July. It used an exploit that was revealed on March 17, for which a patch wasn't immediately available. It was during a week's span of vulnerability that the servers were compromised, the FSF said in a statement. A Trojan horse was placed on the system at that time, possibly for password collection and to use the machine for additional attacks, according to the FSF. [See also http://zdnet.com.com/2100-1105-5063658.html -- which prompted Keith Rhodes to note the following: * The bad news: "The project urged those who have downloaded software from the server since March to check that the source code has not been tampered with." * The good news: You actually have source you can check. PGN]
A computer glitch at Nasdaq evidently caused the network to report a false and exceedingly low trade price for Rentrak Inc. at the end of trading on 13 Aug 2003. For a short time it was reported that common shares in Rentrak had closed at 15 cents, down nearly 98 percent from the previous close of $6.65. The false price of 15 cents was in the data continuously supplied by Nasdaq to communication channels, such as wire services and web portals. A short time later the closing price was changed to $6.28, down 5.6 percent from the previous day. [Source: Computer error sends Rentrak's reported stock price on roller coaster ride, Robert Goldfield, 13 Aug 2003, American City Business Journals Inc.; PGN-ed] http://portland.bizjournals.com/portland/stories/2003/08/11/daily33.html
I received an e-mail asking be to join something called the American Consumer Panel (http://www.americanconsumerpanel.com), and as a "perk" for joining, I would be sent an Amazon gift certificate. On the website, they claim to be a service of Forester Research (even links to Forester's site and shows a copyright (for what ever that is worth)) yet doing a search on Forester finds no mention of them. Anyways, something besides all of that made me suspicious (maybe how the URL got redirected to https://netpanel.gmi-mr.com/portals/gpms_cp/5000585/) so I checked out the terms and conditions of membership and buried down in the middle of the terms was this gem, "5. Third-Party Accounts By participating in the Service, you authorize ACP to access your spending and savings in your personal accounts, including but not limited to your credit card and bank accounts, using ACP's secure, computerized system, [and authorize your third-party account providers to provide us with such information.] Where applicable, you also authorize ACP to record your Web-surfing behavior. You agree that ACP assumes no responsibility and shall incur no liability with respect to the acts, omissions, or determinations of any such third-party account providers." Maybe it's over-reacting on my part, but ignoring the web-surfing monitoring, it seems a stretch for a research company to need to access my personal credit cards and bank accounts. Even if this is legitimate (I sent an e-mail to Forester and have not received a response), access is a very vague term. If I have access to something, what kind of permissions do I have? Can I remove money or transfer it to another account? Additionally, some banks charge you for 3rd party access so you could get whacked with all kinds of bank fees. Regardless, buried this deep into the terms and conditions makes this whole site very suspicious. Risks seem obvious enough... M@ Anderson Sr. Enterprise Architect email@example.com
easynet.nl runs a SPAM blacklist based solely on source IP address and, as far as I can tell, uses a highly indiscriminate process for adding addresses that can be summarized as "One accusation and you're convicted" combined with "Guilty until proven innocent". Unfortunately, they are also one of the most widely used blacklists, and their popularity is threatening to seriously affect the ability to communicate by e-mail. My hosting provider recently had to change its upstream provider and get new IP addresses because easynet had its entire class B netblock on the list to "punish" the owner of that netblock for perceived unwillingness or inability to police SPAM. The new addresses come from class A block 69/8, which until fairly recently was unallocated. Somehow, the NEW address for my provider's SMTP server is also on easynet's list, so we're back where we started. Easynet won't communicate with anyone about their decisions, and getting removed is nearly impossible. How long will it take before ISPs using easynet realize they're hurting their own subscribers as much as the spammers? This threatens to fragment the Internet into isolated islands where large groups of users are unable to communicate with each other.
>I can't be the first to point this out, but: having a character that is >visually indistinguishable from the absence of a character is in itself a >risk. Perhaps it would be useful for URL-display and similar outputs to use >a visible character to indicate spaces, as can be done with word-processors? The visible character might be misunderstood as representing itself. Better to use a shading, or colour, to indicate non-character regions, IMHO. Typically, text is black on white, for this, use whatever (20% black + 80% white) works out as, or similar. My site's URL would then display as http://www.merlyn.demon.co.uk/######... where ######... represents light but unmistakable shading. John Stockton, Surrey, UK. <URL:http://www.merlyn.demon.co.uk/>
> [I somewhat reluctantly fixed a typo above: "bardcode" sounded > appropriately Shakespearean for a library system. PGN] Actually, Bardcode is a very cute Web site that presents the entire works of Shakespeare in barcode form. http://artcontext.net/bardcode/ It seems to be down at the moment, but the Wayback Machine has it: http://web.archive.org/web/20020211011705/http://artcontext.net/bardcode/
Please report problems with the web pages to the maintainer