The long-awaited report on the deadly 23 Jul 2011 high-speed train crash in Wenzhou CHINA attributes it to a string of blunders, including serious design flaws in crucial equipment used to signal and control the trains that was purchased, evaluated, and used improperly. Two top former officials of the Railway Ministry were singled out for blame. Public outrage died down only after government authorities muzzled the domestic media. The intense public reaction to the accident and the bungled rescue effort that followed are considered major reasons why the Chinese government is now instituting tighter controls of Internet message boards known as microblogs [and presumably the censorship of this issue of RISKS?]. However, the report is lacking in details on what actually went wrong technically—although it mentions failure to notice failure to notice that lightning strikes had affected the equipment. The *NYT* article is well worth reading in full. [Source: Sharon Lafraniere, 28 Dec 2011, *The New York Times*; PGN-ed] http://www.nytimes.com/2011/12/29/world/asia/design-flaws-cited-in-china-train-crash.html?hp
[This is reproduced with permission from a list devoted to election integrity. PGN] I recently ran across Richard Feynman's appendix to the Rogers Commission Report on the Space Shuttle Challenger Accident (published June 6, 1986), and one passage (quoted below) about the software in the space shuttle struck me. He describes what it takes to check and test for the correctness and reliability of the software. (NASA does not even attempt to deal with the software's security against attackers, presumably because it was judged that software in a closed system like the shuttle is not very vulnerable.) I suggest reading this with voting system software in mind. Notice in the 3rd paragraph his point about management's temptation to curtail the amount of checking and testing even in the face of "perpetual" requests for software changes, and the need to resist that temptation. The shuttle's software was at that time about 250,000 lines of code—on the same order as that in a voting system (e.g. a DRE).20 Quoted from http://history.nasa.gov/rogersrep/v2appf.htm Because of the enormous effort required to replace the software for such an elaborate system, and for checking a new system out, no change has been made to the hardware since the system began about fifteen years ago. The actual hardware is obsolete; for example, the memories are of the old ferrite core type. It is becoming more difficult to find manufacturers to supply such old-fashioned computers reliably and of high quality. Modern computers are very much more reliable, can run much faster, simplifying circuits, and allowing more to be done, and would not require so much loading of memory, for the memories are much larger. The software is checked very carefully in a bottom-up fashion. First, each new line of code is checked, then sections of code or modules with special functions are verified. The scope is increased step by step until the new changes are incorporated into a complete system and checked. This complete output is considered the final product, newly released. But completely independently there is an independent verification group, that takes an adversary attitude to the software development group, and tests and verifies the software as if it were a customer of the delivered product. There is additional verification in using the new programs in simulators, etc. A discovery of an error during verification testing is considered very serious, and its origin studied very carefully to avoid such mistakes in the future. Such unexpected errors have been found only about six times in all the programming and program changing (for new or altered payloads) that has been done. The principle that is followed is that all the verification is not an aspect of program safety, it is merely a test of that safety, in a non-catastrophic verification. Flight safety is to be judged solely on how well the programs do in the verification tests. A failure here generates considerable concern. To summarize then, the computer software checking system and attitude is of the highest quality. There appears to be no process of gradually fooling oneself while degrading standards so characteristic of the Solid Rocket Booster or Space Shuttle Main Engine safety systems. To be sure, there have been recent suggestions by management to curtail such elaborate and expensive tests as being unnecessary at this late date in Shuttle history. This must be resisted for it does not appreciate the mutual subtle influences, and sources of error generated by even small changes of one part of a program on another. There are perpetual requests for changes as new payloads and new demands and modifications are suggested by the users. Changes are expensive because they require extensive testing. The proper way to save money is to curtail the number of requested changes, not the quality of testing for each.
[This is reproduced with permission from a list devoted to election integrity. PGN] I just listened to a very interesting 15-minute podcast discussion of risk in aviation control systems. The bottom line is that in some cases, the control systems make mistakes and people (pilots) correct for them, but it's actually more frequent for people to make mistakes because they don't understand what's going on. The interviewee argues that perhaps we should trust software and recognize that it *will* make mistakes that will kill some people, but fewer than would die without the software. The podcast concludes with an explanation that 100 years ago, one of the railroads advertised that due to technological advancements only one person was being killed each day in train accidents, rather than 10 per day as had been the case! Podcast is at http://spectrum.ieee.org/podcast/aerospace/aviation/the-benefits-of-risk/ I am NOT arguing that voting is the same, and it's important to recognize that they're talking about reliability (not security) - the key difference being that in reliability you're concerned about ACCIDENTAL errors causing failures, while in security you're concerned with INTENTIONAL errors causing failures. Also, the failure calculations assume a static environment, but with constant software changes and constant changes to the systems that the software is part of it's anything but a static environment. But thinking of a voting system as a compete system - including the people, equipment, processes, etc. - it's interesting to consider how the accidental failure rate compares for an electronic system to a traditional system. Said another way, consider three cases: (1) The current environment, comparing an optical scan system to a DRE-based system, recognizing the risks of accidental bugs in the DRE software vs. accidental loss of optical scan ballots, accidental misprogramming of both, accidental loss or erasure of memory cards, etc. (2) Comparing the current environment (with either optical scan or DRE) to an Internet voting environment, IGNORING all security concerns for the Internet environment - potentially reducing the risks of accidental errors by pollworkers or election officials (but ignoring intentional insider attacks by either pollworkers or election officials). (3) Comparing the current environment to an Internet voting environment, again ignoring security concerns for the Internet environment, but this time including intentional insider attacks by pollworkers and election officials. Of course quantifying any of these is very hard, but we know the risk is non-zero for all of the failure cases. I don't have any answers, but wonder if eliciting the questions might help the public (and policymakers) understand the tradeoffs somewhat better, and help answer the question "if I can bank online and shop online, why can't I vote online", but also "if we can rely on software to fly our planes, why can't we rely on software to run our elections".
(Paul Marks) Paul Marks, Dot-dash-diss: The gentleman hacker's 1903 lulz, New Scientist, 27 Dec 2011 http://j.mp/upslUK [via NNSquad] "A century ago, one of the world's first hackers used Morse code insults to disrupt a public demo of Marconi's wireless telegraph."
The New York Times said it accidentally sent e-mails on Wednesday to more than eight million people who had shared their information with the company, erroneously informing them they had canceled home delivery of the newspaper. The Times Company, which initially mischaracterized the mishap as spam, apologized for sending the e-mails. The 8.6 million readers who received the e-mails represent a wide cross-section of readers who had given their e-mails to the newspaper in the past, said a Times Company spokeswoman, Eileen Murphy. ... [Source: Amy Chozick, *The New York Times*, Media Decoder blogs, 28 Dec 2011] http://mediadecoder.blogs.nytimes.com/2011/12/28/times-readers-inundated-by-false-e-mail-on-subscriptions/
Newark, NJ - Not quite the "War Of The Worlds" broadcast of a Martian invasion in New Jersey, a Verizon "emergency" alert Monday that the company texted to its wireless customers still jangled some nerves and triggered hundreds of calls from concerned residents to local and state offices. The company sent the alert to customers in Middlesex, Monmouth and Ocean counties, warning of a "civil emergency" and telling people to "take shelter now." Trouble was, the message was meant to be a test but it wasn't labeled as such, Verizon later admitted. [AP item] rest: http://www.cbsnews.com/8301-201_162-57341882/mistaken-verizon-emergency-alert-scares-n.j/ or: http://goo.gl/ihsy5
Francis Moran, Giving a fair shake to the eyes in the sky http://blogs.itbusiness.ca/2011/12/giving-a-fair-shake-to-the-eyes-in-the-sky/ This article discusses testing for colour-blindness, but the first paragraph deals with a risk sneaking through the cracks: In July 2002, a FedEx Boeing 727 carrying cargo crashed on its approach for a night-time landing in Tallahassee, Fl. A U.S. National Transportation Safety Board investigation identified the first officer's colour vision deficiency as a factor in the crash and recommended that all existing colour vision testing protocols employed by the U.S. Federal Aviation Administration (FAA) be reviewed. Four years later, this case, and the issues which it raised about colour blindness testing in the commercial aviation industry, was the subject of a panel at an international workshop hosted by Saudi Arabian Airlines.
John Biggs, *The New York Times*, 25 Dec 2011 How much is a tweet worth? And how much does a Twitter follower cost? In base economic terms, the value of individual Twitter updates seems to be negligible; after all, what is a Twitter post but a few bits of data sent caroming through the Internet? But in a world where social media's influence can mean the difference between a lucrative sale and another fruitless cold call, social media accounts at companies have taken on added significance. The question is: Can a company cash in on, and claim ownership of, an employee's social media account, and if so, what does that mean for workers who are increasingly posting to Twitter, Facebook and Google Plus during work hours? A lawsuit filed in July could provide some answers. ... http://www.nytimes.com/2011/12/26/technology/lawsuit-may-determine-who-owns-a-twitter-account.html
I'm really surprised that this conclusion of test failure has not been vocally challenged here. If I do a penetration test on an untested network and am able to widely penetrate the network do you all declare my penetration test to be a failure? This failure conclusion mistakes failure of the Emergency Alert System local systems with failure of the test. In the Emergency Response community, just like in the network security community, a test which exposes numerous system failures is considered a success because it identifies problems which need to be fixed. A test of a nation-wide system which has never had end-to-end testing is not a failure when it finds problems, it is a BIG success. The systems failed; the test succeeded. Hopefully we will see even more robust end-to-end tests of the Emergency Alert System in the future, and hopefully they will also be a success by finding problems so they can be fixed until the whole system works as planned. There was a failure which was pointed out, but the wrong failure was highlighted. The FEMA website had a notice for at least two weeks prior to this test that many cable system customers would not see the alert banners they were used to seeing during local broadcast system tests because the method used for the nationwide test would not trigger those banners. The failure was that the method FEMA used to communicate this expectation did not effectively disseminate the information to test observers. I would have never predicted the RISK that the experts here would fail to challenge confusion of system failure with test failure. David E. Price SRO, CHMM, Senior Consequence Analyst for Special Projects, CBRNE (Chem, Bio, Rad, Nuc, and Explosives Accident/Safety Analyses)
[From Dave Farber's IP distribution. PGN] > Why did they not encrypt their credit card info? Djf It may be far more than just a blunder. News reports indicate that card numbers were obtained, which is precisely what PCI-DSS 2.0 was supposed to prevent. From https://www.pcisecuritystandards.org/documents/pci_dss_v2.pdf 3.4 Render PAN unreadable anywhere it is stored (including on portable digital media, backup media, and in logs) by using any of the following approaches: - One-way hashes based on strong cryptography (hash must be of the entire PAN) - Truncation (hashing cannot be used to replace the truncated segment of PAN) - Index tokens and pads (pads must be securely stored) - Strong cryptography with associated key-management processes and procedures Note: It is a relatively trivial effort for a malicious individual to reconstruct original PAN data if they have access to both the truncated and hashed version of a PAN. Where hashed and truncated versions of the same PAN are present in an entity's environment, additional controls should be in place to ensure that the hashed and truncated versions cannot be correlated to reconstruct the original PAN. PA-DSS covers application security and may also be relevant https://www.pcisecuritystandards.org/documents/pa-dss_v2.pdf As a side note, PA-DSS 2.0 has made it pretty much impossible to create and certify open source card processing software.
The Menlo Report is an effort from DHS S&T to establish guidelines for ethical network security research involving human subjects, much as the Belmont Report in the 1970s established guidelines for medical research. (See http://en.wikipedia.org/wiki/Belmont_report) The Menlo Report is now out on the Federal Register for comments. Details on how to download the report and submit comments are at http://www.federalregister.gov/articles/2011/12/28/2011-33231/submission-for-review-and-comment-the-menlo-
The meeting "Decoupling Civil Timekeeping from Earth Rotation" was held in Exton, Pennsylvania on October 5-6, 2011. The meeting was announced on the Risks Digest: http://catless.ncl.ac.uk/Risks/26.50.html#subj12 And preprints of the proceedings are now available from: http://futureofutc.org/preprints The slides presented and the resulting group discussions are also available. This was an excellent meeting that has produced insightful papers and intriguing discussions on an obscure topic. If the International Telecommunication Union votes to redefine UTC in January, the topic (and the related risks) won't remain obscure. Rob Seaman, National Optical Astronomy Observatory
First STAMP/STPA Workshop MIT April 17-19, 2012 STAMP/STPA is a new systems thinking approach to engineering safer systems described in Nancy Leveson's new book "Engineering a Safer World" (MIT Press, January 2012). While relatively new, it is already being used in space, aviation, medical, defense, nuclear, automotive, food, and other applications. This informal workshop will bring together those interested in improving their approaches to safety engineering and those who are already trying this new approach in order to share their experiences. The first day will be a tutorial on STPA, the new hazard analysis technique built on the STAMP accident causality model. The tutorial will be taught by Prof. Leveson and her graduate students, who have been using STPA on my different types of projects. The next two days will involve informal presentations by attendees and small group meetings for specific industries and applications. The workshop and tutorial will be free. If you are interested in attending, please send an e-mail (for planning purposes) to email@example.com with the following information: Name: E-mail address or contact information: Organization/job title: Industry: Interested in presenting? If so, what would you like to present?: Further information will be provided in January to those who respond to this preliminary announcement. The workshop is sponsored by the MIT Engineering Systems Division, the Aeronautics and Astronautics Dept., and the MIT Industrial Liaison Program Dr. Nancy G. Leveson, Professor of Aeronautics and Astronautics and Professor of Engineering Systems, Director, Complex Systems Research Lab (CSRL), MIT Room 33-334 77 Massachusetts Ave. Cambridge, MA 02139-4307 Tel: 617-258-0505 firstname.lastname@example.org URL: http://sunnyday.mit.edu
Please report problems with the web pages to the maintainer