Therac 25 Mistreatment


Therac-25 was a computerised medical radiation therapy machine produced by Atomic Energy Commission Limited (AECL) and CGR. Two sorts of radiation were used to treat patients (electrons for shallow tissue, and X-ray photons for deep tissue) at various energy levels up to 25 MeV.

The first commercial version of Therac-25 was released in late 1982, and 11 machines were installed in total, 5 in the USA and 6 in Canada. Six known accidents involving radiation overdoses occurred from June 1985 through January 1987.

Therac-25 was preceded by two other machines in the same series: Therac 6 (a 6 MeV machine producing X-rays only), and Therac-20 (a 20 MeV machine that could produce electrons and X-rays). Both of the earlier machines were able to be run without the software, and standard safety features and interlocks were used. Therac-25 was not intended to be used standalone, and protective circuits used in monitoring the electron beam scanning and mechanical interlocks were not carried over from Therac-20 to Therac-25. However, some of the software in the three models was interrelated or in common.

The safety analysis performed on Therac-25 had several failings. A Fault Tree Analysis (FTA) was carried out which ignored software. The event "Computer selects wrong energy" was assigned a probability of 10^11, and the event "Computer selects wrong mode" was assigned a probability 4 x 10^9, both without any justification.

Feedback from the accidents included patients reporting burning sensations, one case involving a patient moving from the table after receiving a "hot coffee" sensation (and receiving a second dose to the wrong body part after moving).

Physicists reported symptoms of radiation burns (including striped skin due to the nature of the grating through which the radiation was emitted).

The estimates of the levels of exposure were 15000 to 20000, 13000 to 17000 and 16500 to 25000 rads. The vulnerability of the human body to radiation can be gauged from the 50% level for a full body exposure being approximately 500 rads.

Operator feedback included anecdotal evidence that the operators became used to frequent malfunctions, and that the repeated "no dose" messages had led to operators automatically proceeding with another attempt at the same dose. Problems with the operator interface included: data entry allowed the reentry of existing data, and that the error messages were cryptic such as "Malfunction 54" meaning either too high or too low a dose had been given.

Although the series of incidents became known to AECL and the regulators, no enquiry occurred. The physicists were told that it was impossible for the machine to operate in electron mode without scanning to spread the beam, and reportedly also told that no other incidents had occurred.

AECL also failed to investigate immediately despite lawsuits being brought against them. When they did investigate, they looked at the failure of a microswitch that determined the position of the turntable, and ended up recommending a procedural mitigation to the users.

A recommendation to remove the option for the operator to proceed was ignored, but the number of retries allowed was reduced from five to three. The request for independent interlocks on the turntable position was also ignored.

As a result of the mitigations put in place, AECL claimed an improvement of "at least five orders of magnitude" despite admitting that the cause of the accident was still unknown.

The mechanism by which the accidents occurred was worked out by a combination of users of the machines. A physicist and operator worked out that one factor appeared to be the speed with which the operator entered the information. They reached the stage at which they were able to produce the message "Malfunction 54" at will. The dose given in these trials was determined to be over 4000 rads.

The link to the Therac-20 machines was also discovered by users, although there had been an informal notification of Therac-20 users of the Therac-25 incidents. University students using a machine at a university managed to cause multiple blown fuses. However, blame for these was placed on the creative editing used by new users, and a common software problem was blamed. However, a hardware interlock prevented beam being turned on.

The machines were eventually recalled in 1987.

Fix?

The monitoring of a safety-critical system should not be limited to those involved in its operation.

Fix?

The harm caused by Therac-25 may be been lessened by using an incident reporting mechanism, forcing action to be taken before additional incidents occurred.

Similar?

The Theratron mistreatment.

Resource:

"An Investigation of the Therac-25 Accidents" by Nancy Leveson and Clark S. Turner, IEEE Computer in July 1993.