2/2 Can we teach a computer to predict when we'll die? - by Siddharthe Mukherjee 死期の予知

Avati and his team identified about 200,000 patients who could be studied. The patients had all sorts of illness – cancer, neurological diseases, heart and kidney failure. The team’s key insight was to use the hospital’s medical records as a proxy time machine. Say a man died in January 2017. What if you scrolled time back to the “sweet spot of palliative care” – the window between January and October when care would have been most effective? But to find that spot for a given patient, Avati knew, you’d presumably need to collect and analyze medical information before that window. Could you gather information about this man during this prewindow period that would enable a doctor to predict a demise in that three-to-12 month section time? And what kinds of inputs might teach such an algorithm to make prediction?
Avati drew on medical information that had already been coded by doctors in the hospital: a patient’s diagnosis, the number of scans ordered, the number of days spent in the hospital, the kinds of procedures done, the medical prescriptions written. The information was admittedly limited – no questionnaires, no conversations, no sniffing of chemicals – but it was objective, and standardized across patients.
These inputs were fed into a so-called deep neural network – a kind of software architecture thus named because it’s thought to loose mimic the way the brain’s neurons are organized. The task of algorithm was to adjust the weights and strengths of each piece of information in order to generate a probability score that a given patient would die within three to 12 months.
The “dying algorithm,” as we might call it, digested and absorbed information from nearly 160,000 patients to train itself. Once it had ingested all the data, Avati’s team tested it on the remaining 40,000 patients. The algorithm performed surprisingly well. The false-alarm rate was low: Nine out of 10 patients predicted to die within three to 12 months did die within that window. And 95 percent of patients assigned low probabilities by the program survived longer than 12 months. (The data used by this algorithm can be vastly refined in the future. Lab values, scan results, a doctor’s note or a patient’s own assessment can be added to the mix, enhancing the predictive power.)
So what, exactly, did the algorithm “learn” about the process of dying? And what, in turn, can it teach oncologists? Here is the strange rub of such a deep learning system: It learns, but it cannot tell us why it has learned; it assigns probabilities, but it cannot easily express the reasoning behind the assignment. Like a child who learns to ride a bicycle by trial and error and, asked to articulate the rules that enable bicycle riding, simply shrugs her shoulders and sails away, the algorithm looks vacantly at us when we ask, “Why?” It is, like death. Another black box.
Still, when you pry the box open to look at individual cases, you see expected and unexpected patterns. One man assigned a score of 0.946 died within a few months, as predicted. He had had bladder and prostate cancer, had undergone 21 scans, had been hospitalized for 60 days – all of which had been picked up by the algorithm as sighs of impending death. But a surprising amount of weight was seemingly put on the fact that scans were made of his spine and that a catheter had been used in his spinal cord – features that I and my colleagues might not have recognized as predictors of dying (an M.R.I. of the spinal cord, I later realized, was most likely signaling cancer in the nervous system – a deadly site for metastasis).
It’s hard for me to read about the “dying algorithm” without thinking about my patient S. If a more sophisticated version of such an algorithm had been available, would I have used it in his case? Absolutely. Might that have enabled the end-of-life conversation S. never had with his family? Yes. But I cannot shake some inherent discomfort with the thought that an algorithm might understand patterns of mortality better than most humans. And why, I kept asking myself, would such a program seem so much more acceptable if it had come wrapped in a black-and-white fur box that, rather than emitting probabilistic outputs, curled up next to us with retracted claws?