It also confuses the sources of certain data sets. This may mean that researchers will miss important features that affect model training. Many people unknowingly use a data set containing chest scans of children who have not been infected with the new coronavirus as an example of non-new coronavirus cases. But the result is that artificial intelligence has learned to recognize children, not the new coronavirus.
Driggs’ group trained its model using a data set that contains mixed scans taken while the patient is lying down and standing up. Because patients who lie down for scanning are more likely to be seriously ill, AI has mistakenly learned to predict a serious risk of new coronavirus from a person’s location.
In other cases, some AIs were found to receive text fonts used by certain hospitals to mark scans. As a result, fonts from hospitals with more severe cases became predictors of covid risk.
In hindsight, mistakes like this seem obvious. If the researcher knows them, they can also be fixed by adjusting the model. It is possible to acknowledge these shortcomings and publish a less accurate but less misleading model. However, many tools are either developed by artificial intelligence researchers who lack medical expertise to find data flaws, or by medical researchers who lack mathematical skills to compensate for these flaws.
A more subtle issue highlighted by Driggs is the merging bias, or bias introduced when the data set is labeled. For example, many medical scans are flagged based on whether the radiologist who created them indicated that they showed the new coronavirus. But this embeds or incorporates any biases of a particular doctor into the basic facts of the data set. Driggs said it would be much better to label medical scans with the results of PCR tests rather than the opinions of doctors. But in a busy hospital, there is not always time for statistical details.
This has not stopped some of these tools from being rushed into clinical practice. Wynants stated that it is not yet clear which or how it is being used. Hospitals sometimes say that they only use the tools for research purposes, which makes it difficult to assess how much doctors rely on them. “There are many secrets,” she said.
Wynants asked a company marketing deep learning algorithms to share information about its methods, but received no response. She later discovered several published models from researchers associated with the company, all of which carry a high risk of bias. “We don’t actually know what the company implemented,” she said.
According to Wynants, some hospitals have even signed confidentiality agreements with medical AI providers. When she asked the doctor what algorithm or software they were using, they sometimes told her they couldn’t.
How to fix
What is the solution? Better data will help, but in times of crisis, this is a big requirement. What’s more important is to make full use of the data sets we have. Driggs said the easiest move is to get the AI team to collaborate more with clinicians. Researchers also need to share their models and disclose how they were trained so that others can test them and build on it. “These are two things we can do today,” he said. “They may solve 50% of the problems we find.”
Bilal Mateen, a doctor in charge of clinical technology research at Wellcome Trust, a global health research charity based in London, said that if the format is standardized, it will be easier to obtain data.