Forms of artificial intelligence (AI), including machine learning-based systems, are rapidly making their way into healthcare. The recent rise of machine learning—especially the development of deep neural networks—has rapidly improved AI’s ability to identify patterns: machine learning systems seem to perform better than older AI systems, including for diagnosis and screening. The last two years have seen developments including: the production of an algorithm more accurate than standard approaches in predicting cardiovascular event risk; algorithms granted EU and US regulatory approval for use in screening mammography; and the launch of direct-to-consumer device-based AIs (e.g. to detect atrial fibrillation). Much of this is occurring in the private sector, producing proprietary, for profit products that claim excellent performance but are not explainable to users. The data-driven approaches underpinning these systems rely on the availability of existing diagnostic data; problems in these data are reflected in the new systems. For example, data can include duplicates, there are often missing data, and data sets may not include all the features needed to drive AI performance. For some forms of AI, existing diagnostic data need to be manually labelled by clinicians to provide a ‘ground truth’ for machines to learn from. The quality of this ‘ground truth’ is an important potential source of bias, arising from the existing biases of clinicians, regional practice variation, or even systematic prejudice or discrimination in practice: without deliberate intervention, machine learning algorithms will diagnose future cases in ways that encode existing biases. Neural network approaches to machine learning also have a ‘black box’ problem: it is difficult for humans to determine exactly how they make decisions. This introduces particular risks in relying solely on neural network algorithms.
Academic, government and non-government organisations, and to some extent the technology sector, have recently begun asking hard questions about the ethical, legal and social implications of these technologies. Surprisingly little of this activity has paid close attention to machine learning in healthcare. Our research group is beginning work in this area. We will review developments in machine learning for screening and diagnosis, and discuss these in light of key ethical, legal and social questions being raised in the general AI ethics literature. These include: the likely effect of diagnostic and screening AI on the relationship between providers and patients and on professional roles; attribution of responsibility for diagnoses made using AI; how diagnostic and screening AI should be regulated; addressing bias and conflicts of interest in proprietary algorithms; and understanding the impact of AI upon health inequities and public trust in healthcare. A key theme cutting across these issues is the apparent potential of AI to both increase and reduce overdiagnosis, and even reconceptualise disease itself. The overdiagnosis community is well-placed to draw on lessons from past missteps in diagnostic and screening test and program evaluation, and to guide the development of diagnostic and screening AI. Attending carefully to these issues now, before widespread implementation, is a necessary step to maximise benefit and minimise harms from these technologies.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.