AI with Swarm Intelligence Detects COVID-19 in Data Stored in Decentralized Fashion
By LabMedica International staff writers
Posted on 27 May 2021
Using a principle called “swarm learning” an international research team has trained artificial intelligence (AI) algorithms to detect blood cancer, lung diseases and COVID-19 in data stored in a decentralized fashion.Posted on 27 May 2021
This approach by experts from the German Center for Neurodegenerative Diseases (DZNE; Bonn, Germany), the University of Bonn (Bonn, Germany) and Hewlett Packard Enterprise (HPE; Houston, TX, USA) has advantages over conventional methods since it inherently provides privacy preservation technologies, which facilitates cross-site analysis of scientific data. Swarm learning could thus significantly promote and accelerate collaboration and information exchange in research, especially in the field of medicine.
Science and medicine are becoming increasingly digital. Analyzing the resulting volumes of information - known as “big data” - is considered a key to better treatment options. However, the exchange of medical research data across different locations or even between countries is subject to data protection and data sovereignty regulations. In practice, these requirements can usually only be implemented with significant effort. In addition, there are technical barriers: For example, when huge amounts of data have to be transferred digitally, data lines can quickly reach their performance limits. In view of these conditions, many medical studies are locally confined and cannot utilize data that is available elsewhere.
In light of this, an international research collaboration tested a novel approach for evaluating research data stored in a decentralized fashion. The basis for this was the still young “Swarm Learning” technology developed by HPE. In addition to the IT company, numerous research institutions from Greece, the Netherlands and Germany - including members of the “German COVID-19 OMICS Initiative” (DeCOI) - participated in this study.
Swarm Learning combines a special kind of information exchange across different nodes of a network with methods from the toolbox of “machine learning”, a branch of AI. The linchpin of machine learning are algorithms that are trained on data to detect patterns in it - and that consequently acquire the ability to recognize the learned patterns in other data as well. With Swarm Learning, all research data remains on site. Only algorithms and parameters are shared - in a sense, lessons learned. Unlike “federated learning”, in which the data also remains locally, there is no centralized command center. Thus, the AI algorithms learn locally, namely on the basis of the data available at each network node. The learning outcomes of each node are collected as parameters through the blockchain and smartly processed by the system. The outcome, i. e. optimized parameters, are passed on to all parties. This process is repeated multiple times, gradually improving the algorithms’ ability to recognize patterns at each node of the network.
The researchers are now providing practical proof of this approach through the analysis of X-ray images of the lungs and of transcriptomes: The latter are data on the gene activity of cells. In the current study, the focus was specifically on immune cells circulating in the blood - in other words, white blood cells. The research team addressed a total of four infectious and non-infectious diseases: two variants of blood cancer (acute myeloid leukemia and acute lymphoblastic leukemia), as well as tuberculosis and COVID-19. The data included a total of more than 16,000 transcriptomes. The swarm learning network over which the data were distributed typically consisted of at least three and up to 32 nodes. Independently of the transcriptomes, the researchers analyzed about 100,000 chest X-ray images. These were from patients with fluid accumulation in the lung or other pathological findings as well as from individuals without anomalies. These data were distributed across three different nodes.
The analysis of both the transcriptomes and the X-ray images followed the same principle: First, the researchers fed their algorithms with subsets of the respective data set. This included information about which of the samples came from patients and which from individuals without findings. The learned pattern recognition for “sick” or “healthy” was then used to classify further data, in other words it was used to sort the data into samples with or without disease. The accuracy, i.e. the ability of the algorithms to distinguish between healthy and diseased individuals, was around 90% on average for the transcriptomes (each of the four diseases was evaluated separately); in the case of the X-ray data, it ranged from 76% to 86%. The study also found that Swarm Learning yielded significantly better results than when the nodes in the network learned separately.
“The methodology worked best in leukemia. In this disease, the signature of gene activity is particularly striking and thus easiest for artificial intelligence to detect. Infectious diseases are more variable. Nevertheless, the accuracy was also very high for tuberculosis and COVID-19. For X-ray data, the rate was somewhat lower, which is due to the lower data or image quality,” said Joachim Schultze, Director of Systems Medicine at the DZNE and professor at the Life & Medical Sciences Institute (LIMES) at the University of Bonn. “Our study thus proves that Swarm Learning can be successfully applied to very different data. In principle, this applies to any type of information for which pattern recognition by means of artificial intelligence is useful. Be it genome data, X-ray images, data from brain imaging or other complex data.”
Related Links:
German Center for Neurodegenerative Diseases (DZNE)
University of Bonn
Hewlett Packard Enterprise