Introduction
Methods
Identifying the research question
Identifying relevant studies
Identifying medical and health-related publications
Database | Query |
---|---|
PubMed | (((“Diagnosis”[MeSH] OR “diagnostic”[TIAB] OR “diagnostics”[TIAB] OR “diagnosis”[TIAB] OR “diagnoses”[TIAB]) AND (“rare diseases”[MeSH] OR “Genetic Diseases, Inborn/genetics”[MeSH] OR ((“rare”[TIAB] OR “genetic”[TIAB] OR “orphan”[TIAB]) AND (“diseases”[TIAB] OR “disease”[TIAB])))) OR (“Rare Diseases/diagnosis”[MeSH] OR “Genetic Diseases, Inborn/diagnosis”[MeSH])) AND (“Decision Support Systems, Clinical”[MeSH] OR “Decision Support Techniques”[MeSH] OR “decision support”[TIAB] OR “artificial intelligence”[MeSH] OR “artificial intelligence”[TIAB] OR “Medical Informatics Computing”[MeSH] OR “Big data”[MeSH] OR “Data Mining/methods”[MeSH] OR “expert system”[TIAB] OR “information retrieval”[TIAB] OR “search engine”[MeSH] OR “Software Design”[MeSH] OR “Software Validation”[MeSH]) |
Web of Science | (ALL = (“rare disease*” OR “genetic disease*” OR “orphan disease*”) OR (TI = (“rare” OR “genetic” OR “orphan”) AND TI = “disease*”)) AND ALL = (diagnosis OR diagnostic* OR diagnoses) AND ALL = (“decision support*” OR “expert system*” OR “artificial intelligence” OR “information retrieval” OR “search engine*” OR “medical informatics computing” OR “software design” OR “software validation” OR “big data” OR “data mining”) |
Identifying methodological publications
Study selection
-
Aimed at assessing disease severity, survival, prognosis or risk for recurrence but not diagnosis (publications identifying primary risk for a disease were kept)
-
Aimed at assessing the risk for a disease using only environmental factors
-
Aimed at identifying the best treatment option based on individual variability
-
Aimed at classifying diseases without performing a more precise diagnosis/subtyping (e.g., for cystic fibrosis, assessing the thickness of airways)
-
Aimed at improving disease knowledge (e.g., aiming at identifying gene signatures) instead of generating a diagnosis tool or algorithm
-
Focusing on diseases that are neither rare nor genetic (e.g., Alzheimer, Parkinson)
Charting the relevant studies
Collating, summarizing and reporting the results
Results
Metadata
Publication scope
Publication target
Material
Number of studies | Number of studies with datasets | Number of datasets | Number of patients | ||
---|---|---|---|---|---|
Median | Mean [Min, Max] | ||||
Patients | |||||
Group 1 | 29 studies | 27 studies | 29 datasets | 50 | 291 [7, 5050] |
Group 2 | 15 studies | 14 studies | 20 datasets | 98 | 730 [5, 10,593] |
Group 3 | 17 studies | 8 studies | 10 datasets | 161 | 6929 [40, 39,000] |
Controls | |||||
Group 1 | 29 studies | 27 studies | 29 datasets | 70 | 105,491 [10, 2,966,363] |
Algorithm and model
Preprocessing
Developed models
Material | Knowledge | Machine learning | Articles |
---|---|---|---|
Phenotype concepts (22 studies) | Knowledge-based (14 studies) | No | |
Hybrid (7 studies) | Yes | ||
No | [25] | ||
Data driven (1 study) | Yes | [72] | |
Fluids (12 studies) | Hybrid (2 studies) | Yes | |
Data driven (10 studies) | Yes | ||
Images (16 studies) | Hybrid (2 studies) | Yes | |
Data driven (14 studies) | Yes | ||
Questionnaires (3 studies) | Data driven (3 studies) | Yes | |
Family history and combined material (8 studies) | Knowledge-based (5 studies) | No | |
Hybrid (2 studies) | Yes | ||
Data driven (1 study) | Yes | [40] |
Evaluation and validation
-
The performance metrics;
-
The comparison of results to other references;
-
The use of external validation;
-
The inclusion of a process to deal with the imbalance issues.
Evaluation | Data driven | Knowledge-based | Hybrid |
---|---|---|---|
Comparison to other methods | 15 studies | 9 studies | 4 studies |
Comparison to other tools | 1 study | 8 studies | 3 studies |
Comparison to experts | 3 studies | 1 study | 1 study |
External validation | 8 studies | 8 studies | 2 studies |
Method for imbalance issue | 2 studies | 0 studies | 0 studies |
Total | 29 studies | 19 studies | 13 studies |
-
For data-driven systems, we assessed whether the algorithm was validated on an external dataset. Indeed, datasets can be subject to certain biases, and the methods can be overfitted to one dataset and fail in other datasets. Therefore, a validation step on an external dataset is required.
-
For knowledge-based studies, we assessed whether models were evaluated on real patients.
-
For hybrid models, both validation processes were considered.
Tool implementation
-
Online tools
-
Advanced tools/algorithms
-
Innovative prototypes
General information
Online tools
Tool name | Date | Data sources | Performances: Top 10 ranking | Related articles | URL |
---|---|---|---|---|---|
Phenomizer | 2009 | Phenotype concepts | NA | [63] | http://compbio.charite.de/phenomizer |
BOQA | 2012 | Phenotype concepts | NA | [64] | http://compbio.charite.de/boqa/ |
Phenotips | 2013 | Phenotype concepts | NA | [65] | http://phenotips.org |
FindZebra | 2013 | Phenotype concepts | 63% | [66] | http://www.findzebra.com/ |
PhenIX | 2014 | Phenotype concepts/genes | ~ 99% | [67] | http://compbio.charite.de/PhenIX/ |
Phenolyzer | 2015 | Phenotype concepts/genes | ~ 85% | [69] | http://phenolyzer.usc.edu |
RDD | 2016, 2017 | Phenotype concepts | 38% | http://diseasediscovery.udl.cat/ | |
IEMbase | 2018 | Phenotype concepts | 90% | [54] | http://www.iembase.org/app |
PubCaseFinder | 2018 | Phenotype concepts | 57% | [71] | https://pubcasefinder.dbcls.jp/ |
RDAD | 2018 | Phenotype concepts/genes | 95% | [73] | |
GDDP | 2019 | Phenotype concepts | ~ 32% | [77] | https://gddp.research.cchmc.org/ |
Xrare | 2019 | Phenotype concepts/genes | ~ 95% | [78] | https://web.stanford.edu/~xm24/Xrare/ |
CC-Cruiser | 2017 | Images | NA | [44] | https://www.cc-cruiser.com/ |
DeepGestalt | 2019 | Images | NA | [62] | https://www.face2gene.com/ |
Advanced tools and algorithms
Innovative prototypes
Discussion
Overview
Technical significance
Clinical significance
Perspectives
-
To use standardized metrics to facilitate evaluation and comparison. For studies using the top K ranking of possible diagnoses, we recommend providing at least the top 10 disease rankings and the mean ranking of correct disease over all patients.
-
To use standardized terminologies to enhance interoperability and spread of the tools. For systems based on phenotype concepts, we recommend using the HPO, provided that this terminology keeps being enriched and is available in several languages.
-
To combine expert and data knowledge to enhance explicability
-
To provide robust methods dealing with the imbalance and data volume issues
-
To make training sets accessible
-
To validate the findings on external datasets and real patient cases
-
To measure the impact on patient diagnosis and outcomes.