Wilson’s disease (WD) is an autosomal recessive disorder which is caused by poor excretion of copper in mammalian cells. In this review, various issues such as effective characterization of ATP7B genes, scope of gene network topology in genetic analysis, pattern recognition using different computing approaches and fusion possibilities in imaging and genetic dataset are discussed vividly. We categorized this study into three major sections: (A) WD genetics, (B) diagnosis guidelines and (3) treatment possibilities. We addressed the scope of advanced mathematical modelling paradigms for understanding common genetic sequences and dominating WD imaging biomarkers. We have also discussed current state-of-the-art software models for genetic sequencing. Further, we hypothesized that involvement of machine and deep learning techniques in the context of WD genetics and image processing for precise classification of WD. These computing procedures signify changing roles of various data transformation techniques with respect to supervised and unsupervised learning models.
Metals are indispensable micronutrients for growth of all organisms (1). They are an integral part of various enzymes and are responsible for the existence of several diverse redox reactions. These enzymes go on to play a central role for determining metabolic and biological processes inside the cells by balancing several processes such as: free radical detoxification, neurotransmitter synthesis, oxidative metabolism and neurotransmitter synthesis actions (2). Despite the crucial role of metals within cells, the imbalance prevalence of metals homeostasis interrupts the conventional mechanism of biological metabolic processing. This imbalance leads to hereditary disorders like Wilson’s disease (WD) and Menkes disease (MND). WD is a very rare disorder initially described by Wilson (3) as a lenticular degeneration syndrome caused by copper overloading in mammalian cells. In this study, we describe WD as a syndrome of neurological and liver cirrhosis symptoms.
WD epidemiology revelation shows that the carrier frequency or mutation sequence does not show uniformity over geographical landscape. This has been shown to have appraisal incidence of one in 55000 births (4). A recent study of molecular sequencing showed a higher prevalence in Japan due to consanguinity (5). Authors also described characteristic differences of occurrence of WD in Asia and others such as for Japanese population (33-68 per million) and European population (12-29 per million). It shows our primary issue which emphasizes over inconsistency of WD prevalence and its analogue with genetic mutants variations. To resolve this issue, several studies have been performed to obtain a specific pattern in genetic coding variant annotations. These coding variants or mutations conceal strong information about the epidemiology of an autosomal recessive disorder. Out of those major issues, a precise lack of understanding of mitochondrial genomics to find a unique mutation pattern in all WD patients is an open question to genetic experts. It is well known that a mutation pattern is a permanent alteration of various nucleotide sequences and this mutation pattern is different for different geographical regions for WD. So, an essential step is to analyze a complete gene sequence environment and its gene interaction.
Another issue that is related to precise detection of WD is using hepatic and cognitive imaging of WD patients. For WD diagnosis, conventionally, liver and neurological images are used to analyze variation of structural features. Very little efficient hypotheses have been proposed to distinguish characterization features for WD liver cirrhosis and alcoholic liver cirrhosis (6, 7). So, the method of finding a specific WD imaging pattern is unknown, our search shows not a single hypothesis is proposed which can be converted into a mathematical model and provides some effective imaging classification between WD liver cirrhosis and alcoholic liver cirrhosis.
In our hypothesis, we discussed liver imaging along with neurological symptoms which in combination with genetic biomarkers (region of genomic DNA and gene encoding pattern) applied in a machine or deep learning frame work provides a powerful paradigm for WD classification and risk stratification. To collect these features, abdomen imaging manifestations, motor activates analysis and Cis region status identification must be frequently performed by practitioners. In further processing, a set of these biomarkers is used for feature selection step in a machine or deep learning model. This study presents a review that integrates information about WD from different sources such as genetics, clinical trials for diagnosis, and treatment studies. Further, we extract more meaningful inference based on the deep learning based supervised paradigm. We showed an improved version of WD diagnosis process in Figure 1 where imaging, deep learning and genetics play a significant role to provide precise detection of WD and effective treatment.
Improved version of WD diagnosis.
The layout of WD discussion is as follows: section 2 discusses the copper carrier gene and corresponding ATPase family, with the issue of correlation between different WD mutations and epidemiological variations. To resolve this, we discussed a deep learning based supervised paradigm also known as Deep learning for identifying Cis- Regulatory Elements and other applications (DECRES) and a Cis- regulatory theory-based hypothesis to couple WD mutations and epidemiological variations. We discussed involvement of deep learning model: convolution neural networks (CNN) to obtain a precise genomic and imaging pattern, which may enhance the genetic study of inherited diseases. Section 3 constitute discussion over WD diagnosis in terms of liver and neurological imaging and application of ML and DL approaches to show prominent characteristics of biomarkers. Section 4 contains detailed discourse over data fusion techniques and classification measurement indexes to corporate performance of competitive approaches. Section 5 describes overall important points of the study and finally section 6 concludes by providing significant outcomes from entire research work.
WD is an autosomal recessive disorder which is characterized by improper copper excretion in the body. Clinical trials are in favor of improper copper accumulation in primarily the liver, brain and retina. Genetic regulation of the cellular copper metabolism is performed by copper transporting P-type ATPases and altered mutations in ATP7B gene (Figure 2A). The ATP7B gene is located in the trans-Golgi network (Figure 2B) of hepatocytes and the brain and maintains the balance of copper level by excess copper excretion into bile and plasma.
(A)Structure of copper-transporting ATP7B Gene, (B) Localization of ATP7B in TransGolgiNetwork in Liver.
Forbes (8) analyzed structure of the ATP7B protein and its functional traits by defining site-directed mutations inside it. The authors showed that the WD protein can be categorized into 5 domains (Figure 3). These domains are: (1) Phosphatase domain (TGEA motif Thr-Gly- Glu Ala), (2) Phosphorylation domain (DKTGT motif Asp-Lys-Thr-Gly-Thr), (3) ATP binding domain (TGDN motif), (4) metal binding domain (six copper binding motifs at the N-terminus in the cytosol) and (5) eight Transmembrane segments. Functionally ATP7B gene receives copper ions from Antioxidant 1 copper chaperone (ATOX1) which is a cytosolic protein and transport directly into hepatocytes. In WD, altered mutations in ATOX1 block copper pathway from cytosol to copper binding domains and proper copper transportation is disrupted.
Asystematic structural model presentation of ATP7B protein.
Apart from discussion of ATP7B protein, several articles have been published over WD mutation spectrum findings (9) and the epidemiology of WD (10). However, these articles suffer from lack of correlation findings between regional variations of genotype-phenotype traits and unique mutation findings. This is a possible reason for poor diagnosis confirmation of WD in random case study. Further, after performing mutation testing, practitioners are not sure on the precise confirmation of WD existence. To resolve this issue, we propose a hypothesis that is based on a combination of genome prediction theory with supervised machine and deep learning methods.
In our hypothesis, it is considered that for an effective correlation between genetic variations and unique mutation finding, it is essential to explain characteristics of Cis- regulatory regions (CRRs) of the ATP7B gene. Genetic experts should give more attention over enhancer and promoter components of DNA in WD. These two components play a crucial role in gene expression control. In deep learning perspective, an advance supervised deep learning model that can be used to classify the mutation characteristics based on CRRs dataset is DECRES. This model was implemented by Li (11) and available at https://github.com/yifeng-li/DECRES. It uses multilayer layer perceptron for identifying regulatory regions. The practitioner can use deep feature selection algorithm (DFS) for feature selection of ATP7B DNA based CRRs dataset. This model has worked efficiently over classification of GM12878 lymphoblastoid cells with an accuracy of 93.59% (11). Before applying DECRES, WD gene expression data must be cleaned so that it can handle all the missing numeric values and replicate values from intracellular signaling pathways and downstream transcription factors of a sequence. In order to work with this framed sequence, initial step include deep feature extraction with a new one-to-one sparse layer added to conventional multi-layer perceptron (MLP) architecture. The model is regularized so that overfitting does not happen. One common regularization method is elasticnet (38). The inclusion of sparse layer allows the model to learn the non-linearity of features ordinarily not estimated by linear models. Back propagation learning rule was applied for updating the weights. It is found that the features extracted from the model were discriminatory and enriched with regulatory element increasing accuracy of the classifier. Alternatively, time serial model can also be effective to analyze complex interactions between both across small time segments. This analysis can help to maintain an index to represent activity measurement of genes. To implement time series model, genetic experts require CRRs activity variation sequence (12) based datasets. An un-regular activity can be simulated as spike sorting method where an abrupt change in signal and high variation in specific motif activity can be a significant measure to estimate the causes of WD. Similar time dependent models like autoregressive moving average (AMR) and autoregressive integrated moving average (ARIMA) can also be effective to observe variation deviation in defected genetic mutations.
Apart from genetics, WD diagnosis also has issue of precise knowledge discovery findings and its application in prediction of WD existence. WD diagnosis is mainly based on lower abdominal imaging characterization and neurologic symptoms detection but several clinical findings (13) are also in favor of hepatic and neurological deficits with cornea syndrome. Several vitro researches have been performed for precise diagnosis of WD symptoms findings but, till date, no strong diagnosis test is available for confirmation of pediatric WD existence.
WD imaging studies can be classified into neurological disorders study and hepatic cirrhosis study. In hepatic studies, symptoms of WD are confirmed based on visibility of focal liver lesion in hepatocytes. Visibility of focal liver lesion is categorized by (1) diffuse intensity level characteristics of nodules, (2) size of nodules. This classification model elaborates major liver cirrhosis and their correlation with different imaging techniques. To explore clear understanding of liver lesion, several MR imaging-based studies (14) have been reported. Table 1 shows the liver lesion stages variations in terms of liver nodules specifications and characterization (15). Also, in Table 1 a comparative analysis of structural features of normal liver tissues and liver affected by WD is shown.
Nodules specifications | Normal case | WD case |
---|---|---|
Regenerative nodules | No hyper-enhancement is seen in pixel intensity of MR images. | In WD cases irregular liver shape is seen.Hypo-intense pixels are observed. |
Nodules in hepatic steatosis | Micro vesicular | Macro vesicular |
Nodular fatty infiltration | - | Very rare changes are observed in hyperintense signals in T1-weighted MR images. |
Pseudo-mass | Normal tissue structure | Advanced cirrhosisin liver tissue. |
Honeycomb pattern | Normal nodules structure | Several hypo-intense nodules covered by hyper-intense septa are seen |
Dysplastic nodules | __ | Atypical liver cell structure found in WD. |
Malignant nodules | Not exist in normal case. | The specific type of malignant nodule is called as HCC is more frequently (95%) found in WD. |
Another type of malignant nodule is intrahepatic cholangiocarcinoma which is very rare in WD. |
To show hepatic involvement in WD, Dohan (15) preformed a set of liver imaging manifestations and showed degree of intensity enhancement in nodules (Figure 4) (15). Authors considered focal liver lesion status and intensity variations in hepatocytes as important biomarkers for WD detection but found no correlation with confirmation of WD possibilities because other factors like alcohol consumption may also be the cause of presence of focal liver lesion.
A. Fat-suppressed MR image confirms hypointense liver nodule (arrows). B. The homogeneous and hypointense nodule (arrows) relative to the adjacent hepatic parenchyma. C. Nodule is markedly hypointense indicating free diffusion. D. Fat-suppressed image showing 3-D volumetric interpolated breath-hold examination and nodules (arrows) shows degree enhancement like adjacent hepatic parenchyma. E. Nodule (arrows) shows lower degree of enhancement compare to adjacent hepatic parenchyma. F. The nodule (arrow) is isointense to the adjacent hepatic parenchyma
Usually neurological symptoms develop in the third decade of life and mainly involve Dystonia, Seizure, Chorea, Dysarthria, Resting tremor and other psychiatric disturbances. One important observation is the presence of the Kayser– Fleischer13 ring with almost all cases of neurological involvement in WD.
In our hypothesis, common imaging biomarkers such as muscles contraction status in basal ganglia, dysfunction in cerebellum region and pixel intensity pattern combined with genetic biomarkers (class predictor genes set) form observations set for classification. In this phase, statistical offline feature values are mapped along with statistical genetic motif data. One similar demonstration was performed by Kim (39, 40), where a small sample of financial data was trained by using CNN. Analysts can extend imaging features (41) by appending a new dimension of features related genetic properties. One advantage of this demonstration is availability of online DCNN server to handle such type of data. This server automatically fix hyper-parameters for applied CNN model. This approach can be estimated as binary classification where two classes (1) Class WD prediction and (2) Non class WD prediction determine the state of fused data (Figure 5). The detection of WD or Non-WD depends over effective characterization of WD gene expressions with neuro/ liver images. To characterize it, effective noise removal techniques and transformations can be applied. After refining both genetic and imaging data, an advance feature selection procedures extracts prominent features with precise accuracy level. This step mitigates the redundant and less significant attributes. In genetic data, prominent genes can be obtained by filtering the higher ranking gene set with specification of their risk stratification score. A normalization step specifies range of parameters for both datasets so that a unique mapping procedure can access the statistical properties of amalgamated dataset. Now both individual datasets can be put in a single data storage file for classification purpose. Class prediction can be used as strong genetic biomarker also in genetics and already has been used in molecular classification of Cancer (16). It can be accessed by RankGene software (17). Earlier, recurrent neural networks (RNNs) along with long short term memory (LSTM) have been applied to genetic data which are discussed later.
Fusion of WD data taken from neural images and microarray data.
Another method that can be used to classify infected region in WD and non-WD cases is CNN. It is an effective algorithm which is a mathematical classification model (Figure 6). CNN constitutes three basic components: (A) Input, (B) Convolution layer and (C) Pooling. To incorporate with CNN in medical image data, dense liver lesion images are used as input and transferred into convolution layer where image characterization is performed by applying convolution operation between kernel matrix and original liver image matrix. This operation provides feature image mapping of entire image with respect to filter. The output of convolution layer is passed into pooling layer and information related to spatial locality of evaluated feature is reduced (18, 19). In last decade, CNN has been widely used by research community for tissue characterization (20) and medical image segmentation (21). A brief description of DL techniques and its applications are given later.
A CNN classification model for WD prediction based on liver tissues or neuro-tissues variations and genetic biomarkers.
In our hypothesis (Figure 1), we discussed integration of two medical domains for strong diagnosis of WD. Primary domain is WD diagnosed liver images and secondary is gene profiling data in the form of regulatory sequence code image. In primary domain, several research (22) have been performed over liver image characterization using CNN and other models. In the secondary domain, chromatin shift, its association with disease variations and conserved segments (23) are merged with different genomic biomarkers and applied to CNN. These two steps of learning train with huge amount of data. Implementation of a strong fusion algorithm integrating two domains is another challenge. This problem can be easily mapped with Big- data where data is voluminous and collected from several sources. In Big-data, such problems are resolved by fusion techniques (24) (Discussed in section 5.3). In medical domain, fusion process is already successfully performed over medical images (25, 26).
Wilson’s disease is a metal disease disorder, so the objective of treatment must be reduction of copper accumulated content from the tissues. WD treatment study can be divided into two stages: In initial phase of therapy, therapist tries to abrogate excess copper contents from tissues and in later stage they perform maintenance to overcome copper level and further its reaccumulation. A characterization table (Table 2) is shown to represent pros- and cons of some popular chelating agent.
Chelating agents | Characterization | References |
---|---|---|
D- Penicillamine (Copper excretion via urine) | Major cause of alteration in the dermal elastic tissue | (27) |
It develops adverse effects and even worsening of neurological symptoms in WD patients. Discontinuation of the DPA treatment, a rapid clinical deterioration may take place, resulting in the necessity of a liver transplantation or even in the death of the patient | (28) | |
Neurologic deterioration | (29) | |
Zinc and Trientine | Zinc therapy demonstrates poor efficacy in controlling liver disease in pre-symptomatic children with Wilson’s disease | |
Zinc is associated with gastrointestinal adverse effects in nearly 20% of children | (30) | |
Zinc monotherapy is effective in Wilson’s disease patients with mild liver disease diagnosed in childhood | (31) | |
Zinc monotherapy treatment for a long time may create neurological deterioration.Sodium dimercaptopropanesulfonate (DMPS) combined with Zinc produces with better results compare to Zinc monotherapy in neurological WD compare to D-penicillamine (DPA) | (32) | |
Liver transplantation | Overall, neuropsychiatric symptoms improved after liver transplantation, substantiating arguments for widening of the indication for liver transplantation in symptomatic neurologic Wilson’s disease patients with stable liver function | (33) |
Recurrent hepatitis C virus (HCV) infection | (34) | |
Anastomotic stenosis is another major issue associated with liver transplantation | (35) |
There are several other chelating agents are also used for WD treatment, but no chelating is perfect for WD treatment. Based on recent studies (2014 – till date), some new chelating agents can be better alternatives compared to old chelating agents. Zinc mono therapy is better than D- Penicillamine but in some cases, it fails and creates new issues such as nausea, vomiting, brain cell death and abdominal pain (36). Compared to old chelating agents, Methanobactin (MB) (37) a newly developed peptide, performed outstandingly in rats for hepatic copper accumulation, liver damage and cirrhosis tissues impairment. DMPS with zinc is another possible alternative. If these alternatives are not performing well then hepatocyte transplantation, Stem cell transplantation and gene therapy (34) can be used alternative to liver transplantation.
Linear classification problems were the first problems to be encountered intelligently by machines. These problems were successfully dealt using K-Nearest Neighbors (KNN) (42), perceptrons (43) etc. KNN is a lazy learning algorithm which assigns the test instance the most frequent class of its K nearest neighbors. Perceptron is a two-layer hierarchical network of computing nodes. The nodes do the computing while weights of network connections get updated based on the bias with respect to linear classification problem. A basic difference between KNN and perceptron is that KNN does not learn anything while in a perceptron learning is stored among the network weights.
However, with time the size and complexity of data has increased manifold. Bio-informatics data such as WD require gene sequencing over millions of samples. Traversing and making decisions on such data may take years of computation time and special hardware. Learning algorithms based on ML/DL techniques can play a crucial role in understanding the nonlinear relationship between instances within such dataset. In the next few subsections we cover some ML and DL techniques and their applications in bio-informatics and medical imaging.
Various ML techniques have been developed over the years for characterization of data. The ML technique is a twostage process: feature extraction and characterization. In feature extraction, mathematical tools based on domain knowledge is required to extract features from gene and image data. Optionally, feature selection methodologies can be applied to reduce the extracted features. In the second stage, characterization algorithms are applied for classification task. A generalized process model of ML process is shown in Figure 7. Some of the most popular ML techniques are K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Artificial Neural Networks (ANN), Naïve Bayes Algorithm etc.
Generalized ML process model.
KNN have been discussed earlier. ANNs can be two layers like perceptron, or may be multiple layers i.e., multi-layer perceptrons. There are many types of learning laws for ANNs. ANNs can be categorized as iterative and no-iterative. Backward propagation (BP) (44) is the most popular iterative law for updating of weights in ANNs. Extreme Learning Machine (ELM) is a non-iterative single-layer feed-forward neural network which learns its weights in a single-pass (45). It uses the concept of Moore Penrose inverse form (46) to give the least square solution. SVM uses two strategies for characterization. In the first approach, SVM tries to find maximum margin hyper plane dividing the two classes. In the case of non-linear classification, SVM applies kernel trick: transfers the problem from low to higher dimension space where the classes are linearly separable. The Naïve Bayes algorithm uses conditional probability model for classification. It computes the posterior probability of class of a particular instance based on the prior probability, evidence and likelihood. There are several areas in genetics and medical imaging where ML strategies have been successfully applied. We will discuss ML-based applications in genetics and medical imaging in the next few sub-sections.
Effective decision support system in healthcare requires accurate characterization of gene expression data and its correspondence with community of patients. A generalized model for genome classification in the context of training/testing paradigm is given in Figure 8. Each part of the model is described as follows:
ML-based genome classification in the training/testing paradigm.
The objective of genetic data prepressing is preparing data suitable for given application. It improves the performance of applied computing approach by strongly integrating with data format standards. In supervised learning environment, it is essential to pre-process data using Adaptive mode (47, 48). Conventionally it handle dynamic data in context of structural changes in gene sequence over time (49). A sharp inclusion of bulk data involvement with adaptive mode pre-processing may enhance performance of large and complex genetic data bundle. A recently developed Micro-PreP software framework (50) may be coupled with adaptive pre-processing in the following steps: (1) filtering of bad and empty spots, parsing spot description (PrePrep module), (2) normalization, scaling, data visualization and data exploration (Prep module) and (3) handling of replicate slides and intra-slide replicate spots, low signal filtering, outlier detection, slide quality indication output of additional tables: Cyber-T, SAM, engene and ANOVA (PostPrep module). Some other available frameworks for this purpose are GENALEX 6 (51), Genomic Analysis Toolkit (52), Arlequin (53, 54) and many more.
Although there are no rules of thumb to fix criterion for dataset division but some primary assumptions can standardize the validation of proposed model. Here we discussed two competing concerns as (1) when less training data is available and (2) large training data is available. In earlier methods, statistical observations show higher variance in their parameter values so it would be great sense to balance normal variance in training data and testing data. In later methods, it does not make sense to balance variance because it is normally distributed over dataset but also suffer from curse of dimensionality so practitioner should apply training algorithms which estimate better convergence over large dataset.
To study pairwise correlation between genetic features, WGENA is widely applied approach to handle high dimensional genetic data. It creates Pearson correlation matrix between available variables for all samples and design topological network between expression modules based on correlation. Some other prominent application of WGENA are data reduction, clustering, feature selection, genomic data integration and support for data meta-analysis techniques. The theme behind WDENA deals five step procedure which are: (1) correlation matrix formation, (2) co expression selection, (3) survival related modules and their characteristics identification, (4) Representative genes clustering and (5) Gene network prognostic score creation. A detailed analysis of this step can be understood from here (55, 56, 57).
In this stage, a threshold parameter is defined for the mapping of all the prominent features selected in previous step. This mapping function is significant to measure adjacency level of all the genes. These individual adjacency levels are stored as topological overlap matrix (TOM) and it assist to classify similar gene expressions on accordance of adjacency level and its features. Some important features are gene modules, gene label (if available), and dissimilarity index and so on. This network can be approached for optimization purpose for dimension minimization using graph reduction approaches. So a compressed WGENA graph with optimal features and adjacency index is demonstrated for classification purpose.
In machine learning approaches, several types classification algorithms are available i.e., comparison of the performance of Decision Trees and SVM in context of structural data of genes (58). Authors concluded that SVM outperform to classify larger training data compare to decision tree method. Bayesian network was used to analyze genetic data and developed multiple statistical measures for model validation (59). Similarly, Newton et al. also applied Bayesian model over Escherichia coli genetic data and observed fluctuations in fluorescent intensities at each spot on the microarray. To train parameters in these algorithms, several frameworks have been proposed. Some promising framework that can be used in genetic data are (1) Genetic algorithm in Neural network classification (60), (2) Fuzzy ruling for classification and parameter tuning (61, 62), (3) Parameter tuning in SVM (63) and many more.
Several risk prediction models (64) can be used to predict performance level of applied estimation approach. Some popular approaches are based on Brier score (65) to measure prediction performance of applied framework, C-statistic for discriminative ability of model (Based on ROC curve) (66), goodness-of-fit statistics for calibration (67, 68), Net Reclassification Index (NRI) (69), Integrated Discrimination Index (IDI) (70, 71).
It has been applied in identifying single nucleotide polymorphisms (SNPs) in genetic data which is responsible for several diseases including cancer (72, 73). Ensemble ML techniques (74) have been also applied on genetic expression data to characterize cancerous and non-cancerous tumor cells.
There is a widespread application of ML techniques in medical imaging. The training/testing paradigm for ML is shown in Figure 9. Similar to gene data, images are pre-processed and divided into train and test data. Features are extracted by using various properties of image i.e., shape, texture, curvature etc. Thereafter, ML algorithms are applied on these features for characterization.
ML Training/testing paradigm for medical images.
Suri and his team have applied ML techniques for chronic liver disease classification from ultrasound (US) images (75). Haralick (76), Fourier transform (77), Gabor texture (78), discrete cosine (79) and Gupta (80) features were extracted from the US images. Backward Propagation Neural Network was applied to characterize live images as normal/abnormal. Suri and his team also characterized fatty liver disease from US images (81, 82). In the first case, high-order spectra and discrete waveform transform features were extracted and then decision tree (83) and fuzzy classifier (84) was applied with fuzzy classifier obtaining higher accuracy. In the second case, Gabor, GLCM (85) and GRLM features (86) were extracted and ELM and SVM methods were applied with ELM scoring higher accuracy. In the next subsection, we describe DL and its various applications in genetics and bio-medical images.
Over the past few years, DL techniques have become very popular among machine learning community. This is owed to decrease in computer hardware costs along with emergence of Big Data (87). There are many advantages of DL with respect to ML techniques. The biggest advantage is the independence from feature extraction algorithms. DL algorithms apply multiple layers of abstraction on the input data to capture non-linear relationships within the data components. Thus the deep features obtained are of high quality. Feature extraction algorithms are dependent on the domain knowledge. DL is independent of any pre-requirement of knowledge about data. This makes DL an open-to-all platform for people who do not have any prior information about data. This independence also allows DL to work on multiple combinations of data for better training without worrying about commonality or differences of the features. Further, DL have multi-tasking ability i.e., the features extracted from DL framework can be used for both classification and segmentation. It’s also proven that DL models can be used for different kinds of data i.e., CNNs have been applied to images (88), signal data (89), genetics (90) etc. Additionally, DL systems are capable of transfer learning (91). Some of the disadvantages of DL are as follows: DL is highly costly in terms of computing time and space. It may take several hours or even days to train a DL-based model depending on the processing power of graphical processing unit (GPU). It’s also observed that initial training results may vary for the same data and may take more time to converge at an optimal point. There are many different kinds of DL models available: CNN, autoencoders (92), deep belief network (DBN) (93), residual neural network (ResNet) (94), long short term memory (LSTM) (95) etc. CNN as already discussed earlier, uses convolution operations to extract features and pooling for downsampling the features. Additionally, it uses rectifier linear unit (ReLu) after convolution operation as an activation function. ReLu is used specifically to tackle the issue of vanishing gradients (96). DBNs are inspired from Restricted Boltzmann Machine (97). It uses a stack of RBNs consisting of multiple hidden layers and a single visible layer. It is used for unsupervised learning. Autoencoders are also used for unsupervised learning where the number of output nodes is equal to the number of input nodes. If the number of hidden nodes are less than input/output nodes than the network learns a compressed form of input. Similarly, if the number of hidden nodes are greater interesting relationship within input data can be understood. DL models becomes tougher to train with increase in depth and therefore the training error accumulates. In order to prevent error accumulation original mapping of data is recast by integrating input data and the difference between the input and output mapping. This strategy is the principle of ResNets. In this way the training error does not increase. LSTMs are employed for time series data, speech control, natural language processing etc. and can be employed for predicting genetic sequences. It consists of a memory cell and three gates for control of memory, input and output.
In genetics, DL has been employed for annotate and interpret non-coding genome (98), nucleic acids binding interactions (90), classify cancer types based on gene expression data (99). Traditional computing procedures in gene computation lack in various challenges such as (1) Effective merging of next generation sequence technology with large genomic dataset, (2) Proper characterization of genomic biomarkers such as exons location, promoters and enhancer regions and their status and nucleosomes position, (3) High cost of pattern mapping approaches among various set of genomes, (4) advance knowledge management and many more. To deal with these challenges, Deep learning may concentrates over some graphical methods such as genetic residual network design (100, 101) where node to node connections shows connections between different nucleotides and weight of that link depends over genetic map distance (102, 103, 104), efficient procedures to decompose gene regulatory network into simple functional paths and network minimization (105, 106).
In this sub-section we have provided a fused architecture using both ML/DL paradigm for WD detection. This architecture has two pathways. The ML-classifier takes genetic data, patient symptoms as input and providing normal/abnormal results based on previously trained data. Logistic Regression is applied to grade the results in low and high risk zones. This information along with symptoms, patient liver US images are fed into the second pathway consisting of DL-based classifier. Finally, based on outputs the patient is either diagnosed with WD or declared normal. A process model of the approach is given in Figure 10.
WD detection process model.
The main objective of this review was to elaborate deep learning application involvement in rare disease (WD) study. Apart from theoretical considerations, we tried to propose those hypotheses which may be a cause of evolutionary change in the discovery of rare diseases and their corresponding diagnosis criteria. In the WD genetic study section, our hypotheses promoted involvement of cis regulatory regions study with DECRES package, which may be effective to find biomarkers which can correlate mutation variations with the epidemiology of WD. Another sub-aim was to study of WD phenotypes- genotype traits characterization by deep learning algorithms. In the diagnosis section, we represented a combined study of genetic and imaging biomarker and their application in infected region classification in WD or alcoholic domain. We considered that for strong WD prediction, a combined study of biological factors and predictive models is required. These models can be more effective in genetics declaration and cirrhosis classification.
In WD treatment, we described a detailed analysis of all frequent and popular chelating agents such as D-Penicillamine, Trientine, Zinc and their functionality to bind copper ions and form a complex, heavy compound. These chelating agents are not accomplished safe and associated with specific newly derived symptoms such as fever, rash, degenerative changes in skin, serous retinitis, Anemia and Hepatotoxicity. To resolve these symptoms, we addressed newly developed treatment alternatives and described that how these alternatives can be proper replacement of conventional treatment strategies.
We showed different possible cases associated with WD study in terms of seen biomarkers in genetics models, hepatic and neurological studies to classify different possibilities. We also tried to propose a hypothesis for the purpose of diagnosis. In future, application of various statistics-based enhancements can increase the importance of this article. These enhancements can be in terms of imaging-based algorithm involvement in imaging study, WD signal band spectrum findings and their mappings with new cases and simulation based classification algorithms may provide some optimized and interesting results in future.
For writing the article, authors did not get any kind of financial support from any organization/institute. Moreover, authors also declared that there is no conflict of interest. We would like to express our special thanks to the staff of the Neurological Research Division, AtheroPoint™Roseville, CA, USA to provide the support in graphics design.