Heart Diseases Diagnosis Expert System Based on Multichannel Adaptive Resonance Theory (MART)

Authors : Nasser Nafaa Khamiss and Amir Fared Partu

Abstract: The research expresses an artificial intelligent system that helps a physician to early heart diseases diagnosis based-ECG database. It benefits the properties of common neural networks methods and the Multichannel Adaptive Resonance Theory (MART). It is an Adaptive Resonance Theory (ART) based neural network for adaptive classification of multichannel signal patterns without prior supervised learning. The mechanical aspect and the electrical aspect that include ECG wave creation, sensors leads of the body and all type of noises are observed in term of their effects on the ECG signal analysis. Also, the Premature Ventricular Contraction disease (PVC) was dealt. The operation of MART was tested for diagnosing a set of real patterns (QRS interval of PVC disease) that were taken from many patients of Holter ECG Database. Then an off line method was used for learning the MART system. A MART system of two-channels is used to quantify the different changing reliabilities of the individual signal channels, in the same time the credibility parameter of the system algorithm is determined and then used during the PVC pattern classification, this results of reducing the creation of spurious or duplicate categories (major problem for ART-based classification of noisy channels), in the same time, it reduces the processing time to be just 0.124 sec. The accuracy and sensitivity of MART are improved to be 93.4 and 95.5%, respectively that make the MART a dedicated algorithm for biomedical signals.

How to cite this article:

Nasser Nafaa Khamiss and Amir Fared Partu, 2009. Heart Diseases Diagnosis Expert System Based on Multichannel Adaptive Resonance Theory (MART). Asian Journal of Information Technology, 8: 37-46.

URL: https://medwelljournals.com/abstract/?doi=ajit.2009.37.46

INTRODUCTION

The subject of heart disease detection and classification took allot of attention in the last decay by different research centers, where different techniques were introduced and evaluated, some of such a techniques are mentioned in the study Ali (2005), Prasad and Sahamb (2003) and Acharya et al. (2004). In the same time, the main fields of applications of artificial neural networks are pattern recognition. Neural networks designed for pattern recognition include (1999), Self Organizing Maps (SOM’s) (FA-Long et al., 1997) and Multi layer Perceptrons (MLP’s) (Rao and Hayagriva, 1997). Most of these designs require that the patterns to be recognized and need to be learned during an initial process of supervised networks based on Adaptive Resonance Theory (ART) (Karuiannis and Anastasios, 1997), Hopfield networks Haykin learned templates to the input received during a particular classification session. SOM’s don't require supervision during initial training, but they still depend on the existence of an appropriate training set and their capacity for post training adaptive is at best very limited. To date, the only networks endowing templates with significant capacity to evolve in response to successive deformation of the input received during classification sessions are ART-based designs. The neural computational model ART was introduced by Carpenter and Grossberg in the 1980s and developed through various versions, such as ART1, ART2 and ART3 (Karuiannis and Anastasios, 1997). These networks have contributed number valuable properties with respect to other neural architectures, amongst which could be mentioned it’s on-line and self-organizing learning. On the other hand, these networks make it possible to resolve the dilemma between plasticity and stability, allowing both the updating of the classes learned and the immediate learning of new classes without distorting the already existing ones.

This property enables it to be used in problems in which the number of classes is not limited a priority in which there is an evolution in the classes over time. These characteristics are shared by great number of variants of the ART architecture, which are acquiring more operation and application possibilities.

ART-based networks require no preliminary training because they begin each classification sessions with null templates; the first pattern received becomes the templates of patterns category 1. This template adapts on-line and as successive slightly variant specimens are assigned to category 1 and if a specimen is received that differs from the current category 1, it becomes the initial template of category 2 and so on. Thus, ART-based networks not only allow category templates to adapt to current circumstances, but also allow on-line creation of categories during classification session. The two fold flexibility is important in many fields of application (notably biomedical signal classification) in which the patterns to be recognized differ from session to session and can gradually evolve within session.

ECG waveform and the related interferences noises: The ECG is the graphic record of the electrical activity of the human heart. This graphic record is comprised of a series of waveforms. Six different waveforms are discernible and are differentiated as P, Q, R, S, T and U (all of them represents ECG wave). In the case of a subject with a normal heart, a fixed pattern of waveforms emerges. A typical ECG waveforms and the time taken from the start to the finish is shown Fig. 1. Each of the ECG segments is described by Mchancer (1992) and Bement and Clyde (2007):

•	The S-T segment describes the period between completion of the depolarization and repolarization of the ventricular muscle

•	The Q-T interval on the ECG is equal to QRS complex duration plus the S-T interval

•	The T-P segment on the ECG represents the elapsed time from the completion of repolarization of ventricular muscle to the onset of next ECG cycle


Fig. 1:	Typical ECG signal waveform

The ECG noise interferences are different, where the effective of them as described in the study, Tam and Hakw (1977) and Mchancer (1992) that are considered in this research are the motion artifact, the interference from power line, the Electromyography (EMG) noise, the electrodes and polarized voltage, the leakage current, the electrostatic induction, the electromagnetic induction and the thermal and shot noise.

Heart and premature ventricular contraction disease
Prolonged and bizarre patterns of the QRS complex: The QRS complex lasts as long as depolarization continues to spread through the ventricles that is as long as part of the ventricles is depolarized and part is still polarized. Therefore, the cause of a prolonged QRS complex is always prolonged conduction of the impulse through the ventricles. Such prolongation often occurs when one or both ventricles are hypertrophied or dilated, owing to the longer pathway that the impulse must then travel. The normal QRS complex lasts 0.06-0.08 sec, whereas in hypertrophy or dilatation of the left or right ventricle, the QRS complex may be prolonged to 0.09-0.12 sec (Guyton and John, 2000).

When the Purkinje fibers are blocked, the cardiac impulse must be conducted by the ventricular muscle instead of by way of the Purkinje system. This decreases the velocity of impulse conduction to about one-third to one-fourth normal. Therefore, if complete block of one of the bundle branches occur, the duration of the QRS complex is usually increased to 0.14 sec or greater. In general, a QRS complex is considered to be abnormally long when it lasts >0.09 sec and when it lasts >0.12 sec, the prolongation is almost certain to be caused by pathological block of the conduction system somewhere in the ventricles.

Premature Ventricular Contractions (PVC): The ectrocardiogram of Fig. 2 shows the series of Premature Ventricular Contractions (PVCs) alternating with normal contraction.


Fig. 2:	Premature ventricular contraction demonstrated by the large abnormal QRS complexes

PVCs cause specific effect in the electro cardiogram, as follows (Guyton and John, 2000):

•	The QRS complex is usually considerably prolonged. The reasons are that the impulse is conducted mainly through the slowly conducting muscle of the ventricle rather than through the Purkinje system

•

The QRS complex has a high voltage for the following reasons: when the normal impulse passes through the heart, it passes through both ventricles simultaneously and the con-sequently, in the normal heart the depolarization waves of the two sides of the heart partially neutralize each other. When PVC occurs, the impulse travels in only one direction, so that there is no such neutralization effect and one entire side of the heart is depolarized ahead of the other, whereas the other entire side is still polarized; this causes intense electrical potentials

•

After almost all PVCs, the T wave has a potential polarity opposite to that of the QRS complex because the slow conduction of the impulse through the cardiac muscle causes the area first depolarized also to depolarized first. As a result, the direction of current flow in the heart during repolarization is opposite to that during depolarization and the potential of the T wave is reversed to that of the QRS complex. This is not true of the normal T waves

Some PVCs are relative being in their origin and result from factors such as cigarettes, coffee, lack of sleep, various mild toxic states and even emotional irritability. On the other hand, many other PVCs result from stray impulses or re-entrant signals that originate around the borders of the infracted or ischemic areas of the heart.

Therefore, the presence of such PVCs is not to be taken lightly. Statistics show that people with significant numbers of the PVCs have a much higher than chance of developing spontaneous lethal ventricular fibrillation, presumably imitated by one of the PVCs. This is especially true when the PVCs occur during the vulnerable period for causing fibrillation, just at the end of the T wave when the ventricles are coming out of refractoriness.

Multichannel Adaptive Resonance theory (MAR)
An overview of MART: The typical ART architecture, on which MART is based, is sketched in Fig. 3. Its input layer, F1, is itself a multilayer structure in which each component of the input vector is processed by a multi-node bloke performing noise attenuation, normalization and contrast enhancement. Each node of its output layer, F2, represents one of the categories it distinguishes.


Fig. 3:	Structure of ART2

The suggested MART for the purpose of diagnosis is designed to increase the number of categories in the same time the on-line is provided by the orienting system parallel to the direct F1-F2 connection. The main structural differences between this typical ART2 network and the used MART are depicted in Fig. 4 and 5 are as follows:

•	The addition of a further layer, F3, for the integration of the information which is provided by multiple channels of the input

•	The separation of layers F1and F2 into channel-specific blocks that communicate directly only with F3 and the orienting system, not with each other

•	The connection of the orienting system between layers F1 and F3 instead of F1 and F2

Classification of an input pattern by MART involves iterations of a 2-stage cycle. Where each cycle consists of 2 stages, the 1st stage is upward flow (Fig. 5). Where the pattern to be classified is tentatively assigned to the category that, among the set of categories that are currently recognized and inhibited has the template most closely resembling the input. The second stage is downward flow (Fig. 5), where the similarity between template and input is reassessed with the contribution of each channel weighted in accordance with the credibility has accumulated during classification of previous patterns. If the patterns template fails this test, the category in question is inhibited and the procedure returns to stage 1. This cycle is repeated until either a category is found for which the input pattern passes the 2nd stage test in which case the pattern is assigned to this category, whose template may then be updated to take it into account or the set of currently recognized categories is exhausted in which case a previously inactive top-layer F3 node becomes active as re-presentative of a new category to which the input pattern is assigned (Fernandez and Barro, 1998; Fernandez et al., 2000). The network that implements the above procedure consists of I (1-12) blocks (each block i consisting of two layers of fully interconnected nodes, F1_i and F2_i), a third node layer (F3) and an orienting system bridging between the F1_i layers and F3. The F1_i layer of each block receives input from a single channel and consists of the same number of nods as there are points in a single-channel input pattern (J). K is the number of nodes in F3 and in the F2_i layer of each block, so K represents a maximum number of categories that can be distinguished. The oriented system consists of one node for each channel block plus a single global node. In each single-channel block i, each node in layer F1_i is connected to each node in F2_i by a two-way link. The upward link between F1_ij (the jth F1 node in channel block i) and F2_ik (the kth F2 node in channel block i) is associated with a value z_ijk that depends on the input supplied to F1_ij by previous patterns assigned to category k. The downward link associated with a weight z'_ijk that is numerically the same as z_ijk but is distinguished rotationally for later convenience: the vector z_ik = (z_i1k,…., z_iJk) constitutes the adaptive template for category k in channel i. Each node F2_ik is also connected by an unweighted two-way link to node F3_k the global representative of category k. In the orienting system, channel I node receives input from each of the nodes in F1_i and has a single output line to the global node R_e, which sends out a signal to the nodes in F3.

MART mathematical representations: Here, the research of MART as applied for the purpose of diagnoses is described in great detail, including the mechanisms that guarantee the stability of its categories and prevent category proliferation. The required mathematical for signal flow as treated in this research.

Upward processing in each channel block: Figure 6 shows the input to and output from F1_ij.The input arriving from down-up, I_ij is the jth point of the input pattern in channels i. The input arrives from up-down, send down from the layer F2_i of channel block I, will be discussed later. The ascending output of F1_ij is sent up to each node in F2_i.

Figure 7 shows the input to and output from F2_ik. For down-up, F2_ik received input from node in F1_i and from up-down it receives input from node in F3_k.


Fig. 4:	Block diagram of the MART


Fig. 5:	Complete structure of suggested system (MART)


Fig. 6:	Connections of node F1_ij


Fig. 7:	Connections of node F1_ij

Where, T_ik is a measure of the similarity between I_i and the current channel I template for category K as follows:

(1)

T_ik is in fact J minus the ‘city-block’ distance between I_i and z_ik ∈ (0,J) because I_i is normalized before input so that each I_ij ∈ (0,1), where max_j {I_ij} = 1 and min_j{I_ij} = 0, j = 1, …., J and the same is true for z_i.

Competition in F3: The input to and output from each of the K nodes in F3 are shown in Fig. 7. The primary inputs to F3_k are the T_ik values sent up from the I nodes in F2_ik. These are summed to give the global similarity between the current input pattern and the current template for category k:

(2)


Fig. 8:	Connections of nodes F3_K

The input labeled re_k of Fig. 8 is a binary signal from the global orienting node R_e. When zero it has no effect, but when set to unity it totally inhibits F3_k. The other input and output shown in Fig. 8 allow implementation of a winner-take-all competition among the node in F3 to find which has the largest P_k value, using self- reinforcing and inhibitory connection with weights α ≈ 1 and 0. Thus the transient behavior is followed by a steady state in which all the F3_k nodes are inhibited (their output u_k are all zero) except for the one with the largest P_k among all the nodes which are not inhibited by R_e, whose output u_k is 1.

Downward processing in each channel block:

(3)

Following the competition in Eq. 3, the output u_k sent down from F3_k to nodes F2_ik (i = 1, …, I) is the Kronecker delta δ_kk*. Where k* is the index of the winner of the internodal competition. The activity of F2_ik is s_ik = u_k and is sent down toward F1_i, so v_ij the total input to F1_ij from layer F2_j is given by shown in Fig. 6:

(4)

Thus the vector of inputs from F2_i to F1_i, (v_i1… v_iJ), is (z'_ik*1… z'_ik*J), the channel I template of the uninhibited category best matching the input pattern. The downward output from F1_ij, d_ij, is the absolute difference between I_ij and z'_ik*j. The corresponding value is of the template for category k*: d_ij = |I_ij – z'_ik*j|, Fig. 6. Hence d_i, the total signal sent to channel I node of the orienting system is given by:

(5)

Where, the factor 1/J maintains d_i∈ (0,1).

The orienting system: The definitive measuring of overall concordance between the current input pattern and the k*th template is not just the sum of the d_i, because the different credibilities of the various channels must be taken into account. Some channels may be much noisier and the information they provide accordingly less reliable, than others. If the classifiers a dose not minimize the contribution of a noisy channel or channels it will create a large number of categories, the differences among whose templates are due to noise. To minimize the influence of noisy channels and prevent the creation of spurious categories, MART’s definitive test of the tentative decision reached by the F3 layer consists in comparison of a threshold (the global ‘vigilance’ parameter, ρ_g) with a weighted sum of the d_i:

(6)

The weights x_i used to calculate d evolve in accordance with the credibility accumulated by channel I during classification of successive complexes: the less noisy channel I seem to be the greater is xi. The comparison between ρ_g and d is performed by the global orienting node R_e, which accepts the tentative assignment if d<ρ_g and rejects it if d≥ρ_g. In the latter case, F3_k* is inhibited by setting r_ek* to one and the assignment cycle is repeated.

Evolution of templates: The initial value of each z_ijk and z'_ijkis 1/(J+1). When each F3_k is first committed as the representative of a new category, the corresponding z_ijk and z'_ijkadopt the values I_ijof the current input pattern, as described above. Thereafter, the template for each category k is continually updated to take into account the information supplied by the patterns successively assigned to that category after passing the required test. In the interests of templates stability, however not all input patterns contribute to the template for this category: for a template to be updated to take a newly accepted pattern into account, the pattern must satisfy a stricter similarly criterion than is applied by the orienting system, namely (d<ρ_a), where (ρ_a<ρ_g). Furthermore, since global concordance (d<ρ_g) may have been achieved in spite of relatively large discrepancies between input pattern and template in certain channels, the template is only updated for those channels I for which d_i<ρ_g. If these conditions are satisfied, the template is updated in accordance with the Eq. 7:

(7)

Where:

M_ik* = Min_l{A_z z_ilk* (t) + A_I I_il(t)}, l = 1,……, J

(8)

M_ik* = Max_l {A_z z_ilk* (t) + A_I I_il(t)}, l = 1,……, J

(9)

Note: Max_l {z_ilk*(t+1)} = 1 and min_l{z_ilk*(t+1)}=0, l = 1…J.

Evolution of channel credibility's: The initial value of each channel credibility x_i is 1/I. Subsequently, the value of x_i increases for channels in which input patterns are consistently very similar to the template of the category to which they are first tentatively assigned by competition in F3 and decreases for channels for which the contrary is true; x_i updating occurs before to confirm or to reject these first tentative assignments of each input pattern (and is excluded from any subsequent assignment cycles so as to prevent repeated application of essentially the same information on channel reliability).

Where x_i-updating is performed by comparison of d_i with three threshold parameters, δ₁, δ₂ and δ₃, where the condition (δ₁<δ₂<δ₃) is considered. If d_i<δ₁, then credibility x_i is increased. If δ₁≤d_i<δ₂, it is assumed that this moderate value of d_i merely reflects the variation to be expected among the patterns of any single category and x_i is left unchanged. If δ₂≤d_i<δ₃, it is assumed that the discrepancy between input pattern and template is due to noise and x_i is decreased. Finally, if d_i≥δ₃it is assumed that such large discrepancy might be due to the pattern not belonging to the category with whose template it is currently being compared and x_i is left unchanged.

When x_i changes, it does so by a fixed quantity (Δ_x = 0.0025) (Fernandez et al., 2000), proved only that the resulting value of x_i does not differ from 1/I by more than a quantity r_i (the credit radius of channel i) that is channel-specific and evolves in time. Thus,

(10)

Where:

(11)

(12)

The credit radius r_i evolves in accordance with the Eq. 13:

(13)

Here, r_max is a fixed parameter and (d_i) (t) is a weighted average of pattern-template discrepancies d_i that evolves from, the initial value (d_i) (0) = d_i (0) in accordance the Eq. 14:

(14)

Where, B_i(s) are weighted values. The restriction of x_i to an interval (1/I-r_i, 1/I + r_i) is necessary in order to prevent large values of d being generated by large x_i, which could lead, undesirable, to a tentative pattern assignment being rejected just due to a very little variation among successive patterns of that category. The adaptive evolution of r_i tends to stabilize x_i in the neighborhood of values that properly represent the variability of the signals of each channel (Fernandez and Barro, 1998; Fernandez et al., 2000).

Application of MART to morphological classification of QRS complexes of PVC disease
ECG data learning: The off-line learning of MART is built in MATLAB 6.5 and used in the present work. Where the data (ECG signals) are firstly taken for different patients of the assigned cases to be diagnosed, then these signals are stored to be used for determining the efficiency of MART algorithm for PVC diagnosis.

Testing MART: The QRS complexes to which MART was applied were selected from two-channel trace of the ECG database. These traces were deemed particularly suitable for test purposes because they contain a relatively large number of ectopic ventricular beats with anomalous QRS morphology.

In this database, each beat has a label identifying its point of origin (e.g., Normal (N) or Premature Ventricular Contraction (PVC) or any other case) and a numerical code (subtype) which represents the morphological class to which the beat belongs. So, the diagnostic in present work deals with one category of diseases, which is PVC. The PVC has three classes (subtypes) each one has specific configuration but all of them in a testing procedure belongs to PVC category. Appearances of these classes are due to changes in wave propagation direction at ventricular surface. Certainly ‘morphological’ categories contain beats that obviously have very different morphology and beats of apparently very similar morphology have different subtypes. In view of this, it's requested the collaboration of an experienced cardiologist in order to modify, on the basis of the criteria that are commonly used in cardiology, the morphology labeling of some beats, in order that those included in each category were all morphology similar and different from those of categories.

The frequencies values of ECG trace are located between (0.15-150) Hz, so that to obtain more accuracy ECG analysis, traces were resampled at 500 Hz. The QRS complex of each beat in each channel was next framed in a 250 m sec window and then rescaled and shifted vertically to afford, for input to MATR, a pattern consisting of 125 samples values laying in (0, 1).


Fig. 9:	Upward processing of PVC disease testing

Figure 9 illustrates thepassage of a preprocessed two-channel QRS complex through the first stage of the MART`s performance and must be set for each particular application of (ρ_g, A_z, A_I,…etc.).

In feed forward algorithm, each QRS's input of two channels at (F1) layer will compare with all templates that store in algorithm at (F1-F2) layers simultaneously to measured the similarities at (F2) layer, then the maximum similarity are taken at (F3) layer.

The feed backward algorithm now stimulate to measured the difference (d) between input and template which have maximum similarity. So the results are getting according to the following conditions:

•	If d≤0.15 (Global Vigilance Parameter), then the input matching with this template (disease)

•	If d>0.15, then the cycle (feed forward and feed backward) will be repeated to get other maximum similarity for remaining templates and so on

•	If input mismatching with all templates in algorithm, so the input QRS can be created (added) as a new template in algorithm. This property is suitable for biomedical applications because biomedical signals are variable from session to another

RESULTS AND DISCUSSION

For preliminary quantitative evaluation of MATR accuracy, sensitivity and specificity in classifying of PVC category, the two-channel of ECG database traces where fifty patients were used. Most of fifty patients have diseases belong to PVC and the other have different diseases of anomalous QRS's, so these patients were used for testing the MART algorithm. Where it can't be determined the specificity in present work because the data collected dose not contain patients without disease, where just the database of patients of cardiac disease with previous history are available. So, the test is achieved upon persons of up normal cases. The three performance indexes normally used in clinical medicine are defined corresponding to the literatures as following (Haykin 1999):

•	Accuracy is the ratio of the number of correct diagnoses to the total number of cases

•	Sensitivity is the ratio of the correct positive diagnoses to the total number of patients with the disease

•	Specificity is the ratio of the number of correct negative diagnoses to the total number of patients without the disease


Fig. 10:	QRS template of the two channels of the three Classes 1, 2 and 3 (up-down)

The following result represents some of QRS's signals training for many patients. Figure 10 shows the classes of PVC which were selected from ECG database as initial templates for the algorithm. These templates were store between (F1-F2) layers as weights in the algorithm, so any QRS's inputs will be compare with them to select the suitable case (disease).Where the width and height of templates window are 250 ms and (0-1) mv respectively.

Figure 11 shows the matching between the selected QRS's of patient No. 3 (dashed line) with the template belong to class 1 (plain line).

This matching between them make the dissimilarity (d), global difference according to (Eq. 6) is 0.137, this value less than critical vigilance parameter (ρ_g = 0.15), so the patient No. 3 have Premature Ventricular Contraction Disease and greater than template updating vigilance parameter (ρ_a= 0.1), so the available template remaining itself without update (dotted line) according to (Eq. 7) (Fig. 12).

Figure 13 shows the matching between the selected QRS's of patient No. 12 (dashed line) with the template belong to class 1 (plain line).


Fig. 11:	Training of patient No. 3


Fig. 12:	Training of patient No. 12


Fig. 13:	Anomalous signal

This matching between them make the dissimilarity (d), global difference according to (Eq. 9) is 0.073, this value less than critical vigilance parameter (ρ_g= 0.15) (so that the patient No. 12 have premature ventricular contraction disease) and less than template updating vigilance parameter (ρ_a = 0.1) (so that the available template upgrade itself (dotted line) according to (Eq. 10). Therefore, the new template will be stored in the algorithm and replaced by the previous template, as illustrate in Fig. 12 and 13 (Table 1).

Table 1:	The results after training a series of QRS's

CONCLUSION

In this study, classification of multichannel patterns was achieved by means of adaptive resonance theory networks. In its current form, MART uses distance as the measuring of dissimilarity between the multichannel input pattern and the adaptive templates of the categories, it creates. The MART represented in this research, offers the usual advantages of neural networks, including:

•	Reduced processing time to (0.124 m sec) due to massively parallel processing (all signal channels could be processed simultaneously and the input pattern in each channel could be compared simultaneously with the templates of all the available categories)

•

Fault tolerance due to the distribution of processing over numerous physical elements. The chief innovation incorporated in MART is the system of adaptive channel weights (credibilities) by means of which it learns the relative reliabilities of its several channels and uses this information to re-evaluate overall fit between an input pattern and the template of the category to which it has provisionally been assigned. This system of credibilities allows the influence of noisy channels to be minimized and so avoid category proliferation and the consequent exhaustion of capacity to learn new categories

Related Links

Journals By Subject

Asian Journal of Information Technology

Heart Diseases Diagnosis Expert System Based on Multichannel Adaptive Resonance Theory (MART)

How to cite this article: