ba iz, C pop the ed cla s of el h rom diagnosis of SAHS in an ambulatory home-based setting and could alleviate under-diagnosis rate and the waiting lists in sleep units. rome ( repetit s of re ce of e. s a con Diagnosis and assessment of the severity of SAHS require stud- ies beyond the clinical exploration. The American Sleep Disorders Association (ASDA) has established four levels for studies, accord- ing their complexity: level I: standard polysomnography (PSG); le- vel II: portable PSG; level III, respiratory polygraphy (PGR) and Netzer, & Kristo, detection pnea even en, Hsu, & 2009; Magalang et al., 2003) and frequency domain (Hua 2007; Morillo, Gross, León, & Crespo, 2012; Zamarrón, Gud cala, Rodriguez, & Romero, 2003) and non-linear approaches rez, Hornero, Abásolo, Del Campo, & Zamarrón, 2006; Hornero, Alvarez, Abásolo, Del Campo, & Zamarrón, 2007; Morillo, Rojas, Crespo, León, & Gross, 2009) have been proposed for SAHS screen- ing purposed using SpO2 recordings. In the last years, multivariate diagnostic models have demon- strated proficiency in SAHS screening (Alvarez, Hornero, Marcos, & Del Campo, 2010; Marcos et al., 2008; Morillo & Gross, 2013). ⇑ Corresponding author. Address: Escuela Superior de Ingeniería, Dpto. de Ingeniería Automática, C/Chile, 1, CP 11003 Cádiz, Spain. Tel.: +34 956015709; fax: +34 956015237. Expert Systems with Applications 41 (2014) 1654–1662 Contents lists availab Expert Systems w .e E-mail address:
[email protected] (D. Sánchez-Morillo). performance and risk for cardio-vascular diseases (Leger et al., 2012). But it is assumed that more than 80% of women and 90% of men with moderate to severe obstructive sleep apnea may be still undiagnosed (Pang, Oto, Orl, & Terris, 2006). cepted for SAHS screening (Netzer, Eliasson, 2001). Pulse oximetry is applied in SAHS for the desaturations in SpO2 caused by apnea and hypo nal processing techniques in the time (Lin, Yeh, Y 0957-4174/$ - see front matter � 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.eswa.2013.08.062 of the ts. Sig- Hang, & Yu, e, Bar- (Alva- prevalence, social repercussion and remarkable cardiovascular consequences (Leger, Bayon, Laaban, & Philip, 2012; Marin et al., 2012). The prevalence of SAHS in industrialized countries concerns 3–7% of adult men and 2–5% of adult women (Punjabi, 2008). The early diagnosis and treatment reduces the associated risk factors such as excessive daytime sleepiness, loss of concentration and Consequently, research focused on alternative diagnostic methods that overcome some of these limitations has remarkably increased. New approaches for simplified SAHS detection have been com- monly based on the analysis of a reduced set of data. The overnight analysis of arterial blood oxygen saturation (SpO2) data recorded from non-invasive nocturnal pulse oximetry is widely used and ac- 1. Introduction The Sleep Apnea-Hypopnea Synd breathing disorder characterized by tions of respiration. These episode respiratory airflow induce the presen tions in the oxygen saturation profil The growing relevance of SAHS i � 2013 Elsevier Ltd. All rights reserved. SAHS) is a sleep-related ive reductions or cessa- duction or absence of characteristic modifica- sequence of its raising level IV: continuous monitoring of one or two parameters (Ferber et al., 1994). However, gold standard for SAHS diagnosis is still overnight attended PSG and the related apnea-hypopnea index (AHI). Among the limitations associated to PSG are the high costs, the dedicated medical personnel attention and the limitation of diagnostic facilities (Flemons, Douglas, Kuna, Rodenstein, & Wheatley, 2004). Additionally, long waiting lists for diagnosis and patient’s inhibitions to undertake PSG must not be neglected. Multivariate 2 achieved in the assessment of the four-class classification. The proposed method could be useful in the Novel multiclass classification for home- hypopnea syndrome D. Sánchez-Morillo a,⇑, M.A. López-Gordo a, A. León b aBiomedical Engineering and Telemedicine Lab, School of Engineering, University of Cád b Pulmonology and Allergy Unit, Puerta del Mar University Hospital, 11009 Cádiz, Spain a r t i c l e i n f o Keywords: SAHS Overnight pulse-oximetry Sleep apnea Binary hierarchical classifier SpO2 Multiclass a b s t r a c t Currently, Sleep Apnea-Hy decade, in order to reduce home-located methods bas have been proposed. Binary high penalizations in term This study presents a nov set of features extracted f journal homepage: www sed diagnosis of sleep apnea /Chile, 1, 11003 Cádiz, Spain nea Syndrome (SAHS) is accurately diagnosed in Sleep Units. In the last burden for health systems and the consequent impact in patients, several on the binary classification of SAHS using overnight pulse oximetry (SpO2) ssifiers give rise to higher accuracies, but the cost of misclassifying leads to either health care costs or risks in patient’s health. ierarchical classification scheme for the four-class SAHS diagnosis using a SpO and reported in specialized literature. An accuracy of 82.6% was le at ScienceDirect ith Applications lsevier .com/locate /eswa alang et al., 2003; Marcos, Hornero, Álvarez, Aboy, & Del Campo, 2012; Morillo et al., 2012). SpO2 > 3% within time intervals of 40 s and 30 s, respectively (Rau- s w Assessment of severity of SAHS is a 4 class classification prob- lem. Supervised multiclass classification algorithms aim at assigning a class label for each input sample. Since binary prob- lems are simpler to solve than the original multicategory prob- lem, many proposals have been developed about binarization for multi-classification (Galar, Fernández, Barrenechea, Bustince, & Herrera, 2011; Lorena, de Carvalho, & Gama, 2008). The most common decomposition strategies include One-Versus-One (OVO) and One-Versus-All (OVA) (Duda, Hart, & Stork, 2001). However, OVO approach is prohibitive for a large number of clas- ses and OVA results in imbalanced data learning problems, where the number of training set samples in the two classes is very dif- ferent (Kumar, Ghosh, & Crawford, 2002). In the present study, we proposed a hierarchical classification (BHC) scheme for mul- ti-class diagnosis in order to improve the recognition perfor- mance (Casasent & Wang, 2005). Hierarchical classifiers have been successfully applied to different domains (Banos et al., 2012; Cancela, Ortega, Fernández, & Penedo, 2013; Cao, Pang, Liang, & Tian, 2013; Rodriguez-Lujan, Santa Cruz, & Huerta, 2012; Tsakonas & Gabrys, 2012). Different feature sets and classifiers were evaluated at each node of the BHC through sequential backward selection (SBS) and leave-one-out-cross-validation. The most relevant time-do- main, non-linear and frequency-domain parameters reported in literature for diagnosis SAHS were analyzed over a common data- set and three different binary classifiers were finally chosen: a decision tree (DT), a probabilistic neural network (PNN) and a sup- port vector machines classifier (SVM). In summary, the overall objective of this study was the evalua- tion of the proposed four-class SAHS diagnosis method to classify the degree of severity of the disease (non-SAHS, mild-SAHS, mod- erate-SAHS, and severe-SAHS) from SpO2 data. The proposed meth- od could provide to practitioners a simple and cost efficient diagnostic tool, more useful and comprehensive than that provided by classical screening schemes. 2. Subjects and data A total of 115 SpO2 recordings from subjects suspected of suf- fering from SAHS were available for the study. The subjects were referred to the Sleep Unit of the University Hospital Puerta del Mar in Cadiz (Spain). Inclusion criteria were medically stable status and ability to provide informed consent and follow instructions. The local Review Board on Human Studies approved the protocol and informed consents were obtained from all participants. Patients underwent standard in-laboratory overnight PSG. The signals recorded in the polysomnographic study were electroen- Advanced pattern recognition techniques have been applied to dis- criminate between SAHS and non-SAHS condition using only SpO2 data. In all cases, the authors were involved with a two-class clas- sification task. Multivariate screening of SAHS using logistic regres- sion (LR) (Daniel Alvarez et al., 2010; Marcos, Hornero, Alvarez, Del Campo, & Zamarrón, 2009), linear (LDA), quadratic discriminant analysis (QDA), K-nearest neighbours (KNN) (Marcos et al., 2009) and probabilistic neural networks (PNN) (Morillo & Gross, 2013) have been addressed resulting in high sensitivity and specificity pairs. Automatic estimation of the AHI through SpO2 analysis has been addressed barely with moderate success to obtain a more de- tailed characterization of the patient’s state (Lin et al., 2009; Mag- D. Sánchez-Morillo et al. / Expert System cephalogram (EEG), electrocardiogram (ECG), electro-oculogram (EOG), submental and tibial electromyogram (EMG), airflow (nasal thermistor), chest and abdominal piezo-sensors, body position and scher, Popp, & Zwick, 1991). Rapid restoration events defined as in- creases in SpO2 > 3% (RES3) and 4% (RES4) within 10 s were also calculated (Rauscher et al., 1991). In all cases, baseline was set using a 60 s moving average. The cumulative time spent below a threshold of 90% (CT90) (Netzer et al., 2001), the median value (Sat50), the second (variance) and fourth (kurtosis) statistical mo- ments in the time domain (M2t andM4t respectively), the range be- tween the percentiles 25% and 75% (P25–75) and the range between the percentiles 2.5% and 97.5% (P2.5–97.5) were also used to characterize the dynamic behaviour in the time domain (Morillo & Gross, 2013). Because of its relevance in scientific literature, the Delta Index (DIndex), computed for 12 seconds was also calculated (Lévy, Pépin, Deschaux-Blanc, Paramelle, & Brambilla, 1996; Mag- alang et al., 2003). 3.1.2. Frequency-domain analysis Spectral analysis was carried out to characterize the correlation between the power spectral density (PSD) in the desaturation fre- pulse oximetry (SpO2). Data were recorded using a Standard Sleep- Lab Polysomnographic system (Erich Jaeger, UK). Oximetry signals were recorded by means of a Jaeger 70750A19 oximeter at a rate of 8 samples per second. Sleep analysis was performed by an expert according the crite- ria given by Rechtschaffen and Kales. An apnea event was defined as the absence of airflow for more than 10 s. A hypopnea episode was defined as a decrease in respiratory flow by at least 50%, accompanied by a decrease of more than 4% in the arterial oxygen saturation of haemoglobin. The average AHI was calculated in sam- ples of sleep per hour. The calculated AHI was used to assign to each subject one of the following categories: non-SAHS (AHI < 5 h�1), mild-SAHS (5 6 AHI < 15 h�1), moderate-SAHS (15 h�1 6 AHI 6 30 h�1) and severe-SAHS (AHI > 30 h�1). These thresholds have been proposed by physicians and clinical experts as criteria for SAHS definition (Kushida et al., 2005). Table 1 sum- marizes the demographic and clinical data of the population under study. 3. Methods Pattern recognition techniques were applied to model the SAHS diagnosis problem using nocturnal oximetry data. The methodol- ogy applied involved three stages: (1) feature extraction, (2) selec- tion of features and (3) pattern classification. 3.1. Features extraction In the feature extraction phase, a reduced set of features were extracted from the arterial oxygen saturation signals. A review of the state of the art about the influence of apnea events on SpO2 dynamics allowed defining a set of 28 features. These features were extracted by means of algorithms in the time-domain, in the fre- quency domain and using non-linear techniques. Each feature was selected because of its high discrimination power reported in SAHS studies. 3.1.1. Time-domain analysis The number of oxygen desaturation events per hour was com- puted following several well-known criteria. ODI430 and ODI440 were calculated for events with falls in SpO2 > 4% in time intervals of 40 s and 30 s, respectively (Rusch, Sankar, & Scharf, 1996). ODI340 and ODI330 were computed for events with falls in ith Applications 41 (2014) 1654–1662 1655 quency range and the number and intensity of desaturation events. Six bands of desaturation frequencies were analysed to extract six statistics. These six analysed bands corresponded to apnea desatu- ent classifiers within the multi-class problem was chosen due to its esen s w ration periods of: 10–60 s (PSD10/60), 15–60 s (PSD15/60), 20–60 s (PSD20/60), 10–75 s (PSD10/75), 15–75 s (PSD15/75) and 20–75 s (PSD20/75). A window size of 512 samples was selected and a 40th-order Yule-Walker autoregressive method was used for the calculation of the PSD of the SpO2 signal within each of the previ- ous frequency ranges. The frequency desaturation index (FDI) was also estimated according to the procedure detailed in (Morillo et al., 2012). Finally, four additional spectral features were in- cluded: total spectral power (PT), computed as the total area under the PSD function; peak amplitude of the most significant frequency component (PA) and mean value (PA30/70) of the SpO2 spectral con- tent in the apnea frequency range from 0.014 to 0.033 Hz; and the relative power (PR30/70) in the apnea frequency band, defined as the ratio of the area enclosed under the PSD function in the apnea fre- quency band to the total signal power (Alvarez et al., 2010; Zam- arrón et al., 2003). 3.1.3. Non-linear analysis Nonlinear analysis of the SpO2 signals by means of approximate entropy (ApEn), central tendency measure (CTM) (Alvarez, Hornero, García, Del Campo, & Zamarrón, 2007), and Lempel-Ziv complexity (LZC) (Alvarez et al., 2010) was performed to measure irregularity, variability and complexity respectively. ApEn estimates irregularity of time series. Higher values of ApEn have been reported in patients with SAHS. Recommendations from authors were followed to cal- culate ApEn (m = 1, r = 0.15SD, N = 1000 s) (Pincus, 2001). LZC is re- lated to the complexity of the signal (Ziv & Lempel, 1976). Series with high complexity provide high values of LZC. Higher values of LZC have been found in SAHS-positive subjects. LZC was com- puted by converting SpO2 signals into binary sequences. The med- ian value of the signal samples was used as threshold. CTM quantifies the variability of the signal (Cohen, Braver, & O’Reilly, 1996). A lower CTM value in patients with sleep apnea has been de- scribed (Alvarez et al., 2006). Finally, variability of the oximetry signals was computed through the Poincaré SD1 descriptor, that quantifies the short-term variability of the SpO2 signal (Morillo et al., 2009). Table 1 Demographic and clinical statistics of all subjects, training set and test set. Data are pr apnea/hypopnea index computed as events per hour. Variable Total Non-SAHS N 115 25 Females 32 10 Age 58.4 ± 13.4 51.3 ± 12.2 Minutes 319.1 ± 143.3 406.0 ± 72.6 Weight 88.5 ± 21.0 78.48 ± 19.6 Height 163.0 ± 24.2 157.75 ± 35.4 BMI 32.1 ± 7.2 28.8 ± 6.0 1656 D. Sánchez-Morillo et al. / Expert System 3.2. Selection of features Classification algorithms are unable to achieve high classifica- tion accuracy when a large number of weakly relevant and/or redundant features are managed. This problem is commonly known as the curse of dimensionality. As a consequence, features selection mechanisms have to be implemented. Optimal selection of features requires an exhaustive search that is practically infeasible. Instead, suboptimal search algo- rithms like sequential search algorithms are generally used (Pudil, Novovicova, & Kittler, 1994). Two variants in algorithms for sequential searches to build the candidate set can be found. In sequential forward selection (SFS), features are sequentially added to an empty candidate. The process stops when the addition of new features does not improve a predefined criterion. In sequen- computational and conceptual simplicity (Casasent & Wang, 2005). Fig 1 shows the proposed architecture of a hierarchical classifier that requires only three pairwise classifiers arranged as a binary tree with four leaf nodes, one for each class, and three internal nodes, each with its own feature space. The root node of the binary tree represents the original four- class problem. Each of the three internal nodes consists of a two meta-classes classifier and a set of features specific to them. Coarse separation between classes occurs in at upper level in the hierar- chy and a finer classification decision at lower levels (Wang & Casasent, 2009). The architecture has a balanced binary hierarchi- cal structure, in which the two meta-classes at each node had the same number of classes. As stated, three classifiers were trained as part as the BHC. In this study, decision trees (DT) (Breiman, Friedman, Olshen, & Stone, 1984), Linear Discriminant Analysis (LDA) (Friedman, 1989), Logis- tic Regression Classifier (LR) (Jobson, 1991), K-nearest neighbour tial backward selection (SBS), the features are sequentially re- moved from a full candidate set. The iterative process ends when the removal of new features does not improve the criterion. A common drawback for these strategies is that once a feature is in- cluded or removed there is no way of getting rid of it, in forward selection, or include again in backward selection. Some alternative approaches have been presented to overcome this drawback although given that we are not working with hyperspectral data, these approaches were not considered. Instead, the search for the best set of features in this study was accomplished by using SBS in each of the nodes of the multiclass classifier and consider- ing the geometric mean of the sensitivity and specificity as a crite- rion. SBS was performed together with leave-one-out cross validation. After removing a feature, the new dataset was used to cross-validate the classifier in order to ensure the generaliza- tion of the results. 3.3. Classification A binary hierarchical classification structure (BHC) using differ- ted as mean ± standard deviation. n: number of subjects; BMI: body mass index; AHI: Mild-SAHS Moderate-SAHS Severe-SAHS 37 23 30 8 6 8 61.8 ± 12.9 61.5 ± 9.5 57.5 ± 15.6 278.2 ± 155.1 307.5 ± 145.5 305.9 ± 146.6 86.3 ± 18.7 84.5 ± 13.5 102.9 ± 23.1 161.2 ± 28.9 164.9 ± 10.1 168.3 ± 8.1 31.7 ± 7.6 31.1 ± 4.6 36.4 ± 7.8 ith Applications 41 (2014) 1654–1662 (KNN) (Bishop, 1995), Generalized Likelihood Ratio Test (GLRT) (MacDonough, 1995), Support Vector Machines (SVM) (Cristianini & Shawe-Taylor, 2000), Relevance Vector Machines (RVM) (Tip- ping, 2001) and Probabilistic Neural Networks (PNN) classifiers were evaluated as candidates for each of the nodes. A short description of the three finally selected classifiers follows. Decision Trees (DT): A DT is a logical model represented as a bin- ary tree. Decision trees are an inductive approach to learn knowl- edge on classification. They have a hierarchical structure, with nodes and leaves, used to classify classes based on a series of rules about the attributes of a qualitative target. The algorithms used for the induction of DT are ID3, C4.5, and CART (Breiman et al., 1984). Additional pruning algorithms are used to improve accuracy by removing tree branches to fit noise in the data. Further details about the theoretical framework of decision tree analysis can be harmonic mean of precision and recall. } Ψ1 Φ1 e cla e n i s w found in (Breiman et al., 1984). In this study, the algorithm C4.5 was used for induction and backward pruning to the optimal size was applied to avoid over-fitting. Support vector machine classifier (SVM): SVM is a robust and effective method that constructs a N-dimensional hyperplane that optimally separate classes. Unlike LDA, SVM does not assume that the feature space of a classification problem is linearly separable. SVM use a kernel to represent the input data. The kernel introduces nonlinearity and can transform the data into a higher dimensional space that may be separable by a hyperplane. Overfitting is pre- vented by specifying a maximummargin that separates the hyper- plane from the classes. Samples which violate this margin are penalized using a parameter often referred to as C parameter (Cris- tianini & Shawe-Taylor, 2000). In this work, we selected the Radial Basis Function (RBF) as a kernel and the cost that maximized the Ω1={0,1,2,3} Ω2={0,1} Ω4={0} Ω5={1 Φ2Classifier Fig. 1. The proposed binary hierarchical classifier applied to SAHS diagnosis. Since th Each internal node is comprised of a set of features (Wi) and a classifier (Ui). Each nod 1 = Mild-SAHS; 2 = Moderate-SAHS and 3 = Severe-SAHS. Ψ2Features Set D. Sánchez-Morillo et al. / Expert System classification performance was chosen. Probabilistic Neural Neetworks (PNN): A PNN is an implementa- tion of the kernel discriminant analysis statistical algorithm that is predominantly designed for classification through the use of Bayes’ optimal decision rule. The operations are organized into a multilayered feedforward network with four fully interconnected layers (Specht, 1990). Parzen window method is used to estimate the probability density function of the input vector at the neurons. In this research, the Gaussian function was used as Parzen window. The optimal values of the spread parameter, that controls the size and shape of the Gaussian function, were here estimated with the conjugate gradient algorithm and the Jackknife method. The out- put layer makes the decision based on the maximum probability or the Bayes’ rule. PNN are faster compared to other architectures, are relatively insensitive to the outliers and generate accurate pre- dicted target probability scores. 3.4. Validation and performance indexes Cross-validation is a technique used to ensure the statistical validity of classification results (Hjorth, 1994). Cross-validation iteratively partitions the sample into two sets of data. One is used for building the model, and the other is used to test it. In this study, leave-one-out cross-validation (LOOCV) was used. Predictions from each of the classifiers and actual labels were compared using confusion matrixes. A confusion matrix shows 4. Results 4.1. Preselected features Fig 2 illustrates a segment of the overnight oximetry signal from a subject with severe SAHS (a), characterized by a cyclic pattern of the count of samples comparing the predicted and true labels. Sev- eral measures revealing different aspects of the model perfor- mance were calculated from the entries of these tables: accuracy, which is the proportion of the correctly classified samples; preci- sion, defined as the proportion of correctly classified positive sam- ples to all samples classified as positive and F1-score defined as the Ω3={2,3} Ω6={2} Ω7={3} ssifier operates with four classes, its structure has 3 internal nodes and 4 leaf nodes. s associated with a set of classes. In this study, classes are defined as: 0 = Non-SAHS; Features Set Classifier Ψ3 Φ3 Features Set Classifier ith Applications 41 (2014) 1654–1662 1657 oxygen desaturations due to apnea events, and a SAHS negative subject (b) with a stable SpO2 profile. Fig 2(c) shows the PSD esti- mated from these signals. PSD values are higher in the apnea fre- quency band in the case of the patient with SAHS. Features were computed over the SpO2 signals after removing the artefacts with high amplitude, due to non-physiological events, and after applying a moving average filter to reduce the effect of low amplitude artifacts. Table 2 details the average (mean ± SD) values of each of the features considered in this study over the four degrees of severity of SAHS. 4.2. Binary hierarchical classifier One-way analysis of variance (ANOVA) was used to test the dif- ference between the means of the subgroups. As result of the AN- OVA, F statistic and the associated p-value were calculated. As expected, p-values were less than 0.05 in all cases. Therefore, the hypothesis that there is an influence of the severity of the disease on the dependent data, or that the means of at least two of the sub- groups of severities differ significantly can be accepted. Features were selected because of a high discriminant power was reported in all cases. Therefore, the results are coherent with nature and purpose of every of the selected features in relation to SAHS. Following ANOVA test, a Student-Newman-Keuls test for pair- wise comparison of subgroups was performed. The results of the ANOVA and the Student–Newman–Keuls test are presented in s w 60 80 100 (% ) a 1658 D. Sánchez-Morillo et al. / Expert System Table 2. Features with a relevant potential to rank SAHS severity are market with an asterisk (⁄). A top-down approach for building the BHC was followed. The tree construction started at the root node. The meta-class associ- ated with the root node contains all classes. It was then split into two meta-classes. The classes that were assigned to one set were more similar to each other than to the classes in the other set. In this case, since SAHS severity follows an ordinal scale associated 0 0.5 1 1.5 2 0 20 40 Time (h) Sp O 2 0 0.5 1 1.5 2 0 20 40 60 80 100 Time (h) Sp O 2 (% ) b Fig. 2. Examples of overnight pulse-oximetry recordings in the time domain for: (a) p spectral density for each of the cases is showed in (c). Continuous and discontinuous line subject respectively. Table 2 Average values and statistical significance of the preselected features. Data are presented Features that potentially could discriminate between severities were determined using th Feature Total Non-SAHS Mild-SAHS CT90 11.3 ± 20.5 1.5 ± 5.4 8.5 ± 20.5 Sat50 94.3 ± 3.1 95.8 ± 1.9 94.3 ± 2.9 M2t 8.7 ± 14.5 1.4 ± 0.9 3.5 ± 3.2 M4t 8.2 ± 6.2 8.4 ± 6.5 9.9 ± 7.5 P25–75 2.8 ± 3.0 1.3 ± 0.5 1.7 ± 0.9 P2.5–97.5 9.6 ± 7.1 4.1 ± 1.7 7.5 ± 4.4 ODI430 20.4 ± 21.0 3.5 ± 4.3 10.4 ± 9.0 ODI440 21.4 ± 23.5 3.2 ± 4.0 10.1 ± 9.2 ODI340 26.7 ± 21.7 7.1 ± 8.0 17.3 ± 11.7 ODI330 28.3 ± 24.6 6.8 ± 8.0 17.2 ± 12.4 RES3 20.7 ± 25.8 3.0 ± 3.7 9.0 ± 8.9 RES4 27.4 ± 27.6 6.3 ± 7.1 14.8 ± 11.7 PSD10/60 18.7 ± 49.2 1.7 ± 1.3 4.4 ± 4.4 PSD15/60 33.7 ± 85.7 2.8 ± 2.2 8.0 ± 7.8 PSD20/60 63.1 ± 149.5 5.0 ± 4.0 15.3 ± 14.7 PSD10/70 22.5 ± 56.9 2.0 ± 1.6 5.4 ± 5.3 PSD1575 38.1 ± 93.1 3.2 ± 2.5 9.3 ± 8.9 PSD20/75 66.8 ± 152.0 5.5 ± 4.3 17.0 ± 15.8 FDI 23.8 ± 24.8 4.7 ± 4.7 13.4 ± 10.1 PA3070 140.2 ± 258.3 14.3 ± 11.0 45.3 ± 39.2 PT 5590.2 ± 9294.3 922.1 ± 564.4 2256.0 ± 2066.1 PR30/70 0.02 ± 0.01 0.02 ± 0.01 0.02 ± 0.01 PA 264.2 ± 548.6 22.96 ± 18.77 76.6 ± 68.10 SD1 0.2 ± 0.2 0.1 ± 0.0 0.1 ± 0.1 ApEn 0.8 ± 0.3 0.5 ± 0.2 0.7 ± 0.2 DINDEX 1.4 ± 1.1 0.6 ± 0.2 0.9 ± 0.3 LZC 1.6 ± 0.3 1.4 ± 0.4 1.6 ± 0.3 CTM 0.4 ± 0.2 0.6 ± 0.2 0.4 ± 0.1 10 15 Power Spectral Density c ith Applications 41 (2014) 1654–1662 to the number of apnea/hypopnea events, the meta-classes distri- bution followed was: two classes with a lower AHI (non-SAHS and mild-SAHS) versus the two classes with a higher AHI (moder- ate-SAHS and severe-SAHS). Therefore, the SAHS diagnosis system proposed is based on a modular learning system comprised of a binary hierarchy of classi- fiers, each solving a two-class problem and having its own feature space. 2.5 2.5 0 0.02 0.04 0.06 0.08 -30 -25 -20 -15 -10 -5 0 5 Frequency (Hz) Po w er /fr eq u en cy (dB /H z) atient without SAHS and (b) patient with severe SAHS. The corresponding power s illustrate the PSD of the SpO2 signal from the non-SAHS and from the severe-SAHS as mean ± standard deviation. P values and F-ratios were calculated through ANOVA. e Student–Newman–Keuls test and are marked with an asterisk (⁄). Moderate-SAHS Severe-SAHS p⁄ F-ratio 7.4 ± 12.6 25.9 ± 25.6 were involved in the validation stage (moderate-SAHS and se- vere-SAHS subjects in the original dataset), and seven predictors were backward-selected: M4t, P25–75, ODI430, ODI330, RES3, RES4, ApEn. An accuracy of 79.3%, sensitivity of 76.7% and specificity of 82.6% were estimated. Precision and F-measure were 85.2% and 0.9 respectively. The area under the ROC curve (AUC) in the valida- tion data was 0.9 (Table 3). Table 4 details the confusion matrixes and the cross-validated [93.55/ 90.32] [92.17/ 89.57] [96.23/ 79.25] Decision Tree PNN 25] SVM {ODI440, PSD20/60, PR30/70} {P2.5–97.5, ODI440, SD1, ApEn, Δ Index, LZC, CTM} {M4t, P25–75, ODI430, ODI330, RES3, RES4, ApEn} ch l ccur Mild-SAHS 28 9 80.0% 88.8% Rest of Classes 7 71 True Moderate-SAHS Rest of Classes Se Sp Moderate-SAHS 19 5 73.1% 94.4% Rest of Classes 7 84 True Severe-SAHS Rest of Classes Se Sp Severe-SAHS 26 4 86.7% 95.3% Rest of Classes 4 81 D. Sánchez-Morillo et al. / Expert Systems with Applications 41 (2014) 1654–1662 1659 As can be appreciated in Fig 3, the architecture has a balanced binary hierarchical structure, in which the two macro-classes at each node had the same number of classes. At each node, the per- formance of several classifier configurations was compared by varying the features subset through SBS and LOOCV. DT, LDA, LR, KNN, GLRT, SVM, RVM and PNN classifiers were evaluated. Classification results achieved in the validation data by the pro- posed scheme are shown in Tables 3 and 4. In the root node, a DT was selected as the classifier with the best performance. This first stage is used to partition the set of four classes into two meta-clas- ses. The single 7-nodes decision tree illustrated in Fig 4 was able to classify the two meta-classes with an accuracy of 89.6% in the val- idation set. Concerning the rest of evaluated classifiers, accuracy ranged from 88.7% using a KNN to 83.5% using a GRLT classifier. The whole dataset was used in the LOOCV algorithm and the parameters ODI440, PSD20/60 and PR30/70 were finally considered as input features after the pruning stage. The achieved sensitivity and specificity were 90.6% and 87.3% respectively. Precision esti- mated was 85.2% and F-measure 0.8. The second node of the BHC classified non-SAHS and mild-SAHS subjects. After testing and evaluating several algorithms (enumer- ated in Section 3.3), a PNN was the selected configuration. After SBS-LOOCV, five predictor features were selected: P2.5–97.5, ODI440, SD1, ApEn, D Index, LZC and CTM. In the cross validation pro- cess, which involved 62 samples (non-SAHS and mild-SAHS sub- Non-SAHS Mild SAHS Fig. 3. Detail of binary hierarchical classifier for the SAHS classification problem. Ea numbers associated to each internal node (in brackets) represent the classification a test data respectively. jects in the original dataset), the minimum error occurred with 11 neurons in the model. An accuracy of 90.3% was achieved. Sen- sitivity was 94.6% and specificity 84.0%. Precision and F-measure were 89.7% and 90% respectively and the area under the ROC curve (AUC) was 0.9. The third node of the hierarchical structure was designed to classify between moderate-SAHS and severe-SAHS subjects. A Sup- port Vector Machine (SVM) with 23 support vectors was selected for classification. SVM kernel function was the RBF. 53 samples Table 3 Classification results achieved by the proposed scheme in the validation set. TP: true pos specificity; Ac: accuracy; AUC: area under the ROC curve. Internal node TP TN FP O1 Decision tree 48 55 7 Selected features ODI440, PSD20/60, PR30/70, O2 PNN 35 21 4 Selected features P2.5–97.5, ODI440, SD1, ApEn, DIndex, LZC, C O3 SVM 23 19 4 Selected features M4t, P25–75, ODI430, ODI330, RES3, RES4, Ap Moderate SAHS Severe SAHS eaf node in the binary tree is labeled with one of the four classes it represents. The acy of the two-class classifier at that node, calculated over the training data and the Table 4 Diagnostic results achieved by the hierarchical binary classifier. Results for each severity degree are compared. Se: sensitivity; Sp: specificity. Predicted Performance True Non-SAHS Rest of Classes Se Sp Non-SAHS 22 3 91.7% 96.7% Rest of Classes 2 88 True Mild-SAHS Rest of Classes Se Sp performance achieved by the global hierarchical binary classifier for each category. Each degree of SAHS was treated as the positive category, and all other categories were grouped as the negative category. The overall accuracy was 82.6%. Inter-rater agreement was used to evaluate the agreement be- tween the proposed scheme and the classification based on the gold pattern (AHI). Agreement was quantified by the Kappa (K) sta- tistic (Fleiss, Levin, & Paik, 2003). Kappa was 0.8 (CI 0.7 to 0.9) with a standard error of 0.04. itives; TN: true negatives; FP: false positives; FN: false negatives; Se: sensitivity; Sp: FN Se Sp Ac AUC 5 90.6% 88.7% 89.6% – 2 94.6% 84.0% 48.7% 0.9 TM 7 76.7% 82.6% 36.5% 0.9 En rchi ng f V). s w 5. Discussion In this paper, we have presented an objective evaluation of a novel hierarchical classification scheme adapted to classify the de- grees of severity of SAHS from SpO2 recordings. Time-domain, fre- quency-domain and non-linear features reported in specialized literature were used to reflect the dynamic behaviour of these sig- Fig. 4. Description of the single decision tree used in the first node of the binary hiera meta-class included moderate-SAHS and severe-SAHS classes. The problem of searchi using sequential backward selection (SBS) and leave-one-out cross validation (LOOC 1660 D. Sánchez-Morillo et al. / Expert System nals. Overall accuracy of the four-class SAHS diagnosis was 82.6%. The strength of agreement between the proposed scheme and the classification based on AHI was measured by the Kappa statistic. The result of 0.8 can be considered very good (Altmann, 1991).Con- ventional features from time and spectral analysis (ODI440, PSD20/60 and PR30/70) achieved the higher diagnosis accuracy in the root node of the classifier. The evaluated decision tree demonstrated accuracy over 89.6% for discriminating between non/mild-SAHS and moderate/severe-SAHS subjects. Only three well-known fea- tures were needed for the capture of the dynamic characteristics of the SpO2 profiles in the coarse classification between the two main meta-classes. The best subset found by SBS for the separation of non-SAHS and mild-SAHS cases included time domain characteristics (P2.5– 97.5, ODI440, D Index) and non-linear parameters (SD1, ApEn, LZC, CTM). Differences in AHI between these classes were not sufficient to be reflected in spectral parameters. Therefore, frequency-do- main features remained excluded from the subset of features used in the designed probabilistic neural network. Non-linear features were especially useful at this stage where an accuracy of 90.3% was achieved. Discrimination between moderate and severe subjects was per- formed using a support vector machines classifier. Best features subset was comprised of features calculated by signal analysis in the time-domain (M4t, P25–75, ODI430, ODI330, RES3 and RES4) and one non-linear parameter (ApEn). In general, the main difficulties in classification were located in differentiating between mild-SAHS and moderate-severe SAHS cases. From the 21 subjects misclassified, 7 corresponded to pa- tients with mild-SAHS that were classified as moderate-SAHS (4) and vice versa (3). Four cases corresponded to misclassifications between non-SAHS as mild-SAHS subjects and four between mod- erate-SAHS and severe-SAHS. These diagnostic errors corre- sponded to borderline patients. These errors are possibly related to the fact that AHI and ODI differ especially in moderate to severe SAHS patients (Gross et al., 2011). Furthermore, hypopneas may not being considered because their definition includes events be- yond changes in oxygen saturation. Such a situation can justify 9 cal classifier. First meta-class included the non-SAHS and mild-SAHS classes. Second or a set of features that best discriminates the resulting two meta-classes was solved N: number of subjects classified at each node. ith Applications 41 (2014) 1654–1662 misclassifications. These subjects were classified with a lower de- gree of the actual SAHS severity. The drawback of a hierarchical classifier is that if a misclassifi- cation error occurs at some internal node, it cannot be corrected in the subsequent levels. 8 out of the 115 subjects were misclassified in the first stage and these errors remain along the classification procedure. Some alternative algorithms are being proposed to avoid such situation and should be evaluated to improve the re- sults here presented. The worst diagnostic ability was detected in discriminating be- tween mild-SAHS and moderate-SAHS subjects from the rest of cases, what agrees with the qualitative results reported recently in (Marcos et al., 2012). Furthermore, although labelling SAHS-po- sitive patients (AHI > 5 h-1) as non-SAHS is the most relevant diag- nostic error, only two mistakes were made (sensitivity of 91.7%). Previous studies from our research group have been focused on SAHS detection from SpO2 analysis (Morillo et al., 2012, 2009; Morillo & Gross, 2013). However, a novel classification approach was here proposed. As a novelty, the output of the algorithm is a categorical variable that indicates the severity of the disease for the input feature pattern. A few regression techniques to develop prediction models of the AHI have been lately reported. Univariate approaches like simple linear regression (Chung, Liao, Elsaid, & Islam, 2012; Morillo et al., 2012), multivariate adaptive regression spline (MARS) (Mag- alang et al., 2003), multiple linear regression (MLR) and multilayer perceptron (MLP) neural networks (Marcos, Hornero, Alvarez, Del Campo, & Aboy, 2010) have been evaluated for this purpose. Most of the aforementioned studies were focused on finding the optimal cut-off based on maximal accuracy for oxygen desaturation in- dexes to predict AHI. Nevertheless, these studies did not assess s w the ability of their methods to rank SAHS severity. Only in the ap- proach in (Marcos et al., 2012), the authors described AHI predic- tion using a MLP algorithm in order to assess the severity of SAHS through SpO2 analysis. The method achieved a moderate suc- cess, with an intraclass correlation coefficient of 0.9 and accuracy in differentiating between severities of 72%. Overall accuracy of the presented strategy outperforms the accuracy reported in that study. In summary, pulse oximetry is a convenient and cost-effective method to detect sleep apnea. Despite some of the limitations associated to SpO2, the proposed hierarchical classifier could im- prove the diagnostic capacity of oximetric signals recorded from overnight pulse oximetry by enabling a preliminary high accuracy classification of the severity of the disease. As far as we know a di- rect approximation to severity assessment by undertaking the multi-classification using a hierarchical architecture with binary classifiers and a divide and conquer strategy has not been ad- dressed in the SAHS problematic. We conclude that four-class clas- sification has demonstrated to be an option to binary-class for the accurate diagnosis of SAHS. The method could be used in a domi- ciliary approach to the early diagnosis of SAHS. 6. Conclusions In this paper, we have presented an objective evaluation of a classification scheme adapted to four-class SAHS diagnosis. A hier- archical binary classifier and features extracted from SpO2 were chosen to match reports of successful binary diagnosis. Perfor- mance of the severity classification strategy has been analysed and quantified by means of classification accuracy and receiver operating characteristic (ROC) analysis. The results evidenced a high accuracy in the four-class SAHS diagnosis. An overall accuracy of 82.6% was achieved. Inter-rater agreement, used to evaluate the agreement between the proposed scheme and the classification based on the gold pattern (AHI) was 0.8. The strength of agreement that resulted can be considered remarkable. We conclude that the proposed novel four-class hierarchical classifier has demonstrated to be an alternative to binary-class for the accurate diagnosis of SAHS. It could be useful in the diagno- sis of SAHS in an ambulatory home-based setting and could allevi- ate under-diagnosis rate and the waiting lists in sleep units. References Altmann, D. (1991). Practical statistics for medical research. London: Chapman and Hall. Alvarez, D., Hornero, R., Abásolo, D., Del Campo, F., & Zamarrón, C. (2006). Nonlinear characteristics of blood oxygen saturation from nocturnal oximetry for obstructive sleep apnoea detection. Physiological Measurement, 27(4), 399–412. http://dx.doi.org/10.1088/0967-3334/27/4/006. Alvarez, Daniel, Hornero, R., García, M., Del Campo, F., & Zamarrón, C. (2007). Improving diagnostic ability of blood oxygen saturation from overnight pulse oximetry in obstructive sleep apnea detection by means of central tendency measure. Artificial Intelligence in Medicine, 41(1), 13–24. . Alvarez, Daniel, Hornero, R., Marcos, J. V., & Del Campo, F. (2010). Multivariate analysis of blood oxygen saturation recordings in obstructive sleep apnea diagnosis. IEEE Transactions on Biomedical Engineering, 57(12), 2816–2824. http://dx.doi.org/10.1109/TBME.2010.2056924. Banos, O., Damas, M., Pomares, H., Rojas, F., Delgado-Marquez, B., & Valenzuela, O. (2012). Human activity recognition based on a sensor weighting hierarchical classifier. Soft Computing, 17(2), 333–343. http://dx.doi.org/10.1007/s00500- 012-0896-3. Bishop, C. M. (1995). Neural networks for pattern recognition. New York, New York, USA: Oxford University Press. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Belmont, CA.: Wadsworth. Cancela, B., Ortega, M., Fernández, A., & Penedo, M. G. (2013). Hierarchical D. Sánchez-Morillo et al. / Expert System framework for robust and fast multiple-target tracking in surveillance scenarios. Expert Systems with Applications, 40(4), 1116–1131. http:// dx.doi.org/10.1016/j.eswa.2012.08.025. Cao, K., Pang, L., Liang, J., & Tian, J. (2013). Fingerprint classification by a hierarchical classifier. Pattern Recognition, 46(12), 3186–3197. http://dx.doi.org/10.1016/ j.patcog.2013.05.008. Casasent, D., & Wang, Y. C. (2005). A hierarchical classifier using new support vector machine for automatic target recognition. Neural Networks, 18(2), 541–548. Chung, F., Liao, P., Elsaid, H., & Islam, S. (2012). Oxygen desaturation index from nocturnal oximetry: A sensitive and specific tool to detect sleep-disordered breathing in surgical patients. Anesthesia, Analgesia, X(X), 1–8. http:// dx.doi.org/10.1213/ANE.0b013e318248f4f5. Cohen, J. D., Braver, T. S., & O’Reilly, R. C. (1996). A computational approach to prefrontal cortex, cognitive control and schizophrenia: recent developments and current challenges. Philosophical Transactions of the Royal Society of London – Series B: Biological Sciences, 351(1346), 1515–1527. . Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines. History, 47, 1–15. . Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern classification. In R.O. Duda, P. E. Hart, & D. G. Stork (Eds.). Neuropsychopharmacology, 2, 654. . Ferber, R., Millman, R. P., Coppola, M., Fleetham, J., Murray, C. F., Iber, C., et al. (1994). Portable recording in the assessment of obstructive sleep apnea: ASDA standards of practice. Sleep, 17, 378–392. Fleiss, J. L., Levin, B., & Paik, M. (2003). Statistical methods for rates and proportions (3rd. ed.). Hoboken: John Wiley & Sons. Flemons, W. W., Douglas, N. J., Kuna, S. T., Rodenstein, D. O., & Wheatley, J. (2004). Access to diagnosis and treatment of patients with suspected sleep apnea. American Journal of Respiratory and Critical Care Medicine, 169(6), 668–672. Friedman, J. H. (1989). Regularized discriminant analysis. Journal of the American Statistical Association, 84, 165–175. Galar, M., Fernández, A., Barrenechea, E., Bustince, H., & Herrera, F. (2011). An overview of ensemble methods for binary classifiers in multi-class problems: experimental study on one-vs-one and one-vs-all schemes. Pattern Recognition, 44(8), 1761–1776. http://dx.doi.org/10.1016/j.patcog.2011.01.017. Gross, N., Friedmann, J., Kunze, C., Stork, W., Morillo, D. S., Jimenez, A. L. et al. (2011). Increasing reliability and information content of pulse oximetric SAHS screening algorithms. In BIOSIGNALS—proceedings of the international conference on bio-inspired systems and, signal processing (pp. 438–445). Hjorth, J. S. U. (1994). Computer intensive statistical methods: Validation model selection and bootstrap. London: Chapman and Hall. Hornero, R., Alvarez, D., Abásolo, D., Del Campo, F., & Zamarrón, C. (2007). Utility of approximate entropy from overnight pulse oximetry data in the diagnosis of the obstructive sleep apnea syndrome. IEEE Transactions on Bio-medical Engineering, 54(1), 107–113. http://dx.doi.org/10.1109/TBME.2006.883821. Hua, C.-C., & Yu, C.-C. (2007). Smoothed periodogram of oxyhemoglobin saturation by pulse oximetry in sleep apnea syndrome: an automated analysis. Chest, 131(3), 750–757. http://dx.doi.org/10.1378/chest.06-1624. Jobson, J. (1991). Applied multivariate data analysis: Categorical and multivariate methods. New York, New York, USA: Springer. Kumar, S., Ghosh, J., & Crawford, M. M. (2002). Hierarchical fusion of multiple classifiers for hyperspectral data analysis. Pattern Analysis and Applications, Special Issue on Fusion of Multiple Classifiers, 5(2), 210–220. Kushida, C. A., Littner, M. R., Morgenthaler, T., Alessi, C. A., Bailey, D., Coleman, J., et al. (2005). Practice parameters for the indications for polysomnography and related procedures: an update for 2005. Sleep (Rochester), 28(4), 499–521. . Leger, D., Bayon, V., Laaban, J. P., & Philip, P. (2012). Impact of sleep apnea on economics. Sleep Medicine Reviews, 16(5), 455–462. http://dx.doi.org/10.1016/ j.smrv.2011.10.001. Lévy, P., Pépin, J. L., Deschaux-Blanc, C., Paramelle, B., & Brambilla, C. (1996). Accuracy of oximetry for detection of respiratory disturbances in sleep apnea syndrome. Chest, 109, 395–399. . Lin, C.-L., Yeh, C., Yen, C.-W., Hsu, W.-H., & Hang, L.-W. (2009). Comparison of the indices of oxyhemoglobin saturation by pulse oximetry in obstructive sleep apnea hypopnea syndrome. Chest, 135(1), 86–93. http://dx.doi.org/10.1378/ chest.08-0057. Lorena, A. C., de Carvalho, A. C., & Gama, J. M. (2008). A review on the combination of binary classifiers in multiclass problems. Artificial Intelligence Review, 30(1– 4), 19–37. MacDonough, R. N. (1995). D. of S. in N. A. D. W. (Ed.), A. O. via E. (1995). Detection of Signals in Noise. (A. D. Whalen, Ed.). Access Online via Elsevier. Magalang, U. J., Dmochowski, J., Veeramachaneni, S., Draw, A., Mador, M. J., El-Solh, A., et al. (2003). Prediction of the apnea-hypopnea index from overnight pulse oximetry. Chest, 124(5), 1694–1701. . Marcos, J. V., Hornero, R., Álvarez, D., Aboy, M., & Del Campo, F. (2012). Automated prediction of the apnea-hypopnea index from nocturnal oximetry recordings. IEEE Transactions on Bio-Medical Engineering, 59(1), 141–149. http://dx.doi.org/ 10.1109/TBME.2011.2167971. Marcos, J. V., Hornero, R., Alvarez, D., Del Campo, F., & Aboy, M. (2010). Automated detection of obstructive sleep apnoea syndrome from oxygen saturation recordings using linear discriminant analysis. Medical & Biological Engineering & Computing, 48(9), 895–902. . Marcos, J. V., Hornero, R., Alvarez, D., Del Campo, F., & Zamarrón, C. (2009). Assessment of four statistical pattern recognition techniques to assist in obstructive sleep apnoea diagnosis from nocturnal oximetry. Medical Engineering & Physics, 31(8), 971–978. http://dx.doi.org/10.1016/ j.medengphy.2009.05.010. Marcos, J. V., Hornero, R., Alvarez, D., Del Campo, F., Zamarrón, C., & López, M. (2008). Utility of multilayer perceptron neural network classifiers in the diagnosis of the obstructive sleep apnoea syndrome from nocturnal oximetry. Computer Methods and Programs in Biomedicine, 92(1), 79–89. http://dx.doi.org/ 10.1016/j.cmpb.2008.05.006. Marin, J. M., Agusti, A., Villar, I., Forner, M., Nieto, D., & Carrizo, S. J. (2012). Association between treated and untreated obstructive sleep apnea and risk of hypertension. JAMA: The Journal of the American Medical Association, 307(20), 2169–2176. http://dx.doi.org/10.1001/jama.2012.3418. Morillo, D. S., & Gross, N. (2013). Probabilistic neural network approach for the detection of SAHS from overnight pulse oximetry. Medical & Biological Engineering & Computing, 51(3), 305–315. http://dx.doi.org/10.1007/s11517- 012-0995-4. Morillo, D. S., Gross, N., León, A., & Crespo, L. F. (2012). Automated frequency domain analysis of oxygen saturation as a screening tool for SAHS. Medical Engineering & Physics, 34(7), 946–953. http://dx.doi.org/10.1016/ j.medengphy.2011.10.015. Morillo, D. S., Rojas, J. L., Crespo, L. F., León, A., & Gross, N. (2009). Poincaré analysis of an overnight arterial oxygen saturation signal applied to the diagnosis of sleep apnea hypopnea syndrome. Physiological Measurement, 30(4), 405–420. http://dx.doi.org/10.1088/0967-3334/30/4/005. Netzer, N., Eliasson, A. H., Netzer, C., & Kristo, D. A. (2001). Overnight pulse oximetry for sleep-disordered breathing in adults: A review. Chest, 120(2), 625–633. . Pang, K. P., Oto, F., Orl, F., & Terris, D. J. (2006). Screening for obstructive sleep apnea: An evidence-based. Analysis, 27, 112–118. http://dx.doi.org/10.1016/ j.amjoto.2005.09.002. Pincus, S. M. (2001). Assessing serial irregularity and its implications for health. Annals of the New York Academy of Sciences, 954(1), 245–267. . Pudil, P., Novovicova, J., & Kittler, J. (1994). Floating search methods in feature- selection. Pattern Recognition Letters, 15, 1119–1125. Punjabi, N. M. (2008). The Epidemiology of Adult Obstructive Sleep Apnea, 5, 136–143. http://dx.doi.org/10.1513/pats.200709-155MG. Rauscher, H., Popp, W., & Zwick, H. (1991). Computerized detection of respiratory events during sleep from rapid increases in oxyhemoglobin saturation. Lung, 169(6), 335–342. . Rodriguez-Lujan, I., Santa Cruz, C., & Huerta, R. (2012). Hierarchical linear support vector machine. Pattern Recognition, 45(12), 4414–4427. http://dx.doi.org/ 10.1016/j.patcog.2012.06.002. Rusch, T. L., Sankar, R., & Scharf, J. E. (1996). Signal processing methods for pulse oximetry. Computers in Biology and Medicine, 26(2), 143–159. Specht, D. F. (1990). Probabilistic neural networks and the polynomial Adalineadaline as complementary techniques for classification. IEEE Transactions on Neural Networks, 1(1), 111–121. Tipping, M. E. (2001). Sparse bayesian learning and the relevance vector machine. The Journal of Machine Learning Research, 1, 211–244. http://dx.doi.org/10.1162/ 15324430152748236. Tsakonas, A., & Gabrys, B. (2012). GRADIENT: Grammar-driven genetic programming framework for building multi-component, hierarchical predictive systems. Expert Systems with Applications, 39(18), 13253–13266. http://dx.doi.org/10.1016/j.eswa.2012.05.076. Wang, Y. F., & Casasent, D. (2009). A Support Vector Hierarchical Method for Multi- class Classification and Rejection, 3281–3288. Zamarrón, C., Gude, F., Barcala, J., Rodriguez, J. R., & Romero, P. V. (2003). Utility of oxygen saturation and heart rate spectral analysis obtained from pulse oximetric recordings in the diagnosis of sleep apnea syndrome. Chest, 123(5), 1567–1576. . Ziv, J., & Lempel, A. (1976). On the complexity of finite sequences. IEEE Transactions on Information Theory, IT-22(1). 1662 D. Sánchez-Morillo et al. / Expert Systems with Applications 41 (2014) 1654–1662 Novel multiclass classification for home-based diagnosis of sleep apnea hypopnea syndrome 1 Introduction 2 Subjects and data 3 Methods 3.1 Features extraction 3.1.1 Time-domain analysis 3.1.2 Frequency-domain analysis 3.1.3 Non-linear analysis 3.2 Selection of features 3.3 Classification 3.4 Validation and performance indexes 4 Results 4.1 Preselected features 4.2 Binary hierarchical classifier 5 Discussion 6 Conclusions References