1834 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 6, NO. 4, AUGUST 2013 PPI-SVM-Iterative FLDA Approach to Unsupervised Multispectral Image Classification Hsian-Min Chen, Member, IEEE, Chinsu Lin, Member, IEEE, Shih-Yu Chen, Member, IEEE, Chia-Hsien Wen, Clayton Chi-Chang Chen, Yen-Chieh Ouyang, Member, IEEE, and Chein-I Chang, Fellow, IEEE AbstractâThis paper presents a new approach to unsupervised classification for multispectral imagery. It first implements the pixel purity index (PPI) which is commonly used in hyperspectral imaging for endmember extraction to find seed samples without prior knowledge, then uses the PPI-found samples as support vec- tors for a kernel-based support vector machine (SVM) to generate a set of initial training samples. In order to mitigate randomness caused by PPI and sensitivity of support vectors used by SVM it further develops an iterative Fisherâs linear discriminate analysis (IFLDA) that performs FLDA classification iteratively to produce a final set of training samples that will be used to perform a follow-up supervised classification. However, when the image is very large, which is usually the case in multispectral imagery, the computational complexity will be very high for PPI to process the entire image. To resolve this issue a Gaussian pyramid image processing is introduced to reduce image size. The experimental results show the proposed approach has great promise in unsuper- vised multispectral classification. Index TermsâFisherâs linear discriminate analysis (FLDA), It- erative Fisherâs linear discriminate analysis (IFLDA), Pixel purity index (PPI), Support vector machine (SVM). Manuscript received May 24, 2012; revised August 20, 2012; accepted Oc- tober 07, 2012. Date of publication December 04, 2012; date of current version July 22, 2013. The work of H.-M. Chen was supported in part by the National Science Council (NSC) under contract 101-2221-E-241-003. The work of C. Lin was supported in part by the NSC under contract NSC98-2221-E-005-096 andNSC100-2221-E-005-084. The work of Y.-C. Ouyangwas supported in part by the NSC under contract NSC98-2221-E-005-096- and NSC100-2221-E-005- 084. (Corresponding author: C. C.-C. Chen.) H.-M. Chen is with the Department of Biomedical Engineering, Hungkuang University, Taichung, Taiwan (e-mail:
[email protected]) C. Lin is with the Department of Forestry and Natural Resources, National Chiayi University, Chiayi 60004, Taiwan (e-mail:
[email protected]) S.-Y. Chen is with the Remote Sensing Signal and Image Processing Laboratory, Department of Computer Science and Electrical Engineering, Uni- versity of Maryland, Baltimore County, Baltimore, MD 21250 USA (e-mail:
[email protected]) C.-H. Wen is with the Department of Computer Science and Information Management, Providence University, Taichung, Taiwan (e-mail: chwen@pu. edu.tw). C. C.-C. Chen is with the Department of Radiology, Veterans General Hos- pital, Taichung, Taiwan, and the Department of Radiological Technology, Cen- tral Taiwan University of Science and Technology, Taichung, Taiwan (e-mail:
[email protected]) Y.-C. Ouyang is with the Department of Electrical Engineering, National Chung Hsing University, Taichung, Taiwan (e-mail:
[email protected]) C.-I Chang is with the Remote Sensing Signal and Image Processing Labora- tory, Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, Baltimore, MD 21250 USA, with the Department of Electrical Engineering, National Taipei University of Technology, Taipei, Taiwan, and also with Department of Computer Science and Information Engi- neering, Providence University, Taichung, Taiwan (e-mail:
[email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JSTARS.2012.2225097 I. INTRODUCTION U UNSUPERVISED multispectral image classificationis generally more challenging than unsupervised hy- perspectral image classification [1]â[3] because of following reasons. One is low spectral resolution which generally results in mixed pixels. Second, also due to low spatial resolution many materials substances often appear at subpixel scale in which case no spatial information can be used for classification. Third and most importantly, with a small number of spectral channels used for multispectral data acquisition multispectral image classification must largely rely on spatial correlation rather than spectral information which is crucial in hyperspec- tral data exploitation. As a result, endmember extraction which has received considerable interest in hyperspectral imaging has little role in multispectral imaging due to the fact that an endmemmber is rare in multispectral data [4]. However, this does not exclude endmember extraction from its applications to multispectral imagery. There have been many endmember extraction algorithms (EEAs) reported in the literature such as pixel purity index (PPI) [5]â[7], N-FINDR [8], [9], vertex component analysis (VCA) [10], simplex growing algorithm (SGA) [11]. But not all EEAs are applicable to multispectral images. In general an endmember extraction algorithm (EEA) requires the prior knowledge about the number of endmembers, , to be generated. According to [12] an endmember is defined as a pure spectral signature and thus, this number is gener- ally upper-bounded by band dimensionality. In multispectral imagery, this implies that the number of endmembers cannot go beyond the used band dimensionality. As an example, a 4-band SPOT data used for experiments in this paper has a total number of spectral bands, 4. In this case, no more than four can be generated as endmembers. For example, if N-FINDR or its variants such as SGA is used to extract endmembers from such a 4-band SPOT data, due to its use of maximal simplex volume as a criterion, the dimensionality of simplexes cannot be more than 4. In fact, because of the sum-to-one abundance constraint, this number is actually reduced to 3. On the other hand, also because of low spectral resolution there are very few endmembers in multispectral images. This may be the major reason that endmember extraction received little interest in multispectral imaging analysis and very little has been reported in this area. However, PPI is a completely different story. Since PPI is abundance-unconstrained and uses orthogonal projection as a criterion to find endmembers, the constraint of band dimensionality is not applied to PPI. In this case, the data 1939-1404/$31.00 © 2012 IEEE CHEN et al.: PPI-SVM-ITERATIVE FLDA APPROACH TO UNSUPERVISED MULTISPECTRAL IMAGE CLASSIFICATION 1835 samples extracted by PPI are not necessarily endmembers but rather samples of interests. This disadvantage for PPI to be used for endmember extraction in hyperespectral images turns out to be an advantage for PPI to extract training samples instead of endmembers from multispectral images. It is our brief that PPI may be one of very few endmember extraction algorithms used in hyperspectral imaging that is actually applicable to multispectral imagery. This paper takes advantage of PPI described above to derive an interesting endmember extraction-based unsupervised clas- sification technique for multispectral imagery. It makes use of PPI to find samples of interest which can be considered as seed training samples for the data. These PPI-found samples are then used as support vectors to train a kernel-based SVM [13] to produce a set of initial training samples where the number of classes, which can be determined by the full band dimension- ality of multispectral imagery. Since PPI uses a random gener- ator to produce skewers on which data samples are orthogonally projected, the PPI-found samples are not reproducible. In order to resolve this inconsistency issue, an iterative Fisherâs linear discriminant analysis (IFLDA) is further developed to imple- ment FLDA [14] iteratively to mitigate the instability caused by the use of skewers and the sensitivity of using PPI-found sam- ples as support vectors by SVM. These resulting IFLDA-clas- sified sample vectors are then used as a final set of training samples which are in fact obtained by a series of processes by first finding seed samples via PPI, then by using SVM to gen- erate initial training samples which are further refined and cor- rected by IFLDA to produce a final desired set of training sam- ples. So, an algorithm implementing PPI coupled with SVM in conjunction with IFLDA to find unsupervised training samples is called PPI-SVM-IFLDA algorithm. Finally, these PPI-SVM- IFLDA-produced samples are then used as training samples for a follow-up classifier to perform supervised classification on the original data samples in which case this supervised classifier is once again implemented by IFLDA with only difference in that this IFLDA is now performed on the entire data space com- pared to the IFLDA used in the PPI-SVM-IFLDA which only performs the classification on SVM-generated training samples, not original data samples. On many occasions the image data size is too large for PPI to process due to its high computational complexity. To alleviate this problem, a Gaussian pyramid image processing [15] is pro- posed to reduce the original image size to help users select ap- propriate regions of interest (ROIs). This method not only can save processing time, but also can effectively pinpoint ROIs. Such benefits will be demonstrated by experiments. It should be noted that PPI cannot work alone by itself. It must be coupled with SVM and implemented in conjunction with IFLDA to make it work. This paper provides a potential application of the PPI to multispectral imaging which has not been investigated or explored over the past year. II. PIXEL PURITY INDEX The PPI [5] has been widely used for endmember extraction due to availability of the popular ENVI software system origi- nally developed by Analytical Imaging and Geophysics (AIG) and has found many applications in different areas. Its idea can be briefly described as follows. Let be a given set of data sample vectors. Now, we use a random generator to generate a set of random unit vec- tors, referred to as skewers, which cover dif- ferent random directions. All the data sample vectors are then projected onto this randomly generated skewer set, . According to geometry of convexity, if an end- member is considered as a pure signature its projection should fall at end points of some of these skewers, i.e., its projection should be either maximal or minimal. For each data sample vector we calculate number of skewers on which this partic- ular sample vector is projected to yield either maximal or min- imal projection. Let this number be the PPI count for and denoted by . Generally speaking, the higher the PPI count of a sample vector, the more likely the sample vector to be an endmember. However, in order for PPI to be effective, a large number of skewers, are generally required so that these skewers can cover as many random directions as possible. Unfortunately, by far there is no guideline suggested for users to determine how many skewers should be used for PPI. Empirically, it re- quires hundreds if not thousands of skewers for PPI to perform well in hyperspectral data exploitation. Details of a step-by-step MATLAB version of PPI can be found in [4], [6]. It is worth noting that in multispectral images there are only a few spectral images. So, the dimensionality reduction (DR) required for PPI is not needed here. This is because PPI was originally developed for hyperspectal images with hundreds of bands. Also, the purpose of using PPI is to find potential candi- dates for training samples not necessarily endmembers. The se- lection of the threshold can be set to 1. In this case two major issues, DR and determination of the threshold value , suffered from the implementation of PPI are resolved. The only issue re- mains is how many skewers, , required for PPI to implement, which can be determined empirically. III. SUPPORT VECTOR MACHINES A support vector machine (SVM) is a linear discriminant function which is a linear binary classifier andwas originally de- veloped in statistical machine learning theory [16]. It uses a set of support vectors as training samples to perform binary classi- fication. More specifically, an SVM finds an optimal hyperplane specified by a weighting vector that maximizes the margin be- tween two classes of support vectors. In order to further solve linear non-separable problems, it also introduces kernels speci- fied by a nonlinear function bymapping data samples into a high dimensional feature space so that nonlinear decision boundaries in the original space can be solved by linear decisions in the feature space. The resulting SVM is referred to as kernel-based SVM (KSVM). SinceKSVM is awell-established technique de- tails on its implementation are available in many references as well as books such as [13]. So, no further discussions are needed here. It is worth noting that since KSVM makes use of those data samples near the separating hyperplanes as their support vec- tors, from a classification point of view these support vectors can be considered as worst training samples due to their loca- tions close to hyperplanes. If a KSVM can perform well for 1836 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 6, NO. 4, AUGUST 2013 these worst samples, it definitely can do better than other data samples. With this interpretation these samples are indeed worst training samples for classification. Another comment on KSVM is noteworthy. The need of KSVM arises from the fact that the seed samples found by PPI are generally not sufficiently large to constitute reliable data statistics to be used as training samples. So, it cannot directly used for statistics-based classifiers such as FLDA. It occurs that a major strength of using KSVM is its use of support vectors which are not necessarily large as long as these support vectors are well selected as worst samples for separation. The PPI-found seed samples seem to play such worst training samples as support vectors to be used for KSVM. IV. ITERATIVE FLDA The idea of designing an iterative FLDA (IFLDA) is derived from solving two major issues arising in training samples, lack of training samples and sensitively of training samples. Furthermore, in order for FLDA to work effectively, a large number of training samples is generally needed to constitute reliable data statistics. This requirement can be accommo- dated by taking classified data samples resulting from running KSVM on a small number of support vectors as training sam- ples. However, since the support vectors used by KSVM are assumed to be worst training samples as noted in Section III, the KSVM-classified data samples may produce skewed data statistics for FLDA. The IFLDA is developed to correct these potential biased training samples through a series of self-itera- tions by implementing FLDA iteratively in the sense that the classification results produced by FLDA in a previous iteration are used as training samples for FLDA implemented in the next iteration. Such an iterative process will be terminated when the classification error produced by FLDA in two consecutive runs within an error threshold. As an additional benefit, the proposed iterative process can further reduce the effect caused by issue of random sampling. In other words, on some cases, due to random selected set of training samples the final classification may not be reproducible and the results are not consistent. The use of IFLDA can resolve this dilemma. An algorithm for implementing such an iteration classifica- tion described above with detailed implementation steps is out- lined as follows. Algorithm for Implementing PPI-SVM-IFLDA 1. Initial conditions: Select FLDA as a classifier. Let be a given threshold. Set the iteration counter . Input a training sample pool, for classes where is a set of training samples for the th class. Set for all and calculate the initial classification rate . 2. At the th iteration, the FLDA classifier performs classification using the training set to produce a new set of classes and calculates the new classification rate derived from . 3. Stopping rule: (a) If , then the algorithm is terminated. Go to step 5. Otherwise, compare (b) . If yes, the algorithm is terminated. Go to step 5. Otherwise, go to step 4. 4. Let and go to step 2. 5. Output as the final classification result, denoted by . A flow chart which implements the above algorithm to per- form iterative classification is provided in Fig. 1. A note on the use of such an iterative process is worthwhile. It is natural that the same approach developed for IFLDA can be also applied to SVM to derive an iterative KSVM (IKSVM) if the FLDA classifier is replaced by KSVM. Unfortunately, in most cases, the classification performance will not be improved due to the fact that the KSVM uses worst samples as support vectors to perform classification. However, this is not true for FLDA. This is mainly because FLDA is a statistics-based clas- sifier and its performance can be improved through iterative pro- cesses by refining training sample data statistics. V. UNSUPERVISED CLASSIFICATION It has been shown in [17] that potential target pixels can be only captured by spectral statistics of high orders. In order to do so the data must be pre-processed to remove the first and second orders of statistics prior to implementation of PPI. The data sphering is a common practice used for this purpose, specifically for independent component analysis (ICA) [18] which makes the sample mean zero to remove the 1st order of statistics as well as the data variance of each band image normalized to unity to remove the 2nd order of statistics. After the data is sphered, the PPI-SVM-IFLDA developed in Section IV is then implemented to produce a final set of refined training samples for a follow-up supervised classifier. In our developed algorithm this supervised classifier is also the same IFLDA used in PPI-SVM-IFLDAwith only difference in that the IFLDA is now implemented on the entire data set instead of sphered data. Unsupervised PPI-SVM-IFLDA Classification Algorithm 1. Initial condition: Set the number of classes to the number of spectral bands. 2. Sphere the data. 3. Implement PPI-SVM-IFLDA on the sphered data to produce a desired set of training samples. 4. Using the training samples obtained in Step 3 to perform IFLDA on the original data to produce classification results. Two comments are noteworthy. 1) Since multispectral imagery has a small number of spec- tral dimensions, PPI does not perform dimensionality re- duction as generally done in hyperspectral imaging. CHEN et al.: PPI-SVM-ITERATIVE FLDA APPROACH TO UNSUPERVISED MULTISPECTRAL IMAGE CLASSIFICATION 1837 Fig. 1. Flow chart of an algorithm implementing PPI-SVM-IFLDA. 2) Due to unavailability of prior knowledge the number of classes to be classified is unknown. In this case, a general guideline is to set the number of classes to the number of spectral bands used for data acquisition. However, this can be adjusted once partial prior knowledge is available such as visual inspection. VI. EXPERIMENTS A 4-band SPOT image scene shown in Fig. 2(a) was taken in 2008 over a mountain area in the central part of Taiwan with ground sampling distance of 10 m and size of 32420 m 37540 m. Since the image scene covers a very large remote site with many scattering areas. In this case, finding interesting areas will be challenging. To resolve this dilemma a Gaussian pyramid-based multi-resolution technique [15] was applied to pinpoint regions of interest (ROIs) with multiple resolution im- ages shown in Fig. 2 where the 5th layer of Gaussian pyramid image was used for PPI-SVM-IFLDA. Fig. 3(a) shows the bright samples extracted by PPI with their PPI counts greater than zero. These PPI-extracted samples are then used as support vectors for an RBF kernel-based SVMwith parameters and used in [13] to classify the Gaussian pyramid-re- duced image scene into 4 classes shown in Fig. 3(b) where the number of classes determined by the number of bands was 4. The SVM-classified samples in Fig. 3(b) were then used as training samples for a follow-up supervised IFLDA. Fig. 4 shows the classification results of the IFLDA with 5 iteration results produced by the FLDA. According to the ground truth provided in Fig. 5, the initial IFLDA-classification results can be identified as vegetations, water and bare land. Therefore, a region of interest (ROI) marked by square and shown in Fig. 6(a) was selected for further full image size clas- sification by the same algorithm again, PPI-SVM-IFLDA clas- sification algorithm described in Section V. The terrain in the selected ROI happened to be the one collapsed with a severe landslide disaster caused by a devastating earthquake, called 911 occurred on September 11, 1999 where a large area of the forest within this ROI was flattened and vanished. This selected ROI was then mapped back to the original full size image shown in Fig. 6(b) with original resolution. Fig. 6(c) shows the bright samples extracted by PPI with their PPI counts greater than zero. These PPI-extracted samples were then used as support vectors for an RBF kernel-based SVM (where and are defined in [13]) to classify the image in the image scene in Fig. 6(b) into 4 classes in Fig. 6(d). The SVM-classified samples in Fig. 6(d) were further used as training samples for IFLDA to perform classification on the original ROIs. Fig. 7 shows the classification results of the IFLDA with 5 iteration results. Since no prior knowledge was used in unsupervised classi- fication, the Google Earth was used to verify the results. Al- 1838 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 6, NO. 4, AUGUST 2013 Fig. 2. An approximate spot image pyramids of a mountain area in the central Taiwan (a) 5 layers of Gaussian pyramid images (size: 3242 3754) (b) 4 layers of Laplacian pyramid images (size:1621 1877) (c) 3rd layer of Gaussian and Laplacian pyramid image (size:811 939) (d) 4th layer of Gaussian and Laplacian pyramid image (size:406 740) (e) 5th layer of Gaussian pyramid image (size:203 235). (a) 1st layer (b) 2nd layer (c) 3rd layer (d) 4th (e) 5th. Fig. 3. A SPOT image of a mountain area (a) PPI-extracted sample (b) Data samples of 4 classes produced by SVM. Fig. 4. Classification by IFLDA with 5 iterations. though the mountain in Fig. 8 was covered by snow, the classi- fied 4 classes could be still identified as bare land, vegetation, Fig. 5. Ground truth of mountain obtained by Google Earth. low density grass and shadow. Interestingly, according to the ground truth conducted by an aerial view these 4 classes were very close to what they were found by our algorithm. In order to further demonstrate the effectiveness of unsuper- vised PPI-SVM-IFLDA classification algorithm. The same ex- periment was repeated over again for another region of water selected from Fig. 4. This region was mapped back to original CHEN et al.: PPI-SVM-ITERATIVE FLDA APPROACH TO UNSUPERVISED MULTISPECTRAL IMAGE CLASSIFICATION 1839 Fig. 6. A SPOT ROI image of a mountain area (a) Mapping the selected ROI back to original image. (b) ROI of 200 200 pixels (c) PPI-extracted samples (d) training samples of 4 classes produced by SVM. image in Fig. 9(a) with the ROI marked by square in Fig. 9(a) where Fig. 9(b) shows its original resolution. Fig. 9(c) shows the bright samples extracted by PPI with their PPI counts greater than zero. These PPI-extracted samples were then used as sup- port vectors for an RBF kernel-based SVM to classify the image in Fig. 10(c) into 4 classes in Fig. 10(d). The four SVM-classified samples in Fig. 9 were then further used as training samples for IFLDA and its classification results are shown in Fig. 10 with 4 iterations to complete the process. According to the Google Earth image shown in Fig. 11, the 4 classes obtained in Fig. 10 were identified as farm land, dam, vegetation and river bed which are nearly identical to what we found in Fig. 11. Based on the above experiments it is evidential that the pro- posed unsupervised PPI-SVM-IFLDA classification is indeed very effective without appealing for prior knowledge, which is a great advantage to unsupervised multispectral image classifi- cation. Three final comments are worthwhile. 1) Although there are also many unsupervised techniques available in the literature which can be used for the same purpose, many of them run into two major issues, (1) its use of random initial conditions and (2) non-repeatable final results. For example, the C-means (CM) method, also known as ISODATA [14] is one of most widely used unsupervised methods and was also tested for our exper- iments. Unfortunately, it was not effective. In most cases CM method-produced results were not good. On some occasions it did not even converge. Even its best results Fig. 7. Classification by PPI-SVM-IFLDA Classification Algorithm. Fig. 8. Ground truth of ROI obtained by Google Earth. were barely comparable to our results. Accordingly, their results are not included in the paper. 2) The parameters of kernels used by our SVMs were empir- ically selected. So, the results presented in this paper can be further improved provided that optimal parameters are determined. Unfortunately, finding optimal values of these parameters are very difficult and challenging. It is beyond the scope of this paper. Nevertheless, a recent work pro- posed in [19] to automatically determine these parameters [19] seems to provide a feasible solution to this problem. 1840 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 6, NO. 4, AUGUST 2013 Fig. 9. A SPOT ROI image of a mountain area (a) Mapping ROI to original image (b) ROI of 200 200 pixels (c) PPI-extracted samples (d) Classified sam- ples of 4 classes produced by SVM. Fig. 10. Classification by PPI-SVM-IFLDA with 4 iterations. 3) Finally, the idea of using PPI is neat. PPI has been widely used in hyperspectral imaging as endmember extraction but has not been explored in multispectral imaging since Fig. 11. Ground truth of ROI obtained by Google Earth. most pixels are heavily mixed and not endmembers due to low spatial and spectral resolution. Therefore, using PPI for multispectral images to find endmembers seems unre- alistic and has not received interest in multispectral image processing. Instead, PPI is used to find samples of interest. This paper is believed to be the first work to investigate and explore its potential in multispectral imaging where PPI is used for extracting samples of interest not endmembers. However, due to a likely small number of samples found by PPI, we need to find a classifier which does not need lots of training samples. The SVM is primarily chosen for this purpose. VII. CONCLUSIONS This paper investigates several issues of unsupervised clas- sification in multispectral imagery and further develops a new classification technique, unsupervised PPI-SVM-IFLDA classi- fication algorithm. It first implements PPI to find samples of in- terest that can be used as seed samples without prior knowledge. These PPI-found seed samples are then used as support vec- tors for KSVM to produce a large set of initial training samples. In order to mitigate instability caused by randomness resulting from using PPI and sensitivity to the use of PPI-found seed sam- ples as support vectors for SVM, an iterative FLDA (IFLDA) is developed to implement FLDA iteratively to produce a final set of training samples. Several new contributions can be derived from the proposed unsupervised PPI-SVM-IFLDA classifica- tion algorithm. One is the use of the PPI to find seed samples to be used as support vectors for an SVM. Here, we should bear in our mind that PPI used here is to look for seed training samples not endmembers as it was designed to be used in hyperspectral imaging. This is quite different from the way we think of PPI used in hyperspectral imaging. According to the authorsâ best knowledge there is no such work in the literature which uses PPI in multispectral image applications. Our proposed PPI-SVM- IFLDA based algorithm may open a door to new approaches to unsupervised multispectral image classification and analysis. A second contribution is to develop an iterative FLDA (IFLDA) which can fix the instability problem caused by PPI and further improve classification. Although FLDA is a CHEN et al.: PPI-SVM-ITERATIVE FLDA APPROACH TO UNSUPERVISED MULTISPECTRAL IMAGE CLASSIFICATION 1841 well-known technique in pattern classification, IFLDA is the first work ever reported in the literature. It is also shown to be a convergent algorithm. The benefits derived from IFLDA are significant. As a matter of fact, two applications are currently undertaken for investigation. One is reduction of training sample pool. Another is reduction of uncertainty caused by random sampling. The use of IFLDA is particularly useful when sample size is small. A third contribution is to use Gaussian pyramid images to obtain initial classification from which ROIs in original resolu- tion image can be located. It not only can reduce computational complexity, but also can help users find an appropriate ROI. Finally and most importantly, our proposed PPI-SVM- IFLDA classification algorithm is an unsupervised technique which can be implemented without appealing for prior knowl- edge. According to our extensive experiments it is practically useful for unsupervised multispectral image classification and indeed works very effectively. Despite that there are also unsupervised methods such as C-means (K-means, ISODATA), nearest neighboring rule, which can do the same as PPI does, according to our extensive experiments none of them is ef- fective. The main reason may be due to the fact that all such methods are spatial domain-based clustering methods and require grouping the entire data into a finite number of regions from which seed samples can be found. Unfortunately, such spatial domain-found seed samples are generally smeared and could not be used as training samples. On the other hand, it also requires knowing how many groups to be clustered in the first place where each group is considered as a spatial class not spectral class. On the other hand, PPI is a spectral-based tech- nique which looks for possible seed samples of interest from spectral characteristics not spatial properties. As mentioned in the introduction, although there are also other endmember extraction techniques that may be used for the same purpose, they are not applicable to multispectral images. One major reason that prevents such endmember extraction algorithms from being considered in this paper is that they require the prior knowledge about the number of endmembers to be generated. The virtual dimensionality (VD) developed in [1], [20] is a spectral-based technique and cannot be used for this purpose because there are only a few spectral bands available in which case all bands should be used. Even in this case, this may not be enough based on the fact that there are usually more classes of interest than the number of spectral bands. This situation is completely reversed in hyperspectral imagery where there are more spectral bands than the number of signatures or endmem- bers of interest in which case VD and endmember extrcation algorithms work. This is exactly one of main contributions made in this paper which makes PPI applicable to multispectral imaging, while other endmember extraction algorithms are not. In other words, PPI used here is to find samples of interest to be used as seed samples which are not necessarily endmem- bers. Since these PPI-found samples provide only seeds of training samples, a follow-up SVM is then used to find a set of initial training samples where the PPI-found samples are used its support vectors. Finally, IFLDA is further used to refine training samples to obtain a final set of training samples. So, the chain of combining PPI, SVM and IFLDA is a novel idea to demonstrate great potential of PPI in multispectral imaging. Most importantly, it also turns out that PPI may be the only endmember extraction algorithm available in hyperspectral imaging that can be used in multispectral imaging because it does not need the prior knowledge of number of endmembers. As a result, the PPI-found samples are not necessarily end- members but can be used training samples. This disadvantage suffered in hyperspectral endmember extraction becomes the main strength of PPI to be used to find seed training samples for multispectral imaging. REFERENCES [1] C.-I Chang, Hyperspectral Imaging: Techniques for Spectral Detec- tion and Classification. New York: Kluwer Academic/Plenum Pub- lishers, 2003. [2] J. M. Bioucas-Dias, A. Plaza, N. Dobigeon, M. Parente, Q. Du, P. Gader, and J. Chanussot, âHyperspectral unmixing overview: Geomet- rical, statistical, and sparse regression-based approaches,â IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. (JSTARS), vol. 5, no. 2, pp. 354â379, 2012. [3] C. A. Lee, S. D. Gasster, A. Plaza, C.-I Chang, and B. Huang, âRecent developments in high performance computing for remote sensing: A review,â IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 4, no. 3, pp. 508â527, 2011. [4] C.-I Chang, Hyperspectral Data Processing: Algorithm Design and Analysis. New York: Wiley, 2012. [5] J. W. Boardman, F. A. Kruse, and R. O. Green, âMapping target sig- natures via partial unmixing of AVIRIS data,â in Summaries of JPL Airborne Earth Science Workshop, Pasadena, CA, 1995. [6] C.-I Chang and A. Plaza, âFast iterative algorithm for implementation of pixel purity index,â IEEE Trans. Geosci. Remote Sens. Lett., vol. 3, no. 1, pp. 63â67, Jan. 2006. [7] C.-I Chang, C.-C. Wu, and H. M. Chen, âRandom pixel purity index algorithm,â IEEE Trans. Geosci. Remote Sens. Lett., vol. 7, no. 2, pp. 324â328, Apr. 2010. [8] M. E. Winter, âN-finder: An algorithm for fast autonomous spectral endmember determination in hyperspectral data,â Image Spectrometry V, Proc. SPIE 3753, pp. 266â277, 1999. [9] W. Xiong, C.-I Chang, C.-C. Wu, K. K. Kalpakis, and H. M. Chen, âFast algorithms to implement N-FINDR for hyperspectral endmember extraction,â IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 4, no. 3, pp. 545â564, 2011. [10] J. M. P. Nascimento and J. M. Bioucas-Dias, âVertex component analysis: A fast algorithm to unmix hyperspectral data,â IEEE Trans. Geosci. Remote Sens., vol. 43, no. 4, pp. 898â910, Apr. 2005. [11] C.-I Chang, C. Wu, W. Liu, and Y. C. Ouyang, âA growing method for simplex-based endmember extraction algorithms,â IEEE Trans. Geosci. Remote Sens., vol. 44, no. 10, pp. 2804â2819, Oct. 2006. [12] R. A. Schowengerdt, Remote Sensing: Models and Methods for Image Processing, 2nd ed. New York: Academic, 1997. [13] S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd ed. Englewood Cliffs, NJ: Prentice-Hall, 1999, ch. 6. [14] R.O.Duda and P. E. Hart, Pattern Classification and Scene Analysis. New York: Wiley, 1973. [15] P. Burt and T. Adelson, âThe Laplacian pyramid as a compact image code,â IEEE Trans. Commun., vol. COM-9, no. 4, pp. 532â540, 1983. [16] V. N. Vapnik, Statistical Learning Theory. New York: Wiley, 1998. [17] H. Ren, Q. Du, J. Wang, C.-I Chang, and J. Jensen, âAutomatic target recognition hyperspectral imagery using high order statistics,â IEEE Trans. Aerosp. Electron. Syst., vol. 42, no. 4, pp. 1372â1385, Oct. 2006. [18] A. Hyvarinen, J. Karhunen, and E. Oja, Independent Component Anal- ysis. New York: Wiley, 2001. [19] C.-H. Li, H.-H. Ho, Y.-L. Liu, C.-T. Lin, B.-C. Kuo, and J.-S. Tau, âAn automatic method for selecting the parameter of the normalized kernel function to support vector machines,â J. Inf. Sci. Eng., vol. 28, pp. 1â15, 2012. [20] C.-I Chang and Q. Du, âEstimation of number of spectrally distinct signal sources in hyperspectral imagery,â IEEE Trans. Geosci. Remote Sens., vol. 42, no. 3, pp. 608â619, Mar. 2008. 1842 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 6, NO. 4, AUGUST 2013 Hsian-Min Chen received the B.S. andM.S. degrees from Huafan University, Taipei, Taiwan, in 1999 and 2001, and the Ph.D. degree in the Department of Electrical Engineering at National Chung Hsing University, Taichung, Taiwan, in 2008. He is currently an Assistant Professor in the De- partment of Biomedical Engineering, HungKuang University, Taichung, Taiwan. His research interests include digital image processing, biomedical image processing, and medical informatics. Chinsu Lin (Mâ11) received the B.S. degree from National Chung Hsing University (NCHU), Taichung, Taiwan in 1986, M.S. and Ph.D. degrees from National Taiwan University (NTU), Taipei, Taiwan in 1988 and 1996, all in forestry. Dr. Lin is a certified forester. He has been with the National Chiayi University (NCYU), Chiayi, Taiwan, since 1997 and is currently professor of the Department of Forestry and Natural Resources. His research interests include multispectral and hyper- spectral signal processing and pattern recognition. He is currently investigating applications of remote sensing techniques for forest ecosystem management; the major applications are biophysical and biochemical research, automatic delineation and mapping of individual tree canopy, volumetric modeling with airborne lidar data, forest biomass and productivity, carbon sequestration estimation, and individual-based tree species classification. He also makes efforts at upgrading the forest inventory accuracy by incorporating the techniques of hyperspectral and hyperspatial remote sensing, lidar remote sensing, GIS, and GPS. Shih-Yu Chen (Sâ09âMâ12) received the B.S. de- gree in electrical engineering from Da-Yeh Univer- sity in 2005 and the M.S.E.E. degree from National Chung Hsing University, Taiwan, in 2010. He is cur- rently pursuing the Ph.D. (EE) degree at the Univer- sity of Maryland, Baltimore County. Mr. Chenâs research interest includes medical im- ages, remote sensing images and vital sign signal pro- cessing. Chia-Hsien Wen received the B.S. degree in computer science from Tamkang College, Taipei, Taiwan, in 1976, and the M.S. degree in applied mathematics and the Ph.D. degree in computer sci- ence from National Tsing Hua University, Hsinchu, Taiwan, in 1978 and 1994, respectively. He is currently an Associate Professor in the Department of Computer Science and Information Management, Providence University, Taiwan. Prior to joining Providence University in 2005, he was the Director of the Computing Center at Taichung Veterans General Hospital (TCVGH). His research interests include medical image processing, medical informatics, database management, and machine learning. Clayton Chi-Chang Chen received the M.D. degree from China Medical College, Taichung, Taiwan, in 1981. Currently he is the Chairman of the Department of Radiology, Veterans General Hospital, Taichung, Taiwan, and an Associate Professor in the Depart- ment of Radiological Technology, Central Taiwan University of Science and Technology, Department of Physical Therapy, Hungkung University of Technology, Taichung, Taiwan, and Department of Physical Therapy, National Yang-Ming University, Taipei, Taiwan. His research interests include biomedical image analysis, neuroradiology, CT, MRI and functional MRI. Yen-Chieh Ouyang (Sâ86âMâ92) received the B.S.E.E. degree from Feng Chia University, Taiwan, in 1981, and the M.S. and Ph.D. degrees from the Department of Electrical Engineering, University of Memphis, Memphis, Tennessee, in 1987 and 1992, respectively. Dr. Ouyang joined the faculty of the Department of Electrical Engineering at National Chung Hsing University, Taiwan, in August 1992, where he is cur- rently a Professor and the Director of the Multimedia Communication Laboratory. His research interests include hyperspectral image processing, medical imaging, communication networks, network security in mobile networks, multimedia system design and performance evaluation. Chein-I Chang (Sâ81âMâ87âSMâ92âFâ10) re- ceived the B.S. degree from Soochow University, Taipei, Taiwan, the M.S. degree from the Institute of Mathematics at National Tsing Hua University, Hsinchu, Taiwan, and the M.A. degree from the State University of New York at Stony Brook, all in math- ematics. He also received M.S. and M.S.E.E. degrees from the University of Illinois at Urbana-Champaign and the Ph.D. degree in electrical engineering from the University of Maryland, College Park. He has been with the University of Maryland, Bal- timore County (UMBC) since 1987 and is currently a Professor in the De- partment of Computer Science and Electrical Engineering. He was a visiting research specialist in the Institute of Information Engineering at the National Cheng Kung University, Tainan, Taiwan, from 1994 to 1995. His research in- terests include multispectral/hyperspectral image processing, automatic target recognition, medical imaging. He has five patents and several pending on hy- perspectral image processing. Dr. Chang received an NRC (National Research Council) senior research as- sociateship award, 2002â2003, sponsored by the U.S. Army Soldier and Bio- logical Chemical Command, Edgewood Chemical and Biological Center, Ab- erdeen Proving Ground, Maryland. He was also a distinguished lecturer chair at the National Chung Hsing University sponsored by the Ministry of Edu- cation in Taiwan, ROC, 2005â2006. He was a chair professor of the Envi- ronmental Restoration and Disaster Reduction Research Center and Depart- ment of Electrical Engineering, National Chung Hsing Universuty, Taichung, Taiwan, and has been chair professor of remote sensing technology at the same institute since 2009. Currently, he holds an international chair professor of Na- tional Taipei University of Technology, Taipei, Taiwan. He was a distinguished visiting fellow/fellow professor sponsored by the National Science Council in Taiwan, 2009â2010. He was a plenary speaker for SPIE Optics+Applications, Remote Sensing Symposium, 2009. He was a keynote speaker at the User Con- ference of Hyperspectral Imaging 2010, Industrial Technology Research Insti- tute (ITRI), Hsinchu, Taiwan, 2009 Annual Meeting of the Radiological Society of the Republic of China, Taichung, Taiwan, 2008 International Symposium on Spectral Sensing Research (ISSSR), Conference on Computer Vision, Graphics, and Image Processing 2003 (CVGIP 2003), Kimen, Taiwan, and CVGIP 2012, Natou, Taiwan. He was guest editor of a special issue of Journal of High Speed Networks on Telemedicine and Applications (April 2000) and co-guest editor of another special issue of the same journal on Broadband Multimedia Sensor Networks in Healthcare Applications, April 2007. He was co-guest editor of a special issue on High Performance Computing of Hyperspectral Imaging for the International Journal of High Performance Computing Applications, De- cember 2007, and special issue on Signal Processing and System Design in Health Care Applications for EURASIP Journal on Advanced in Signal Pro- cessing, 2009. He has authored two books, Hyperspectral Imaging: Techniques for Spectral Detection and Classification (Kluwer Academic Publishers, 2003) and Hyperspectral Data Processing: Signal Processing Algorithm Design and Analysis (Wiley, 2012), and edited two books, Recent Advances in Hyperspec- tral Signal and Image Processing (Transworld Research Network, India, 2006) and Hyperspectral Data Exploitation: Theory and Applications (Wiley, 2007) and co-edited with A. Plaza the book High Performance Computing in Remote Sensing (CRC Press, 2007). He is currently working on a third book, Real Time Hyperspectral Image Processing: Algorithm Architecture and Implementation (Springer-Verlag, 2013). He was an Associate Editor in the area of hyperspec- tral signal processing for IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 2001â2007. He is currently an associate editor of Artificial Intelligence Research and is on the editorial boards of Journal of High Speed Networks, Recent Patents on Mechanical Engineering, International Journal of Computa- tional Sciences, Journal of Robotics and Artificial Intelligence, and Engineering and Open Remote Sensing Journal. Dr. Chang is a Fellow of SPIE.