Facial Expression Recognition Using Self Organizing Map Dr.G.F.Sudha Asst. Prof., Dept. of Electronics and Comm. Engg., Pondicherry Engg. College, Puducherry.
[email protected] P. Jeyashree Final year-M.Tech-Electronics and Comm. Engg., Pondicherry Engg. College, Puducherry.
[email protected] Abstract Facial expression recognition has attracted a significant interest in the scientific community, as it plays a vital role in human centered interfaces. Many applications require efficient facial expression recognition in order to achieve desired results. In this paper facial expression recognition is presented using Self Organizing Map(SOM). At first placement of Candide grid nodes to face landmarks of the image is carried out. For the face recognition system Gabor filters extract the feature vector values. These values are used for classification and clustering in the SOM grid. The system adapts to the user’s preferences by returning more relevant images from the SOM grid for which the responses have been most densely mapped. The main contribution of this work is the identification of a face image from a database along with its gesture. Results obtained on the test image database indicate that the system retrieves the relevant expressions accurately with minimum number of training steps. 1. Motivation Images of human faces are central to intelligent human computer interaction. Face detection, recognition, tracking, pose estimation, expression and gesture recognition are in research now. The existing solutions [1, 2] use Geometric Deformation for feature extraction and SVM for clustering. Given a single image or a sequence of images, the goal is to identify human faces regardless of their positions, scales, orientations, poses, expressions, occlusions and lighting conditions. This is a challenging problem because faces are nonrigid objects with a high degree of variability. In this paper, SOM has been used as the clustering algorithm. The simplicity and increased speed of computation of SOM makes the recognition process quicker and accurate. 2. System Architecture The architecture of the content-based image retrieval system used for face detection and recognition is shown in Figure 1. Figure 1. System architecture Feature Extraction Image classification Query Feature Extraction Measure SOM Grid Test Image Retrieved Images International Conference on Computational Intelligence and Multimedia Applications 2007 0-7695-3050-8/07 $25.00 © 2007 IEEE DOI 10.1109/ICCIMA.2007.164 219 International Conference on Computational Intelligence and Multimedia Applications 2007 0-7695-3050-8/07 $25.00 © 2007 IEEE DOI 10.1109/ICCIMA.2007.164 219 Firstly, the images in the database are processed to extract the features to form the metadata information. These are used to index the image in the SOM grid. Next the query image is analyzed to extract the visual features which are used to retrieve similar images from the database. The similarity of two images is measured by computing the distance between the feature vectors. The retrieval system returns images having maximum similarity, whose distance from the query image is below some defined threshold. 3. Self Organizing Map The SOM [3, 4] uses an ordered structure of neurons. Neurons are usually located in the nodes of the two-dimensional grid with rectangular or hexagonal cells. Neurons also interact with each other and the distance between the neurons on the map lattice governs the degree of interaction. The SOM learning process consists of sequential corrections of neurons. On every step of the learning process a random vector is chosen from the initial data set and then the best- matching neuron coefficient vector is identified. The winner is selected, which is the most similar to the input vector. The SOM learning process is given by the set of equations 1, cX W− = min { }iX W− wi (t+1) = wi(t) + ∆wi(t) ∆wi(t) = α(t)hc,i(t)[x(t)-wi(t)] hc,i(t) = exp(-d(i,c)2/β(i,c)2) (1) where X is the input at a particular instant, Wc and Wi are the weight of the winning node and the node under consideration. hc,i(t) is referred to as the neighborhood function, d(i,c)2is the Euclidean distance from node i to the winning node, α(t) is the learning rate and β(i,c) is the neighborhood size at time t. After the winning neuron is found the neural network weights are corrected. The winning unit and its neighbors adapt to represent the input by modifying their reference vectors towards the current input. 4. Feature Based Facial Expression Recognition Using Gabor Filter The facial expression recognition system is composed of two subsystems: one for Candide grid node information extraction and one for grid node information classification. The grid node information extraction is performed by a tracking system, while the grid node information classification is performed by a SOM system. The most significant nodes around eyes, eyebrows and mouth should be chosen, since they are responsible for the formation of facial deformations. The facial expressions can then be identified with the help of these Candide grid nodes formed. The feature vectors are obtained using the Gabor filter [5] on the above grid regions and these are placed in the classifier grid. 5. Experimental Results The facial expression recognition system was tested on a database of 360 facial image comprising of 38 male and 22 female images having all six expressions on P4 system with 2.5GHz and 256MB RAM. Out of the 38 male and 22 female images, 10 and 8 images respectively are taken as the training set and the rest for the recognition purpose. The images were convolved using both types of filters of 4 spatial frequency and 6 orientations. Six angular orientations from 0 to 150 degrees in 30-degree steps were used. The result of the initialization procedure, when seven nodes (four for the inner and outer corner of the eyes and three for the upper lip) are placed by the user, can be seen in Figure 2. 220220 (a) (b) (c) (d) (e) (f) Figure 2. Grid initialization results for (a) neutral (b) smile (c) surprise (d) sad (e) fear and (f) anger expression (a) (b) (c) (d) (e) (f) Figure 3. GUI for (a) neutral (b) smile (c) surprise (d) sad (e) fear and (f) anger expression Figure 3 shows the GUI for taking an image under test and finding the gesture of that particular image by first marking the Candide grid nodes and then using the SOM classifier for gesture identification. 6. Conclusions This paper describes a face recognition system to identify facial expressions like neutral, smile, sad, surprise, fear, and anger. The system uses SOM as the classifying algorithm to make the retrieval process simpler and time consuming. The clustering of images is done with the feature vector values obtained using the Gabor filter. The system developed uses minimum number of images for training. It has an added advantage of identifying the gesture of a facial image not present in the database itself. The system has useful applications such as health condition monitoring, customer satisfaction studies and in lie detection. 7. References [1] Irene Kotsia and Ioannis Pitas, “Facial expression recognition in image sequences using Geometric Deformation Features and Support Vector Machines”, IEEE transactions on Image processing, vol.16, no.1, January 2007. [2] Y. Zhang and Q. Ji, “Active and dynamic information fusion for facial expression understanding from image sequences,” IEEE Transaction on Pattern Analysis Mechanisms Intelligence., vol. 27, no. 5, May 2005. pp. 699– 714. [3] Kohonen.T, Hynninen.J, Kangas. J Laaksonen, “SOM_PAK: The Self Organizing Map Program Package”, Technical Report A31, Helsinki University of Technology , August 2005. [4] T. Kohonen, “The self-organizing map”, Proceedings of the IEEE Transaction on Image Processing, vol. 78, no. 9, November 2005. [5] Nectarios Rose, “Facial expression classification using Gabor and log Gabor filters”, Proceedings of the International Conference on Automatic Face and Gesture Recognition (FGR’06). 221221