Novel Technique using Optimal Salp Swarm Based Feature Fusion with Linear Multi k-SVM Classifier on Moving Object Imaging

Authors : G. Jemilda and S. Baulkani

Abstract: To develop a new automatic moving object segmentation and classification system from the level-1 and level-2 sub bands, the Local Shape (LoS) and the Histogram oriented Gradients (HoG) features are extracted. These extracted features are then fused at the feature level Fusion using Salp Swarm optimization (FFSSO) algorithm. For convenience, the fused features are now called w-LoSHoG descriptor hereafter. Moreover, the feature extraction technique is applied on Least Enclosing Rectangle (LER) of the segmented object to increase the processing speed. The main intuition of this salp swarm algorithm relays on reducing the computational load of the proposed classifier by removing the repetitive and unrelated features from the feature vector. Also, increased training samples of similar shaped classes when applied on the classifier can generate the mis classification results. Thus, a new layered kernel based Support Vector Machine (k-SVM) classifier is developed by means of integrating the k-neural network classifier and layered SVM classifier. Because of the high dimensional features there occurs a difficulty in the application of single classifier. In order to ease the computational load, this multi classifier is integrated with shadow elimination technique to classify the object categories of intelligent transportation system such as motorcycle, bicycle, car and pedestrians.

How to cite this article:

G. Jemilda and S. Baulkani, 2020. Novel Technique using Optimal Salp Swarm Based Feature Fusion with Linear Multi k-SVM Classifier on Moving Object Imaging. Asian Journal of Information Technology, 19: 21-27.

DOI: 10.36478/ajit.2020.21.27

URL: https://medwelljournals.com/abstract/?doi=ajit.2020.21.27

INTRODUCTION

In current trend, intelligent transport system has received more attention in the research and commerce area. Smarter transportation system is generated through minimizing crowding, accident and injury. To further enhance the reliability, efficiency and safety of transportation subsystem an improved transportation system management technique was developed. Presently, the intelligent transportation system is managed effectively using one of the key technologies called wireless traffic video surveillance system. However, the tasks such as vehicle detection, vehicle tracking, vehicle classification and vehicle recognition are considered to be significant factors in the design of efficient traffic video surveillance system^[1-3]. The first step invoked to develop a traffic video surveillance system is that designing an automated vehicle detection process. Essentially, this can be achieved by extracting necessary details about moving vehicles and applying these details for correct classification and recognition. Traffic conditions can be monitored and analyzed accurately by means of classifying the moving objects into categories such as bicycles, motorcycles, pedestrians and cars. Also, these categories support a lot in accurate analysis of traffic conditions and retrieval of an object from the video frames. In general, the performance of an overall classification system is affected with two significant factors such as feature extraction from candidate objects and the classifier model. From the past decades, many studies have been proposed for the detection of intelligent transportation application categories namely, cars, bicycles, motorcycles and pedestrians^[4-9]. In the last decade, extensive research has been done on moving object detection and tracking. Difficulty in object tracking was observed during unstructured objects structure and cameras, scenes, sudden movements of objects and quick object changes. However, detecting the object from the video sequence and also tracking the object remain a challenge for researches. Histograms of oriented Gradients (HoG), Local Binary Patterns (LBP), Haar-like features and Haar wavelets are the common features included by Wang et al.^[4]. In previous works by Jing et al.^[10] and Mirjalili et al.^[11] two sets of feature descriptors HoG-LBP combination incurs the goodness of each feature descriptor, hence, the detection performance is highly improved. However, computational loads in the classifier and feature dimension cost are increased with this HoG-LBP combination. In order to rectify all these issues, a new automatic moving object segmentation and classification system is proposed. This novel approach includes new feature descriptor, feature selection using new optimization algorithm and a new layered k-SVM classifier incorporating the shadow elimination technique which reduces the complexity effectively. The main contributions of this study can be described as follows.

The use of Least Enclosing Rectangle (LER) of a segmented object reduced the time consumption while on extracting the high dimensional feature descriptors such as the LBP and HOG. The use of new layered k-SVM classifier and Shadow Elimination (SE) technique increases the classification accuracy.

Applying the classification technique alone to a segmented object reduces the processing time and makes them feasible for performing the real time operations.

MATERIALS AND METHODS

The proposed research work focuses on the construction of integrated moving object detection/classification system for better discrimination of real time applications (i.e., intelligent transportation systems, human motion capture and so on) is shown in Fig. 1.

Construction of LER window: Initially, the RGB color space incorporating the shadow elimination is considered to implement the proposed object segmentation technique. Five basic steps of this process are as follows: At first, moving pixels are identified through determining the frame difference between the current and the previous frames. Secondly, the composing pixels are updated for the registered background regions. Thirdly, the moving objects from the background region are distinguished effectively by following the background difference calculation. Beyond the color based modifications used in gray images, the initial three aforesaid steps of the proposed object segmentation technique also characterizes the new function for registering a new object as a background region. Further, shadow effect of the segmented object is reduced in the fourth step. Ultimately, in the fifth step, vertical and horizontal histograms for the segmented image are determined to obtain the position of the LER window of an object. However, after perfect segmentation of an object its complete LER window is acquired. Subsequently, a tracking algorithm is employed to obtain the LER window of the moving object.

Preprocessing: In this research to better distinguish the features among the four classes of moving objects (i.e., pedestrians, cars, bicycles and motorcycles), a weight mask for a LER window is introduced.

Feature extraction: In the feature extraction step, Local Shape (LoS) and HoG features are extracted effectively.

Feature selection using BSSA: In this approach, all solutions are constrained to the binary values [0, 1]. Further, optimal features are selected from each video frames through defining a solution as a one-dimensional vector (i.e., each cell having 0 and 1 values). Based on the number of w-LoS HoG features in a video frame, the length of the vector is defined. Value 1 indicates that the feature is selected; otherwise, the feature is not selected with value 0. These selected optimal features are sent to the new layered k-SVM classifier for object classification.


Fig. 1:	Block diagram of the proposed approach

Layered k-SVM classifier with SE technique: In order to classify four classes of the moving objects such as cars, bicycles, motorcycles and pedestrians, a newly developed layered k-SVM classifier is employed. Further, two classification stages are introduced in this classification. Initially, in the two-wheeled objects class, the bicycles and motorcycles are assigned due to their shape similarity.

Basically, the LER window of an object is resized to obtain an accurate feature dimension for classifying the objects having different sizes. To the width and length of the LER window, the scaling factor is applied; thus the maximum size of a rescaled LER (RLER) window obtained is 128×62. In case if the original window of an object satisfied this constraint then, it is not necessary to perform resizing. Next if determined the classified object in the RLER window belong to a two-wheeled object class then, again classification is performed to distinguish the object into a motorcycle or bicycle.

However, the Shadow Elimination (SE) technique incorporated with this classifier is allowed to classify three classes of objects in the initial stage itself. In other words, the SE technique can reduce the shadow effects on the segmented object. In this section, SE technique is used as a clue to distinguish the moving objects as fast as possible; instead of applying SE technique for segmenting the moving object before classification. For instance, large shadow areas are generated by the cars than the motorcycles; thus, based on the shadow effect it is easier to identify the moving object whether it is bicycle or motorcycle. Further, the proposed multi-SVM classifier is trained using 2N training samples. The output generated by the proposed multi-SVM classifier is lower than zero on testing then, the object is recognized as a bicycle; otherwise, it is classified as a motorcycle (i.e., output>0).

RESULTS AND DISCUSSION

This study detailed the experimental outcomes and performance analysis of the proposed approaches.

Experimental setup: The performance of the proposed approaches is tested using the objects segmented from four videos under various scenes. Implementation is done using MATLAB. The experimental results are evaluated and performance is analyzed using the parameters, like True Positive Rate (TPR), False Positive Rate (FPR), Precision (P), Recall (R) and Accuracy (A). The pixel values for the size of each captured image in the video are fixed to 740×480. However, if the length or width of the object in captured image is too smaller than 15 pixels then they are difficult to distinguish. Furthermore, in this research, it is important to perform feature extraction process using the pixels of the object in an image; therefore, the width/length of the object in captured image should be large enough. Also, the number of interested pixels in an LER window was fixed to 18 as minimum pixels. Ultimately, the performances of several conventional features are used to analyze the classification performance and dimensionality reduction of the proposed FFSSO optimization approach and multi k-SVM classifier.

Framework validation: In this research, the pedestrian, car, bicycle and motorcycle classes include M number of training samples to train the proposed multi k-SVM classifier. The M number of training samples for each class was fixed to 2000. Figure 2 depicts the training samples collected for each class under different scenes.

Further, the LER windows of the moving objects and the scenes of the four test videos which are different from the training video are shown in Fig. 3. The four videos are taken under the duration 1550, 1450, 1807 and 1365 s. Different backgrounds can be observed in these four videos and moving objects are captured from side-to-side view of the image. Using proposed projection based segmentation approach the objects are segmented from the four videos with the duration’s 1323’s, 1244’s, 988’s and 300’s, respectively. The proposed classification approach is tested using these segmented objects. Using the update and background registration step the background registered in different frames are depicted. In case the object remains stable for a certain period of time then their background is registered as a new background. Using the frame difference technique the movement of the object is identified as soon as the object starts moving. Then, the background is registered as a new background region.


Fig. 2:	Training samples collected for each class in different scenes

From Fig. 3b, it is possible to identify that the two different objects namely, pedestrian and bicycle are entering into the scene at the same time. In this case, the bicycle is occluded by a pedestrian. Initially, a pedestrian alone is covered by the LER window. However, when used the projection based segmentation approach the bicycle is segmented soon after the occlusion is vanished and also a new LER window is created (indicated in second and third columns).

Evaluation metrics: In order to reveal the performance of proposed approaches the evaluation metrics such as True Positive Rate (TPR), False Positive Rate (FPR), Precision (P), Recall (R) and Accuracy (A) were adopted and they are defined in Eq. 1-5:

(1)

(2)

(3)

(4)

(5)

Where:

TP	=	The total number of true positive pixels
TN	=	The total number of negative pixels
FP	=	The total number of false positive pixels
FN	=	The total number of false negative pixels and so on

Precision defines the percentage of all identified pixels corresponding to the moving object. Recall defines the percentage of all pixels corresponding to moving object which is correctly identified. Accuracy defines the percentage of all pixels in RLER window which is correctly rejected and detected. To accurately detect the objects in the background the value of precision, recall, TPR and Accuracy should be high and at the same time the value of FPR should be low.

Comparison with conventional classifiers: The performance of automatic moving object classification system (i.e., multi k-SVM with Salp swarm algorith


Fig. 3:	Classification results of the proposed technique on four test videos (a) Video I (b) Video II (c) Video III and (d) Video IV

(multi k-SVM+SSA)) is analyzed by comparing the efficiency of hybrid classifiers such as convolutional neural network and genetic algorithm (CNN+GA), Feed-forward neural with Bayesian classifier (FFN+BC), conventional neural network with back propagation algorithm (CNN+BP). The performance of the proposed system is analyzed by increasing the training data. Fig. 4(a-e) depict the TPR, FPR, Precision (P), Recall (R) and Accuracy (A) of the proposed system on moving object classification. When increased the training data, the proposed classification system achieved better performance than the other hybrid classifiers. This goodness is observed because; the multi k-SVM classifier is developed by integrating the k-NN and the SVM classifiers. For achieving multi-classification the K-NN classifier is the best choice due to the fact that it performs classification wholly based on the distance among the training data and test sample. Further, in this research, the dimensionality of the newly fused w-LoSHoG feature descriptor can be effectively reduced using SVM classifier because it is evident that the SVM classifier has the ability to behave better on the high dimensional data.


Fig. 4:	Performances of classifiers (a) TPR vs. training data (b) FPR Vs Training data (c) Precision rate vs. training data (d) Recall rate vs. training data and (e) Accuracy vs. training data

Due to this advantageous, the two classifiers namely k-NN and SVM are integrated to develop the layered K-SVM classifier. Also, the SE technique is incorporated with this developed classifier to avoid misclassification of the similar training samples having similar images.

When increased the training data, the proposed classification system achieved better performance than the other hybrid classifiers. This goodness is observed because the layered K-SVM classifier (Multi-SVM) is developed by integrating the K-NN and the SVM classifiers. For achieving multi-classification, K-NN classifier is the best choice because it performs classification wholly based on the distance among the training data and test sample. Further, in this research, the high dimensional features are extracted using the newly developed w-LoSHoG feature descriptor where, the SVM classifier has the ability to behave better on the high dimensional data. Due to this advantageous, the two classifiers namely K-NN and SVM are integrated to develop the layered K-SVM classifier. Also, the SE technique is incorporated with this developed classifier to avoid misclassification of the similar training samples having similar images.

CONCLUSION

In this module an effective moving object segmentation and classification approach was presented. Initially, the projection based segmentation method was proposed for object segmentation. The LoS and HoG features are extracted from the segmented object using Haar DWT feature extraction process. A new feature descriptor called w-LoSHoG was developed by FFSSO optimization approach. The Salp Swarm Algorithm (SSA) was imposed to find an optimal weight score to fuse the extracted LoS and HoG features; hence, the dimensionality issue and increase in processing time was gradually decreased. Finally, a new multi k-SVM classifier was developed by means of integrating the k-neural network classifier and layered SVM classifier. In order to ease the computational load, this multi classifier has been developed to classify the object categories of intelligent transportation system such as motorcycle, bicycle, car and pedestrians. The experimental results proved the effectiveness of proposed methods compared to other existing conventional single and hybrid classifiers in terms of TPR, FPR, precision rate, recall rate and accuracy. As a future research, degradation of video frames can be reduced by means of using improved lossless video surveillance techniques and instead of doing classification with large objects, small sized objects and its shadow can be applied as an input for classification.

Related Links

Journals By Subject

Asian Journal of Information Technology

Novel Technique using Optimal Salp Swarm Based Feature Fusion with Linear Multi k-SVM Classifier on Moving Object Imaging

How to cite this article: