A Fuzzy Clustering Technique for Adapting Tournament Selection

Authors : Abdelaziz El Moujahid, Khalid Jebari, Abdelaziz Bouroumi and Aziz Ettouhami

Abstract: Tournament selection is a popular selection scheme commonly used in genetic algorithms. This method depends largely on the size of the tournament which influences the quality of the final solution. In this study, a new technique is used to dynamically adjust the tournament size using the fuzzy c-means algorithm. The efficiency of the proposed technique is shown by using several benchmark test functions.

How to cite this article:

Abdelaziz El Moujahid, Khalid Jebari, Abdelaziz Bouroumi and Aziz Ettouhami, 2011. A Fuzzy Clustering Technique for Adapting Tournament Selection. International Journal of Soft Computing, 6: 62-67.

DOI: 10.3923/ijscomp.2011.62.67

URL: https://medwelljournals.com/abstract/?doi=ijscomp.2011.62.67

INTRODUCTION

There are many selection schemes for Genetic Algorithms (GA). Several factors have contributed to the increased use of tournament selection as a selection mechanism for GA. Tournament selection is simple to code and efficient for both parallel and nonparallel architectures. In addition, tournament selection can also adjust the selection pressure to adapt to different application domains (Miller and Goldberg, 1995). Yet, the selection pressure varies with the size of the tournament, k. Hence, the importance of choosing this parameter as demonstrated by several studies (Goldberg and Deb, 1991; Muhlenbein and Voosen, 1993; Back, 1994; Blickle and Thiele, 1995a, b; Poli and Langdon, 2006).

Choosing good k values is particularly difficult in unsupervised contexts i.e., in situations where no prior knowledge is available on the nature of the problem. Large values of k leads to strong selection pressure and convergence of the GA to a local optima. Small k values on the other hand may prevent individuals with good fitness values from being candidates for the tournament selection (Poli and Langdon, 2006).

In this study, we present a new technique which allows determining k in a dynamic way. This technique is mainly based upon the Fuzzy C-Means algorithm (FCM) and uses the normalized partition entropy as a validation criterion. The value of k is the optimal number of clusters given by this unsupervised clustering procedure.

Tournament selection: The tournament selection was attributed to Wetzel in an unpublished research then it has been studied by Brindle in his doctoral thesis in 1981 (Goldberg and Deb, 1991). Afterwards, several researchers became interested in this method (Blickle and Thiele, 1995a, b; Miller and Goldberg, 1995; Filipovic et al., 2000; Julstrom and Robinson, 2000; Sokolov and Whitley, 2005; Vajda et al., 2008). The standard tournament selection can simply be described by the following pseudo-code:

Pseudo-code of standard tournament selection

•	Randomly select k individuals from the population; //(k>1, k: Tournament size)
•	Select the individual who has the best fitness value from selected individuals in step 1
•	Repeat steps 1 and 2 N times//(N is the population size)

The balance between exploitation and exploration is critical for the behavior of the GA. This balance can be adjusted by selection pressure of the selection operator (Blickle and Thiele, 1995a). Several techniques can be used in order to evaluate the selection pressure. The followings are a mong the mostly used in practice.

The takeover time (Goldberg and Deb, 1991) is the number of generations required for a single best individual to fill the entire population by using only the selection operator:

(1)

Where:

K	=	The tournament size
N	=	The population size

The loss of diversity (Blickle and Thiele, 1995a) is the proportion of individuals that are not selected during selection:

(2)

Selection intensity was used by Back (1995), Muhlenbein and Voosen (1993) and Blickle and Thiele (1995b):

(3)

Takeover time, loss of diversity and selection intensity depend on the size of the tournament hence, the importance of this parameter to adjust the selection pressure.

The difficulty lies in how to determine k. Many approaches have been proposed to determine k. Some well-known methods are mentioned here:

Fixed Tournament Size (FTS) (Goldberg and Deb, 1991): Tournament size k is fixed according to the form k = 2ⁱ, i = 1,...,5.

Tournament selection based on weight: Julstrom and Robinson (2000) introduced a tournament selection based on weight. The probability that individual j is selected is:

(4)

where, w denotes the weight of individual j:

(5)

Where, r is the exponential normalization factor and N the population size.

Fine-grained tournament selection: Filipovic et al. (2000) showed that the standard tournament selection does not allow precise setting of the balance between exploration and exploitation. In their method, the tournament size is not fixed but takes a value among a set of values.

Deterministic Tournament-size Control (DTC) (Vajda et al., 2008): The tournament size is a function of generation t but their empirical formula depends on two parameters P₁ and P₂. The tournament size increases linearly from P₁-P₂ for the first 1000 generations and afterwards binds to P₂. They chose: p₁, p₂ in {2,7,17},

(6)

MATERIALS AND METHODS

Proposed method: To dynamically determine k, we precede the selection process by an unsupervised fuzzy clustering procedure whose main goal is to detect the presence of homogeneous groups within the population of candidate solutions. This procedure gives us an idea about the population diversity which can help adjust the balance between exploration and exploitation by increasing or decreasing the selection pressure. For this we use the number of detected clusters as a value of k.

The fuzzy clustering procedure which offers a better way of modeling and representing overlapping clusters, consists in two main steps. The first step is an optimization procedure which involves the FCM algorithm (Bezdek, 1981) for different values of the number of clusters, c. The second step is a module for validating results using the normalized partition entropy (Bezdek, 1981) defined as follows:

(7)

Where, U denotes the matrix of membership degrees. By varying the parameter c within an interval of reasonable values and repeating FCM for each c value within this interval, several partitions can be produced for each generation. Choosing the best partition and therefore the number of clusters to use is finally done through a validation phase that assesses and compares the quality of different partitions. This classification phase has two outputs:

•	The number of clusters
•	A prototype for each cluster

k is chosen as the number of clusters found at the clustering phase and the prototype became a member of the next generation.

We named the method Fuzzy Adapting Tournament Selection (FATS).

Pseudo-code for FATS

Benchmark functions: To illustrate the performance of the method we applied it to the problem of minimizing several test functions that are commonly used in practice as benchmark for genetic algorithms.

Ackley function:

(8)

•	Number of variables: n = 5
•	Search domain: -30≤x_i≤30
•	Multimodal
•	The global minimum: x* = (0, …, 0), f (x*) = 0

Cosinus mixture function:

(9)

•	Number of variables: n = 40
•	Search domain: -1≤x_i≤1
•	Unimodale
•	The global minimum: x* = (0, …, 0), f (x) = 0.1n

Goldstein and price function:

(10)

•	Number of variables: n = 2
•	Search domain: -2≤xi≤2
•	Multimodal
•	The global minimum: x* = (0, -1), f (x*) = 3

Griewank function:

(11)

•	Number of variables: n = 5
•	Search domain: -600≤xi≤600
•	Multimodal
•	The global minimum: x* = (0, …, 0), f (x*) = 0

Rastrigin function:

(12)

•	Number of variables: n = 10
•	Search domain: -5.12≤xi≤5.12
•	Multimodal
•	The global minimum: x* = (0, …, 0), f (x*) = 0

Rosenbrock function:

(13)

•	Number of variables: n = 2
•	Search domain: -30≤xi≤30
•	Unimodal
•	The global minimum: x* = (1, 1), f (x*) = 0

Schwefel function:

(14)

•	Number of variables: n = 2
•	Search domain: -500≤xi≤500
•	Multimodal
•	The global minimum: x* = (420.97, 420.97), f (x*) = 0

Easom function:

(15)

•	Number of variables: n = 3
•	Search domain: -100≤xi≤100
•	Unimodal
•	The global minimum: x* = (π, π, π), f (x*) = - 1

Weierstrass function:

(16)

•	Number of variables: n = 1
•	Search domain: -100≤xi≤100
•	Multimodal
•	The global minimum: x* = (π, π, π), f (x*) = - 1
•	b>1, 1≤ s≤ 2 in the case (1≤ i≤ 30, s = 1.7, b = 5)

NUMERICAL RESULTS AND DISCUSSION

We compared the technique (FATS) with:

•	Fixed Tournament Selection (FTS) with k in {4,8,16}
•	Deterministic Tournament size (DTC) (Eq. 6)

The researchers calculate for each methods (FTS, DTC and FATS) relative error and takeover time. For GA, we chose the following parameters:

•	Initialization is random with a population size N = 100
•	The coding we used is the real code
•	For the crossover operator we used Simulated Binary Crossover SBX (Deb and Agrawal, 1995)
•	Crossover probability: 0.97
•	For mutation we considered the polynomial mutation (Deb and Kumar, 1995)
•	Mutation probability: 0.03
•	For the stopping criterion we have chosen as the maximum number of generations t_max = 100, another stop criterion is used, percentage of optimum in population defined as follows

(17)

Where, N = population size. To evaluate the quality of the optimum provided by FTS, DTC and FATS, we measure the quality of the Percentage Relative Error (PRE):

(18)

The f^* in Eq. 18 is the optimum provided by the algorithm for each technique, f the actual optimum which is a priori known. Table 1 shows the results: The researchers note that FATS provides significantly better results compared to FTS and DTC.

Takeover time is used for the second comparison. As Table 2 shows takeover time for FTS and DTC are independent of the test function. The two techniques do not reflect the characteristics of the test functions that differ according to their distributions and their unimodal or multimodal nature. This is explained by the fact that the takeover time for DTC depends only on generation number and the takeover time for FTS is the same for each test function and for each generation while FATS has an acquisition time that depends on the nature of the test function.

Indeed, during the first generations, the population is diversifying, so the number of clusters is high, the takeover time is small and the selection pressure is high. Over generations if diversity decreases, takeover time increased and vice versa.

The graph in Fig. 1 shows the variation of takeover time for Ackley function, cosine mixture function and Griewank function. Selection pressure is the same whatever the test function for the FTS, it is a function of generation for the DTC whereas the FATS depends on the nature of the function.

Table 1:	Percentage relative errors

Table 2:	Takeover time for test functions


Fig. 1:	Takeover time for 3 techniques


Fig. 2:	Proportion of the best individual for FTS, DTC and FATS

The proportion of individuals having the best fitness value for the three functions in Fig. 2 show that the variation in FATS is according to the nature of the function. In Fig. 2a, Cosine mixture which is a unimodal function, we have a FATS net diversity compared to FTS and DTC for the first generations. After, we have a high selection pressure resulting in the deflection angle of its curve. For Weierstrass (Fig. 2b) and Goldstein-Price (Fig. 2c), Multimodal functions, FATS has a lower selection pressure compared to the FTS and the DTC.

This result corresponds to the model defined in (Back and Hoffmeister, 1991): Unimodal function require a high pressure of selection to force the search space into the gradient direction. By cons, a multimodal function requires an explorative character of the research so, a low pressure of selection is used so as not to push the GA to converge to a local optimum.

CONCLUSION

In this study, we proposed a new technique for dynamically determining the tournament size using a fuzzy clustering method. The basic idea behind this approach consists of adjusting the selection pressure according to the population diversity in order to strike a balance between exploitation and exploration: FATS adjust appropriately the selection pressure according to the search result by dynamically using different tournament size values at different stages of the GA taking into account the nature of the problem.

In addition, the comparison between the results obtained by FATS, FTS and DTC, confirms the efficiency of the proposed technique in dealing with functions of different nature. In future study, the researchers propose other technique to dynamically adjust other parameters of genetic algorithms.

Related Links

Journals By Subject

International Journal of Soft Computing

A Fuzzy Clustering Technique for Adapting Tournament Selection

How to cite this article: