Journal of Animal and Veterinary Advances

Year: 2009
Volume: 8
Issue: 12
Page No. 2626 - 2630

Genomic Scan to Detect QTL Using SNP Markers for Simulated Data by Regression Analysis in Half-Sib Design

Authors : Hasan Koyun, Hayrettin Okut and Seyrani Koncagül

Abstract: The aim of the present study was to conduct a genome-wide screening for QTL (Quantitative Trait Loci) detection using the simulated phenotype and genotype data sets obtained from the QTL-MAS workshop 12. A genome scan was carried out in 45 half-sib families to identify QTL influencing a hypothetical trait. Among six chromosomes, each chromosome with 1000 SNP loci, 11 informative markers at least 2 cM apart from one another were chosen based on PICs with highest χ2-statistic. Half-sib data were pooled and used simultaneously in the analyses conducted for each chromosome separately. Data were analyzed by generating an F-statistic every 1 cM on a linkage map by regression of phenotypes on the probabilities of inheriting an allele from the sire. Permutation tests at chromosome-wide significance thresholds were carried out over 1000 iterations. Among six chromosomes, significant putative QTL were detected on chromosome 1 (27 cM), 2 (36 cM), 3 (18 cM), 4 (0 cM) and 5 (96 cM) across the families (α = 0.01 and α = 0.05). There was no QTL detected, exceeding chromosome-wide significance level of p<0.05 and p<0.01 on chromosome 6.

How to cite this article:

Hasan Koyun, Hayrettin Okut and Seyrani Koncagül, 2009. Genomic Scan to Detect QTL Using SNP Markers for Simulated Data by Regression Analysis in Half-Sib Design. Journal of Animal and Veterinary Advances, 8: 2626-2630.

INTRODUCTION

SNPs (Single Nucleotide Polymorphisms) are DNA sequence variations at the base of a single nucleotide (A, T, C, or G) in the genome differing between members of a species. SNPs are generally bi-allelic genetic markers that tend to be less polymorphic than RFLP (Restriction Fragment Length Polimorphism), SSR (Simple Sequence Repeat) and other multi allelic genetic markers although, the most abundant class of DNA polymorphisms (Nielsen, 2000; Heaton et al., 2002).

As genetic markers, SNPs are essential for development of genetic test for economically important traits in livestock and complex disease-related traits in other species as well as in humans. Due to being stable through many generations, SNPs can be used for establishing ancestral relationships among individuals. Taking these relationships into account one can reconstruct haplotypes for any region of interest in the genome (Cervino et al., 2005; Stone et al., 2005).

Significant associations of SNP marker alleles with the phenotype in question suggest linkage of the marker to QTL (Quantitative Trait Loci), or the QTL with major effect on phenotype containing that particular SNP. Additionally, in the last few years, there has been increase in population-based studies to identify genomic regions that are associated with common diseases in humans and economically important quantitative traits of livestock and plants (Cleves, 2005). Moreover, a genome-wide search for QTL associations using SNPs will become more popular and cheaper for genotyping of individuals than using conventional genetic markers such as microsatellites with regards to DNA (SNP) microarrays or chips in the near future.

The aim of the study was to perform a genomic scan thus, detecting or identifying putative QTL locations along with chromosomes using the simulated data sets obtained from the QTL-MAS workshop 12.

MATERIALS AND METHODS

Data set: Simulated common data sets including phenotypic and genotypic data obtained from the QTL-MAS workshop 12. The phenotypic data set consists of 5,865 individuals from seven generations. The genotype file contains the genotype of each animal in the pedigree described in the phenotype file. There are 6,000 loci evenly distributed over 6 chromosomes (1,000 markers/chromosome), with 0.1 centi-Morgan (cM) between markers.

Statistical methods: The common data sets were analyzed and evaluated using a web-based GridQTL (http://gridqt1. cap.ed.ac.uk:8080/gridsphere) computer program developed by Seaton et al. (2002) and SAS-statistical software package (version 9.13). Analysis of SNP alleles for choosing SNPs with the highest PIC (Polymorphism Information Content) values was performed using the Allele Procedure function of SAS. SNPs and trait associations within and across families were detected using GridQTL with a single QTL model.

QTL analysis: Offspring of the 43-45 sires from the simulated data spanning 4 generations were evaluated. First of all, 11-13 informative SNPs from each of the six chromosomes containing 1000 SNP loci at 0.1 cM distance in adjacent each other were chosen based on the highest χ2-statistics values of Polymorphism Information Content (PIC values of SNP markers) by using allele procedure of SAS-statistical software package (version 9.13). Then, the method of Haley and Knott (1992), Knott et al. (1996) and De Koning et al. (1998, 2001) were adopted for the detection and mapping of QTL in half-sib families using least square simple regression analysis.

The half-sib model of GridQTL runs within and across sire families. The analysis carried out in a two-step procedure. Firstly, SNP marker data on progeny of common parent (sire) were combined in a multipoint approach to obtain the probability of inheriting an allele or the other from the sire at particular region of a chromosome of interest. The calculated probabilities were combined into coefficients with values varying between 0 and 1. Secondly, the phenotypic values on half-sib family members were regressed on these coefficients in a within-common sire regression analysis. A linear model with the fixed effects of generation and sex was fitted to coefficients and phenotypic data. Appropriate F-statistic thresholds for a p<0.05 and p<0.01 chromosome-wide type 1 error level were generated by permutation test as described by Churchill and Deorge (1994) and Deorge and Churchill (1996). The significant threshold levels and F-statistics (for p<0.05 and p<0.01) were computed by GridQTL program. When the F-statistic exceeded the F-threshold value, it was indicated as a SNP-trait association. The analyses were carried out for each chromosome separately.

RESULTS AND DISCUSSION

Table 1 shows, the selected informative SNP markers for each chromosome and the number of sire families used in the analysis. Although, >13 informative SNP markers were identified by SAS program for each chromosome, 11-13 SNP markers were chosen among them for the purpose of ease of computation by GridQTL computer program. Another reason for not choosing more SNP markers was to guaranty the distance between the adjacent SNP marker loci being at least 2 cM apart. The number of allele ranged from 1-2 for each SNP locus.

Table 2 and Fig. 1 (a-f) show that the PIC values of SNP markers vary depending upon chromosomes. The highest SNP-PIC value (0.686) was obtained on chromosome 1 whereas the lowest SNP-PIC value (0.273) was seen on chromosome 2.

Table 3 shows, the estimated QTL locations corresponding to the peak of F-statistics, as well as results of chromosome-wide analysis with 5 and 1% thresholds for the phenotype across sires. Significant QTL locations were identified between intervals for chromosome 1 (9-47 cM), 2 (18-55 cM), 3 (0-50 cM), 4 (0-54 cM) and 5 (64-100 cM) and putative QTL influencing trait were detected in 27, 36, 18, 0 and 96 cM positions on each chromosome, respectively (α = 0.01and α = 0.05). There were no QTL detected, exceeding chromosome-wide significance level of p<0.05 and p<0.01 on chromosome 6. Entire genomic scan and significant levels of (p<0.05 or p<0.01) chromosomal regions scanned chromosome-wide were shown in Fig. 2 (a-f).

Table 1: Selected SNP markers according to PICs based on χ2-test using allele and haplotype procedure in SAS statistical software for each chromosome
*SNPi = SNP ith cM apart from the left end of the corresponding chromosome

Fig. 1:

Polymorphism Information Content (PIC) values and positions of SNP markers along with (a-f) chromosomes 1-6


Fig. 2:

Genome scan of each chromosome to identify putative QTL influencing trait positions along with (a-f) chromosomes 1-6

This study presents a pioneering example of a genome-wide, QTL detection method using simulated SNP markers, as well as phenotypic and genotypic data for traits in question. Based on a chromosome-wide screening protocol, it was concluded that SNP markers detected on chromosome 1 (27 cM), 2 (36 cM), 3 (18 cM), 4 (0 cM) and 5 (96 cM) had significant putative QTL associations.

Table 2: Lowest and highest PIC values of SNP markers and their positions (in cM) on chromosomes

Table 3: The estimated QTL locations corresponding to the peak of F-statistics and chromosome-wide 5 and 1% thresholds for the phenotype for each chromosome

*LR = Likelihood Ratio

However, detection of QTL-influencing traits that was based on an F statistic computed from sums of squares explained only additive effects. Accordingly, such detection does not contain dominant and epistatic effects, it also does not explain additive, dominance and epistasis coefficients, resulting in the estimation of QTL locations along the chromosomes with a wide confidence interval.

As pointed out earlier, the results presented here are from an initial genomic search that will enable the performance of further fine-mapping analysis of putative QTL-affecting traits using multi-QTL model(s) for detection of both QTL and QTL-SNP associations per chromosome. Consequently, in order to make more precise estimations of QTL locations throughout the genome, calculations should be carried out considering >1 QTL at a time and with additive, dominant and epistatic effects and interactions determined as well.

ACKNOWLEDGEMENTS

For supplying and permitting to use the common data sets, we would like to specially express the appreciation and acknowledge the QTL-MAS 2008 workshop organizing committee; L. Ronnegard, F. Besnier, J.M., Alvarez- Castro, W. Ek, A. Johansson, L. Crooks and M. Petterson, the Group of Computational Genetics leaded by Örjan Carlborg from Department of Animal Breeding and Genetics of the Swedish University of Agricultural Sciences, Uppsala, Sweden. We would like to also thank Jules Hernandez-Sanchez from Institute of Evolutionary Biology, University of Edinburg, UK for his precious help and advice to run GridQTL program and Ben Hayes from Department of Primary Industries, Victoria, Australia for his informative assistance regarding to QTL analyses.

Design and power by Medwell Web Development Team. © Medwell Publishing 2024 All Rights Reserved