Features
Introduction of the powerful SNP analysis software "SNPAlyze".
High user friendlinesssimilar to Excel
All data can be easily edited, similar to Excel. You can also use drag, copy, cut, and paste functions to transfer data directly from the Excel data sheet. The edited data can be saved as a text file (TSV format).
A TSV (Tab Separated Values)/CSV (Comma Separated Values) file, an Excel file (xls/xlsx format) and SNPAlyze data file (slyz format) can be easily imported by following the directions on the screen. In addition, other exported files such as Biotage PSQ96 or ABI PRISM7900 can be imported.
The world’s fastest haplotype inference function *1
The Haplotype Inference function of SNPAlyze is the fastest in the worldabout 1000 times Arlequin *2
Multilocus haplotype inference
Ver. 5.0 Pro can analyze 40 SNPs at once. (In Standard, the maximum number of SNPs that can be analyzed at once is 30.)
Haplotype Inference depends on the number of samples; the haplotype of 40 SNPs can be analyzed in several seconds.
(If you wish to estimate the haplotype frequency of more than 41 SNPs, please contact us at info@dynacom.co.jp.)
Supports AIC evaluation besides chisquare and P values
SNPAlyze includes additional useful functions that calculate not only conventional chisquare or P values but also AIC values. Generally, a chisquare test contains "ambiguity" since significant levels are selected arbitrarily by a user.
However, the AIC eliminates the "ambiguity" observed in the chisquare test and can provide analytical results with much higher accuracy. SNPAlyze utilizes the AIC function for the following analyses:
Graphical display of Linkage Disequilibrium
The LD coefficient between multiple SNPs can be seen at a glance. The area with a strong LD coefficient can be easily specified.
The following three display settings are available in SNPAlyze:
 Comparative display of the analysis results for two different groups
 Comparative display of the analysis result for two different LD coefficient and statistics
 Superimposed display of the analysis result for two different groups
(LD map type of BMP only)
Supports LD coefficients calculated by Akaike’s information criterion (AIC) besides conventional LD coefficients (D, D’, r^2)
D, D’, r^2 are known as criteria of Linkage Disequilibrium. SNPAlyze can evaluate Linkage Disequilibrium with AIC added to conventional criteria. *3
By considering a model in which two SNP loci are assumed to be in LD (AIC (IM)) and another model in which two SNP loci are assumed to be in linkage equilibrium (AIC (DM)), SNPAlyze evaluates their differences. This evaluation is based on the following equation:

Estimation of diplotype distribution
Diplotype distribution can be estimated by using the EM algorithm. The combination of haplotypes that constitute each diplotype and the number of samples that correspond to identical diplotypes are displayed.
SNPAlyze calculates the difference between the haplotype frequencies of multiple groups such as case groups or control groups. However, a large error may occur in the chisquare test when the haplotype frequency is extremely small.
Permutation tests are one of the methods that can effectively avoid this problem by identifying an appropriate P value in a wide range of data. By using random numbers, this method arrives at an exact probability value by approximation. *4
Processability of a massive amount of data (Only the Pro version)
A maximum of 10,000 samples can be analyzed. The analysis of a massive data that cannot be processed in the Standard version is possible. The number of data items that can be analyzed depends on the computer memory.
Compared with SNPs, the frequency of appearance of microsatellite in whole genome is low, while microsatellite polymorphisms are very diverse and contain much more information.
Therefore, efficient analysis is possible by combining SNP and microsatellite data.
Identification of htSNP (tagSNP) is now possible. Efficient genotyping is possible by utilizing the htSNP that represents a haplotype block.
If you perform multilocus haplotype inference, two or more combination of htSNP may exist. SNPAlyze output all possible sets.
In Ver.5.0, Multiple datasheets can open in SNPAlyze. This function provides the following techniques:
Comparing the analysis results among different datasheets.
Referring the analysis results when if you want to check.
Haplotype Block Analysis
SNPAlyze can construct "Haplotype Block" by following two methods:
Gabriel method (Gabriel et al, science., 2002) *5
Four Gamete method (Wang et al, Am.J.Hum.Genet., 2002) *6
And, it is possible to run the CaseControl Haplotype Analysis directly on constructed haplotype blocks.
Automatic selection of polymorphic markers
In Ver.5.0, SNPAlyze can select the appropriate polymorphic markers automatically. The automatic selection is filtered by HardyWeinberg Equilibrium test (HWE), minor allele frequency (MAF) and polymorphic marker types.
HealthSketch is a multivariate analysis tool for clinical and/or lifestyle data. The following functions are available by cooperating with HealthSketch.
 Data passing between SNPAlyze and HealthSketch *7
 Combinational analysis of DNA polymorphism and clinical and/or lifestyle data.
 Use of classification result by clustering using clinical information
 Logistic Regression Analysis *8
Allows treating of genotyping data and all analysis data collectively
SNPAlyze Data file includes genotyping data and all analysis data collectively. If you open a file that saved as this file format, the genotyping data and all analysis data will appear. You can continue your analysis, or share the genotyping data and all analysis data by distributing this file to other SNPAlyze users. (Please mind this file include genotyping data)
In casecontrol studies, SNPAlyze perform multiple testing corrections using FDR. The FDR controls the proportion of errors among test results that null hypothesis were rejected. SNPAlyze calculate qvalues on the basis of the distribution of pvalues. (BH or Bootstrap method is available)
SNPAlyze performe CochranArmitage Trend Test for Dominant, Recessive and Genotype model about each SNP.
CochranArmitage Trend Test is to investigate if genes associated with disease by means of comparison between two groups, one of which is a patient group and another is a nonpatient group. This analysis assesses for the presence of a linear trend association between casecontrol category and allele counts.
 VCF file import
You can use the VCF file as input file (Support VCF ver4.1 and 4.2 ).
However, Importable number of samples is different, in Standard and Pro version.
Please see 『Product comparison between Standard & Pro Version』 for more information.
 Effect size is refers to the magnitude of effect of statistical test, there is such as “Standardized difference between two groups” and “Correlation measures of effect size”.
The larger the absolute value, it indicates that effect is large. For example, correlation measures of effect size is phi(Φ) and Cramer’s V(V).  The extent of the relationship indexes between two variables (2 x 2) is using the chisquare test.
SNPAlyze calculates the effect size from each contingency table of 4 genetic models: Genotype, Allele, Recessive and Dominant.
Effect size can be use the “Chisquare value(χ^{2})” “The total number of case group and the control group (N)” and “The number of rows or columns of the lesser of contingency table (k)” expressed by the following equation.
Related information & Links
 List of papers that related to SNPAlyze
 Product comparison between Standard & Pro Version
 List of functions
 New features in Ver. 9.0
*1 As of September, 2004 (its company investigation)
*2 Limit to SNP data.
*3 ShimoOnoda K, Tanaka T, Furushima K, Nakajima T, Toh S, Harata S, Yone K, Komiya S, Adachi H, Nakamura E, Fujimiya H, Inoue I. Akaike’s information criterion for a measure of linkage disequilibrium. J Hum Genet 2002; 47(12): 64955.
*4 Fallin D, Cohen A, Essioux L, Chumakov I, Blumenfeld M, Cohen D, Schork NJ. Genetic analysis of case/ control data using estimated haplotype frequencies: application to APOE locus variation and alzheimer’s disease. Genome Res. 2001 Jan; 11: 143151.
Good, P. Permutation Tests. A Practical Guide to Resampling Methods for Testing Hypothesis. Second Edition. New York: SpringerVerlag, 2000.
*5 Stacey B. Gabriel, Stephen F. Schaffner, Huy Nguyen, Jamie M. Moore, Jessica Roy, Brendan Blumenstiel, John Higgins, Matthew DeFelice, Amy Lochner, Maura Faggart, Shau Neen LiuCordero, Charles Rotimi, Adebowale Adeyemo, Richard Cooper, Ryk Ward, Eric S, Lander, Mark J. Daly, David Altshuler, The Structure of Haplotype Blocks in the Human Genome. Science. 2002 Jun 21;296(5576):22259.
*6 Ning Wang, Joshua M. Akey, Kun Zhang, Ranajit Chakraborty, and Li Jin. Distribution of Recombination Crossovers and the Origin of Haplotype Blocks: Interplay of Population History, Recombination, and Mutation. Am J Hum Genet. 2002 Nov ;71 (5):122734.
*7 SNPAlyze Ver.5.0.2 (or later) and HealthSketch Ver.1.1 (or later) are required for data passing function.
*8 SNPAlyze Ver.7.0 (or later) and HealthSketch Ver.2.5 (or later) are required for Logistic Regression Analysis.