Features

Introduction of the powerful SNP analysis software "SNPAlyze".

High user friendliness-similar to Excel

All data can be easily edited, similar to Excel. You can also use drag, copy, cut, and paste functions to transfer data directly from the Excel data sheet. The edited data can be saved as a text file (TSV format).


Easy data import

A TSV (Tab Separated Values)/CSV (Comma Separated Values) file, an Excel file (xls/xlsx format) and SNPAlyze data file (slyz format) can be easily imported by following the directions on the screen. In addition, other exported files such as Biotage PSQ96 or ABI PRISM7900 can be imported.


The world’s fastest haplotype inference function *1

The Haplotype Inference function of SNPAlyze is the fastest in the world-about 1000 times Arlequin *2


Multilocus haplotype inference

Ver. 5.0 Pro can analyze 40 SNPs at once. (In Standard, the maximum number of SNPs that can be analyzed at once is 30.)
Haplotype Inference depends on the number of samples; the haplotype of 40 SNPs can be analyzed in several seconds.

(If you wish to estimate the haplotype frequency of more than 41 SNPs, please contact us at info@dynacom.co.jp.)


Supports AIC evaluation besides chi-square and P values

SNPAlyze includes additional useful functions that calculate not only conventional chi-square or P values but also AIC values. Generally, a chi-square test contains "ambiguity" since significant levels are selected arbitrarily by a user.
However, the AIC eliminates the "ambiguity" observed in the chi-square test and can provide analytical results with much higher accuracy. SNPAlyze utilizes the AIC function for the following analyses:


Graphical display of Linkage Disequilibrium

The LD coefficient between multiple SNPs can be seen at a glance. The area with a strong LD coefficient can be easily specified.
The following three display settings are available in SNPAlyze:


Supports LD coefficients calculated by Akaike’s information criterion (AIC) besides conventional LD coefficients (D, D’, r^2)

D, D’, r^2 are known as criteria of Linkage Disequilibrium. SNPAlyze can evaluate Linkage Disequilibrium with AIC added to conventional criteria. *3

By considering a model in which two SNP loci are assumed to be in LD (AIC (IM)) and another model in which two SNP loci are assumed to be in linkage equilibrium (AIC (DM)), SNPAlyze evaluates their differences. This evaluation is based on the following equation:

(AIC(LD)= AIC(IM)- IC(DM))

Estimation of diplotype distribution

Diplotype distribution can be estimated by using the EM algorithm. The combination of haplotypes that constitute each diplotype and the number of samples that correspond to identical diplotypes are displayed.


Use of permutation tests

SNPAlyze calculates the difference between the haplotype frequencies of multiple groups such as case groups or control groups. However, a large error may occur in the chi-square test when the haplotype frequency is extremely small.

Permutation tests are one of the methods that can effectively avoid this problem by identifying an appropriate P value in a wide range of data. By using random numbers, this method arrives at an exact probability value by approximation. *4


Processability of a massive amount of data (Only the Pro version)

A maximum of 10,000 samples can be analyzed. The analysis of a massive data that cannot be processed in the Standard version is possible. The number of data items that can be analyzed depends on the computer memory.


Supports microsatellite data

Compared with SNPs, the frequency of appearance of microsatellite in whole genome is low, while microsatellite polymorphisms are very diverse and contain much more information.

Therefore, efficient analysis is possible by combining SNP and microsatellite data.


htSNP identification

Identification of htSNP (tagSNP) is now possible. Efficient genotyping is possible by utilizing the htSNP that represents a haplotype block.

If you perform multilocus haplotype inference, two or more combination of htSNP may exist. SNPAlyze output all possible sets.


Multiple open of data-sheets

In Ver.5.0, Multiple datasheets can open in SNPAlyze. This function provides the following techniques:
Comparing the analysis results among different datasheets.
Referring the analysis results when if you want to check.


Haplotype Block Analysis

SNPAlyze can construct "Haplotype Block" by following two methods:
Gabriel method (Gabriel et al, science., 2002) *5
Four Gamete method (Wang et al, Am.J.Hum.Genet., 2002) *6
And, it is possible to run the Case-Control Haplotype Analysis directly on constructed haplotype blocks.


Automatic selection of polymorphic markers

In Ver.5.0, SNPAlyze can select the appropriate polymorphic markers automatically. The automatic selection is filtered by Hardy-Weinberg Equilibrium test (HWE), minor allele frequency (MAF) and polymorphic marker types.


Cooperate with HealthSketch

HealthSketch is a multivariate analysis tool for clinical and/or lifestyle data. The following functions are available by cooperating with HealthSketch.


Allows treating of genotyping data and all analysis data collectively

SNPAlyze Data file includes genotyping data and all analysis data collectively. If you open a file that saved as this file format, the genotyping data and all analysis data will appear. You can continue your analysis, or share the genotyping data and all analysis data by distributing this file to other SNPAlyze users. (Please mind this file include genotyping data)


Use of FDR

In case-control studies, SNPAlyze perform multiple testing corrections using FDR. The FDR controls the proportion of errors among test results that null hypothesis were rejected. SNPAlyze calculate q-values on the basis of the distribution of p-values. (BH or Bootstrap method is available)


Cochran-Armitage Trend Test

SNPAlyze performe Cochran-Armitage Trend Test for Dominant, Recessive and Genotype model about each SNP.
Cochran-Armitage Trend Test is to investigate if genes associated with disease by means of comparison between two groups, one of which is a patient group and another is a non-patient group. This analysis assesses for the presence of a linear trend association between case-control category and allele counts.


VCF file importNEW!


Effect size NEW!

  • Effect size is refers to the magnitude of effect of statistical test, there is such as “Standardized difference between two groups” and “Correlation measures of effect size”.

    The larger the absolute value, it indicates that effect is large. For example, correlation measures of effect size is phi(Φ) and Cramer’s V(V).
  • The extent of the relationship indexes between two variables (2 x 2) is using the chi-square test.
    SNPAlyze calculates the effect size from each contingency table of 4 genetic models: Genotype, Allele, Recessive and Dominant.
    Effect size can be use the “Chi-square value(χ2)” “The total number of case group and the control group (N)” and “The number of rows or columns of the lesser of contingency table (k)” expressed by the following equation.

                                                      effect_size_equation3


Related information & Links

 


*1 As of September, 2004 (its company investigation)

*2 Limit to SNP data.

*3 Shimo-Onoda K, Tanaka T, Furushima K, Nakajima T, Toh S, Harata S, Yone K, Komiya S, Adachi H, Nakamura E, Fujimiya H, Inoue I. Akaike’s information criterion for a measure of linkage disequilibrium. J Hum Genet 2002; 47(12): 649-55.

*4 Fallin D, Cohen A, Essioux L, Chumakov I, Blumenfeld M, Cohen D, Schork NJ. Genetic analysis of case/ control data using estimated haplotype frequencies: application to APOE locus variation and alzheimer’s disease. Genome Res. 2001 Jan; 11: 143-151.

Good, P. Permutation Tests. A Practical Guide to Resampling Methods for Test-ing Hypothesis. Second Edition. New York: Springer-Verlag, 2000.

*5 Stacey B. Gabriel, Stephen F. Schaffner, Huy Nguyen, Jamie M. Moore, Jessica Roy, Brendan Blumenstiel, John Higgins, Matthew DeFelice, Amy Lochner, Maura Faggart, Shau Neen Liu-Cordero, Charles Rotimi, Adebowale Adeyemo, Richard Cooper, Ryk Ward, Eric S, Lander, Mark J. Daly, David Altshuler, The Structure of Haplotype Blocks in the Human Genome. Science. 2002 Jun 21;296(5576):2225-9.

*6 Ning Wang, Joshua M. Akey, Kun Zhang, Ranajit Chakraborty, and Li Jin. Distribution of Recombination Crossovers and the Origin of Haplotype Blocks: Interplay of Population History, Recombination, and Mutation. Am J Hum Genet. 2002 Nov ;71 (5):1227-34.

*7 SNPAlyze Ver.5.0.2 (or later) and HealthSketch Ver.1.1 (or later) are required for data passing function.

*8 SNPAlyze Ver.7.0 (or later) and HealthSketch Ver.2.5 (or later) are required for Logistic Regression Analysis.