Professor Dr. med. Christian Scharfetter

Dept. of Psychiatry, Psychotherapy & Psychosomatics

Psychiatric Hospital, University of Zurich

IFRG Logo
IFRG Emblem
Trithemius

Predicting IgM Levels from Multilocus Genotype

A genetically predisposed aberrancy of the inflammatory response system has been linked to various complex diseases. In consequence, of particular clinical interest are "objective" classifiers that enable reliable prediction, on the one hand, and offer the opportunity for early intervention prior to the onset of clinical manifestations, on the other. To investigate the extent to which IgM levels can be reproducibly predicted for each individual patient from his/her multilocus genotype, we carried out a Neural Network (NN) analysis on a sufficiently large sample (n=1,042; genotyped for 5,728 SNPs of a conventionally designed 0.4 Mb genome scan) under the constraint of a 10-fold cross-validation. Since NN results tend to be over-optimistic, even when using stringent cross-validation approaches, we were interested in the reproducibility of predictors across populations ("training" versus "test" samples) and across SNP sets (conventionally designed genome scan versus anonymous 500k-chip). To address these questions, we relied on independent test samples (n=746; genotyped for 545,080 SNPs of a 500k-chip) along with 6 different SNP sets, each with 5,728 SNPs drawn from the 500k-chip under the constraint of maximum informativeness and compatibility with the training SNPs.

SNP Selection for Cross-Validation

Based on NCBI36 data, the coordinates X(k) of the 5,728 SNPs of our training sample were used to define surrounding X(k)±0.1 Mb intervals (k=1,2,.. 5,728). Typically 50-80 SNPs of the 500k-chip were located in these intervals and served as pool for selecting 8 "optimal" SNPs in terms of informativeness and vicinity to the original loci at X(k) (k=1,2,.. 5,728). Finally, 6 subsets of 5,728 SNPs each were constructed by randomly combining SNPs from each interval [Figure]. This process led to mutual overlaps between the 6 subsets in the range of 14.6-16.6%. Due to missing data typically 40 SNPs (0.7%) of the resulting sets had to be excluded from analysis, so that on average only 5,688 SNPs were available in each set for testing.

Reproducibility of Multilocus Configuration

In terms of clusters of at least 3 SNPs within a 0.5 Mb region, the training step yielded a configuration of 15 genomic loci (61 SNPs) that served as reference for subsequent investigations into the reproducibility of classifiers across populations and SNP sets. Yet unexpectedly, the same algorithm applied to the 746 test samples with 6 competitive SNP sets, typically yielded relatively reproducible results for 4 out of the 6 SNP sets, whereas the results of the 2 other SNP sets pretty consistently turned out to be largely arbitrary. Given current results, no more than 5 of 15 genomic loci derived from the training samples appear to be reproducible through the test samples and independent of SNP sets.

References

Stassen HH, Szegedi A, Scharfetter C: Modeling Activation of Inflammatory Response System. A Molecular-Genetic Neural Network Analysis. BMC Proceedings 2007, 1 (Suppl 1): S61, 1-6
Stassen HH, Anghelescu IG, Hell D, Hoffmann K, Rujescu D, Scharfetter C, Szegedi A, Tadic A: Linking autoantibody formation to genetic vulnerability to psychiatric disorders and psychotropic drug response. Int J Neuropsychopharmacol. 2008; 11 (Suppl. 1): 101
Stassen HH, Hoffmann K, Scharfetter C: The Difficulties of Reproducing Conventionally Derived Results through 500k-Chip Technology. BMC Genet 2009; 3 Suppl 7: S66
Stassen HH, Braun S, Bridler R, Seifritz E, Weisbrod M: Inflammatory Processes and Schizophrenia: Evidence from a Twin Study. Eur Neuropsychopharmacology 2017; 27 Suppl 4: S934-S935
Braun S, Bridler R, Müller N, Schwarz MJ, Seifritz E, Weisbrod M, Zgraggen A, Stassen HH: Inflammatory Processes and Schizophrenia: Two Independent Lines of Evidence from a Study of Twins Discordant and Concordant for Schizophrenic Disorders. European Archives of Psychiatry and Clinical Neuroscience 2017; 267: 377-389
Stassen HH: Heterogeneity of schizophrenic disorders and link to chronically elevated IgM values. Neurology, psychiatry and brain research 2018; 29: 23-24

 

vSpacer Reproducing Conventionally Derived Results through 500k-Chip Technology
Independent training and test sets were used to address the question of population dependence. Specifically, 6 randomly selected subsets of SNPs from the 500k-chip data, each comprising 5,728 SNPs, were used in order to quantify SNP-set dependence. Selection criteria for the 6 subsets were: (1) SNPs from ±0.1 Mb intervals around the original training set; (2) maximum informativeness of newly selected SNPs; and (3) less than 20% overlap between each of the 6 subsets.
Hspacer
[ Mail to Webmaster ] k454910@ifrg.ch
rFrame