Partners:
Jena, Germany
London, UK
Utrecht, Netherlands
Amsterdam, Netherlands
Barcelona, Spain
Bonn, Germany
Heidelberg, Germany
Szeged, Hungary
Zurich, Switzerland
Marie Curie Action:
035987
|
|
Genotype-to-Phenotype Research Strategies
Standard (logistic) regression connects genotype with phenotype in a direct way, thus
greatly simplifying biology. In fact, genes code for proteins or RNA ("gene products")
which may interact in a variety of ways and influence the phenotype only after a
cascade of intermediate steps. Molecular-genetic Neural Nets (NNs) generalize
standard regression analysis in a very natural way by (1) implementing multistage gene
products through one or more intermediate "layer(s)", and (2) allowing for
(linear/nonlinear) interactions between genes and between gene products.
Molecular-Genetic Neural Nets
It is the advantage of NNs that the specific knowledge about the cascade of intermediate
steps, which ultimately lead from genotype to phenotype, can be incomplete or even unknown
("hidden layers"). In this case, the model’s gene product layers lack direct interpretation
and act in the sense of a "black box". However, the influence of each single gene
on the phenotype, as well as the interactions between genes, can always be quantified and
detailed through analysis of the weight matrices of the fitted model. In the simplest form,
molecular-genetic NNs connect each gene with its gene product, while these gene products
contribute to a one-dimensional phenotype, for example, IgM level or time to response to
treatment. Interactions between gene products are modeled explicitly by implementing one or
more gene product layers.
Backpropagation Algorithm
NN connects the "neurons" of input and output layers via one or more "hidden" layers. All
outputs are computed using sigmoid thresholding of the scalar product of the corresponding
weight and input vectors. Outputs at stage "s" are connected to each input of stage "s+1".
NN connections are realized through (1) weight matrices and (2) model fitting algorithms
minimizing an error function in the weight space (goodness of fit). The most popular fitting
strategy, the backpropagation algorithm, looks for the minimum of the error function using
the method of gradient descent:
Selecting an Initial Configuration
In principal, NNs may be used for selecting genes (SNPs) out of a pool of candidates. The
computational burden of such an approach can become unrealistic for larger data sets, in
particular when reproducibility has to be tested through k-fold cross-validation. On the
other hand, molecular-genetic NNs possess a sufficient performance when an initial gene
configuration is available —either through a priori knowledge or derived through other
methods— so that the initial configuration can be optimized by systematically adding or
removing genes.
|
|
Molecular-genetic Neural Nets may connect multiple genetic factors, as observed in each
individual patient, through a layer of gene products to a one-dimensional phenotype, for example,
IgM level, Within-pair concordance of monozygotic twins, or time to response to treatment under
consideration of interactions between all gene products. The model can easily be generalized to
multidimensional phenotypes, for example, the syndrome patterns underlying schizophrenic or
bipolar illness.
|