The Comprehensive Systems Biology Project (CSB)
csbdb CSB.DB (CSB.DB@MPIMP)
- A Comprehensive Systems-Biology Database -
mpimp
Hosted at Max Planck Institute of Molecular Plant Physiology
Databases: Associated DB | Transcriptome DB | Metabolome DB | Co-Response DB | BestFit
Home | What's new | Search | About CSB.DB | Site Map | Mail2Us | FAQ | Help
NOTIFICATION: Permanent migration of all CSB.DB services and functionalities on 1.1.2016.
Herewith we inform you that all CSB.DB databases have been migrated at the beginning of 2016. This includes all gene correlation and expression databases, the GMD@CSB.DB module and all associated databases
The BestFit software, a tool for non-aqueous fractionation data analysis, will also be available by the Experimental Systems Biology Research Group headed by Dr. Patrick Giavalisco.
We thank all users, contributors, and collaborators of CSB.DB at the Max Planck Institute for their long-standing support.
Yours sincerely, the CSB.DB Curator and the CSB.DB Developmental Core Team
Help@CSB.DB: Coefficient - Transcript Co-Response Queries
tCoR query
If you want to get help directly related to a page/query, use the Info Pages / Medium Info Pages. Direct links are available at each (Query) Page.

If you are completely lost, here is link to a short description of what CSB.DB is and is not. Enter this page.

Coefficient

Co-response can be measured by various statistical coefficients. Currently, no public convention exists as to which numerical approach is best applied to detect and validate co-response of changing transcript levels. For this reason we integrated a range of different statistical and computational algorithms, which are routinely applied in various research areas.
To get information of your choice for a coefficient use the links listed below.
Correlation Coefficients:

Correlation Coefficient back to top


Correlation coefficients measure the magnitude and direction of the association between two variables. The correlation between them reflects the degree to which both are related. Correlation coefficients range from +1 to -1. The closer the correlation is to either +1 or -1, the stronger the relationship. If the correlation near to zero, there is no association between the two variables.
The magnitude is the strength of the correlation (relationship). Usually it means the strength of the tendency of two variables, X and Y, to move in the same (or opposite) direction. The direction of the correlation estimate how the two variables are related. If the correlation is positive, the two variables have a positive (move in the same direction) or negative relationship (move in the opposite direction). A positive relationship means one variable increases and also the other one. In contrast a negative relationship reflects that the other variable decreases.

Spearman's rs back to top


Spearman's rs is a non-parametric measure of correlation. The Spearman correlation is based on the ranked observation of the two variables, and therefore makes no assumption about the distribution of the values. It ranges from +1 to -1. Spearman's rs is the non-parametric 'variant' of Pearson correlation on ranked data and can detect linear and non-linear relationships (countinously increase). Spearman's rs is robust to outliers and does not required bivariate normal distributed observations.
Computation: Spearman's rs was computed as Pearson's Correlation on ranked expression measurements according Sokal & Rohlf (1995). Confidence Interval & Power of Test were determined as suggested by Bonett & Wright (2000).

Kendall's τ back to top


Kendall's τ is a non-parametric measure of correlation, which is intended to measure the strength of relationship. In versus to other coefficients the algebraic structure of Kendall's τ is much simpler. It can even be computed from the actual observations without initial converting them to ranks but also from ranks. It ranges from +1 to -1. Kendall's τ can measure linear and non-linear relationships (countinously increase) and is robust to outliers. A bivariate normal distribution of observations for the variables is not required to compute Kendall's τ.
Computation: Kendall's τ was computed according Sokal & Rohlf (1995). Confidence Interval & Power of Test were determined as suggested by Bonett & Wright (2000).

Pearson's ρ (r) back to top


The most common measure of correlation is the Pearson Product Moment Correlation (Pearson's ρ or Pearson's r). The Pearson correlation is a parametric measure of correlation and reflects the degree of linear relationship between two variables that are on an interval or ratio scale. It ranges from +1 to -1. The Pearson correlation can be affected by outliers, which may strongly increase or decrease the strength of relationship. The observations for both variables should be approximately (bivariate) normal distributed.
Computation: Pearson's ρ, Confidence Interval & Power of Test were computed according Sokal & Rohlf (1995).
To compute the p-value yourself you can transform Pearson's r into approximately t distributed values with N-2 degrees of freedom as follows:
t=r/sqrt[(1-rē)/(N-2)]
To compute a p-value in excel you can therefore use the following expression =TDIST(ABS(THE_CELL/SQRT((1-THE_CELL*THE_CELL)/(N-2))),(N-2),2)
Mutual Information & Distance Measure

Mutual Information (mi) back to top


The mutual information (mi) is a general measure for statistical independence. The mutual information quantifies the reduction in the uncertainty of one random variable given knowledge about another random variable. The natural logarithm was used to compute the mutual information. The MI was adjusted for systemic bias and converted into distance range by d(MI) = -exp(MI). The d(MI) is in the range of 0 to 1. Zero represent strong dependency, 1 reflect that gene A and B are independent.
Computation of d(MI): The mi was computed of gene A and B based on the Shannon entropy (Shannon, 1948; overview: Steuer et al., 2002). Instead to use the log base 2 (Steuer et al., 2002, eq.(4)) we used the natural logarithm (ln). The obtained mutual information was adjusted for systemic bias (Steuer et al., 2002, eq.(20)) to obtain the true mi. Obtained negative values were treated as missing obersvations ('NA'). The true mi was than converted into distance range by use of d(MI) = -exp(MI). d(MI) is in the range of 0 to 1.

Euclidean Distance back to top


The Euclidean distance is a commonly used measure of distance. The distance between two points is the length of the path connecting them. The Euclidean distance is just the sum of the squared distances of two vectors of observation, e.g. expression measurements for gene A and B. Normalization can be done by dividing the Euclidean distance by the maximum of all Euclidean distances. The normalized Euclidean distance is in the range of 0 to 1. 0 means the expression level of gene A and B are closely related, whereas 1 reflects most distant behaviour.
Computation of d(E): The Euclidean distance was computed according Mirkin (1996). To normalize the Euclidean distance each distance was divided by the maximum of distance obtained for the whole matrix: d(E) = dE / max(dE). The resulting normalized Euclidean distances are in range of 0 to 1.

For suggestions and questions feel free to contact the CSB.DB curator.
Requirements
Minimal resolution for optimal view: 1024x768 (without favorites)! Web browser of the sixth generation required, e.g. MS IE 6.0. Javascript must be enabled... more