Explanation to the functional distribution of gene lists.

For statistical test we use the hypergeometric distribution

This means we calulate the p-value as follows:

$N=$all_count; Genome amount 
(e.g. 6723 genes) 

$S=$fun_all{$fun_set_num1}; functional category
(e.g. 1500 genes in functional category 01)

$n=$set_count; List genes
(4 genes)

$k=$count_set1; HITs in single functional category of List genes 
(from 4 genes in List genes 2 genes are in functional category 01)

formula:
                                      n
                                  ________
     (S)   (N-S)                  \    (S)   (N-S) 
     ( ) x (   )                   \   ( ) x (   )   
     (k)   (n-k)                    \  (k)   (n-k) 
  p=-------------      P-value = pp= >-------------
        (N)                         /     (N)   
        ( )                        /      ( )   
        (n)                       /_______(n)
                                      k