| Organism | Source and release | Number of sequences |
| Oryza sativa | v6 | 67393 |
| Arabidopsis thaliana | v9 | 33200 |
| Medicago truncatula | v2 | 38749 |
| Sorghum bicolor | v1.4 | 36338 |
| Populus trichocarpa | v1.1 | 45555 |
| Vitis vinifera | v1 | 30434 |
| Physcomitrella patens | v1.1 | 35938 |
| Selaginella moellendorffii | v1 | 34697 |
| Glycine max | v1 | 75778 |
| Ostreococcus tauri | v2 | 7725 |
| Chlamydomonas reinhardtii | v4 | 16706 |
| Cyanidioschyzon merolae | v1 | 5014 |
| Brachypodium distachyon | v1 | 44411 |
| Carica papaya | n/a | 24782 |
| Ricinus communis | v0.1 | 31221 |
| Zea mays | v4 | 53764 |
Remark about sequence identifiers
Sequences are usually identifed by a the locus tags defined by the consortia responsible of the annotation (e.g. At5g20240.1). For some draft genomes, we have modified id using scaffold id_ species code (e.g. Phypa_96903)
Correspondance between UniProtKB [4]( last update: 22 may 2009) was made on the ordered locus when available (in 'Gene names' section).
Otherwise, mapping was done using the first blast hit of blast having anidentity score > 90%.
Uniprot Taxonomy
Kegg [5] data were download from the KEGG Orthology (KO) Database when available (last update: 02/02/2009)
GO terms, and particularly Plant GOslim, were obtained from the interpro and UniProt.
Clusters are tag by selected pubmed id referenced in UniProt entries. Annotator can also add their own publications.