Generated on 11/6/2009 at 12:44:24 with version v0.3.2 5 Mars 2009 11:12 *Statistics for entities and category structure: Known entities:624658 Categories :Categories with 100 entries and more (all words) :Categories with 100 entries and more (2 firsts words) :Categories with 100 entries and more (1 firsts words) :Clusterisation results *Structure info of original file:/DATA_NLG/dewiki-20081011-pages-articles.xml Stats Total count : 67163150 Entries : 624648 Redir : 467735 Homonims : 193841 Score for the named entity classification task: Label pers (52) Precision=0.830508474576271 Recall=0.942307692307692 FS=0.882882882882883 (49:59) Label org (14) Precision=0.5625 Recall=0.642857142857143 FS=0.6 (9:16) Label date (3) Precision=1 Recall=0.666666666666667 FS=0.8 (2:2) Label place (42) Precision=0.760869565217391 Recall=0.833333333333333 FS=0.795454545454546 (35:46) Label unk (50) Precision=0.848484848484849 Recall=0.56 FS=0.674698795180723 (28:33) Label prod (26) Precision=0.645161290322581 Recall=0.769230769230769 FS=0.701754385964912 (20:31) Page sets used for test = 209 (en fichier test 624648 : pris 187) FScore = 0.742465101580511 Stats for graphs and redirections Amount of graph:624648 Redirect in graph=1779131 |