Generated on 11/19/2009 at 16:51:18 with version v0.3.2 5 Mars 2009 11:12

*Statistics for entities and category structure:


Known entities:737623

Categories :Categories with 100 entries and more (all words)

:Categories with 100 entries and more (2 firsts words)

:Categories with 100 entries and more (1 firsts words)

:Clusterisation results



*Structure info of original file:/DATA_NLG/frwiki-20090315-pages-articles.xml

Stats
Total count : 86957831
Entries : 737613
Redir : 901592
Homonims : 176209


Score for the named entity classification task:

Label fonc (12) Precision=1 Recall=0.75 FS=0.857142857142857 (9:9)
Label pers (175) Precision=0.939226519337017 Recall=0.971428571428571 FS=0.955056179775281 (170:181)
Label org (57) Precision=0.771929824561403 Recall=0.771929824561403 FS=0.771929824561403 (44:57)
Label time (32) Precision=0.964285714285714 Recall=0.84375 FS=0.9 (27:28)
Label loc (150) Precision=0.934640522875817 Recall=0.953333333333333 FS=0.943894389438944 (143:153)
Label unk (130) Precision=0.913385826771654 Recall=0.892307692307692 FS=0.90272373540856 (116:127)
Label prod (93) Precision=0.861702127659574 Recall=0.870967741935484 FS=0.866310160427808 (81:94)

Page sets used for test = 716 (en fichier test 737613 : pris 649) FScore global=0.908668158220537 (local=0.885293878107836)


Stats for graphs and redirections
Amount of graph:737613
Redirect in graph=2278785