www.wikimeta.com

Account
Home - Search Metadata - Metadata description - Classification rules - Some Links - Publications - About us


Try also Wikimeta Semantic Labelling Tool based on NLGbAse

Browse metadata

ENFR ES
[Browse metadatas in English, French and Spanish (using keyword search)]

NLGbAse, connecting text to semantic web knowledge


NLGbAse is an architecture to product Metadatas and Components devoted to Natural Language Processing (NLP) and semantic analysis and labeling  tasks. NLGbAse transforms encyclopedics text contents into structured knowledge fully integrated with the LinkedData network and the Semantic Web.

You can read a description of NLGbAse in the LREC 2010 paper and detailed informations about components in the publications page.

Applications of metadatas:

Metadatas are available in French, Spanish and English (and can be quickly trained in other languages). Metadatas are used to build Machine Learning applications and train automaticly NLP tools like Named Entity Recognition systems, Translation Lexics, Information extractors.
Majors applications of Metadatas are :

NLP tools of NLGbAse:

Metadatas of NLGbAse are used to produce resources and training corporas for information extraction tools. The example bellow show sample results of the Semantic NER Tool. The tool detect Named entities and then link them to their Rdf description on the Linked Data Network. You can try this application online using Wikimeta labelling tool.

Semantic Labelling exemple
(rdf) Djibouti [LOC] ( (rdf) Arabic [LOC]Jībūtī [ORG] , (rdf) Somali [LOC] : Jabuuti [LOC] ) , officially the (rdf) Republic of Djibouti [LOC] , is a country in the (rdf) Horn of Africa [LOC] . It is  bordered by (rdf) Eritrea [LOC] in the north , (rdf) Ethiopia [LOC] in the west and south , and (rdf) Somalia [LOC] in the southeast . The remainder of the border is formed by the (rdf) Red Sea [LOC] and the Gulf of Aden [LOC] .

Some major applications of our API are :
  • Fine labelling of wide corpora
  • Automatic detection of a concept emergence and actuality event in open text for press, streaming information
  • Automatic detection of related concept in corpora (i.e What's the biotope of a Protein
  • Cloud of Keywords dynamic generation
  • etc ...

What's in metadata

NLGbAse is a database of graph sets, representing more than 14 millions possible term writings:
* Each set is a term with all is  possibilities of writing  in multiple languages (at the moment, French, English,  Italian, Spanish and German). 
* Each set has a named entitity tag (Enamex standard) like "Location", "Person", "Place", "Date", "Organisation", or "Unknown" for encyclopedic terms.

http://www.nlgbase.org/fr_sample.png
Sample of conceptual graph for a [person] named entity.


This graph of a songer pseudonym (Akhenaton) also gives his real name (Philippe Fragione). Such information used in a music search engine, can be helpfull to increase the coverage of a query.

Graphe representation
Sample of conceptual graph for a [location] entity


All the tools delivered on this website are free to use for academic and research purpose. If you find those tools usefull, please cite published papers related to this system (specially the LREC 2010 paper):

@inproceedings{Charton2010,
address = {Malta},
author = {Charton, Eric and Torres-Moreno, J.M.},
booktitle = {Proceedings of LREC 2010},
editor = {{LREC 2010}},
number = {1},
publisher = {LREC 2010},
title = {{NLGbAse: a free linguistic resource for Natural Language Processing systems}},
year = {2010}
}