NLGbAse is an information extraction system based on a structured information built from Wikipedia and wiki syntax. NLGbAse contains more than 2,7 millions multilingual entities. Those entities contains statistical and semantic informations. Those informations are exploited by information retrieval algorithms, virtually capable of unlimited facts extraction associated to each entity. The final objective of the project is to build a robust Natural Langage modelisation system.
When ?
NLGbAse is under developpement since August 2008. First online version was launched on september 2008.
Who work or have worked on it or with it?
Dr Eric Charton : main author and conceptor of the project and its code. The project is part of E.Charton Ph.D thesis. Some published algorithms have been studied developed at École Polytechnique de Montréal
Also :
- Audrey Laroche and Philippe Langlais : have worked on term translation
Revisiting Context-based Projection Methods for Term-Translation Spotting in Comparable Corpora. Paper accepted at Coling 2010
- Ludovic Bonnefoy and Romain Devaud: Master Student, have worked on Q&A prototype
Interrogations de moteurs de recherche par des requêtes formulées en langage naturel
Ludovic and Romain Student Paper published at Majestic 2010 have obtained a Best Student award. See ther paper here.
- Raphaël Rubino : Phd student, work on information extraction for medical translation applications
See the RANLP 2009 student paper of Raphael here.
... And you ?
Who helped us?
- Pr Juan Manuel Torres Moreno - Laboratoire Informatique d'Avignon University of Avignon (Scientific advisor and referee)
- Pr Georges Linares - Laboratoire Informatique d'Avignon University of Avignon (Infrastructure allowance from 2008 to 2009)
- Laboratoire Informatique d'Avignon University of Avignon (granted publications in 2008 and 2009)
(c) Information :
Until to now, all NLGbAse parts (including tools and database) are free to use and redistribute, unless you keep (c) informations and cite the original provider (a link to this site).
NLGbAse is built with mathematical algorythm's from xml files provided by Wikipedia the Free Encyclopedia.
There is no parts (texts, sentences, images) into generated redistribuable and downloadable NLGbAse content and tools concerned by the (c) policy of Wikipedia Fundation.
NLGbAse is (c) copyrighted by E. Charton, 2008, 2009, 2010, 2011.
Cache files that might be displayed on this site are extracted from the xml files provided by Wikipedia the Free Encyclopedia. Those texts are copyrighted and redistribuables under the following conditions.
The text contained in Wikipedia is copyrighted (automatically, under the Berne Convention) by Wikipedia contributors and licensed to the public under the GNU Free Documentation License (GFDL). The full text of this license is at Wikipedia:Text of the GNU Free Documentation License.
Website publisher:
This web site is hosted by OVH. Publisher and owner of the site is Mr Eric Charton as individual. Applicable law for hosting responsability is probably the french law (has the site is hosted in France), but it's not sure as the owner leave in North America. We are not sure of wich law apply for the content responsability. So if you want to go in court for any reason, it's probably better to discuss with the owner before: he is very kind and polite ...
