About entity classification rules (nov 17 2008)

What is it ? 

 In the expression named entity, the word named restricts the task to those entities for which one or many rigid designators, as defined by Kripke, stand for the referent. For instance, the automotive company created by Henry Ford in 1903 is referred to as Ford or Ford Motor Company. Rigid designators include proper names as well as certain natural kind terms like biological species and substances. [see wikipedia for a beginners acceptable description]
We use for the named entity tags inserted in our base some learning rules. In the first version of our system those rules followed the enamex standard. Naw, we begin to introduce, in perspective of the ESTER EN evaluation campaign, a more precise classification with subclasses. Please see the ESTER2 guidelines (only in French at the moment) for precise description.

Basic Rules


Loc

    City, country, street, place (ie "Mount", "Reserve Of" or "Beach"), transport station, named place on map (river, mountain), monument (ie cathedral or wallstreet)

Pers

    Individual, fictious or real.

Fonc

    This is a label for description of politics, religious, nobility function or title...

Org

    Company - Human group (ie "music group" like "Beatles) - agency, non profit organisation, non governemental organisation, administration (ie CIA, or Ministry of Culture), radio station, television network, press company, holding, school, award (like Cesar, or Nobel Prize), museum, military corps (ie Régiment de Légion Etrangère), Hotel, resort place (like Disneyland Orlando), sporting club, sport organisation, union, political group.

Prod

    tv show, media product (i.e name of a dvd), trade marked product (i.e "Mars" or "m&m", sport event (i.e "Tour de France"), movie, theatre show, computer language (i.e Basic or Java), generally sold product (i.e trade marked name of car, computer, plane,military products (missiles, helicopters), book, comics ...

Unk

    All others non named data, and notably : [year] in [Subject], encyclopedic entry (ie : metal, noun, animals, etc).


Time, amounts


Rules for sub-class extensions

Each tag will be extended if possible by a subclass description: ie fonc.pol or fonc.mil.

It is possible to modify the rules of classification. If you want to use tags like "animal", "car", or "moutain", you just need to modifiy the learning class files with appropriate document files.