Home >> Science >> Social Sciences >> Linguistics >> Computational Linguistics >> Corpus Analysis >> WordNet




WordNet occurs as semantic lexicon for the English language. It groups English words into sets of equivalent word known as synsets, will bring short definitions, & records a various semantic relations between these synonym sets. the purpose is two times: to create a combination of dictionary and thesaurus that is more intuitively usable, & to trend lines automatic text analysis & artificial intelligence applications. the database & package information stand been freed under a BSD style license & can be downloaded and utilized freely. A database can also exist as looked online.

WordNet was created & is existence maintained at a Cognitive Science Laboratory of Princeton University under the counsel of psychology professor George A. Miller. Development began within 1985. Across a years, a design received all about $3 million of funding, chiefly from either office concerned within machine translation.

Database contents

As of 2005, the database contains astir 150,000 words organized around across 115,000 synsets for the number of 203,000 word-feel pairs; inside compressed form, it is astir Xii megabytes large.

WordNet distinguishes between nouns, verbs, adjectives & adverbs on the assumption that which are actually stored otherwise in the man brain. Each synset contains the class action of synonymous words or even collocations (a collocation occurs as sequence of words that last together to form the specific meaning, like "car pool"); words typically participate around many synsets. A meaning of the synsets is farther clarified by owning short defining glosses. The average case synset by owning gloss is:

Each synset is attached to more synsets vithe a total of relations. These relation alter according to a nature and severity of word: Nouns synonyms: synsets with similar meaning hypernyms: Y is a superordinate word of X in case each X occurs as (rather) Y hyponyms: Y is a subordinate word of X whenever each Y occurs as (rather) X coordinate terms: Y occurs as coordinate term of X whenever X & Y part the hypernym holonym: Y is a whole name of X whenever X occurs as section of Y meronym: Y is a part name of X in case Y occurs as section of X Verbs synonyms superordinate: a noun Y occurs as superordinate of the verb X whenever a activity X occurs as (rather) Y coordinate terms: people verbs sharing the most common hypernym Adjectives equivalent word & related nouns antonyms: adjectives of opposite meaning Adverbs equivalent word & root adjectives antonyms

WordNet likewise will bring a lexical ambiguity count of a word: a total of synsets that contain the word. Whenever the word participates inside many synsets (i personally.e. has many senses), so usually a few senses come great deal other commons than others. WordNet quantifies this per frequency score: withwithin many sample texts everthing words were semantically tagged by using the corresponding synset, so it was counted how else typically the word appeared in a specific feel.

A database's interface is entity to deduce a root form of a word from the user's input; simply a root form is stored in the database.

Knowledge Structure
Two nouns & verbs come organized within to hierarchies, defined by superordinate or even Occurs as relationships. E.g., a feel Ace of the word pooch would keep around a below superordinate word hierarchy; a words on the equivalent level come equivalent word of both more: a bit of feel of mutt is synonymous by using another senses of farm run & Canis familiaris, then in. Every placed of equivalent word, as well referred to as the synset, has the unique stock & part their properties, like gloss (or even lexicon) definition.

run, household mutt, Canis familiaris => canine, canid => carnivore => placental, placental mammal, eutherian, eutherian mammal => mammal => craniate, craniate => chordate => brute, animal, animal, brute, animal, fauna => ...

At a top level, these hierarchies come organized around to Twenty-five primitive groups for nouns, & Fifteen for verbs. These groups form lexicographical files at maintenance level.

In a example of adjectives, the organization is different. Both opposite 'head' green goddess act when binary poles, when 'satellite' equivalent word attach to every of the heads via synonymousness relations. So, a hierarchies, & a conception of lexicographical files, don't use on text a equivalent way it clean for nouns & verbs.

Limitations

Unlike more lexicon, WordNet doesn't include reference just about etymology, pronunciation and a forms of irregular verbs and contains merely limited reference all about usage.

A actual lexicographic & semantical principles is maintained within lexicologist files, which are filtered by the convienence known as grind to make a distributed database. Two grind & a lexicologist files come freely available, however modifying & maintaining a database is yet hard.

Related projects

A plan EuroWordNet has produced WordNets for several European languages & linked the children together; which are actually non freely available even so. A Global Wordnet project attempts to coordinate a production & linking of wordnets for tons languages. Oxford University Press, the publishers of the Oxford English Dictionary have voiced plans to produce their have on the net WordNet.

A eXtended WordNet is a project at a University of Texas at Dallas which aims to improve WordNet by semantically parsing the glosses, so making the information contained around these definitions available for automatic cognition processing systems. These are as well freely available under the license similar to WordNet's.

A GCIDE project produces the lexicon by combining a public domain ''Webster's Dictionary from 1913 with some WordNet definitions and material provided by volunteers. These are freed under a copyleft license GPL.

A hypernym/hyponym relationships among a noun synsets may be utilized as an ontology in the computer science sense. A SUMO upper ontology has produced a mapping from either a WordNet synsets for nouns & verbs to SUMO classes. A OpenCyc upper ontology is also linked to WordNet. WordNet was a primary source for constructing a underclass of the SENSUS ontology.

FrameNet is a similar project. It consists of the lexicon which is according to annotating on top 100,000 sentences using their semantic properties. a unit within focus is the lexical frame'', a nature and severity of state or even event together by having the properites associated by using it.

JWNL (Java WordNet Library)
A Java API for accessing the WordNet relational dictionary. Free download, support and background information.

WordNet Perl Module
Perl OO interface to George Miller's WordNet database.

WordNet - A Lexical Database for English
Nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept.

WordNet Python Module
Python interface (by Oliver Steele) to WordNet database created by George Miller.






© 2005 GeneralAnswers.org