Word Sense Disambiguation (WSD)Series-2
Natural Language Problems & Solutions
Word Sense Disambiguation (WSD), has been a trending area of research in Natural Language Processing and Machine Learning. WSD is basically solution to the ambiguity which arises due to different meaning of words in different context.
In computational linguistics, word-sense disambiguation is an open problem concerned with identifying which sense of a word is used in a sentence. The solution to this issue impacts other computer-related writing, such as discourse, improving relevance of search engines, coherence, and inference
Go through my first blog : WSD Series-1 for various approaches to solve this problem
Disambiguation requires two strict inputs: a dictionary to specify the senses which are to be disambiguated and a corpus of language data to be disambiguated (in some methods, a training corpus of language examples is also required). WSD task has two variants: “lexical sample” and “all words” task.
Lesk Algorithm
Given an ambiguous word and the context in which the word occurs, Lesk returns a Synset with the highest number of overlapping words between the context sentence and different definitions from each Synset.
Given an ambiguous word and the context in which the word occurs, Lesk returns a Synset with the highest number of overlapping words between the context sentence and different definitions from each Synset.
WordNet is the lexical database i.e. dictionary for the English language, specifically designed for natural language processing.
Synset is a special kind of a simple interface that is present in NLTK to look up words in WordNet. Synset instances are the groupings of synonymous words that express the same concept. Some of the words have only one Synset and some have several.
“synsets” can identify the context of word in a particular sentence
Fetch data from here and feel free to contribute more solutions.
Major Applications of WSD
Text Mining and Information Extraction (IE)
In most of the applications, WSD is necessary to do accurate analysis of text. For example,WSD helps intelligent gathering system to do flagging of the correct words.
Machine translation or MT is the most obvious application of WSD. In MT, Lexical choice for the words that have distinct translations for different senses, is done by WSD. The senses in MT are represented as words in the target language. Most of the machine translation systems do not
use explicit WSD module.
Lexicography
WSD and lexicography can work together in loop because modern lexicography is corpus based. With lexicography, WSD provides rough empirical sense groupings as well as statistically significant contextual indicators of sense.
Information Retrieval (IR)
Information retrieval (IR) may be defined as a software program that deals with the organization,storage, retrieval and evaluation of information from document repositories particularly textual information. The system basically assists users in finding the information they required but it does
not explicitly return the answers of the questions.
Thanks for the reading and Besides, leave a few claps if you found this text helpful!!!
Feel free to contribute…