admin

MARKOV MODELS

MARKOV MODELS: OVERVIEW Morkov models extract linguistic knowledge automatically from the large corpora and do POS tagging. Morkov models are alternatives for laborious and time-consuming manual tagging. MARKOV PROPERTY The name Markov model is derived from the term Markov property. Markov property is an assumption that allows the system to be analyzed. According to Markov property, […]

MARKOV MODELS Read More »

MAXIMUM ENTROPY

MAXIMUM ENTROPY TAGGING Maximum Entropy Tagging aims to find a model with maximum entropy. The term, maximum entropy here means maximum randomness or minimum additional structure. It exploits some of the good properties of tranformation-based learning and Markov model tagging. It allows flexibility in cues used to disambiguate words. The outputs of the maximum entropy tagging are tags

MAXIMUM ENTROPY Read More »

RULE BASED POS TAGGING

RULE-BASED PARTS-OF-SPEECH TAGGING Rule-based part-of-speech tagging is the oldest approach that uses hand-written rules for tagging. Rule based taggers depends on dictionary or lexicon to get possible tags for each word to be tagged. Hand-written rules are used to identify the correct tag when a word has more than one possible tag. Disambiguation is done

RULE BASED POS TAGGING Read More »

BAUM WELCH ALGORITHM

Baum-Welch Algorithm, also known as forward-backword algorithm was invented by Leonard E. Baum and Lloyd R Welch. It is a special case of Estimation Maximization (EM) method. Baum-Welch algorithm is very effective to train a Markov model without using manually annotated corpora. Baum Welch algorithm works by assigning initial probabilities to all the parameters. Then until the training converges,

BAUM WELCH ALGORITHM Read More »

NLP REFERENCES

BOOKS FOR REFERENCE Ruslan Mitkov, The Oxford Handbook Of Computational Linguistics, Oxford Universitty Press, 2003. Robert Dale, Hermani Moisi, Harold Somers, Handbook Of Natural Language Processing, Markcel Dekker Inc. James Allen, Natural Language Processing, Pearson Education, 2003. Christopher D.Manning & Henrich Schutze, Foundations Of Statistical Natural Language Processing, The MIT Press, 2001 Douglas Biber, Susan

NLP REFERENCES Read More »

TOKENIZATION OVERVIEW

This article presents an overview of Tokenization and the challenges associated with it. WHAT IS TOKENIZATION? Tokenization is the process of breaking up the given text into units called tokens. The tokens may be words or number or punctuation mark. Tokenization does this task by locating word boundaries. Ending point of a word and beginning of the next

TOKENIZATION OVERVIEW Read More »

APPLICATIONS OF NATURAL LANGUAGE UNDERSTANDING

Natural Language Understanding is primarily used in text based applications and dialogue based applications. TEXT BASED APPLICATIONS As the name implies text based applications deal with processing of written text such as books, newspapers, manuals, reports, e-mail messages, and so on. Natural language understanding techniques are widely used in finding the required documentation on certain topics from a

APPLICATIONS OF NATURAL LANGUAGE UNDERSTANDING Read More »