Posts Tagged ‘natural language understanding’
Natural Language Understanding is primarily used in text based applications and dialogue based applications.
Text based Applications
As the name implies text based applications deal with processing of written text such as books, newspapers, manuals, reports, e-mail messages, and so on. Natural language understanding techniques are widely used in finding the required documentation on certain topics from a database of texts, extracting information from messages, articles or documents, machine translation, and summarizing texts for certain purposes. A customer, for example, may want to find news on gold prices for the last one year or a student may want to get abstracts of research papers. Language understanding systems can be employed to do these tasks.
There are alternative techniques to use in the above mentioned applications but they follow blind approaches such as matching with keywords resulting in limited efficiency. Handling complex retrieval tasks, for example, requires computation of the representation of the information which only natural language understanding techniques can achieve. And the resulting information can be used later for inference.
Dialogue based Applications
Dialogue based applications deal with spoken language. Interacting with computers using keyboards such as chatting also comes under this category. Question-answering systems, automated customer service over telephone, tutoring systems, controlling machines using voice, and interactive problem solving systems. Getting information from a database by sending queries, making transactions in banks and interactive tutoring systems are few examples.
Dialogue systems have to face many challenges including handling of errors in signals and inter-speaker differences. Not all speech recognition systems need language understanding to a large extent. For example, a voice controlled television has to only recognize the words uttered and use them as commands to perform functions such changing channels, increasing or reducing the volume, changing the contrast, power on and off.
A natural languge understanding system must have knowledge about what the words mean, how words combine to form sentences, how word meanings combine to from sentence meanings and so on. The different forms of knowledge required for natural language understanding are given below.
Phonetic and phonological knowledge
Phonetics is the study of language at the level of sounds while phonology is the study of combination of sounds into organized units of speech, the formation of syllables and larger units. Phonetic and phonological knowledge are essential for speech based systems as they deal with how words are related to the sounds that realize them.
Morphology concerns word formation. It is a study of the patterns of formation of words by the combination of sounds into minimal distinctive units of meaning called mophemes. Morphological knowledge concerns how words are constructed from morphemes.
Syntax is the level at which we study how words combine to form phrases, phrases combine to form clauses and clauses join to make sentences. Syntactic analysis concerns sentence formation. It deals with how words can be put together to form correct sentences. It also determines what structural role each word plays in the sentence and what phrases are subparts of what other phrases.
It concerns meanings of the words and sentences. This is the study of context independent meaning that is the meaning a sentence has, no matter in which context it is used. Defining the meaning of a sentence is very difficult due to the ambiguities involved.
Pragmatics is the extension of the meaning or semantics. Pragmatics deals with the contextual aspects of meaning in particular situations. It concerns how sentences are used in different situations and how use affects the interpretation of the sentence.
Discourse concerns connected sentences. It is a study of chunks of language which are bigger than a single sentence. Dicourse language concerns inter-sentential links that is how the immediately preceding sentences affect the interpretation of the next sentence. Discourse knowledge is important for interpreting pronouns and temporal aspects of the information conveyed.
Word knowledge is nothing but everyday knowledge that all speakers share about the world. It includes the general knowledge about the structure of the world and what each language user must know about the other user’s beliefs and goals. This essential to make the language understanding much better.
Natural language understanding concerns with process of comprehending and using languages once the words are recognized. The objective is to specify a computational model that matches with humans in linguistic tasks such as reading, writing, hearing, and speaking. To develop a natural language understanding model, it is required to use knowledge from many disciplines including Linguistics, psycholinguistics, philosophy, and computational linguistics. It is necessary to understand how language works, combine all the approaches to produce complex theories and realize such complex theories as computer programs. Testing of these programs will give a clue as to which of the cases fail so that the programs can be improved. By doing this process repeatedly we can finally get to know how human language processing occurs.
Representing and Understanding
Computing the representation of the meaning of the texts is the most important component. For this purpose it is necessary to define the notion of representation otherwise ambiguity will be become great impediment. A more precise language is required to represent meaning. The representation languages should have the following properties. The representation must be precise and unambiguous and it should capture the intuitive structure of the natural language sentences. Making judgments on grammaticality is not a goal in language understanding. A robust system should be able to understand even a sentence with some mistakes. Syntactic structure indicates the way the words in the sentence are related to each other. Syntactic representations of the languages are usually based on the notion of context-free grammars. Syntactic representation represents sentence structure in terms of what phrases are subparts of other phrases usually in a tree form. These structures give details on structure of the phrases and parts of speech for each word.
Logical form refers to the representation of the context-independent meaning of a sentence. The logical form encodes the possible word senses and identifies the semantic relationships between words and phrases. An abstract set of semantic relationships between the verb and its noun phrases is used to capture these relationships. Once semantic relationships are determined, some word senses may be impossible and thus eliminated from consideration. One of the key tasks in semantic interpretation is to consider what combinations of the individual word meanings can combine to create coherent sentence meanings. Exploiting such interconnections between word meanings can greatly reduce the number of possible word senses for each word in a given sentence.
The Final Meaning Representation
Natural language understanding system uses general knowledge representation to represent and reason about its application domain. The final representation is the language in which all the knowledge based on the application is represented. The goal of contextual interpretation is to take a representation of the structure of a sentence and its logical form, and to map this into some expression in the knowledge representation that allows the system to perform the appropriate task in the domain. One of the final representation language is the first-order predicate calculus.