Natural language generation (NLG) is the process of generating language sentences from thought. It is the process of producing meaningful phrases, sentences, and paragraphs from an internal representation. It is a part of natural language processing. The process of generation is done in three phases: identifying thee goals of the utterance, planning how the goals may be achieved by evaluating the situation and available communicative sources, and realizing the plans as a text. The information flow in this process is from intentions to text. The source is the state of mind inside a speaker with intentions acting in a situation. NLG is the opposite of language understanding but is the part of systems that incorporate language understanding.
Components of a natural language generator
Speaker and Generator
To generate a text we need to have a speaker or an application and a generator or a program that renders the application’s intentions into fluent prose relevant to the situation.
Components and Levels of Representation
The process of language generation involves the following interleaved tasks.
1. Content Selection: Information or words should be selected and included in the utterance. Depending on how this information is reified into representational units, parts of the units may have to be omitted, other units added in by default, or perspectives taken on the units to reflect the speaker’s attitude toward them.
2. Textual Organization: The information must be textually organized. It must be ordered, both sequentially and in terms of linguistic relations such as modification or subordination. The coherence relationships among the units of the information must be reflected in this organization so that the reasons why the information was included will be apparent to the audience.
3. Linguistic resources: Linguistic resources must be chosen to support the information’s realization. Ultimately these resources will come down to choices of particular words, idioms, syntactic constructions, productive morphological variations, and so on, but the form they take at the first moment that they are associated with the selected information will vary greatly between approaches. Note that to choose a resource is not ispo facto to simultaneously deploy it in its final form – a fact that is not always appreciated.
4. Realization: The selected and organized resources must be realized as an actual text or voice output. This stage can itself involves several levels of representation and interleaved processes.
Application or Speaker
This is for thinking and maintaining the model of the situation. The speaker initiates the process and does not actually take part in the language generation. It establishes the potentially relevant content, stores the history of past transactions, deploys a representation of what it knows. All these form the situation, selected subset of propositions that speaker has. The speaker has to make sense of the situation.