Main page Tropes, Semantic Text Analysis - Online Reference Manual 
  info@semantic-knowledge.com 
Home | News | Reference | Support | Download | Buy | About 

CHAPTER 2

Semantic Scenarios (part III)

Terminology extraction

The terminology extractor is capable of automatically identifying most significant expressions and compounds, as well as all nouns not classified in the existing Scenario. This tool is useful both for rapidly enriching the Scenarios of the software (for example, by grouping together all acronyms and the expressions corresponding to them) and for obtaining a more precise classification (proposing, for example, to "hard-wire" terms causing ambiguity problems, and/or which might generate "noise" in the list of Relations).

The terminology extractor serves a triple purpose:

  1. It automatically extracts from the text all compounds (i.e. repeated sequences of terms which contain at least one noun and are linguistically coherent) which might prove interesting for purposes of analysis.
  2. It suggests a list of references to complement the Scenario and/or draws up a list of everything that has not been classified as yet.
  3. It speeds up the construction of the Scenario.

The terminology extractor is semantically linked to the Scenario, that is to say that when you select a family of terms, Tropes will automatically try to position the Scenario tool on the group that seems most suitable to accommodate the selected expression. The software can also perform an automatic search for a family of expressions in the text. These two semantic linking functions can be deactivated (cf. "Localize" options at the bottom right in the dialog box). The terms extracted are preceded by a color code showing which are the most frequent terms or expressions (dark = frequent, light = infrequent).

The extractor is run from the [Tools] menu of Tropes, when you have analyzed a text or (preferably) several texts, on the basis of which you want to build a classification. The following example displays the results of a terminological analysis of Darwin's The Origin of Species (195,000 occurrences):

Terminology extraction

Each term (or expression) extracted is displayed on a separate line, with its membership group. It is preceded by a colored square showing its frequency of occurrence.

To enrich your classification, use the Scenario tool and select the terms that interest you in the terminology extractor.

In the example below, we have selected a certain number of expressions that appear interesting to us, and which have been grouped in the "action" category by the software.

Terminology extraction

We will then check the Insert box, and click on the Scenario button (on the right), to transfer them automatically into the Scenario.

The software will automatically create all the entries and corresponding group labels in the Scenario tool.

All you have to do then is save your Scenario ([Save] command in the [File] menu of the Scenario tool) to see the result in the text analyzed.

When you select a term, and provided the Scenario tool is open, the software will try automatically to position the Scenario on the closest semantic category. This behavior can be inhibited by removing the Scenario box from the [Localize] frame on the right in the dialog box.

Lastly, the terminology extractor is provided with an [Options] tab with which you can fine-tune the level of extraction.


First page Previous Next Last page

Copyright Acetic and Semantic Knowledge, all rights reserved
www.semantic-knowledge.com