Main page Tropes, Semantic Text Analysis - Online Reference Manual 
  info@semantic-knowledge.com 
Home | News | Reference | Support | Download | Buy | About 

CHAPTER 4

Introduction to text analysis (Part II)


Statistical, probabilistic and cognitive analyses

Tropes carries out different sorts of text analyses:

Among other things, statistics are used to build the graphs and to lay out the results.

The Frequent word categories and the Text Style are obtained by comparing the distribution of the occurrence frequency of the categories observed in the text with linguistic production norms. These norms have been elaborated after studying a great number of different texts. They are stored into specific in-built tables.


Equivalent classes and Relations between equivalents

The Equivalent classes group together closely related References (common nouns, proper nouns, trademarks) appearing frequently throughout the text. For example, “father” and “mother” are grouped together into the “family” class.

The Reference fields group together the words comprising the Equivalent classes in order to enable the software to elaborate a representation of the context. To achieve this, the Semantic equivalents dictionary of Tropes is composed of three different classification levels. At the lower level are the References, which are next merged more broadly into Reference fields 2, which, in turn, are merged into Reference fields 1.

In the example below, the word “Lord Chancellor” belongs to the “minister” Reference, included in the “government” field 2, which is part of the “politics” field 1. The “politics” field 1 includes broader concepts, such as “political system”, “foreign policy”, etc.

Fields 1 Fields 2 References Words
Politics Political system Communism Communism
Politics Political system Communism Marxism
Politics Political system Democracy Democracy
Politics Political system Democracy Republic
Politics Government Federal government Federal government
Politics Government Head of Government Head of Government
Politics Government Head of Government Prime Minister
Politics Government Minister Lord Chancellor
Politics Government Minister Minister
Politics Government Minister Secretary of State
Politics Government Government Government

Reflections on the size of the texts

Most of the analyses made by Tropes being statistical, the software is sensitive to the amount of text to be processed.

If your corpus consists of files containing very short texts (under one page), the results will probably not be significant and the software may have some difficulty in resolving semantic ambiguities (there will not be enough words to clarify the implicit concepts). If you work on a reasonable number of short texts, we recommend that you put them together in a single file. If you work on a great number of short texts, put them in several files, where you can group them in chronological order, for instance.

Otherwise, if your corpus consists of files containing very long texts (over 100 pages) and/or composed of utterances referring to completely different contexts, the pertinence of the results may be affected: the discourses get mixed up. When working on rather substantial corpora, try to divide them - as much as possible - into files that do not exceed a few hundred pages. You can use a Scenario to merge the results obtained by the analysis of several files (see Zoom Reference manual).

A few “symptoms” indicate that the corpus you are analyzing is too substantial:

- It takes more than one minute for Tropes to process the text.

- The Equivalent classes are hardly usable: they often contain words of different meanings.

- Tropes has not detected any significant Actant, and this fact cannot be explained by the intensive use of personal pronouns.

Important note: because of inner technical limits of the software, some References classes may not be displayed if the text is very long.


First page Previous Next Last page

Copyright Acetic and Semantic Knowledge, all rights reserved
www.semantic-knowledge.com