|Tropes, Semantic Text Analysis - Online Reference Manual|
|Home | News | Reference | Support | Download | Buy | About|
The following notes concern all heterogeneous corpora, obtained by collecting within the same file utterances coming from numerous individuals, and without linear coherence (i.e. it is not the discourse of a single narrator, or an interaction between several interlocutors respecting a logical sequence or a strict chronological order).
Since the propositional hashing made by Tropes is based entirely on grammatical rules, you must use a non-ambiguous punctuation mark (question or exclamation mark) to force the software to separate the different utterances (for example, you can add an exclamation mark at the end of every answer to an Open question (Market Research), in order to separate it from the next one).
If the corpus includes answers to several questions, you will have to use Borders to group the answers together and/or to separate the answers from the questions. You will then analyze each answer separately.
If you have an indicator enabling you to form your corpus according to an external variable (geographic area, type of population, period, etc.), you may use it to split up the utterances into several files (each file containing, for example, the utterances corresponding to only one variable), that you can analyze separately in order to compare them. You can also code these variables inside the texts and then use them as Borders.
If the corpus has no linear coherence (for example, when the utterances contained in each file have been compiled at random, without following a particular logic), the results depending on the chronological analyses of Tropes (Most characteristic parts of text, Bundles, Episodes, Distribution graphs) will not be significant; do not try to interpret these results.
Designed to extract information from substantial press corpora (Text Mining), this method can be used for Business Intelligence purposes, Press Reviews, Historical and/or Sociological Studies, and/or to generate the keywords of a thesaurus. To apply it, you have to use Tropes Zoom.
When analyzing a text file containing the transcript of the discourse of several individuals, start with an overall analysis of the corpus, then use Borders to process the utterances of the various characters separately, and compare the results obtained. When comparing the results, ask yourself the following questions: have all the participants been talking about the same thing? Did they use the same Actants? If not, then why not? Has anyone refused to reply to certain questions? Has anyone been trying to convince another participant? Why? Have they succeeded? Etc.
If you have time, you can solve the anaphoras manually, i.e. replace each personal pronoun by who it refers to. Let us imagine, for example, that you wish to analyze the discourse of two characters - Peter and Paul - who have been talking about three other persons - Alan, Mary and Jane -, and that the text contains many personal pronouns ("I", "you", "he", "she", etc.) Use the [find/replace] command of your word processor to replace some of these pronouns as follows: "I" and "You" by "Peter" or "Paul", "She" by "Mary" or "Jane", "He" by Alan, etc. You will thus be able to count very precisely how many times this or that person has been mentioned, to know whether they are Actants or Acted, etc.
When you make transcripts from the spoken form, it is necessary that you include punctuation in the text, otherwise the software will not be able to carry out the propositional hashing properly and the processing of the analysis will be altered.
TropesOntology GEXF exports are specifically designed for discourse analysis of several documents.
To analyze a play, use the above method (analyzing the utterances of multiple actors is almost equivalent to analyzing a conversation between different interlocutors).
When studying long texts, such as an entire book, first analyze each chapter separately, then you can make a synthesis by processing the whole text (see Reflections on the size of the texts above).
For example, you can compare:
Copyright Acetic and Semantic Knowledge, all rights reserved