Main page Zoom, Semantic Search Engine - Online Reference Manual 
  info@semantic-knowledge.com 
Home | News | Reference | Support | Download | Buy | About 

Collecting statistics when indexing (Text Mining).

Zoom gives you the possibility to automatically collect statistics on all of the documents of your folder, by using the [Statistics] tabsheet of the indexing dialog.

Two statistical functions are at your disposal:

  1. The [Create a Scenario accumulation file] option enables you to collect information on the semantic groups of your Scenario. Naturally, this option is available only if you have defined a Scenario. The result file bears the name of the chosen Scenario and the [.FCS] extension (for example: MyScenario.fcs).
  2. The [Create a Categories accumulation file] option enables you to collect information on the word categories used in your documents. The result file bears the name Categ.xls. Consult the Tropes manual or the software help for further details on the word categories.

Each result file appears as a table where the categories or the semantic classes are presented in columns, while the names of the various documents analyzed are presented in lines. Thus, you have at your disposal a statistical record for all the files of your search index folders. These results can be used to refine your analyses or to automatically carry out other processings outside the software.

You can choose to weight these results with the total number of text words by checking the corresponding box. The advantage of this operation is to present information that is in a form unrelated to the size of the texts, which makes it easier for you to compare documents of different sizes.

Finally, you can choose to automatically start an application (for example: a statistics program, a spreadsheet or a word processor) as soon as the indexing is finished. Any software capable of reading tables in the Windows format (also called Tabulated ANSI) can do the job.

The result files (statistical tables) are created directly in the folder containing the indexing data.

To make an in-depth modification of the statistical parameters, consult the Tropes help (Appendixes, Dynamic Extensions section).

For further information on the creation of Scenario and on word categories, consult the related chapters in the Tropes documentation.

At the end of this manual, you will find an example on the use of a Scenario accumulation file.



First page Previous Next Last page

Copyright Acetic and Semantic Knowledge, all rights reserved
www.semantic-knowledge.com