Main page Zoom, Semantic Search Engine - Online Reference Manual 
  info@semantic-knowledge.com 
Home | News | Reference | Support | Download | Buy | About 

Appendixes



Search index and folder organization.

We recommend that you structure all the folders of your main folder into sub-folders for the following reasons:

- Current Windows versions cannot deal correctly with a large number of files within a single folder (considerable fall-off in performance, clogging of file management tools, etc.)

- You can use the names of the sub-folders of a main folder to carry additional information (for instance, by grouping together the texts by source, by geographical area, by type of population, etc.) that can be reused when collecting statistics on your files (see Indexing, corresponding chapter).

Note that the Robot supplied with Zoom automatically builds up a tree structure of sub-folders (in which are stored Web pages) so as to offer you the best possible performance for indexing the folders and consulting results. Structuring data in folders and sub-folders entails only advantages and no disadvantages at all if you use Tropes Zoom for desktop search.

Search Index merging and dispersal.

To group documents together inside a single folder or, on the contrary, to disperse your documents into several sub-folders, while retaining the coherence of the search index and taking advantage of the incremental performances of Zoom search engine, you must:

  1. copy all the documents (text files, [.URL] Internet shortcuts and [.IDT] indexes) in each destination folder;

  2. rerun the “incremental” indexing to merge the search indexes into a single folder.

Since the semantic information collected in the course of the indexing is stored inside a small file ([.IDT]) associated with each analyzed document, you do not have to make a “global” build of the index to merge or disperse folders, provided that you have chosen to enable incremental build (see Indexing parameters).

These remarks are of interest only if you are processing huge folders (of more than 10,000 documents), and if you are using incremental build, which enables you very quickly to rebuild a search index without re-analyzing each file.

Files specific to the software.

Six files, specific to the software, are generated during the indexing of a folder:

Name of the file

Function

Description

BASEDOC.MFT

Global index

Information Retrieval index

BASEDOC.MIT

Global index

Information Retrieval index

BASEDOC.MWL

Global index

Information Retrieval index

BASEDOC.MVI

Version

Information on the search index version

BASEDOC.SCN

Scenario

Scenario used when indexing

Filename.IDT

File index

List of the equivalent classes of a text, files that remain on the hard drive only if you activate Incremental Build

Do not modify these files: they are managed automatically by the software. However, you can delete them from your folders if you wish to erase all trace of the documentary index.

The use of numerous files is essential for performance reasons, particularly because current Windows versions cannot deal correctly with files of more than 2 Gb, while Zoom needs to store search indexes on folders that can be composed of more than one million documents.

When you are indexing, the software follows these two stages:

  1. it creates an index of the equivalent classes ([.IDT] file) of each processed document,

  2. it groups together indexes in metafiles, used by the Semantic Search Engine.

In the course of “incremental” indexing, stage 1 is carried out only when necessary.

Stage 2 is always carried out. It makes it possible to merge the file indexes with the incremental option, and thus to reduce the length of the indexing.

Reminder: to carry out incremental build, you must check the [Enable incremental build] box when indexing the base.



First page Previous Next Last page

Copyright Acetic and Semantic Knowledge, all rights reserved
www.semantic-knowledge.com