Main page Robot, Web Mining Spider - Online Reference Manual 
  info@semantic-knowledge.com 
Home | News | Reference | Support | Download | Buy | About 

Advanced parameters

To access these options use the [Advanced] button of Scan settings of the [Settings] tabsheet on the right frame :

Web mining parameters

Use the Advanced scan settings parameters to set:

Use the Maximum concurrent downloads parameters to set the number of HTML pages downloads running simultaneously (between 4 and 20, according to your ISP speed and your processor is a good range). Please note that a high concurrent downloads value may cause server bottleneck problems and/or computer resources shortage.

Some computers need a Proxy server to access the Internet. Please contact your network administrator to set theses options.

Web mining parameters

If necessary, use the [Proxy] tabsheet to set the proxy options:

If the Use proxy parameters is checked, you must set these options:

By using the [Policy] tabsheet parameters, you can decide to respect or not the "Robot exclusion standard":

Web mining parameters

Important:

1 - The "Robot exclusion standard" is set by webmasters to limit the directories and files the search engine is allowed to "harvest" (copy and index). See http://www.robotstxt.org, for more information.

2 - You choose to crawl every Web pages under your own responsability.

Read the next part:
Start the Web Mining: Download sites



First page Previous Next Last page

Copyright Acetic and Semantic Knowledge, all rights reserved
www.semantic-knowledge.com