Main page Robot, Web Mining Spider - Online Reference Manual 
  info@semantic-knowledge.com 
Home | News | Reference | Support | Download | Buy | About 

Choosing a Starting URL

The Robot, Zoom's integrated Web spider, is designed to collect Websites and extract texts from Internet, following the links from a starting Web page to other pages, until the process is finished (see parameters).

You must first choose a Starting URL from where you want to collect further information (Web pages or other Web sites), and select an output folder (on your computer):

Web mining parameters

Note that any valid HTTP URL is "good" for starting the Web mining: for example you can copy an URL from your Web Browser and paste it into the Starting URL field.

The Robot is a fast multithreaded Spider (you can download up to 40 pages simultaneously) designed for Information Retrieval and Semantic Analysis. In order to provide the best downloading throughput, the Robot discards images and multimedia files, but creates Internet shortcuts to the collected Web pages (so you can retrieve easily the original content).

This version captures only HTTP Web sites. It doesn't support FTP, Newsgroups or Gopher Internet protocols.

Read the next part:
Defining main parameters



First page Previous Next Last page

Copyright Acetic and Semantic Knowledge, all rights reserved
www.semantic-knowledge.com