|
Background.
|
ViewScore VSBot is ViewScore web-indexing robot. The VSBot crawler collects documents from the Web to build a searchable index for search services using the ViewScore Search Engine. These documents are discovered and crawled because other web pages contain links directing to these documents.
As part of the crawling effort, the ViewScore VSBot crawler will take robots.txt standards into account to ensure we do not crawl and index content from those pages whose content you do not want included in ViewScore Search Technology. If a page is disallowed to be crawled by robots.txt standards, ViewScore will not read or use the contents of that page.
|
|
|
|
Site Submission Policy.
|
|
We add and update new sites to our index each time we crawl the web, Please note: you do not need to submit individual website. ViewScore updates its index on a regular basis, so updated or outdated link submissions are not necessary. Dead links will 'fade out' of our index on our next crawl when we update our entire index.
|
|
The VSBot Crawler.
|
|
VSBot is the user-agent for ViewScore.com's web crawler.
|
|
Q: How do I prevent my site or certain subdirectories from being crawled?
|
A:ViewScore VSBot obeys the Robot Exclusion Standard . Specifically, ViewScore VSBot adheres to the 1994 Robots Exclusion Standard (RES).
ViewScore VSBot will obey the first entry in the robots.txt file with a User-agent containing "VSBot". If there is no such record, it will obey the first entry with a User-agent of "*".
Disallowed documents, including slash (the home page of the site), are not indexed, nor are links in those documents followed. ViewScore VSBot does read the home page at each site and uses it internally, but if it is disallowed it is neither indexed nor followed. If a page has robots.txt standards disallowing it to be crawled, VSBot will not read or use the contents of that page. The URL of a disallowed page may be included in ViewScore Search Technology as a "thin" document with no text content. Links and reference text from other public web pages may provide identifiable information about a URL and may be indexed as part of web search coverage.
Example robots.txt:
|
User-agent: VSBot
Disallow: /cgi-bin/
|
|
Q: How can I reduce the number of requests you make on my web site?
|
A: Since we crawl billions of pages from the entire Web, we use a large number of systems for web crawling. Therefore your web server may log requests from a number of different VSBot crawler client IP addresses. The different crawler systems are coordinated to limit the activity on any single web server. We determine a single "web server" by IP address, so if your host is serving multiple IPs it may see higher levels of activity.
If there are directories on your web server which you do not want represented in web search results, use robot exclusion rules as described in "How do I prevent certain subdirectories from being crawled". An exclusion rule can reduce the number of pages VSBot will read from your server.
There is a ViewScore VSBot-specific extension to robots.txt which allows you to set a lower limit on our crawler request rate.
You can add a "Crawl-delay: xx" instruction, where "xx" is a delay value between successive crawler accesses. If the crawler rate is a problem for your server, you can set the delay up to 5 or 10 or a comfortable value for your server.
Setting a crawl-delay of 10 for ViewScore VSBot would look something like:
User-agent: VSBot
Crawl-delay: 10
In general you should restrict total crawler activity to your server by disallowing unimportant content with a robots.txt rule. Setting a crawl-delay may limit the coverage and freshness of your content representation in ViewScore search results. If you do feel that a crawl-delay is necessary, use small values to avoid blocking VSBot discovery and refresh of your key content.
|
|
|
|
|
|
Q: Why do you open external content with ViewScore frame?
|
|
When ViewScore presents reviews or other content from extrenal sources, we do so within a frame. This is done in order to provide ViewScore users the opportunity to vote on the helpfulness of the content, thereby participating in the ViewScore process of determining which content serves the user better. If you would like ViewScore to remove the frames from your content, just send us an email to support@viewscore.com and we will remove the frames. However, this will mean that ViewScore users will not be able to rank your content according to how helpful it is, which could cause its ranking to drop, generating less traffic for your website.
|