User Manual for LinkCrawler 3.0.0
Oracle JRE7 is required, you can download it from here
How to run the application
On Linux-Based Systems:
How to crawl
On “Crawl Website” tab, enter an Absolute url (including http://, HTTPS is accepted too), for example:
Note: Please use the main site URL in order to crawl the entire site from a good central point.
Then click Start and the application will perform the “crawl” job
How to view and save log
On version 3.0.0, the log is automatically generated. The log is available at the Logs folder located in the same position as the Linkcrawler jar file.
Make sure you are executing the application with administrator privileges in order to create folders and files.
How to generate a report
Once a “crawl” job has finished, click on Reports then “Save in format…”, finally choose HTML, the report will be generated in the same folder as the LinkCrawler application. Make sure you are executing the application with administrator privileges in order to create folders
How to Use Exclusion list
Simple, just type a full URL to exclude a webpage, for example:
Or, Type a partial or a fragment of url to ignore a lot of webpages, for example:
In this case, LinkCrawler will ignore anything that starts with “http://mysite.com/calendar/”
How to verify a Sitemap
In order to verify if your sitemap is valid for google, click on the XML Sitemap Verification tab, then type the url of the sitemap then click on Check Sitemap. Also you can copy the site used when crawling as well by using the copy button.
Note: Linkcrawler will attempt to use sitemap.xml in case you enter the main site url only, for example http://carlosumanzor.com.
In case of any errors, a button will be enabled, It will display how many errors were found in the execution.