Check HTML or XHTML document by extracting a list of anchors and links by W3C Linkchecker.
- The link checker reads an HTML or XHTML document and extracts a list of anchors and links.
- It checks that no anchor is defined twice.
- It then checks that all the links are dereferenceable, including the fragments. It warns about HTTP redirects, including directory redirects.
- It can check recursively a part of a Web site.
- There is a command line version and a CGI version. They both support HTTP basic authentication. This is achieved in the CGI version by passing through the authorization information from the user browser to the site tested.
- This linkchecker was modified from W3C linkchecker which is an open source
- Few modification was necessary proper installation
W3C Link Checker
Robots exclusion
The link checker honors robots exclusion rules. To place rules specific to the W3C Link Checker in /robots.txt files, sites can use the W3C-checklink user agent string. For example, to allow the link checker to access all documents on a server and to disallow all other robots, one could use the following:User-Agent: * Disallow: / User-Agent: W3C-checklink Disallow:
Known Issues
If a link checker run in "summary only" mode takes a long time, some user agents may stop loading the results page due to a timeout. We have placed work arounds hoping to avoid this in the code, but have not yet found one that would work reliably for all browsers. If you experience these timeouts, try avoiding "summary only" mode, or try using the link checker with another browser.Script source
View script as installed on this serverVisit: W3C Linkchecker
http://www.uspharmd.com/tools/checklink
Hits: 46
Editorial Note:
This scripts has been modifed from the original w3c linkchecker. This server does not have Time::HiRes installed.- Time::HiRes usage modified by using sys.syscall.ph
- No other configuration files needed
(Added: 21-Dec-2007 Rating: 0 Votes: 0) Rate It
[0] View Comment