sloth_link_crawler 0.0.2
sloth_link_crawler: ^0.0.2 copied to clipboard
(WIP!) Web crawler to crawl all links of a website. Looking foward to respect robots.txt and use custom user agents.
0.0.2 #
- Crawl website and return found links
- ADD optional parameters
- ADD custom agent
- ADD respect robots.txt of domain
- ADD Filter to only return internal domain links
- ADD Debug mode to print results on console
- ADD Delay bewteen scrape calls (GET-Requests)
- ADD Retry of max 5 on failed requests
0.0.1 #
- Initial version.