sloth_link_crawler 0.0.2 copy "sloth_link_crawler: ^0.0.2" to clipboard
sloth_link_crawler: ^0.0.2 copied to clipboard

(WIP!) Web crawler to crawl all links of a website. Looking foward to respect robots.txt and use custom user agents.

ARE YOU A SLOTH? - SLOTH LINK CRAWLER #

This package is WIP! It will crawl an url and list all links. We try to respect the robots.txt file. See example on how to implement into your codebase.

Features #

  • Crawl URL
  • List all links (internal OR all)
  • Delay crawl requests
  • Custom user agent

Getting started #

With dart:

 $ dart pub add sloth_link_crawler

With Flutter:

 $ flutter pub add sloth_link_crawler

Usage #

Import the package

import 'package:sloth_link_crawler/sloth_link_crawler.dart';

Access the extensions:

  final crawler = SlothLinkCrawler(
      // Domain to crawl
      baseUrl: 'https://example.com',
      // if true only urls of the domain will get listed
      onlyInternalDomainLinks: true,
      // if true some debugging information will be displayed in the console
      debugMode: true,
      // set the duration between crawl calls
      duration: const Duration(seconds: 2),
      // set a custom user agent
      userAgent: 'SlothLinkCrawler');

  // List of all crawled urls
  List<String> result = await crawler.crawl();
3
likes
140
points
36
downloads

Publisher

verified publisherynnob.com

Weekly Downloads

(WIP!) Web crawler to crawl all links of a website. Looking foward to respect robots.txt and use custom user agents.

Homepage
Repository (GitHub)
View/report issues

Documentation

API reference

License

BSD-3-Clause (license)

Dependencies

beautiful_soup_dart, http, http_interceptor, queue, robots_txt

More

Packages that depend on sloth_link_crawler