queryAll method

List<String> queryAll({
  1. required String tag,
  2. String? className,
  3. String? id,
})

Extracts all text content from HTML tags matching the specified criteria.

tag - The HTML tag to search for (e.g., 'h1', 'p', 'div') className - Optional CSS class name to filter by id - Optional ID attribute to filter by

Returns a list of text content from matching elements.

Throws ScraperNotInitializedException if HTML content not loaded. Throws ParseException if parsing fails.

Implementation

List<String> queryAll({
  required String tag,
  String? className,
  String? id,
}) {
  if (_htmlContent == null) {
    throw ScraperNotInitializedException();
  }

  try {
    List<String> results = [];
    String pattern = _buildTagPattern(tag, className: className, id: id);

    RegExp regex = RegExp(pattern, caseSensitive: false, dotAll: true);
    Iterable<RegExpMatch> matches = regex.allMatches(_htmlContent!);

    for (RegExpMatch match in matches) {
      String? content = match.group(1);
      if (content != null) {
        // Remove HTML tags from content and clean up whitespace
        String cleanContent = _cleanHtmlContent(content);
        if (cleanContent.isNotEmpty) {
          results.add(cleanContent);
        }
      }
    }

    return results;
  } catch (e) {
    throw ParseException(
        'Failed to parse HTML with tag pattern', _htmlContent, e);
  }
}