queryWithRegex method
Extracts text content using a regular expression pattern.
pattern
- The regex pattern to match against
group
- The capture group to extract (defaults to 1)
Returns a list of matching text content.
Throws ScraperNotInitializedException if HTML content not loaded. Throws ParseException if regex compilation fails.
Implementation
List<String> queryWithRegex({
required String pattern,
int group = 1,
}) {
if (_htmlContent == null) {
throw ScraperNotInitializedException();
}
try {
List<String> results = [];
RegExp regex = RegExp(pattern, caseSensitive: false, dotAll: true);
Iterable<RegExpMatch> matches = regex.allMatches(_htmlContent!);
for (RegExpMatch match in matches) {
String? content = match.group(group);
if (content != null) {
String cleanContent = content.trim();
if (cleanContent.isNotEmpty) {
results.add(cleanContent);
}
}
}
return results;
} catch (e) {
throw ParseException(
'Failed to parse HTML with regex pattern: $pattern', _htmlContent, e);
}
}