Hello! We are running our annual fundraising. Please consider making a donation if you value this freely available service or want to support people around the world working towards liberatory social change. https://riseup.net/donate.

Commit efa525c1 authored by jvoisin's avatar jvoisin
Browse files

Improve the robustness of the HTML parser

parent f67cd9d7
Pipeline #31809 passed with stages
in 2 minutes and 39 seconds
......@@ -104,6 +104,15 @@ class _HTMLParser(parser.HTMLParser):
self.tag_required_blocklist = required_blocklisted_tags
self.tag_blocklist = blocklisted_tags
def error(self, message): # pragma: no cover
""" Amusingly, Python's documentation doesn't mention that this
function needs to be implemented in subclasses of the parent class
of parser.HTMLParser. This was found by fuzzing,
triggering the following exception:
NotImplementedError: subclasses of ParserBase must override error()
raise ValueError(message)
def handle_starttag(self, tag: str, attrs: List[Tuple[str, Optional[str]]]):
# Ignore the type, because mypy is too stupid to infer
# that get_starttag_text() can't return None.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment