I've been using the htmlagility pack for at least a
year, maybe a bit less than two. It's a really really
good library, and has handled almost any kind of
terrible HTML structure we've thrown at it.
It isn't perfect, though. I find I have to change form
tags into xform because it doesn't parse it properly. I
think it's the same for param. htmlagilitypack has a
built-in list of tags that should be singletons, and I
think it has form as one of them, if I remember right?
Anyway, with a very small amount of preprocessing, it
can handle almost anything.
Hi Steve,
The library behavior for FORM is by design. It was made
such because FORM is, in many real world HTML cases,
overlapping other tags (because of its function).
You can change FORM behavior using the
HtmlNode.ElementFlags: just remove FORM from this list.
HTH
Simon.
PS: glad you guys like it :-) I used .EXE to test its
robustness...