|
Sharkysoft home | |||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Class Summary | |
HtmlCloseTag | Closing HTML tag. |
HtmlComment | HTML comment. |
HtmlComponent | Parsed HTML element. |
HtmlEntities | Translate HTML entities. |
HtmlError | Malformed HTML source. |
HtmlOpenTag | Opening HTML tag. |
HtmlParser | Parses HTML source. |
HtmlRegularTag | HTML tag. |
HtmlSpecialTag | Unusual HTML tag. |
HtmlText | Uninterpreted HTML text. |
HTML parsing.
Details: This package is useful for parsing simple HTML documents. It allows you to turn an HTML text stream into a stream of Java objects that represent the parsed contents of the stream. The classes of objects that can occur in the object stream are shown below. (Click on any of the class boxes to view class-specific documentation.)
All objects are instances of HtmlComponent, but since HtmlComponent is an abstract class, each object in the stream also belongs to one of the shown subclasses as well.
HtmlParser is responsible for converting the text stream into the stream of objects. This package is ideal for applications which must extract information from machine-
Here is a simple example that reads a web page and prints the (relative) URLs of all of the images displayed in it.*
*Actually, this claim is not entirely true. The example only lists images displayed using the <IMG> tag, and does not include the background image (which comes from the <BODY> tag, or other images brought in by other means. We wanted to keep the example simple.
|
Sharkysoft home | |||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |