org.apache.nutch.parse.html
public class: HTMLMetaProcessor [javadoc |
source]
java.lang.Object
org.apache.nutch.parse.html.HTMLMetaProcessor
Class for parsing META Directives from DOM trees. This class
handles specifically Robots META directives (all, none, nofollow,
noindex), finding BASE HREF tags, and HTTP-EQUIV no-cache
instructions. All meta directives are stored in a HTMLMetaTags instance.
| Method from org.apache.nutch.parse.html.HTMLMetaProcessor Summary: |
|---|
|
getMetaTags |
| Method from org.apache.nutch.parse.html.HTMLMetaProcessor Detail: |
public static final void getMetaTags(HTMLMetaTags metaTags,
Node node,
URL currURL) {
metaTags.reset();
getMetaTagsHelper(metaTags, node, currURL);
}
Sets the indicators in robotsMeta to appropriate
values, based on any META tags found under the given
node. |