Save This Page
Home » nutch-1.0 » org.apache.nutch » parse » html » [javadoc | source]
org.apache.nutch.parse.html
public class: HTMLMetaProcessor [javadoc | source]
java.lang.Object
   org.apache.nutch.parse.html.HTMLMetaProcessor
Class for parsing META Directives from DOM trees. This class handles specifically Robots META directives (all, none, nofollow, noindex), finding BASE HREF tags, and HTTP-EQUIV no-cache instructions. All meta directives are stored in a HTMLMetaTags instance.
Method from org.apache.nutch.parse.html.HTMLMetaProcessor Summary:
getMetaTags
Methods from java.lang.Object:
equals,   getClass,   hashCode,   notify,   notifyAll,   toString,   wait,   wait,   wait
Method from org.apache.nutch.parse.html.HTMLMetaProcessor Detail:
 public static final  void getMetaTags(HTMLMetaTags metaTags,
    Node node,
    URL currURL) 
    Sets the indicators in robotsMeta to appropriate values, based on any META tags found under the given node.