Save This Page
Home » nutch-1.0 » org.apache.nutch » parse » [javadoc | source]
org.apache.nutch.parse
public class: ParseUtil [javadoc | source]
java.lang.Object
   org.apache.nutch.parse.ParseUtil
A Utility class containing methods to simply perform parsing utilities such as iterating through a preferred list of Parser s to obtain Parse objects.
Field Summary
public static final  Log LOG     
Constructor:
 public ParseUtil(Configuration conf) 
    Parameters:
    conf -
Method from org.apache.nutch.parse.ParseUtil Summary:
parse,   parseByExtensionId
Methods from java.lang.Object:
equals,   getClass,   hashCode,   notify,   notifyAll,   toString,   wait,   wait,   wait
Method from org.apache.nutch.parse.ParseUtil Detail:
 public ParseResult parse(Content content) throws ParseException 
    Performs a parse by iterating through a List of preferred Parser s until a successful parse is performed and a Parse object is returned. If the parse is unsuccessful, a message is logged to the WARNING level, and an empty parse is returned.
 public ParseResult parseByExtensionId(String extId,
    Content content) throws ParseException 
    Method parses a Content object using the Parser specified by the parameter extId, i.e., the Parser's extension ID. If a suitable Parser is not found, then a WARNING level message is logged, and a ParseException is thrown. If the parse is uncessful for any other reason, then a WARNING level message is logged, and a ParseStatus.getEmptyParse() is returned.