Package org.apache.lucene.analysis.core
Class WhitespaceTokenizerFactory
java.lang.Object
org.apache.lucene.analysis.AbstractAnalysisFactory
org.apache.lucene.analysis.TokenizerFactory
org.apache.lucene.analysis.core.WhitespaceTokenizerFactory
Factory for
WhitespaceTokenizer.
<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory" rule="unicode" maxTokenLen="256"/>
</analyzer>
</fieldType>
Options:
- rule: either "java" for
WhitespaceTokenizeror "unicode" forUnicodeWhitespaceTokenizer - maxTokenLen: max token length, should be greater than 0 and less than
MAX_TOKEN_LENGTH_LIMIT (1024*1024). It is rare to need to change this else
CharTokenizer::DEFAULT_MAX_TOKEN_LEN
- Since:
- 3.1
- SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
- "whitespace"
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final StringSPI namestatic final Stringstatic final StringFields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion -
Constructor Summary
ConstructorsConstructorDescriptionDefault ctor for compatibility with SPICreates a new WhitespaceTokenizerFactory -
Method Summary
Methods inherited from class org.apache.lucene.analysis.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizersMethods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
Field Details
-
NAME
SPI name- See Also:
-
RULE_JAVA
- See Also:
-
RULE_UNICODE
- See Also:
-
-
Constructor Details
-
WhitespaceTokenizerFactory
Creates a new WhitespaceTokenizerFactory -
WhitespaceTokenizerFactory
public WhitespaceTokenizerFactory()Default ctor for compatibility with SPI
-
-
Method Details
-
create
- Specified by:
createin classTokenizerFactory
-