PatternTokenizerFactory

Factory for PatternTokenizer. This tokenizer uses regex pattern matching to construct distinct tokens for the input stream. It takes two arguments: "pattern" and "group".

  • "pattern" is the regular expression.
  • "group" says which group to extract into tokens.

group=-1 (the default) is equivalent to "split". In this case, the tokens will be equivalent to the output from (without empty tokens): [String.split]

Using group >= 0 selects the matching group as the token. For example, if you have:

pattern = \'([^\']+)\'
group = 0
input = aaa 'bbb' 'ccc'

the output will be two tokens: 'bbb' and 'ccc' (including the ' marks). With the same input but using group=1, the output would be: bbb and ccc (no ' marks)

NOTE: This Tokenizer does not output tokens that are of zero length.

<fieldType name="text_ptn" class="solr.TextField" positionIncrementGap="100">
<analyzer>
     <tokenizer class="solr.PatternTokenizerFactory" pattern="\'([^\']+)\'" group="1"/>
</analyzer>

Since

solr1.2

See also

Constructors

Link copied to clipboard
constructor(args: MutableMap<String, String>)

Creates a new PatternTokenizerFactory

constructor()

Default ctor for compatibility with SPI

Types

Link copied to clipboard
object Companion

Properties

Link copied to clipboard
Link copied to clipboard
lateinit var originalArgs: Map<String, String>

Functions

Link copied to clipboard

open override fun create(factory: AttributeFactory): PatternTokenizer

Split the input using configured pattern

Link copied to clipboard
fun get(args: MutableMap<String, String>, name: String): String?
fun get(args: MutableMap<String, String>, name: String, defaultVal: String): String
fun get(args: MutableMap<String, String>, name: String, allowedValues: MutableCollection<String>, defaultVal: String?, caseSensitive: Boolean): String?
Link copied to clipboard
fun getChar(args: MutableMap<String, String>, name: String, defaultValue: Char): Char
Link copied to clipboard
Link copied to clipboard
fun require(args: MutableMap<String, String>, name: String, allowedValues: MutableCollection<String>, caseSensitive: Boolean): String
Link copied to clipboard