Tokenizer
A Tokenizer is a TokenStream whose input is a Reader.
This is an abstract class; subclasses must override .incrementToken
NOTE: Subclasses overriding .incrementToken must call AttributeSource.clearAttributes before setting attributes.
Inheritors
Functions
Expert: Adds a custom AttributeImpl instance with one or more Attribute interfaces.
Captures the state of all Attributes. The return value can be passed to .restoreState to restore the state of this or another AttributeSource.
Resets all Attributes in this AttributeSource by calling AttributeImpl.clear on each Attribute implementation.
Performs a clone of all AttributeImpl instances returned in a new AttributeSource instance. This method can be used to e.g. create another TokenStream with exactly the same attributes (using .AttributeSource). You can also use it as a (non-performant) replacement for .captureState, if you need to look into / modify the captured state.
Copies the contents of this AttributeSource to the given target AttributeSource. The given instance has to provide all Attributes this instance contains. The actual attribute implementations must be identical in both AttributeSource instances; ideally both AttributeSource instances should use the same [ ]. You can use this method as a replacement for .restoreState, if you use .cloneAttributes instead of .captureState.
Resets all Attributes in this AttributeSource by calling AttributeImpl.end on each Attribute implementation.
The caller must pass in a Class value. Returns true, iff this AttributeSource contains the passed-in Attribute.
Returns true, iff this AttributeSource has any attributes
Consumers (i.e., IndexWriter) use this method to advance the stream to the next token. Implementing classes must implement this method and update the appropriate [ ]s with the attributes of the next token.
This method returns the current attribute values as a string in the following format by calling the .reflectWith method:
This method is for introspection of attributes, it should simply add the key/values this AttributeSource holds to the given AttributeReflector.
Removes all attributes and their implementations from this AttributeSource.
Restores this state by copying the values of all attribute implementations that this state contains into the attributes implementations of the targetStream. The targetStream must contain a corresponding instance for each argument contained in this state (e.g. it is not possible to restore the state of an AttributeSource containing a TermAttribute into a AttributeSource using a Token instance as implementation).