Package org.apache.lucene.codecs
Class TermVectorsWriter
java.lang.Object
org.apache.lucene.codecs.TermVectorsWriter
- All Implemented Interfaces:
Closeable,AutoCloseable,Accountable
- Direct Known Subclasses:
Lucene90CompressingTermVectorsWriter
Codec API for writing term vectors:
- For every document,
startDocument(int)is called, informing the Codec how many fields will be written. startField(FieldInfo, int, boolean, boolean, boolean)is called for each field in the document, informing the codec how many terms will be written for that field, and whether or not positions, offsets, or payloads are enabled.- Within each field,
startTerm(BytesRef, int)is called for each term. - If offsets and/or positions are enabled, then
addPosition(int, int, int, BytesRef)will be called for each term occurrence. - After all documents have been written,
finish(int)is called for verification/sanity-checks. - Finally the writer is closed (
close())
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Field Summary
Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected final voidaddAllDocVectors(Fields vectors, MergeState mergeState) Safe (but, slowish) default method to write every vector field in the document.abstract voidaddPosition(int position, int startOffset, int endOffset, BytesRef payload) Adds a term position and offsetsvoidCalled by IndexWriter when writing new segments.abstract voidclose()abstract voidfinish(int numDocs) Called beforeclose(), passing in the number of documents that were written.voidCalled after a doc and all its fields have been added.voidCalled after a field and all its terms have been added.voidCalled after a term and all its positions have been added.intmerge(MergeState mergeState) Merges in the term vectors from the readers inmergeState.abstract voidstartDocument(int numVectorFields) Called before writing the term vectors of the document.abstract voidstartField(FieldInfo info, int numTerms, boolean positions, boolean offsets, boolean payloads) Called before writing the terms of the field.abstract voidAdds a term and its term frequencyfreq.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.lucene.util.Accountable
getChildResources, ramBytesUsed
-
Constructor Details
-
TermVectorsWriter
protected TermVectorsWriter()Sole constructor. (For invocation by subclass constructors, typically implicit.)
-
-
Method Details
-
startDocument
Called before writing the term vectors of the document.startField(FieldInfo, int, boolean, boolean, boolean)will be callednumVectorFieldstimes. Note that if term vectors are enabled, this is called even if the document has no vector fields, in this casenumVectorFieldswill be zero.- Throws:
IOException
-
finishDocument
Called after a doc and all its fields have been added.- Throws:
IOException
-
startField
public abstract void startField(FieldInfo info, int numTerms, boolean positions, boolean offsets, boolean payloads) throws IOException Called before writing the terms of the field.startTerm(BytesRef, int)will be callednumTermstimes.- Throws:
IOException
-
finishField
Called after a field and all its terms have been added.- Throws:
IOException
-
startTerm
Adds a term and its term frequencyfreq. If this field has positions and/or offsets enabled, thenaddPosition(int, int, int, BytesRef)will be calledfreqtimes respectively.- Throws:
IOException
-
finishTerm
Called after a term and all its positions have been added.- Throws:
IOException
-
addPosition
public abstract void addPosition(int position, int startOffset, int endOffset, BytesRef payload) throws IOException Adds a term position and offsets- Throws:
IOException
-
finish
Called beforeclose(), passing in the number of documents that were written. Note that this is intentionally redundant (equivalent to the number of calls tostartDocument(int), but a Codec should check that this is the case to detect the JRE bug described in LUCENE-1282.- Throws:
IOException
-
addProx
Called by IndexWriter when writing new segments.This is an expert API that allows the codec to consume positions and offsets directly from the indexer.
The default implementation calls
addPosition(int, int, int, BytesRef), but subclasses can override this if they want to efficiently write all the positions, then all the offsets, for example.NOTE: This API is extremely expert and subject to change or removal!!!
- Throws:
IOException- NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
-
merge
Merges in the term vectors from the readers inmergeState. The default implementation skips over deleted documents, and usesstartDocument(int),startField(FieldInfo, int, boolean, boolean, boolean),startTerm(BytesRef, int),addPosition(int, int, int, BytesRef), andfinish(int), returning the number of documents that were written. Implementations can override this method for more sophisticated merging (bulk-byte copying, etc).- Throws:
IOException
-
addAllDocVectors
Safe (but, slowish) default method to write every vector field in the document.- Throws:
IOException
-
close
- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Throws:
IOException
-