SolrNet.ExtractParameters Class Reference
Contains parameters that can be specified when extracting a rich document to the index. More...
Public Member Functions | |
ExtractParameters (Stream content, string id, string resourceName) | |
Constructs a new ExtractParameters with required values. | |
ExtractParameters (FileStream content, string id) | |
Constructs a new ExtractParameters with required values. | |
Properties | |
string | Id [get, set] |
Provides the necessary unique id for the document being indexed /summary> | |
string | ResourceName [get, set] |
bool | AutoCommit [get, set] |
Causes Solr to do a commit after indexing the document, making it immediately searchable. | |
bool | ExtractOnly [get, set] |
If true, return the extracted content from Tika without indexing the document. This literally includes the extracted XHTML as a string in the response. | |
ExtractFormat | ExtractFormat [get, set] |
The format to specify for extraction. | |
bool | CaptureAttributes [get, set] |
Index attributes of the Tika XHTML elements into separate fields, named after the element. For example, when extracting from HTML, Tika can return the href attributes in <a> tags as fields named "a". | |
string | Capture [get, set] |
Tika XHTML NAME: Capture XHTML elements with the name separately for adding to the Solr document. This can be useful for grabbing chunks of the XHTML into a separate field. For instance, it could be used to grab paragraphs (<p>) and index them into a separate field. | |
string | Prefix [get, set] |
Prefix all fields that are not defined in the schema with the given prefix. This is very useful when combined with dynamic field definitions. | |
string | DefaultField [get, set] |
If uprefix is not specified and a Field cannot be determined, the default field will be used. | |
IEnumerable< ExtractField > | Fields [get, set] |
Collection of fields and thier specified value. | |
string | XPath [get, set] |
When extracting, only return Tika XHTML content that satisfies the XPath expression. See http://lucene.apache.org/tika/documentation.html for details on the format of Tika XHTML. | |
bool | LowerNames [get, set] |
Map all field names to lowercase with underscores. For example, Content-Type would be mapped to content_type. | |
string | StreamType [get, set] |
Mime type of the file - if provided, Tika won't have to try to infer it from the ResourceName and content. | |
Stream | Content [get, set] |
The rich document to index. |
Detailed Description
Contains parameters that can be specified when extracting a rich document to the index.
See http://wiki.apache.org/solr/ExtractingRequestHandler#Input_Parameters
Constructor & Destructor Documentation
SolrNet.ExtractParameters.ExtractParameters | ( | Stream | content, |
string | id, | ||
string | resourceName | ||
) |
Constructs a new ExtractParameters with required values.
- Parameters:
-
content id resourceName
SolrNet.ExtractParameters.ExtractParameters | ( | FileStream | content, |
string | id | ||
) |
Constructs a new ExtractParameters with required values.
- Parameters:
-
content id
Property Documentation
bool SolrNet.ExtractParameters.AutoCommit [get, set] |
Causes Solr to do a commit after indexing the document, making it immediately searchable.
For good performance when loading many documents, don't call commit until you are done.
string SolrNet.ExtractParameters.Capture [get, set] |
Tika XHTML NAME: Capture XHTML elements with the name separately for adding to the Solr document. This can be useful for grabbing chunks of the XHTML into a separate field. For instance, it could be used to grab paragraphs (<p>) and index them into a separate field.
Content is also still captured into the overall "content" field.
bool SolrNet.ExtractParameters.CaptureAttributes [get, set] |
Index attributes of the Tika XHTML elements into separate fields, named after the element. For example, when extracting from HTML, Tika can return the href attributes in <a> tags as fields named "a".
Stream SolrNet.ExtractParameters.Content [get, set] |
The rich document to index.
string SolrNet.ExtractParameters.DefaultField [get, set] |
If uprefix is not specified and a Field cannot be determined, the default field will be used.
ExtractFormat SolrNet.ExtractParameters.ExtractFormat [get, set] |
The format to specify for extraction.
bool SolrNet.ExtractParameters.ExtractOnly [get, set] |
If true, return the extracted content from Tika without indexing the document. This literally includes the extracted XHTML as a string in the response.
IEnumerable<ExtractField> SolrNet.ExtractParameters.Fields [get, set] |
Collection of fields and thier specified value.
string SolrNet.ExtractParameters.Id [get, set] |
Provides the necessary unique id for the document being indexed /summary>
summary> Name of the file Tika can use it as a hint for detecting mime type. /summary>
bool SolrNet.ExtractParameters.LowerNames [get, set] |
Map all field names to lowercase with underscores. For example, Content-Type would be mapped to content_type.
string SolrNet.ExtractParameters.Prefix [get, set] |
Prefix all fields that are not defined in the schema with the given prefix. This is very useful when combined with dynamic field definitions.
Setting Prefix to false would effectively ignore all unknown fields generated by Tika given the example schema contains <dynamicField name="ignored_*" type="ignored">
string SolrNet.ExtractParameters.StreamType [get, set] |
Mime type of the file - if provided, Tika won't have to try to infer it from the ResourceName and content.
string SolrNet.ExtractParameters.XPath [get, set] |
When extracting, only return Tika XHTML content that satisfies the XPath expression. See http://lucene.apache.org/tika/documentation.html for details on the format of Tika XHTML.
The documentation for this class was generated from the following file:
- SolrNet/ExtractParameters.cs
Generated on Sun May 3 2015 17:19:05 for SolrNet by 1.7.2