Contains parameters that can be specified when extracting a rich document to the index. More...

Public Member Functions
	ExtractParameters (Stream content, string id, string resourceName)
	Constructs a new ExtractParameters with required values.
	ExtractParameters (FileStream content, string id)
	Constructs a new ExtractParameters with required values.
Properties
string	Id `[get, set]`
	Provides the necessary unique id for the document being indexed /summary>
string	ResourceName `[get, set]`
bool	AutoCommit `[get, set]`
	Causes Solr to do a commit after indexing the document, making it immediately searchable.
bool	ExtractOnly `[get, set]`
	If true, return the extracted content from Tika without indexing the document. This literally includes the extracted XHTML as a string in the response.
ExtractFormat	ExtractFormat `[get, set]`
	The format to specify for extraction.
bool	CaptureAttributes `[get, set]`
	Index attributes of the Tika XHTML elements into separate fields, named after the element. For example, when extracting from HTML, Tika can return the href attributes in <a> tags as fields named "a".
string	Capture `[get, set]`
	Tika XHTML NAME: Capture XHTML elements with the name separately for adding to the Solr document. This can be useful for grabbing chunks of the XHTML into a separate field. For instance, it could be used to grab paragraphs (<p>) and index them into a separate field.
string	Prefix `[get, set]`
	Prefix all fields that are not defined in the schema with the given prefix. This is very useful when combined with dynamic field definitions.
string	DefaultField `[get, set]`
	If uprefix is not specified and a Field cannot be determined, the default field will be used.
IEnumerable< ExtractField >	Fields `[get, set]`
	Collection of fields and thier specified value.
string	XPath `[get, set]`
	When extracting, only return Tika XHTML content that satisfies the XPath expression. See http://lucene.apache.org/tika/documentation.html for details on the format of Tika XHTML.
bool	LowerNames `[get, set]`
	Map all field names to lowercase with underscores. For example, Content-Type would be mapped to content_type.
string	StreamType `[get, set]`
	Mime type of the file - if provided, Tika won't have to try to infer it from the ResourceName and content.
Stream	Content `[get, set]`
	The rich document to index.

Detailed Description

Contains parameters that can be specified when extracting a rich document to the index.

See http://wiki.apache.org/solr/ExtractingRequestHandler#Input_Parameters

Constructor & Destructor Documentation

SolrNet.ExtractParameters.ExtractParameters	(	Stream	content,
		string	id,
		string	resourceName
	)

Constructs a new ExtractParameters with required values.

Parameters:

content
id
resourceName

SolrNet.ExtractParameters.ExtractParameters	(	FileStream	content,
		string	id
	)

Constructs a new ExtractParameters with required values.

Parameters:

content
id

Property Documentation

bool SolrNet.ExtractParameters.AutoCommit [get, set]

Causes Solr to do a commit after indexing the document, making it immediately searchable.

For good performance when loading many documents, don't call commit until you are done.

string SolrNet.ExtractParameters.Capture [get, set]

Tika XHTML NAME: Capture XHTML elements with the name separately for adding to the Solr document. This can be useful for grabbing chunks of the XHTML into a separate field. For instance, it could be used to grab paragraphs (<p>) and index them into a separate field.

Content is also still captured into the overall "content" field.

bool SolrNet.ExtractParameters.CaptureAttributes [get, set]

Index attributes of the Tika XHTML elements into separate fields, named after the element. For example, when extracting from HTML, Tika can return the href attributes in <a> tags as fields named "a".

Stream SolrNet.ExtractParameters.Content [get, set]

The rich document to index.

string SolrNet.ExtractParameters.DefaultField [get, set]

If uprefix is not specified and a Field cannot be determined, the default field will be used.

ExtractFormat SolrNet.ExtractParameters.ExtractFormat [get, set]

The format to specify for extraction.

bool SolrNet.ExtractParameters.ExtractOnly [get, set]

If true, return the extracted content from Tika without indexing the document. This literally includes the extracted XHTML as a string in the response.

IEnumerable<ExtractField> SolrNet.ExtractParameters.Fields [get, set]

Collection of fields and thier specified value.

string SolrNet.ExtractParameters.Id [get, set]

Provides the necessary unique id for the document being indexed /summary>

summary> Name of the file Tika can use it as a hint for detecting mime type. /summary>

bool SolrNet.ExtractParameters.LowerNames [get, set]

Map all field names to lowercase with underscores. For example, Content-Type would be mapped to content_type.

string SolrNet.ExtractParameters.Prefix [get, set]

Prefix all fields that are not defined in the schema with the given prefix. This is very useful when combined with dynamic field definitions.

Setting Prefix to false would effectively ignore all unknown fields generated by Tika given the example schema contains <dynamicField name="ignored_*" type="ignored">

string SolrNet.ExtractParameters.StreamType [get, set]

Mime type of the file - if provided, Tika won't have to try to infer it from the ResourceName and content.

string SolrNet.ExtractParameters.XPath [get, set]

When extracting, only return Tika XHTML content that satisfies the XPath expression. See http://lucene.apache.org/tika/documentation.html for details on the format of Tika XHTML.

The documentation for this class was generated from the following file:

SolrNet/ExtractParameters.cs

SolrNet: SolrNet.ExtractParameters Class Reference

SolrNet

SolrNet.ExtractParameters Class Reference

Public Member Functions

Properties

Detailed Description

Constructor & Destructor Documentation

Property Documentation

Get in touch