HTMLparser overview
Public Static Methods
Public Instance Constructors
HTMLparser
|
Overloaded. Initializes a new instance of the HTMLparser class. |
Public Instance Fields
bAutoExtractBetweenTagsOnly | If true (and either bAutoKeepComments or bAutoKeepScripts is true), then oHTML will be set to data BETWEEN tags excluding those tags themselves, as otherwise FULL HTML will be set, ie: '' but if this is set to true then only ' comments ' will be returned |
bAutoKeepComments | If true (default) then HTML for comments tags themselves AND between them will be set to oHTML variable, otherwise it will be empty but you can always set it later |
bAutoKeepScripts | If true (default: false) then HTML for script tags themselves AND between them will be set to oHTML variable, otherwise it will be empty but you can always set it later |
bAutoMarkClosedTagsWithParamsAsOpen | Long winded name... by default if tag is closed BUT it has got parameters then we will consider it open tag, this is not right for proper XML parsing |
bCompressWhiteSpaceBeforeTag | If true (default), then all whitespace before TAG starts will be compressed to single space char (32 or 0x20) this makes parser run a bit faster, if you need exact whitespace before tags then change this flag to FALSE |
oHE | Heuristics engine used by Tag Parser to quickly match known tags and attribute names, can be disabled or you can add more tags to it to fit your most likely cases, it is currently tuned for HTML |
Public Instance Properties
Public Instance Methods
ChangeToEntities | Parses line and changes known entiry characters into proper HTML entiries |
CleanUp | Cleans up parser in preparation for next parsing |
Close | Closes object and releases all allocated resources |
Dispose | |
Equals (inherited from Object) |
Determines whether the specified Object is equal to the current Object.
|
GetHashCode (inherited from Object) |
Serves as a hash function for a particular type, suitable for use in hashing algorithms and data structures like a hash table.
|
GetType (inherited from Object) |
Gets the Type of the current instance.
|
Init | Overloaded. Initialises parses with HTML to be parsed from provided string |
InitMiniEntities | Inits mini-entities mode: only "nbsp" will be converted into space, all other entities will be left as is |
LoadFromFile | Loads HTML from file |
ParseNext | Parses next chunk and returns it with |
ParseNextTag | Returns next tag or null if end of document, text will be ignored completely |
Reset | Resets current parsed data to start |
SetChunkHashMode | Sets chunk param hash mode |
SetEncoding | Overloaded. Sets encoding |
SetRawHTML | Sets oHTML variable in a chunk to the raw HTML that was parsed for that chunk. |
ToString (inherited from Object) |
Returns a String that represents the current Object.
|
Protected Instance Methods
Finalize (inherited from Object) |
Allows an Object to attempt to free resources and perform other cleanup operations before the Object is reclaimed by garbage collection.
|
MemberwiseClone (inherited from Object) |
Creates a shallow copy of the current Object.
|
See Also
HTMLparser Class | Majestic12 Namespace