Predictions and Results of Data Mining
For retrieving information from a processed data mining model, Microsoft® SQL Server™ 2000 Analysis Services supports two different kinds of queries.
Query type | Definition |
---|---|
Prediction query | Returns the predicted values of a set of columns, whose contents are unknown, after applying the results contained within a processed data mining model to them |
Content query | Returns information about the values and rules discovered by training the mining model |
Prediction Queries
Prediction queries allow the user to make predictions for unknown case sets using contents from a previously trained data mining model. Prediction queries are run by means of the SELECT statement:
SELECT [FLATTENED] <SELECT-expressions> FROM <mining model name>
PREDICTION JOIN <source data query> ON <join condition>
[WHERE <WHERE-expression>]
The <Source Data Query> token identifies the set of new cases that will be predicted. <Mining Model Name> identifies the mining model that will be used to generate the predictions.
After the source data has been identified, a relationship between it and the data in the mining model must be defined. This is done using the ON clause of the PREDICTION JOIN statement.
Example
The following example attempts to predict the age of customers using the Age Prediction data mining model and the Customers and Sales cubes:
SELECT t.[Customer ID], [Age Prediction].[Age]
FROM [Age Prediction]
PREDICTION JOIN
(
SHAPE
{
SELECT [Customer ID], [Gender], FROM Customers ORDER BY [Customer ID]
}
APPEND
(
{SELECT [CustID], [Product Name], [Quantity] FROM Sales ORDER BY [CustID]}
RELATE [Customer ID] To [CustID]
)
AS [Product Purchases]
) as t
ON [Age Prediction] .Gender = t.Gender and
[Age Prediction] .[Product Purchases].[Product Name] = t.[Product Purchases].[Product Name] and
[Age Prediction] .[Product Purchases].[Quantity] = t.[Product Purchases].[Quantity]
For more information, see the OLE DB for Data Mining specification.
Content Queries
Browsing the content of a data mining model can provide important insight into the data. For example, it may expose patterns or trends that can be used to predict new data points or train expert systems. The content of the data mining model depends on the algorithm that generated it, and it can vary widely from algorithm to algorithm.
Querying the model directly will return contents of the model. For example, consider the following query:
SELECT * FROM MyOlapModel.CONTENT
This example provides a result table whose structure is the same as that of the MINING_MODEL_CONTENT schema rowset.
More sophisticated queries against the content in a data mining model are also possible. For example, consider the following content query:
SELECT Age
FROM HairColorPredictDMM.Content
WHERE Gender = 'Male' and HairColor = 'Black'
This query returns all of the nodes that are concerned with black-haired men.
Gender | Age | HairColor | P(HairColor) |
---|---|---|---|
Male | 2 | Black | .667 |
Male | 91 | Black | .300 |
Male | 45 | Black | .667 |
Male | NULL | Black | .600 |