INSERT INTO Statement
This statement has two functions: It populates local cubes with dimension members, and it trains data mining models. If the local cube is stored in multidimensional OLAP (MOLAP), the INSERT INTO statement also populates the local cube with data.
BNF
<insert-into-statement> ::= INSERT INTO <target-clause> [<options-clause>] [<bind-clause>] <source-clause>
|INSERT INTO <model> (<mapped model columns>) <source data query>
|INSERT INTO <model> (<mapped model columns>) VALUES <constant list>
|INSERT INTO <model>.COLUMN_VALUES(<mapped model columns>) <source data query>
<mapped model columns> ::= <column identifier> | <table identifier>(<column identifier> | SKIP), ...
<target-clause> ::= <cube-name> <open-paren> <target-element-list> <close-paren>
<target-element-list> ::= <target-element>[, <target-element-list>]
<target-element> ::= [<dim-name>.[<hierarchy-name>.]]<level-name>
| <time-dim-name> | <parent-child-dim-name>
| [Measures.]<measure-name>
| SKIPONECOLUMN
<level-name> ::= <simple-level-name>
| <simple-level-name>.NAME
| <simple-level-name>.KEY
| <simple-level-name>.Custom_Rollup
| .parent
| <simple-level-name>.SkipLevelColumn
<time-dim-name> ::= <dim-name-type-time>
| <dim-name-type-time>.NAME
| <dim-name-type-time>.KEY
<options-clause> ::= OPTIONS <options-list>
<options-list> ::= <option>[, <options-list>]
<option> ::= <defer-options>
| < analysis-options>
<defer-options> ::= DEFER_DATA
| ATTEMPT_DEFER
<analysis-options> ::= PASSTHROUGH
| ATTEMPT_ANALYSIS
<bind-clause> ::= BIND (<bind-list>)
<bind-list> ::= <simple-column-name>[,<simple-column-name>]
<simple-column-name> ::= <identifier>
<source-clause> ::= SELECT <columns-list>
FROM <tables-list>
[ WHERE <where-clause> ]
| DIRECTLYFROMCACHEDROWSET <hex-number> | DIRECTLYFROMMARSHALLEDROWSET <hex number>
<columns-list> ::= <column-expression> [, < columns-list> ]
<column-expression> ::= <column-expression-name>
<column-expression-name> ::= <column-name> [AS <alias-name>]
| <alias name> <column-name>
<column-name> ::= <table-name>.<column-name>
| <column-function>
| <ODBC scalar function>
| <braced-expression>
<column function> ::= <identifier>(. . .)
<ODBC scalar function> ::= {FN<column-function>}
<braced-expression> ::= (. . .)
<tables -list> ::= <table-expression> [, <tables-list>]
<table-expression> ::= <table-name> [ [AS] <table-alias>]
<table-alias> ::= <identifier>
<table-name> ::= <identifier>
<where-clause> ::= <where-condition> [AND <where-clause>]
<where-condition> ::= <join-constraint>
| <application constraint>
<join-constraint> ::= <column-name> = <column-name>
| <open-paren><column-name> = <column-name><close-paren>
<application-constraint> ::= (. . .)
| NOT (. . .)
| (. . .) OR (. . .)
<identifier> ::= <letter>{<letter>
|<digit>
|<underline>
|<dollar>
|<sharp>}. . .
Remarks
The behavior and use of this statement depend on whether you use it for OLAP or data mining.
Using INSERT INTO with Local Cubes
Names of elements in an INSERT INTO statement are level and measure names, sometimes qualified with dimension name or the keyword Measures to avoid ambiguity. The Measures keyword is case-sensitive in binary comparisons. If you use binary comparison or are unsure of your comparison method, use Measures as shown with only M in upper case.
Each level and each measure in a cube is derived from a column in the SELECT clause except the (All) level.
The columns specified in the associated SELECT clause are bound to the elements of the INSERT INTO statement in the order specified and in a one-to-one relationship.
Each level can be derived from two columns, with one used as a name column and the other used as a key column. Both columns must be in the same table. If there are two columns associated with a level, use the suffix .NAME or .KEY properties in the INSERT INTO statement after the level name.
If a column specified in the SELECT clause does not have a related element in the INSERT INTO statement, the keyword SKIPONECOLUMN can be used in the INSERT INTO statement as a placeholder for the unused column. SKIPONECOLUMN can be used more than once.
In the INSERT INTO statement, you can specify a dimension of TYPE TIME by using the name of the dimension. The dimension name is used to correlate the entire dimension with a single column in the source table that contains data with a date/time data type. The levels, of TYPE <level type>, identified for the time dimension in the CREATE_CUBE statement cause the time information to be extracted from the source column specified in the SELECT clause. For more information, see Example D later in this topic.
The WHERE clause can have both application and join constraints. The parser parses only join constraints. It uses the join constraint to find a path from all tables to the fact table and to the dimension tables. The application constraint is used only to specify constraints on a fact table and is passed through without modifications.
Expressions between parentheses are considered to be application constraints. For example, if the expression Sales.Product_ID = Products.Product_ID AND Sales.Customer_ID = Customers.Customer_ID is enclosed in parentheses, it is treated as an application constraint and is not used as a join constraint. It is the responsibility of the client application to ensure that parentheses are used only around application constraints: for example, (Product.Price < 100 AND Product.Category = 1).
The BIND clause is used to bind level and measure names specified in the INSERT INTO statement with column names used to create rowsets.
The AS <alias-name> syntax is not supported for local relational OLAP (ROLAP) cubes.
Using INSERT INTO with Data Mining
The INSERT INTO statement inserts training data into the model. The columns from the query are mapped to model columns through the <mapped model columns> section. The keyword SKIP is used to instruct the model to ignore columns that appear in the source data query that are not used in the model.
The INSERT INTO <model>.COLUMN_VALUES form inserts data directly into the models columns without training the model's algorithm. This allows you to provide column data to the model in a concise ordered manner that is useful when dealing with datasets containing hierarchies or ordered columns. The period (.) operator is used to specify columns that are part of a nested table. When using this form, columns that are part of a relation (either through RELATE TO or by being a KEY in a nested table) cannot be inserted individually and must be inserted together with all the columns in the relation.
Using Passthrough and Advanced Query Processing
The PASSTHROUGH option causes the SELECT clause to be passed directly to the source database without modification by PivotTable® Service. If PASSTHROUGH is not specified, PivotTable Service parses the query and formulates a set of queries equivalent to the original that is optimized for the source database and index structures. This set of queries is often more efficient than the specified query.
The DEFER_DATA option causes the query to be parsed locally and executed only when necessary to retrieve data to satisfy a user request. DEFER_DATA is used to specify that a local cube be defined in the ROLAP storage mode.
The ATTEMPT_DEFER option causes PivotTable Service to attempt to parse the query and defer data loading if successful, or, if the query cannot be parsed, to process the specified query immediately as if the PASSTHROUGH had been specified.
The ATTEMPT_ANALYSIS option causes PivotTable Service to attempt to parse the query and formulate an optimized set of queries. If the query cannot be parsed, PivotTable Services processes the query immediately as if the PASSTHROUGH had been specified.
Passthrough Compatibility Matrix
These options can be used together in combination. Use the following matrix to determine the effect of a pair of options while building a local cube.
Parse |
Neither PassThrough nor Attempt_Analysis | PassThrough |
Attempt_Analysis |
|
---|---|---|---|---|
Neither DEFER_DATA nor ATTEMPT_DEFER | Succeeded | MOLAP | MOLAP(*) | MOLAP |
Failed | Error | n/a | MOLAP(*) | |
DEFER_DATA | Succeeded | ROLAP | Error | ROLAP |
Failed | Error | n/a | Error | |
ATTEMPT_DEFER | Succeeded | ROLAP | MOLAP(*) | ROLAP |
Failed | MOLAP(*) | n/a | MOLAP(*) |
* Indicates that this pair of options will cause the SELECT clause to be passed through to the source database.
Examples
A. Using an Application Constraint
The following example demonstrates how to use an application constraint:
INSERT INTO MyCube (Year, Month.Name, Month.Key, [Product Group], [Product Name], Country, Sales, Cost)
OPTIONS DEFER_DATA
SELECT MyTable.Year, MyTable.Month, MONTH(MyTable.Month), MyTable.ProdGroup, MyTable.ProdName, MyTable.Country, MyTable.Sales, MyTable.Cost
FROM MyTable
WHERE MyTable.SalesRep = "Amir" and MyTable.CustomerGroup = "Industry"
B. Using the Passthrough Option
The following example demonstrates how to use the PASSTHROUGH option.
Note The PASSTHROUGH option specifies that the SELECT clause that follows it is to be passed directly to the database engine with no parsing by PivotTable Service. This option cannot be used with the DEFERED_DATA option.
INSERT INTO MyCube (Year, Month, [Product Group], [Product Name], Country, Sales, Cost)
OPTIONS PASSTHROUGH SELECT MyTable.Year, MyTable.Month, MyTable.ProdGroup, MyTable.ProdName, MyTable.Country, MyTable.Sales, MyTable.Cost
FROM MyTable
WHERE MyTable.SalesRep = "Amir" and MyTable.CustomerGroup = "Industry"
C. Using the DIRECTLYFROMCACHEDROWSET Keyword
The following example demonstrates how to use the DIRECTLYFROMCACHEDROWSET keyword.
Note The DIRECTLYFROMCACHEDROWSET keyword directs data to be read from the address in memory identified immediately after the keyword. It is the responsibility of the client application to specify the correct address in memory. At run time, the number is assumed to be the in-process address of an IUnknown pointer to an OLE DB rowset.
INSERT INTO MyCube (Year, Month, [Product Group], [Product Name], Country, Sales, Cost)
DIRECTLYFROMCACHEDROWSET 0x00001284
D. Using the DEFER DATA Option to Create a ROLAP Cube
The following example demonstrates how to create a ROLAP cube by using the DEFER DATA option:
CREATE CUBE MyCube (
DIMENSION TimeDim TYPE TIME,
LEVEL MyYear TYPE YEAR,
LEVEL MyQtr TYPE QUARTER,
LEVEL MyMonth TYPE MONTH,
DIMENSION Products,
LEVEL [Product Group],
LEVEL [Product Name],
DIMENSION Geography,
LEVEL State,
LEVEL City,
MEASURE [Sales]
FUNCTION SUM
FORMAT 'Currency',
MEASURE [Units Sold]
FUNCTION SUM
)
INSERT INTO MyCube (TimeDim, [Product Group], [Product Name], State, City, Sales, [Units Sold])
OPTIONS DEFER_DATA
SELECT MyTable.TransDate, MyTable.ProdGroup, MyTable.ProdName, MyTable.State, MyTable.City, MyTable.Sales, MyTable.UnitsSold
FROM MyTable
WHERE MyTable.SalesRep = "Jacobsen" and MyTable.CustomerGroup = "Industry"
E. Training a Data Mining Model
The following example trains a data mining model called [Age Prediction]. The training columns for this model are [Gender], [Product Name], [Product Type], and [Month].
INSERT INTO [Age Prediction].COLUMN_VALUES(Gender)
OPENROWSET('SQLOLEDB', '...', 'SELECT DISTINCT Gender FROM Customers')
INSERT INTO [Age Prediction].COLUMN_VALUES([Product Purchases].[Product Name],
[Product Purchases].[Product Type])
OPENROWSET('SQLOLEDB', '...', 'SELECT DISTINCT [Product Name], [Product Type] FROM Sales')
INSERT INTO [Age Prediction].COLUMN_VALUES( SKIP, [Month])
OPENROWSET('SQLOLEDB', '...', 'SELECT MonthID, Month FROM Months ORDER BY MonthID')