Oracle 8i Data Cartridge Developer's Guide Release 2 (8.1.6) Part Number A76937-01 |
|
Query Optimization, 3 of 7
You can compute and store user-defined statistics for domain indexes and columns. These statistics are in addition to the standard statistics that are already collected by ANALYZE
. User-defined selectivity and cost functions for functions and domain indexes can use standard and user-defined statistics in their computation. The internal representation of these statistics need not be known to Oracle; we only require that you provide methods for their access. You are solely responsible for defining the representation of such statistics and for maintaining them. It is important to note that user-collected statistics are only used by user-defined selectivity and cost functions; the optimizer use s only its standard statistics.
User-defined statistics collection, selectivity, and cost functions must be defined in a user-defined type. This type must have a form similar to a system-defined interface called ODCIStats
(Oracle Data Cartridge Interface Statistics) defined as follows:
CREATE INTERFACE ODCIStats AS ( -- Function to get current interface FUNCTION ODCIGetInterfaces(ifclist OUT ODCIObjectList) return NUMBER, -- User-defined statistics functions FUNCTION ODCIStatsCollect(col ODCIColInfo, options ODCIStatsOptions, statistics OUT RAW) return NUMBER, FUNCTION ODCIStatsCollect(ia ODCIIndexInfo, options ODCIStatsOptions, statistics OUT RAW) return NUMBER, FUNCTION ODCIStatsDelete(col ODCIColInfo) return NUMBER, FUNCTION ODCIStatsDelete(ia ODCIIndexInfo) return NUMBER, -- User-defined selectivity function FUNCTION ODCIStatsSelectivity(pred ODCIPredInfo, sel OUT NUMBER, args ODCIArgDescList, start <function_return_type>, stop <function_return_type>, <list of function arguments>) return NUMBER, -- User-defined cost function for functions and type methods FUNCTION ODCIStatsFunctionCost(func ODCIFuncInfo, cost OUT ODCICost,
args ODCIArgDescList, <list of function arguments>) return NUMBER, -- User-defined cost function for domain indexes FUNCTION ODCIStatsIndexCost(ia ODCIIndexInfo, sel NUMBER, cost OUT ODCICost, qi ODCIQueryInfo, pred ODCIPredInfo, args ODCIArgDescList, start <operator_return_type>, stop <operator_return_type>, <list of operator value arguments>) return NUMBER )
You can define a new object type, referred to as a statistics type, with a subset of functions from ODCIStats
. Since user-defined statistics collection, selectivity, and cost functions are all optional, a statistics type need not contain all the functions in ODCIStats
.
The types of the parameters of statistics type methods are system-defined ODCI
(Oracle Data Cartridge Interface) datatypes. Some of them are described in the reference to Extensible Indexing, and the rest are described in the reference chapter detailing the Extensible Optimizer.
The selectivity and cost functions must not change any database or package state. To that end, they must be defined with appropriate purity level pragmas, otherwise the optimizer will not call them.
Depending on the object that user-defined statistics are being associated with, not all the functions defined in a statistics type will be used. The table below lists the functions and default statistics that will be used by the optimizer.
There are two user-defined statistics collection functions, one for collecting statistics and the other for deleting them.
The first, ODCIStatsCollect
, is used to collect user-defined statistics; its interface depends on whether a column or domain index is being analyzed. It is called when analyzing a column of a table or a domain index and takes two parameters:
col
for the column being analyzed, or
ia
for the domain index being analyzed;
options
for options specified in the ANALYZE
command (e.g., the sample size when ANALYZE
ESTIMATE
is used).
As mentioned, the database does not interpret statistics collected by ODCIStatsCollect
. You can store output in a user-managed format or in a dictionary table (described in the Extensible Optimizer reference) provided for the purpose. The statistics collected by the ODCIStatsCollect
functions are returned in the output parameter, statistics
, as a RAW
datatype.
When an ANALYZE
DELETE
command is issued, user-collected statistics are deleted by calling the ODCIStatsDelete
function whose interface depends on whether the statistics for a column or domain index are being dropped. It takes a single parameter: col
, for the column whose user-defined statistics need to be deleted, or ia
, for the domain index whose statistics are to be deleted.
If a user-defined ODCIStatsCollect
function is present in a statistics type, the corresponding ODCIStatsDelete
function must also be present.
The return values of the ODCIStatsCollect
and ODCIStatsDelete
functions must be Success
(indicating success), Error
(indicating an error), or Warning
(indicating a warning); these return values are defined in a system package ODCIConst
(described in the Extensible Optimizer reference).
You will recall that user-defined selectivity functions are used only for predicates of the following forms:
operator(...) relational_operator <constant> <constant> relational_operator operator(...) operator(...) LIKE <constant>
A user-defined selectivity function, ODCIStatsSelectivity
, takes five sets of input parameters that describe the predicate:
pred
describing the function operator
and the relational operator relational_operator
;
args
describing the start and stop values (i.e., <constant
>) of the function and the actual arguments to the function operator
;
start
whose datatype is the same as that of the function's return value, describing the start value of the function;
stop
whose datatype is the same as that of the function's return value, describing the stop value of the function;
operator
.
The computed selectivity is returned in the output parameter sel
, in whole numbers, as a percentage, between 0 and 100, inclusive. The optimizer ignores invalid values.
The return value of the ODCIStatsSelectivity
function must be
As an example, consider a function myFunction
defined as follows:
myFunction
(a NUMBER, b VARCHAR2(10)) return NUMBER
A user-defined selectivity function for the function myFunction
would be as follows:
ODCIStatsSelectivity(pred ODCIPredInfo, sel OUT NUMBER, args ODCIArgDescList, start NUMBER, stop NUMBER, a NUMBER, b VARCHAR2(10)) return NUMBER
If the function myFunction
is called with literal arguments, e.g.,
myFunction (2, 'TEST') > 5
then the selectivity function is called as follows:
ODCIStatsSelectivity(<ODCIPredInfo constructor>, sel, <ODCIArgDescList constructor>, 5, NULL, 2, 'TEST')
If, on the other hand, the function myFunction
is called with some non-literals e.g.,
myFunction(Test_tab.col_a, 'TEST')> 5
where col_a
is a column in table Test_tab
, then the selectivity function is called as follows:
ODCIStatsSelectivity(<ODCIPredInfo constructor>, sel, <ODCIArgDescList constructor>, 5, NULL, NULL, 'TEST')
In other words, the start, stop, and function argument values are passed to the selectivity function only if they are literals; otherwise they are NULL
. The ODCIArgDescList
descriptor describes all its following arguments.
As already mentioned, user-defined cost functions are only used for predicates of the following forms:
operator(...) relational_operator <constant> <constant> relational_operator operator(...) operator(...) LIKE <constant>
You can define a function, ODCIStatsFunctionCost
, for computing the cost of stand-alone functions, package functions, or type methods. This function takes three sets of input parameters describing the predicate:
func
describing the function operator
;
args
describing the actual arguments to the function operator
;
operator
.
The ODCIStatsFunctionCost
function returns its computed cost in the cost
parameter. As mentioned, the returned cost can have two components -- CPU and I/O -- which are combined by the optimizer to compute a composite cost. The costs returned by user-defined cost functions must be positive whole numbers. Invalid values are ignored by the optimizer.
The return value of the ODCIStatsFunctionCost
function must be
Consider a function myFunction
defined as follows:
myFunction
(a NUMBER, b VARCHAR2(10)) return NUMBER
A user-defined cost function for the function myFunction
would be coded as follows:
ODCIStatsFunctionCost(func ODCIFuncInfo, cost OUT ODCICost, args ODCIArgDescList, a NUMBER, b VARCHAR2(10)) return NUMBER
If the function myFunction
is called with literal arguments, e.g.,
myFunction(2, 'TEST') > 5,
then the cost function is called as follows:
ODCIStatsFunctionCost(<ODCIFuncInfo constructor>, cost, <ODCIArgDescList constructor>, 2, 'TEST')
If, on the other hand, the function myFunction
is called with some non-literals, e.g.,
myFunction(Test_tab.col_a, 'TEST') > 5
where col_a
is a column in table Test_tab
, then the cost function is called as follows:
ODCIStatsFunctionCost(<ODCIFuncInfo constructor>, cost, <ODCIArgDescList constructor>, NULL, 'TEST')
In other words, function argument values are passed to the cost function only if they are literals; otherwise they are NULL
. The ODCIArgDescList
descriptor describes all its following arguments.
User-defined cost functions for domain indexes are used for the same type of predicates mentioned previously, except that operator
must be a user-defined operator for which a valid domain index access path exists.
The ODCIStatsIndexCost
function takes eight sets of parameters:
ia
describing the domain index;
sel
representing the user-computed selectivity of the predicate;
qi
containing additional information about the query;
pred
describing the predicate;
args
describing the start and stop values (i.e., <constant
>) of the operator and the actual arguments to the operator operator
;
start
, whose datatype is the same as that of the operator's return value, describing the start value of the operator;
stop
whose datatype is the same as that of the operator's return value, describing the stop value of the operator; and
operator
. The value arguments of an operator are the arguments excluding the first argument.
The computed cost of the domain index is returned in the output parameter, cost
.
ODCIStatsIndexCost
returns
Consider an operator
Contains(a_string
VARCHAR2(2000)
,b_string
VARCHAR2(10))
that returns 1 or 0 depending on whether or not the string b_string
is contained in the string a_string
. Further, assume that the operator is implemented by a domain index. A user-defined index cost function for this domain index would be coded as follows:
ODCIStatsIndexCost(ia ODCIIndexInfo, sel NUMBER, cost OUT ODCICost, qi ODCIQueryInfo, pred ODCIPredInfo, args ODCIArgDescList, start NUMBER, stop NUMBER, b_string VARCHAR2(10)) return NUMBER
Note that the first argument, a_string
, of Contains
does not appear as a parameter of ODCIStatsIndexCost
. This is because the first argument to an operator must be a column for the domain index to be used, and this column information is passed in via the ODCIIndexInfo
parameter. Only the operator arguments after the first (i.e., the "value" arguments) must appear as parameters to the ODCIStatsIndexCost
function.
If the operator is called, e.g.,
Contains(Test_tab.col_c,'TEST') <= 1
then the index cost function is called as follows:
ODCIStatsIndexCost(<ODCIIndexInfo constructor>, sel, cost, <ODCIQueryInfo constructor>, <ODCIPredInfo constructor>, <ODCIArgDescList constructor>, NULL, 1, 'TEST')
In other words, the start, stop, and operator argument values are passed to the index cost function only if they are literals; otherwise they are NULL
. The ODCIArgDescList
descriptor describes all its following arguments.
|
![]() Copyright © 1996-2000, Oracle Corporation. All Rights Reserved. |
|