written by Victor Varlamov, OCP at 07-JUL-2017
There is a Reporting system, where a lot (more than 100) of complex long running queries are launched by Reporting Engine, triggered by different business events. Queries are executed with quite different input parameters (list of customers) and data ranges (daily, weekly, monthly). Because of skewed data in tables, one report can return anywhere from 1 row to 1 million rows depending on the input parameters of the report (different clients have different quantities of rows in fact tables). Every report is implemented as a PL/SQL package with a main function that accepts input parameters, prepares some additional data, and then opens a static cursor with PL/SQL variables as parameters, and finally returns the opened cursor. The CURSOR_SHARING parameter is set to FORCE in the DB server. In such scenarios, SQL can lead to poor performance if the optimizer re-uses unsuitable execution plans that otherwise would not be used if a hard parse was forced using literals. Bind variable peeking can cause a suboptimal plan.
In the book Expert Oracle Practices, Alex Gorbachev relates a story – told to him by Tom Kyte – in which the query plan would change on Monday mornings if it was raining. It’s a true story.
“According to the end-users’ observations, if it was raining heavily Monday morning, then database performance was terrible. Any other day of the week, or on Mondays without rain, there were no problems. Talking to the DBA responsible for the system, Tom found that the performance problems continued until the DBA restarted the database, at which point performance went back to normal. That was the workaround: Rainy Mondays = Reboots.”
It is a real case, and this case was solved without any mysteries and magic, just good knowledge of how to Oracle DB works. I’ll show the solution at the end of the article.
Here is a simple example of how bind variable peeking works:
Let’s create test table with highly skewed data:
SQL> CREATE TABLE VVP_HARD_PARSE_TEST(C1 NUMBER, C2 NUMBER, C3 VARCHAR2(300));
Table created.
SQL> INSERT INTO VVP_HARD_PARSE_TEST
SELECT ROWNUM C1,
CASE
WHEN LEVEL < 9 THEN 1
WHEN MOD(ROWNUM, 100)=99 THEN 99
ELSE 1000000
END C2,
RPAD('A', 300, 'A') C3
FROM DUAL
CONNECT BY LEVEL <= 1000000;
1000000 rows inserted.
SQL> CREATE INDEX IND_VVP_HARD_PARSE_TEST_C2 ON VVP_HARD_PARSE_TEST(C2);
Index created.
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => USER,
TABNAME => 'VVP_HARD_PARSE_TEST',
CASCADE => TRUE,
METHOD_OPT => 'FOR ALL INDEXED COLUMNS SIZE 254');
PL/SQL procedure successfully completed.
SQL> SELECT histogram FROM user_tab_columns WHERE table_name = 'VVP_HARD_PARSE_TEST'
AND column_name = 'C2';
HISTOGRAM
---------
FREQUENCY
SQL> SELECT c2, COUNT(*) FROM VVP_HARD_PARSE_TEST GROUP BY c2 ORDER BY 1;
C2 COUNT(*)
-----------------------
1 8
99 10000
1000000 989992
In other words, we have a table VVP_HARD_PARSE_TEST with 1 000 000 rows, where 10 000 rows have C2=99, 8 rows have C2=1 and others have C2=1000000. A histogram on columns C2 exists and tells Oracle about this data distribution. The above is known as a skewed data distribution, and a histogram can help Oracle to choose the right plan, depending on the value.
Let’s observe simple queries on VVP_HARD_PARSE_TEST table. Obviously for the query
SELECT * FROM VVP_HARD_PARSE_TEST WHERE c2 = :p
if :p=1 then INDEX RANGE SCAN is the fastest data access, if :p=1000000 then FULL TABLE SCAN is better. Query1000000 and Query1 are identical, the only difference is in comments to get a different SQL_IDs.
DECLARE p NUMBER; v NUMBER;
BEGIN
V := 0; p := 1000000;
FOR rec IN (SELECT /*+query1000000*/ * FROM VVP_HARD_PARSE_TEST WHERE c2 = p) LOOP
V := v + 1;
END LOOP;
dbms_output.put_line(v);
v : =0; p := 1;
FOR rec IN (SELECT /*+query1000000*/ * FROM VVP_HARD_PARSE_TEST WHERE c2 = p) LOOP
V := v + 1;
END LOOP;
dbms_output.put_line(v);
-----------------
V := 0; p := 1;
FOR rec IN (SELECT /*+query1*/ * FROM VVP_HARD_PARSE_TEST WHERE c2 = p) LOOP
V := v + 1;
END LOOP;
dbms_output.put_line(v);
v := 0; p := 1000000;
FOR rec IN (SELECT /*+query1*/ * FROM VVP_HARD_PARSE_TEST WHERE c2 = p) LOOP
V := v + 1;
END LOOP;
dbms_output.put_line(v);
END;
Now, let’s look at the query execution plans:
SQL> SELECT sql_id, child_number, executions, plan_hash_value, sql_text FROM v$sql WHERE sql_text LIKE 'SELECT % * FROM VVP_HARD_PARSE_TEST WHERE C2%';
SQL_ID CHILD_NUMBER EXECUTIONS PLAN_HASH_VALUE
SQL_TEXT
-------------------------------------------------
7rqnhhp6pahw2 0 2 2782757451
SELECT /*+query1000000*/ * FROM VVP_HARD_PARSE_TEST WHERE C2 = :B1
7xwt28hvw3u9s 0 2 2463783749
SELECT /*+query1*/ * FROM VVP_HARD_PARSE_TEST WHERE C2 = :B1
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id => '7rqnhhp6pahw2', format => 'basic +peeked_binds'));
SELECT /*+query1000000*/ * FROM VVP_HARD_PARSE_TEST WHERE C2 = :B1
Plan hash value: 2782757451
-------------------------------------------------
| Id | Operation | Name |
-------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS FULL| VVP_HARD_PARSE_TEST |
-------------------------------------------------
Peeked Binds (identified by position):
--------------------------------------
1 - :B1 (NUMBER): 1000000
SQl> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id => '7xwt28hvw3u9s', format => 'basic +peeked_binds'));
SELECT /*+query1*/ * FROM VVP_HARD_PARSE_TEST WHERE C2 = :B1
Plan hash value: 2463783749
------------------------------------------------------------------
| Id | Operation | Name |
------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS BY INDEX ROWID| VVP_HARD_PARSE_TEST |
| 2 | INDEX RANGE SCAN | IND_VVP_HARD_PARSE_TEST_C2 |
------------------------------------------------------------------
Peeked Binds (identified by position):
--------------------------------------
1 - :B1 (NUMBER): 1
So, as we can see, a plan for each query is built only once, when the query executed the first time (only one child cursor with CHILD_NUMBER=0 exists for each query). Each query executed twice (EXECUTION = 2). During a hard parse Oracle peeked behind the bind variables values and chose plan according to these values. But it used the same plan for the next execution despite the bind variable values have been changed for the second query execution! Suboptimal plans used for the queries during the second execution: Query1000000 with bind variable C2=1 used FULL TABLE SCAN instead of using INDEX RANGE SCAN, and QUERY1 with bind variable C2=1000000 used INDEX RANGE SCAN instead of using FULL TABLE SCAN.
Obviously fixing the application and using literals where appropriate is the correct approach, but it leads to dynamic SQL with well-known disadvantages. Another way is to switch off bind variable peeking (ALTER SESSION SET "_OPTIM_PEEK_USER_BINDS" = FALSE) and/or removing histograms (www.pythian.com/blog/stabilize-oracle-10gs-bind-peeking-behaviour).
One possible solution is to “mis-use” the row level security feature of Oracle (RLS, often also called Virtual Private Database, VPD, also known as FGAC, Fine Grained Access Control) to modify the queries on the fly and therefore forcing a hard parse every time a query uses one of the objects the RLS policy is applied to. This technique is described perfectly in Randolf Geist’s (oracle-randolf.blogspot.com/2009/02/how-to-force-hard-parse.html) and Jonathan Lewis’s (jonathanlewis.wordpress.com) blogs. Side effects of this method are the great increase of hard parses and the inability to manage query plans.
So, what we have done in the following. After analyzing our data we decided to split customers into 3 categories: Large, Middle and Small (L-M-S , or 9-5-1) according to quantity of deals (transactions) during the last year. Also quantity of rows in report strongly depends on data range parameter. (Monthly = Large, Weekly = Middle, Daily = Small or 9-5-1). Going further the decision is simple: make the policy function dependent on the client category and the data range. So for every query we have only 9 possible child cursors. Moreover, the query with different policies leads us to the same SQL_ID which gives us a possibility to implement SQL PLAN MANAGEMENT (sql plan baseline).
>SQL> CREATE TABLE HARD_PARSE_TABLE AS SELECT * FROM dual;
Table created.
SQL> CREATE TABLE CLIENTS_HP_STATISTICS (client_seqno NUMBER, client_id VARCHAR2(255), cnt_year NUMBER);
Table created.
SQL> INSERT INTO CLIENTS_HP_STATISTICS (client_seqno, client_id, cnt_year)
VALUES (1, 'SMALL CLIENT', 8);
1 row inserted.
SQL> INSERT INTO CLIENTS_HP_STATISTICS (client_seqno, client_id, cnt_year)
VALUES (99, 'MIDDLE CLIENT', 50001);
1 row inserted.
SQL> INSERT INTO CLIENTS_HP_STATISTICS (client_seqno, client_id, cnt_year)
VALUES (1000000,'LARGE CLIENT', 989992);
1 row inserted.
SQL> CREATE OR REPLACE PACKAGE FORCE_HARD_PARSE_PKG IS
gc_small CONSTANT NUMBER := 1;
gc_middle CONSTANT NUMBER := 5;
gc_large CONSTANT NUMBER := 9;
gc_client_middle CONSTANT NUMBER := 50000;
gc_client_large CONSTANT NUMBER := 500000;
gc_daterange_middle CONSTANT NUMBER := 10;
gc_daterange_large CONSTANT NUMBER := 50;
FUNCTION FORCE_HARD_PARSE(in_schema VARCHAR2, in_object VARCHAR2)
RETURN VARCHAR2;
PROCEDURE SET_PREDICATE (n NUMBER);
PROCEDURE SET_PREDICATES (p_daterange NUMBER DEFAULT NULL,
p_clientrange NUMBER DEFAULT NULL);
PROCEDURE CALC_PREDICATE;
PROCEDURE CALC_PREDICATES(p_date_interval NUMBER DEFAULT 1,
p_client_seqno NUMBER DEFAULT NULL,
p_client_id VARCHAR2 DEFAULT NULL,
p_client_seqno_list VARCHAR2 DEFAULT NULL
);
END FORCE_HARD_PARSE_PKG;
Package created.
SQL> CREATE OR REPLACE PACKAGE BODY FORCE_HARD_PARSE_PKG IS
g_predicate NUMBER; -- g_daterange || 0 || g_clientrange
g_daterange NUMBER; -- 1 - small, 5 - middle, 9 - large
g_clientrange NUMBER; -- 1 - small, 5 - middle, 9 - large
--
FUNCTION FORCE_HARD_PARSE(in_schema VARCHAR2, in_object VARCHAR2)
RETURN varchar2
IS
BEGIN
IF NVL(g_predicate, 0) = 0 THEN
RETURN NULL;
ELSE
RETURN TO_CHAR(g_predicate, 'TM') || ' = ' || TO_CHAR(g_predicate, 'TM');
END IF;
END FORCE_HARD_PARSE;
--
PROCEDURE SET_PREDICATE (n NUMBER)
IS
BEGIN
g_predicate := n;
END;
PROCEDURE SET_PREDICATES (p_daterange NUMBER DEFAULT NULL,
p_clientrange NUMBER DEFAULT NULL)
IS
BEGIN
IF p_daterange IS NOT NULL THEN
g_daterange := p_daterange;
CALC_PREDICATE;
END IF;
IF p_clientrange IS NOT NULL THEN
g_clientrange := p_clientrange;
CALC_PREDICATE;
END IF;
END SET_PREDICATES;
PROCEDURE CALC_PREDICATE
IS
BEGIN
g_predicate := NVL(g_daterange, 0) * 100 + NVL(g_clientrange, 0);
END CALC_PREDICATE;
PROCEDURE CALC_PREDICATES (p_date_interval NUMBER DEFAULT 1,
p_client_seqno NUMBER DEFAULT NULL,
p_client_id VARCHAR2 DEFAULT NULL,
p_client_seqno_list VARCHAR2 DEFAULT NULL)
IS
v_cnt NUMBER;
BEGIN
IF p_date_interval IS NOT NULL THEN
g_daterange := CASE
WHEN p_date_interval < gc_daterange_middle
THEN gc_small
WHEN p_date_interval < gc_daterange_large
THEN gc_middle
ELSE gc_large
END;
CALC_PREDICATE;
END IF;
IF COALESCE(p_client_seqno, p_client_id, p_client_seqno_list) IS NOT NULL
THEN
SELECT NVL(SUM(cnt_year), 0) AS cnt
INTO v_cnt
FROM CLIENTS_HP_STATISTICS t
WHERE 1=1
AND (p_client_seqno IS NULL OR p_client_seqno = t.client_seqno)
AND (p_client_id IS NULL OR p_client_id = t.client_id)
AND (p_client_seqno_list IS NULL OR t.client_seqno IN
(SELECT SUBSTR(s,
CASE
WHEN LEVEL > 1 THEN
INSTR(s, ',', 1, LEVEL - 1 ) + 1
ELSE 1
END,
INSTR(s, ',', 1, LEVEL) –
CASE
WHEN LEVEL > 1 THEN
INSTR(s, ',', 1, LEVEL – 1) + 1
ELSE 1
END)
FROM (SELECT p_client_seqno_list||',' AS s FROM DUAL)
CONNECT BY INSTR(s, ',', 1, LEVEL) > 0));
g_clientrange := CASE
WHEN v_cnt > gc_client_large THEN gc_large
WHEN v_cnt > gc_client_middle THEN gc_middle
ELSE gc_small
END;
CALC_PREDICATE;
END IF;
END CALC_PREDICATES;
END FORCE_HARD_PARSE_PKG;
Package body created.
SQL> EXEC DBMS_RLS.ADD_POLICY (USER, 'HARD_PARSE_TABLE', 'HARD_PARSE_POLICY', USER, 'FORCE_HARD_PARSE_PKG.FORCE_HARD_PARSE', 'select');
PL/SQL procedure successfully completed.
Now, if we are going to implement this technique in a report, we should add HARD_PARSE_TABLE into the query (it doesn’t change the query result at all!) and call CALC_PREDICATES function before the main query is executed.
Let’s see how this technique can be fixed similar to our previous example:
DECLARE p NUMBER; v NUMBER;
BEGIN
V := 0; p := 1000000;
FORCE_HARD_PARSE_PKG.SET_PREDICATE(1000000);
FOR rec IN (SELECT /*+query_hp1000000*/ *
FROM VVP_HARD_PARSE_TEST,
HARD_PARSE_TABLE
WHERE c2 = p) LOOP
V := v + 1;
END LOOP;
dbms_output.put_line(v);
v := 0; p := 1;
FORCE_HARD_PARSE_PKG.SET_PREDICATE(1);
FOR rec IN (SELECT /*+query_hp1000000*/ *
FROM VVP_HARD_PARSE_TEST,
HARD_PARSE_TABLE
WHERE c2 = p) LOOP
V := v + 1;
END LOOP;
dbms_output.put_line(v);
-----------------
V := 0; p := 1;
FORCE_HARD_PARSE_PKG.SET_PREDICATE(1);
FOR rec IN (SELECT /*+query_hp1*/ *
FROM VVP_HARD_PARSE_TEST,
HARD_PARSE_TABLE
WHERE c2 = p) LOOP
V := v + 1;
END LOOP;
dbms_output.put_line(v);
v := 0; p := 1000000;
FORCE_HARD_PARSE_PKG.SET_PREDICATE(1000000);
FOR rec IN (SELECT /*+query_hp1*/ *
FROM VVP_HARD_PARSE_TEST,
HARD_PARSE_TABLE
WHERE c2 = p) LOOP
V := v + 1;
END LOOP;
dbms_output.put_line(v);
END;
Let’s take a look on query execution plans:
SQL>SELECT sql_id, child_number, executions, plan_hash_value, sql_text, s.* FROM v$sql s WHERE sql_text LIKE 'SELECT % * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE c2%' ORDER BY 1,2;
SQL_ID CHILD_NUMBER EXECUTIONS PLAN_HASH_VALUE
SQL_TEXT
--------------------------------------------------------------------------------
7wva3uqbgh4qf 0 1 1136240498
SELECT /*+query_hp1000000*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1
7wva3uqbgh4qf 1 1 3246475190
SELECT /*+query_hp1000000*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1
8cju3tfjvwm1p 0 1 3246475190
SELECT /*+query_hp1*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1
8cju3tfjvwm1p 1 1 1136240498
SELECT /*+query_hp1*/ * FROM VVP_HARD_PARSE_TEST, HARD_PARSE_TABLE WHERE C2 = :B1
--
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id => '7wva3uqbgh4qf', cursor_child_no => 0, format => 'basic +peeked_binds'));
SELECT /*+query_hp1000000*/ * FROM VVP_HARD_PARSE_TEST,
HARD_PARSE_TABLE WHERE C2 = :B1
Plan hash value: 1136240498
----------------------------------------------------
| Id | Operation | Name |
----------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | MERGE JOIN CARTESIAN| |
| 2 | TABLE ACCESS FULL | HARD_PARSE_TABLE |
| 3 | BUFFER SORT | |
| 4 | TABLE ACCESS FULL | VVP_HARD_PARSE_TEST |
----------------------------------------------------
Peeked Binds (identified by position):
--------------------------------------
1 - :B1 (NUMBER): 1000000
--
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id => '7wva3uqbgh4qf', cursor_child_no => 1, format => 'basic +peeked_binds'));
SELECT /*+query_hp1000000*/ * FROM VVP_HARD_PARSE_TEST,
HARD_PARSE_TABLE WHERE C2 = :B1
Plan hash value: 3246475190
--------------------------------------------------------------------
| Id | Operation | Name |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | MERGE JOIN CARTESIAN | |
| 2 | TABLE ACCESS FULL | HARD_PARSE_TABLE |
| 3 | BUFFER SORT | |
| 4 | TABLE ACCESS BY INDEX ROWID| VVP_HARD_PARSE_TEST |
| 5 | INDEX RANGE SCAN | IND_VVP_HARD_PARSE_TEST_C2 |
--------------------------------------------------------------------
Peeked Binds (identified by position):
--------------------------------------
1 - :B1 (NUMBER): 1
--
SQL>SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id => '8cju3tfjvwm1p', cursor_child_no => 0, format => 'basic +peeked_binds'));
SELECT /*+query_hp1*/ * FROM VVP_HARD_PARSE_TEST,
HARD_PARSE_TABLE WHERE C2 = :B1
Plan hash value: 3246475190
--------------------------------------------------------------------
| Id | Operation | Name |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | MERGE JOIN CARTESIAN | |
| 2 | TABLE ACCESS FULL | HARD_PARSE_TABLE |
| 3 | BUFFER SORT | |
| 4 | TABLE ACCESS BY INDEX ROWID| VVP_HARD_PARSE_TEST |
| 5 | INDEX RANGE SCAN | IND_VVP_HARD_PARSE_TEST_C2 |
--------------------------------------------------------------------
Peeked Binds (identified by position):
--------------------------------------
1 - :B1 (NUMBER): 1
--
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id => '8cju3tfjvwm1p', cursor_child_no => 1, format => 'basic +peeked_binds'));
SELECT /*+query_hp1*/ * FROM VVP_HARD_PARSE_TEST,
HARD_PARSE_TABLE WHERE C2 = :B1
Plan hash value: 1136240498
----------------------------------------------------
| Id | Operation | Name |
----------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | MERGE JOIN CARTESIAN| |
| 2 | TABLE ACCESS FULL | HARD_PARSE_TABLE |
| 3 | BUFFER SORT | |
| 4 | TABLE ACCESS FULL | VVP_HARD_PARSE_TEST |
----------------------------------------------------
Peeked Binds (identified by position):
--------------------------------------
1 - :B1 (NUMBER): 1000000
Looks good. Each query executed twice, with a different child cursor for execution and different plans. For parameter C2=1000000 we see the FULL TABLE SCAN in the query execution plans for both queries, for parameter C2=1 we see the INDEX RANGE SCAN.
Finally, here is a solution for the Monday rainy mornings issue:
“It turned out that there was a cold backup on Sunday nights so all query plans would be regenerated upon first use on Monday morning. Now a certain user would start work earlier than everybody else and the values of the bind variables in her case resulted in a query plan performed well for values used by other users during the course of the work week. However, if it was raining on Monday mornings, this user would be late for work because of certain aspects of her Monday morning routine. A batch report would then be the first to execute the problem query on Monday morning but the values of the bind variables in that case resulted in a query plan that was terribly inappropriate for other values.“
Here are some useful views:
- dba_tab_histograms, all_tab_histograms, user_tab_histograms
- v$vpd_policy
- v$sql_bind_capture
- dba_hist_sqlbind