Some time ago I was asked to assist in fixing or at least finding the cause of a performance problem. The application ran fine until the Virtual Private Database (VPD) policy was applied. Oracle claims there should be near zero impact on your application when you implement VPD, then how is this possible?
First of all, the policy applied was a rather complex one. A complex query should be executed to determine if the current user has access to the record. Let’s say this query takes up a second, then I would expect my query to run about a second slower, maybe two. But the query took several minutes to complete when the VPD policy was applied. This didn’t make sense to me, so I decided to find out what was really happening.
To do this, I opened up my sandbox database to try and recreate this situation.
First I need to create two new users
create user vpd1 identified by vpd1 / grant connect, resource to vpd1 / create user vpd2 identified by vpd2 / grant connect, resource to vpd2 /
Then I created a simple table to hold the data that should be protected by the VPD policy:
drop table emp purge / create table emp (empno number(4) not null, ename varchar2(10), job varchar2(9), mgr number(4), hiredate date, sal number(7, 2), comm number(7, 2), deptno number(2)) ; insert into emp values (7369, 'SMITH', 'CLERK', 7902, to_date('17-12-1980', 'DD-MM-YYYY'), 800, null, 20); insert into emp values (7499, 'ALLEN', 'SALESMAN', 7698, to_date('20-02-1981', 'DD-MM-YYYY'), 1600, 300, 30); insert into emp values (7521, 'WARD', 'SALESMAN', 7698, to_date('22-02-1981', 'DD-MM-YYYY'), 1250, 500, 30); insert into emp values (7566, 'JONES', 'MANAGER', 7839, to_date('02-04-1981', 'DD-MM-YYYY'), 2975, null, 20); insert into emp values (7654, 'MARTIN', 'SALESMAN', 7698, to_date('28-09-1981', 'DD-MM-YYYY'), 1250, 1400, 30); insert into emp values (7698, 'BLAKE', 'MANAGER', 7839, to_date('01-05-1981', 'DD-MM-YYYY'), 2850, null, 30); insert into emp values (7782, 'CLARK', 'MANAGER', 7839, to_date('09-06-1981', 'DD-MM-YYYY'), 2450, null, 10); insert into emp values (7788, 'SCOTT', 'ANALYST', 7566, to_date('09-12-1982', 'DD-MM-YYYY'), 3000, null, 20); insert into emp values (7839, 'KING', 'PRESIDENT', null, to_date('17-11-1981', 'DD-MM-YYYY'), 5000, null, 10); insert into emp values (7844, 'TURNER', 'SALESMAN', 7698, to_date('08-09-1981', 'DD-MM-YYYY'), 1500, 0, 30); insert into emp values (7876, 'ADAMS', 'CLERK', 7788, to_date('12-01-1983', 'DD-MM-YYYY'), 1100, null, 20); insert into emp values (7900, 'JAMES', 'CLERK', 7698, to_date('03-12-1981', 'DD-MM-YYYY'), 950, null, 30); insert into emp values (7902, 'FORD', 'ANALYST', 7566, to_date('03-12-1981', 'DD-MM-YYYY'), 3000, null, 20); insert into emp values (7934, 'MILLER', 'CLERK', 7782, to_date('23-01-1982', 'DD-MM-YYYY'), 1300, null, 10); commit / drop table emp_vpd purge / create table emp_vpd as select * from emp / commit /
And of course I need to grant access to this table to the newly created users:
grant all on emp_vpd to vpd1 / grant all on emp_vpd to vpd2 /
On the table I need to create a policy function so I create a package (which mimics the customers package, just simpler) to do this:
create or replace package emp_vpd_policy as function first_policy(owner_in in varchar2 ,objname_in in varchar2) return varchar2; function allowed(empno_in in number ,deptno_in in number) return number; end emp_vpd_policy; / sho err create or replace package body emp_vpd_policy as function first_policy(owner_in in varchar2 ,objname_in in varchar2) return varchar2 is begin dbms_output.put_line('first policy'); if (user = 'VPD1') then return 'emp_vpd_policy.allowed(emp_vpd.empno, emp_vpd.deptno)=10'; elsif user = 'VPD2' then return 'emp_vpd_policy.allowed(emp_vpd.empno, emp_vpd.deptno)=20'; else return '1=1'; end if; end first_policy; function allowed(empno_in in number ,deptno_in in number) return number is begin dbms_output.put_line('emp_vpd_policy.allowed(' || empno_in || ',' || deptno_in || ')'); return deptno_in; end allowed; end emp_vpd_policy; / sho err
and then protect the EMP_VPD table using a policy:
begin sys.dbms_rls.add_policy(object_schema => 'DEMO' ,object_name => 'EMP_VPD' ,policy_name => 'EMP_VPD_SEL' ,function_schema => '&myuser' ,policy_function => 'EMP_VPD_POLICY.FIRST_POLICY' ,statement_types => 'SELECT'); end; /
The package will show what will happen when I perform a select on the table:
conn vpd1/vpd1 set serveroutput on size unlimited select * from demo.emp_vpd /
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO ----- ---------- --------- ----- ----------- --------- --------- ------ 7782 CLARK MANAGER 7839 6/9/1981 2450.00 10 7839 KING PRESIDENT 11/17/1981 5000.00 10 7934 MILLER CLERK 7782 1/23/1982 1300.00 10 first policy first policy emp_vpd_policy.allowed(7369,20) emp_vpd_policy.allowed(7499,30) emp_vpd_policy.allowed(7521,30) emp_vpd_policy.allowed(7566,20) emp_vpd_policy.allowed(7654,30) emp_vpd_policy.allowed(7698,30) emp_vpd_policy.allowed(7782,10) emp_vpd_policy.allowed(7788,20) emp_vpd_policy.allowed(7839,10) emp_vpd_policy.allowed(7844,30) emp_vpd_policy.allowed(7876,20) emp_vpd_policy.allowed(7900,30) emp_vpd_policy.allowed(7902,20) emp_vpd_policy.allowed(7934,10)
In my case this is done rather quickly, there’s almost no difference in timing for the query with or without the policy applied. But as you can see, the policy is executed for each and every record that is being checked. Well, not really the policy itself, but the function that is defined in the policy. So if this function takes a lot of time and your table has a lot of records then the query will run for a very long time. There has got to be a better way to do this.
Let’s analyze what happens, the actual policy is executed twice. What if we use this architecture to our benefit. In the first pass we can setup some in memory data structure to hold whatever we need, this might take some time and then in the second pass we can use this data to be used in the actual check.
First we drop the policy so we can create a new one:
begin sys.dbms_rls.drop_policy(object_schema => '&myuser' ,object_name => 'EMP_VPD' ,policy_name => 'EMP_VPD_SEL'); end; /
For our implementation we need a Nested Table type to be created in the database:
create or replace type empnos_tt is table of number(4) /
Then we create a new package to hold the policy function.
create or replace package emp_vpd_pp as function sel( owner_in in varchar2 , objname_in in varchar2 ) return varchar2; function read_g_empnos return empnos_tt; end emp_vpd_pp; / sho err
The function SEL will be used in the policy. The function READ_G_EMPNOS is needed to retrieve the data in the package variable. Then the actual implementation of the package:
create or replace package body emp_vpd_pp as g_empnos empnos_tt; beenhere boolean := false; function sel( owner_in in varchar2 , objname_in in varchar2 ) return varchar2 is begin if not(beenhere) then if user = 'VPD1' then begin select emp.empno bulk collect into g_empnos from emp where emp.deptno = 10; exception when others then dbms_output.put_line(sqlerrm); end; elsif user = 'VPD2' then begin select emp.empno bulk collect into g_empnos from emp where emp.deptno = 20; exception when others then dbms_output.put_line(sqlerrm); end; end if; end if; beenhere := not(beenhere); if ((user = 'VPD1') or (user = 'VPD2')) then return 'emp_vpd.empno in (select column_value from table(emp_vpd_pp.read_g_empnos))'; else return '1=1'; end if; end sel; function read_g_empnos return empnos_tt is begin return (g_empnos); end; begin beenhere := false; end emp_vpd_pp; / sho err
In the initialization section of the package we initialize the Boolean variable. Then, when the policy function is executed for the first time (per query) we select the column values we need and save that into the package variable. The second time we execute the policy function we use the values saved in the predicate that is being added.
begin sys.dbms_rls.add_policy(object_schema => 'DEMO' ,object_name => 'EMP_VPD' ,policy_name => 'EMP_VPD_SEL' ,function_schema => 'DEMO' ,policy_function => 'EMP_VPD_PP.SEL' ,statement_types => 'SELECT'); end; /
Notice the predicate with the use of the Nested Table is executed always, but the Nested Table is only filled up in the first execution of the policy function. Using this technique the database only has to execute the expensive query once and its result can be used multiple times at almost no cost.
Using this policy function has exactly the same result, but the execution improved dramatically. Using this technique the database only has to execute the expensive query once per query instead of for every row.
This post is also available at the AMIS blog