Null Member in Collection?

If you’re working with collections you sometimes need to know if some value is a member of it. Fortunately, Oracle provided the very useful member of function, it’s highly optimised and will out-perform other methods, shown below :

create type TY_Numbers as table of number -- create a collection of numbers
/

declare
  procedure RunTests(pSampleSize integer, pTestCount integer) is
    BIG_NUMBER constant number := Power(2, 32); -- used to prevent timer wrapping
    vNums TY_Numbers;
    vStart number;
    m number;
    vDummy number;
    procedure TimingInfo(pTest varchar2 default null) is
    begin
      if pTest is not null then
        dbms_output.put_line(pTest||': '||to_char(mod(dbms_utility.get_time - vStart + BIG_NUMBER, BIG_NUMBER))||'. '||m||' members found');
      end if;
      m := 0; -- reset
      vStart := dbms_utility.get_time;
    end;
  begin
    dbms_output.put_line('----------- Tests, sample size =  '||pSampleSize);
    select level
    bulk collect
    into vNums
    from dual
    connect by level < pSampleSize 
    order by dbms_random.value;
    TimingInfo; -- initilises timing
  -----------
    for n in 1..pTestCount
    loop
      if n member of vNums then
        m := m + 1;
      end if;
    end loop;
    TimingInfo('Member of');
  -----------
    for n in 1..pTestCount
    loop
      for k in vNums.First..vNums.last
      loop
        if n = vNums(k) then
          m := m + 1;
          exit;
        end if;
      end loop;
    end loop;
    TimingInfo('Iterate');
  -----------
    for n in 1..pTestCount
    loop
      if TY_Numbers(n) multiset intersect vNums = TY_Numbers(n) then
        m := m + 1;
      end if;
    end loop;
    TimingInfo('Multiset intersect');
  -----------
    for n in 1..pTestCount
    loop
      if cardinality(TY_Numbers(n) multiset intersect vNums) = 1 then
        m := m + 1;
      end if;
    end loop;
    TimingInfo('Cardinality multiset intersect');
  -----------
    for n in 1..pTestCount
    loop
      select n
      into vDummy
      from table(vNums)
      where column_value = n;
      m := m + 1;
    end loop;
    TimingInfo('SQL table function');
  --------------
    for n in 1..pTestCount
    loop
      if cardinality(TY_Numbers(n) multiset except vNums) = 0 then
        m := m + 1;
      end if;
    end loop;
    TimingInfo('Cardinality multiset except');
  --------------
    for n in 1..pTestCount
    loop
      if TY_Numbers(n) submultiset of vNums then
        m := m + 1;
      end if;
    end loop;
    TimingInfo('Submultiset of');
  end;
begin
  RunTests(1000, 100);
  RunTests(10000, 1000);
  RunTests(100000, 10000);
end;
/

/*
----------- Tests, sample size = 1000, member checks = 100-----------
Member of: 1
Iterate: 0
Multiset intersect: 1
Cardinality multiset intersect: 1
SQL table function: 4
Cardinality multiset except: 0
Submultiset of: 1
----------- Tests, sample size = 10000, member checks = 1000-----------
Member of: 20
Iterate: 36
Multiset intersect: 91
Cardinality multiset intersect: 90
SQL table function: 235
Cardinality multiset except: 53
Submultiset of: 36
----------- Tests, sample size = 100000, member checks = 10000-----------
Member of: 1989
Iterate: 3568
Multiset intersect: 11979
Cardinality multiset intersect: 11893
SQL table function: 24417
Cardinality multiset except: 7826
Submultiset of: 3777
*/

As you can see, Member of beats all the other methods hands down. So that’s what we should always use, right?  Well no, not quite, there’s one small problem with it….

declare
  vNums TY_Numbers := TY_Numbers(1, 2, 3, 4, null, 5);
  procedure OP(pStr varchar2, pFlag boolean) is
  begin
    DBMS_OUTPUT.PUT_LINE ( pStr || ' => ' ||
           case
             when pFlag then 'True'
             when not pFlag then 'False'
             else 'null'
           end );
  end;
begin
  OP('null member of vNums', null member of vNums);
end;
/

/*
null member of vNums => null
*/

Null is not a member of a collection containing null?!!! Well funnily enough, Oracle do qualify this fact in their member of documentation..

A member_condition is a membership condition that tests whether an element is a member of a nested table. The return value is TRUE if expr is equal to a member of the specified nested table or varray. The return value is NULL if expr is null or if the nested table is empty.

OK, so how do we find if null is a member of a collection – in other words, does the collection contain a null? Well we could iterate the collection, but that seems inefficient given the tests above, as does using the table operator. Let’s alter the previous code for our other test cases and see which of these gives the expected result.


declare
  vNums TY_Numbers := TY_Numbers(1, 2, 3, 4, null, 5);
  procedure OP(pStr varchar2, pFlag boolean) is
  begin
    DBMS_OUTPUT.PUT_LINE ( pStr || ' => ' ||
           case
             when pFlag then 'True'
             when not pFlag then 'False'
             else 'null'
           end  ||' '||
           case when pFlag then '(CORRECT)' else '(INCORRECT)' end);
  end;
begin
  OP('member of', null member of vNums);
  OP('multiset intersect', TY_Numbers(null) multiset intersect vNums = TY_Numbers(null));
  OP('cardinality multiset intersect', cardinality(TY_Numbers(null) multiset intersect vNums) = 1);
  OP('cardinality multiset except', cardinality(TY_Numbers(null) multiset except vNums) = 0);
  OP('submultiset', TY_Numbers(null) submultiset of vNums);
end;
/

/*
member of => null (INCORRECT)
multiset intersect => null (INCORRECT)
cardinality multiset intersect => True (CORRECT)
cardinality multiset except => True (CORRECT)
submultiset => null (INCORRECT)
*/

So it is only the cardinality multiset approaches that work. Just for the record, cardinality is a multiset operator that provides a safe way of counting items in a collection, as it returns 0 on uninitialised / empty collections.

cardinality(TY_Numbers(null) multiset intersect vNums) = 1)

Is just saying, compares two collections, returning a collection containing the elements that are in both and count those elements.

cardinality(TY_Numbers(null) multiset except vNums) = 0)

Is just saying, compares two collections, returning a collection containing the distinct elements in the left that are not in the right, and count those elements.

Incidentally, the latter is the faster of the two, which makes sense as in this case the left hand side only has one member to find to make an empty set, after which I suspect Oracle short circuits the result.

Anyway, if you need to find if a null exists in a collection, your best bet for performance is probably…

cardinality(<collection1>(null) multiset except <collection2>) = 0)
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s