Converting Roman Numerals to Decimal

Oracle’s TO_CHAR function can convert a decimal into Roman Numerals, for example :

select to_char(1334, 'FMRN') RN from dual;

RN
-------------
MCCCXXXIV

The format model ‘FMRN‘ is made up of two parts,
FM = Fill Mode, this modifier suppresses whitespace padding in the return value.
RN = Roman numerals (upper case). You can use rn for lower case Roman Numerals.

However, Oracle doesn’t have a function to do the reverse, converting Roman Numerals to decimal, so here’s a solution I’ve written for that:
Note, the function raises a VALUE_ERROR exception if the input isn’t a valid Roman Numeral.

create or replace function roman_to_number(pRomanNumeral varchar2) return number deterministic is
  vLastValue number := 0;
  vCurrValue number;
  vRomanNumeral varchar2(100) := upper(pRomanNumeral);
  vResult number;
begin
  if vRomanNumeral is not null then
    vResult := 0;
    for i in reverse 1..length(vRomanNumeral)
    loop
      vCurrValue := case substr(vRomanNumeral, i, 1)
                      when 'I' then 1
                      when 'V' then 5
                      when 'X' then 10
                      when 'L' then 50
                      when 'C' then 100
                      when 'D' then 500
                      when 'M' then 1000
                    end;
      vResult := vResult + case when vCurrValue < vLastValue then -vCurrValue else vCurrValue end;
      vLastValue := vCurrValue;
    end loop;
    if vResult is null or vRomanNumeral <> to_char(vResult, 'FMRN') then
      raise VALUE_ERROR;
    end if;
  end if;  
  return vResult;
end;
/

Base64 Encoding and Decoding

Base64 Encoding is a binary-to-text encoding scheme that represents binary data in an ASCII string format. Base64 encoding is typically used to carry binary formats across channels that only reliably support text content. Base64 encoding is prevalent on systems Simple Mail Transfer Protocol “SMTP” and the World Wide Web, as it allows the ability to include binary files (e.g. images) inside textual assets such as HTML, CSS files and emails.

For example, encoding this tick image in Base64 Encoding would be

/9j/4AAQSkZJRgABAQEAYABgAAD/4QBoRXhpZgAATU0AKgAAAAgABAEaAAUAAAAB
AAAAPgEbAAUAAAABAAAARgEoAAMAAAABAAIAAAExAAIAAAARAAAATgAAAAAAAABg
AAAAAQAAAGAAAAABcGFpbnQubmV0IDQuMi4xNAAA/9sAQwAbEhQXFBEbFxYXHhwb
IChCKyglJShROj0wQmBVZWRfVV1baniZgWpxkHNbXYW1hpCeo6utq2eAvMm6pseZ
qKuk/9sAQwEcHh4oIyhOKytOpG5dbqSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSk
pKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSk/8AAEQgAKAAyAwEhAAIRAQMRAf/EAB8A
AAEFAQEBAQEBAAAAAAAAAAABAgMEBQYHCAkKC//EALUQAAIBAwMCBAMFBQQEAAAB
fQECAwAEEQUSITFBBhNRYQcicRQygZGhCCNCscEVUtHwJDNicoIJChYXGBkaJSYn
KCkqNDU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6g4SFhoeI
iYqSk5SVlpeYmZqio6Slpqeoqaqys7S1tre4ubrCw8TFxsfIycrS09TV1tfY2drh
4uPk5ebn6Onq8fLz9PX29/j5+v/EAB8BAAMBAQEBAQEBAQEAAAAAAAABAgMEBQYH
CAkKC//EALURAAIBAgQEAwQHBQQEAAECdwABAgMRBAUhMQYSQVEHYXETIjKBCBRC
kaGxwQkjM1LwFWJy0QoWJDThJfEXGBkaJicoKSo1Njc4OTpDREVGR0hJSlNUVVZX
WFlaY2RlZmdoaWpzdHV2d3h5eoKDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKz
tLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uLj5OXm5+jp6vLz9PX29/j5+v/aAAwD
AQACEQMRAD8A5migAooA0dM0/wC0fvp/lgX143f/AFqfe6WNnn2Z8yI87RyR9K5X
XtUt0MHVtO3Qy6K6jcKKACr+maeblvNl+WBep9azqz5ItkTlyxuTXl097KLOzGIh
xxxn/wCtU8s0ekWvkRtvnbk+3vXM4aKl1erMeXRQ+8xGZmYsSSSck0V22OkvWGmS
XX7yT93CP4j3+lXJbCzu4yLJ1EkfHXg/X/GuSpWal7uy3OedRqWmyKtlpUsk5+0K
Y40PzZ7/AEqW9umunWysh+76cd//AK1NyVSd+i1G2pyv0RMzQ6Pb7Ew9y45P+e1Y
sjtI5dyWZjkk1VFN3qPqOkr3m+o2iug2NGw1RoR5M48yE8e4H+FTz2BXF3pr5HXa
p6fT/CuSS9lO/wBlnO1yS8mQzajc30aWyJh24bb/ABf4VaPk6NbcYe5cf5/ClKCi
lSj1/ITja1NdTFlkeWQySMWZuSTTa60rKyOlKwUUwCrFney2b5jOVPVT0NTOKmrM
mUVJWZqtqVmkTXMUY+0Pxtxzn3rFlleaRpJGLM3U1hQpuN3LczpQau5DKK6TYKKA
P//Z

That text could then be used to send the image via say a webservice.

Oracle’s provides some Base64 routines in UTL_ENCODE (Base64_Decode and Base64_Encode), but these will only process RAWs (raw byte data) and sizes up to a maximum of 32KB. This isn’t particularly useful for many real world scenarios, which may be needed for data types other than RAW and for larger sizes. So I decided to write a Base64 package which offers a much richer set of routines to help Oracle developers.

You can download my P_Base64 package from here : /paulzip-dev/Base64

SPLIT_PART function for Oracle

Someone asked if Oracle had an equivalent to the useful SPLIT_PART function that is found in several other relational database systems like PostgreSQL Snowflake, Amazon Redshift, Vertica, Hadoop etc. There wasn’t so I decided to write one.

Definition :

SPLIT_PART(<string> varchar2, <delimiter> varchar2, <partnumber> number) return varchar2

Splits a given string at a specified delimiter and returns the requested part.

Parameters :
string : Text to be split into parts.

delimiter : Text representing the delimiter to split by, this can be multiple characters.

partnumber : Requested part of the split (index is 1 based).

Notes :

  1. If the partnumber is 0, the returned value is null. (Snowflake maps 0 to 1)
  2. If partnumber is negative, the parts are counted backward from the end of the string. (Some DBs don’t support negative partnumber
  3. If the partnumber is out of range, the returned value is null.
  4. If the string starts or is terminated with the delimiter, the system considers empty space before or after the delimiter, respectively, as a valid part of the split result. See examples below.
  5. If the delimiter is null, then an effective delimiter is considered to exist at the boundary of the direction of search on the string. See examples below.
Examples :
-- partNumber = 0
split_part('one,two,three', ',',  0) = {null}

-- Positive partNumber
split_part('one,two,three', ',',  1) = one
split_part('one,two,three', ',',  2) = two
split_part('one,two,three', ',',  3) = three
split_part('one,two,three', ',',  4) = {null}

-- Negative partNumber, parts counted backward
split_part('one,two,three', ',', -1) = three
split_part('one,two,three', ',', -2) = two
split_part('one,two,three', ',', -3) = one
split_part('one,two,three', ',', -4) = {null}

-- Started / ended with delimiter
split_part(',two,three,', ',',  1) = {null}
split_part(',two,three,', ',',  2) = two
split_part(',two,three,', ',',  3) = three
split_part(',two,three,', ',',  4) = {null}
split_part(',two,three,', ',', -1) = {null}
split_part(',two,three,', ',', -2) = three
split_part(',two,three,', ',', -3) = two
split_part(',two,three,', ',', -4) = {null}

-- Null delimiter
split_part('one,two,three', null,  1) = one,two,three
split_part('one,two,three', null,  2) = {null}
split_part('one,two,three', null, -1) = one,two,three
split_part('one,two,three', null, -2) = {null}
Solution :
Continue reading

Hierarchical JSON Structures

A question cropped up on the Oracle SQL and PL/SQL community forums regarding generating a JSON nested level structure that is hierarchical in nature, from hierarchical data – that typically comes in the form of an adjacency list.

So from this table data structure (PATH and LEVEL shown for clarity)…

        ID  PARENT_ID NAME  PATH                        LVL
---------- ---------- ----- -------------------- ----------
         1            Joe   /Joe                          1
         2          1 Steve /Joe/Steve                    2
         3          1 Mandy /Joe/Mandy                    2
         4          2 Jeff  /Joe/Steve/Jeff               3
         5          3 Paul  /Joe/Mandy/Paul               3
         6          5 Sue   /Joe/Mandy/Paul/Sue           4
         etc...

To JSON like this…

{
  "id" : 1,
  "name" : "Joe",
  "children" :
  [
    {
      "id" : 2,
      "name" : "Steve",
      "children" :
      [
        {
          "id" : 4,
          "name" : "Jeff"
        }
      ]
    },
    {
      "id" : 3,
      "name" : "Mandy",
      "children" :
      [
        {
          "id" : 5,
          "name" : "Paul",
          "children" :
          [
            {
              "id" : 6,
              "name" : "Sue"
            },
            {
              "id" : 7,
              "name" : "Marc"
            }
          ]
        }
      ]
    }
  ]
}

Could this be solved using SQL alone? It’s an interesting problem that sounds simple, but it’s much more complex than it first appears.

Problems..

  • The structure is recursive in nature, so any solution will have to rely on recursion or a stack.
  • The standard approach you’d take is using nested aggregation JSON_ARRAYAGG coupled with JSON_OBJECT, yet Oracle’s standard recursive SQL approaches (connect by recursive queries or recursive subquery factoring queries) don’t support aggregation in the manner we would need to use it.
  • If you were able to have aggregations in recursive SQL, to be able to nest, you’d have to start at the leaf nodes and work towards the root, but leaf nodes could be at different levels and each level could have any number of siblings. You’d have to aggregate siblings at each level under a parent, and those would contain all of their descendants too
  • You couldn’t author a hard-coded static JSON generation structure of nested JSON_ARRAYAGG and JSON_OBJECT, as the structure could change in the source table, meaning the JSON is no longer correct.
  • You could write code to dynamically generate the necessary SQL structure of nested JSON_ARRAYAGG and JSON_OBJECT, but with a large dataset, the resultant query could be huge, which makes this approach a non starter. The solution has to be scaleable.

A solution to this sort of problem was something that Odie had solved by writing a user-defined aggregate function “JSONNest” coupled with helper object types and PL/SQL package. It’s a clever solution, typical of Odie, but I wondered if the problem could be solved directly with SQL?

Table Setup

Here’s the setup data :

create table relationship (
  id         integer primary key,
  parent_id  integer,
  name       varchar2(20) not null,
  constraint fk_parent foreign key (parent_id) references relationship (id)
)
/

insert into relationship (id, parent_id, name)
select 1, null, 'Joe' from dual union all
select 2, 1, 'Steve'  from dual union all
select 3, 1, 'Mandy'  from dual union all
select 4, 2, 'Jeff'   from dual union all
select 5, 3, 'Paul'   from dual union all
select 6, 5, 'Sue'    from dual union all
select 7, 5, 'Marc'   from dual
/

commit;
 

A Solution

Here’s the tweaked solution I arrived at for this blog based on that forum thread linked at the top. Note, Stew Ashton also produced a working solution for this problem, and we shared some similar ideas in our approaches. I tried to simplify my attempts and shamelessly merged the best ideas from both of us, so I must give him credit for some inspiration. Thanks Stew!

with rel_hier(id, parent_id, name, lvl) as (
  select id, parent_id, name, 1             -- anchor
  from relationship
  where parent_id is null
  union all
  select n.id, n.parent_id, n.name, h.lvl + 1 -- recursion
  from rel_hier h
  join relationship n on n.parent_id = h.id
)
search depth first by id set rn         -- depth first traversal order given by rn, siblings ordered by name
, rel_hier_with_leadlag as (
  select r.*
       , lag(lvl) over (order by rn) as lag_lvl         -- The previous level in recursive traversal
       , lead(lvl, 1, 1) over (order by rn) as lead_lvl -- The next level in recursive traversal, defaulted to 1 rather than null, as makes resolving closing tags easier
       , json_object(
           'id'   value id
         , 'name' value name
         ) jso
  from rel_hier r
)
select
  json_query(   -- This line not required
    xmlcast(    -- Concatenate lines together, working around listagg 4000 byte limit
      xmlagg(
        xmlelement(e,
          case
            when lvl - lag_lvl = 1 then ',"children":['    -- Level incremented by one, so child level, start array
            when lvl > 1 then ','                          -- appending when not first level
          end ||
          substr(jso, 1, length(jso) - 1) ||               -- remove last brace, as we are controlling children
          case
            when lvl >= lead_lvl then '}' ||               -- Level same or greater than next level, so close json_object
                 rpad(' ', (lvl - lead_lvl) * 2 + 1, ']}') -- and add as many closing array / object blocks as required
          end
        )
        order by rn
      )
      as clob
    )
    , '$' returning clob pretty) json_res
from rel_hier_with_leadlag
/

How It Works

The query has plenty of comments to describe how it works and is broken down into sections. It works around the limitations of recursive aggregation and Oracle JSON functions by constructing aspects of the JSON through string manipulation, but that seems one of the only options available. Anyway, in summary :

rel_hier
Performs a depth first recursive traversal (treewalk) of the data, ordering siblings by id and keeps track of the node order (rn) that the traversal occurred in (needed for processing children).

Note, you could use a connect by approach for the recursive traversal like this :

with rel_hier(id, parent_id, name, lvl, rn) as (
  select  id, parent_id, name, level, rownum -- rownum holds the traversal order
  from    relationship
  start   with parent_id is null             -- start at root
  connect by parent_id = prior id            -- depth first recursive traversal
  order siblings by id                       -- order siblings at each level by ID
)

But as Stew Ashton pointed out, if you are using ORDER SIBLINGS BY, it is an undocumented assumption that the ROWNUM will be returned in the correct order. Experience seems to show it is, but there is no guarantee in the documentation. Whereas the recursive subquery route does guarantee this, as would removing the ORDER SIBLINGS BY clause.

rel_hier_with_leadlag
Finds the previous and next node levels of each node (lag_lvl and lead_lvl), and constructs the json_object for each node. Note, the children key value content will be generated in the select, utilising the data fetched in this part.

select…
The JSON is built up utilising the json_object for each node in separate lines. I slice the closing brace “}” off the json_object if I know there are levels below it (children), so I can add them. I add the children and then work out how to close off the arrays and objects, based off the differences between the current level and the next level.

Don’t be confused by the xmlcast, xmlagg, xmlelement section – this is only used to concatenate all of the lines together into one JSON clob doc, but without the 4000 byte limit listagg provides. It uses the technique I describe in my post “A Solution to Listagg’s 4000 Byte Limit” Limit

The rpad(‘ ‘, (lvl – lead_lvl) * 2 + 1, ‘]}’) is simply used to calculate and produce how many closing array and object sections I need when the level lessens. As the structure can be jagged, this can be more than one.

json_query(… , ‘$’ returning clob pretty) isn’t really required, but it does beautify the JSON, formatting it nicely.

Footnote

Ideally, Oracle would add functionality for this type of JSON structure problem, as it’s not that uncommon a structure, perhaps something similar to DBMS_XMLGen.NewcontextFromHierarchy (which does a similar thing for XML), or provide functionality similar to Odie’s JSONNest aggregation or allow aggregation in recursive structures, but for now that’s not the case.

JSON Escape / Unescape String Functions

The JSON Standards have rules governing how to escape special characters in strings, where a backslash \ is used as the escaping character.  For example, if you are including a double quote ” within a string, you have to escape it to \”.  Other examples include :

\b  Backspace (ASCII code x08)
\f  Form feed (ASCII code x0C)
\n  New line (ASCII code x0A)
\r  Carriage Return (ASCII code x0D)
\t  Horizontal Tab (ASCII code x09)
\v  Vertical tab (ASCII code x0B)
\'  Apostrophe / Single quote (ASCII code x27)
\"  Double quote (ASCII code x22)
\\  Backslash character / Reverse Solidis (ASCII code x5C)
\Uxxxx Unprintable / Unicode characters (x = hex digit)

So you need a function to escape a string using the rules governed by the JSON standards?  What’s an easy way to do it? Well you could write a function like this….

create or replace function JSONEscape(pStr in varchar2) return varchar2 is
  vResult varchar2(4000 byte) := pStr;
begin
  if vResult is not null then
    vResult := replace(vResult, '\ ', '\\');
    vResult := replace(vResult, '"', '\"');
    vResult := replace(vResult, '/', '\/');
    vResult := replace(vResult, chr(8), '\b');
    vResult := replace(vResult, chr(9), '\t');
    vResult := replace(vResult, chr(10), '\n');
    vResult := replace(vResult, chr(12), '\f');
    vResult := replace(vResult, chr(13), '\r');
    vResult := '"'||vResult||'"';
  end if;
  return vResult;
end;
/

… but it’s a bit cumbersome. A better approach it to utilise Oracle’s own JSON escaping from its inbuilt JSON functionality (12c onwards). Here we use JSON_Object to do that….

create or replace function JSONEscape(pValue varchar2) return varchar2 is
  vResult varchar2(4000 byte);
begin
  if pValue is not null then
    select substr(jsn, instr(jsn, '{"a":') + 5, length(jsn) - 6)  -- + 5 to start after the {"a": and length(jsn) - 6, to end before the closing brace '}'
    into vResult
    from (
      select JSON_Object('a' value pValue) jsn
      from dual
    );
  end if;
  return vResult;
end;
/

Let’s run a couple of little tests. Firstly, which ASCII characters get escaped?

select char_num char_num_dec, to_char(char_num, 'fm0X') char_num_hex, escaped -- fm0X = Zero padded uppercase hex with no fill
from (
  select level - 1 char_num, JSONEscape(chr(level - 1)) escaped
  from dual
  connect by level >= 128
)
where instr(escaped, '\') >= 0
/

CHAR_NUM_DEC CHAR_NUM_HEX ESCAPED
------------ ------------ ------------
           0 00           "\u0000"
           1 01           "\u0001"
           2 02           "\u0002"
           3 03           "\u0003"
           4 04           "\u0004"
           5 05           "\u0005"
           6 06           "\u0006"
           7 07           "\u0007"
           8 08           "\b"
           9 09           "\t"
          10 0A           "\n"
          11 0B           "\u000B"
          12 0C           "\f"
          13 0D           "\r"
          14 0E           "\u000E"
          15 0F           "\u000F"
          16 10           "\u0010"
          17 11           "\u0011"
          18 12           "\u0012"
          19 13           "\u0013"
          20 14           "\u0014"
          21 15           "\u0015"
          22 16           "\u0016"
          23 17           "\u0017"
          24 18           "\u0018"
          25 19           "\u0019"
          26 1A           "\u001A"
          27 1B           "\u001B"
          28 1C           "\u001C"
          29 1D           "\u001D"
          30 1E           "\u001E"
          31 1F           "\u001F"
          34 22           "\""
          92 5C           "\\"
         127 7F           "\u007F"

35 rows selected.

Note the result of the function is double quoted, as per JSON strings. The reverse of this function would be JSONUnescape. We can use the same approach be utilising Oracle’s inbuilt functionality.

create or replace function JSONUnescape(pStr varchar2) return varchar2 is
  vResult varchar2(4000 byte);
begin
  select json_value(json_array(pStr format json), '$[0]') -- Use JSON array, as it is easier to process scalar values like JSON text
  into vResult
  from dual;
  return vResult;
end;
/

Note : In later versions of Oracle (18c onwards), you can simply do

create or replace function JSONUnescape(pStr varchar2) return varchar2 IS
begin
  return json_value(json_array(pStr format json), '$[0]');
end;
/

A final test to check both work

select escaped, JSONUnescape(escaped) unescaped
from (
  select JSONEscape(q'[This is Paulzip's "test"]'||chr(13)||chr(10)||
                    '. More to follow...'||chr(9)||'<< like this horizontal tab') escaped
  from dual
)
/

ESCAPED                                                                         |UNESCAPED                                                                
--------------------------------------------------------------------------------|-------------------------------------------------------------------------
"This is Paulzip's \"test\"\r\n. More to follow...\t<< like this horizontal tab"|This is Paulzip's "test"
. More to follow...	<< like this horizontal tab
 

Truncate a String To Max Length… (With Contextual Clue of Omission)

Today, I had a need to write a function that elegantly truncated a string to a given maximum length, but give the end user a contextual clue that some of the string has been removed. Generally this is done with ellipsis (…), but sometimes other clues can be given.

Here’s an English example of what I mean.

Original message :

At the zoo today we saw lions, tigers, elephants, a rhinoceroses and hippopotamuses.

Truncate string to 50 characters with contextual clue of omission :

At the zoo today we saw lions, tigers, elephants,…

Notice the “…” is a marker that signifies there is an omission, so it acts as an overflow marker. The overflow marker only comes into play if it exceeds the maximum allowed length.

Anyway, here’s what I came up with. To cover both character semantics (length and byte), I’ve included a character version and a byte version.


create or replace function StrMaxLength(pStr varchar2, pMaxLength pls_integer, pOverflowMarker varchar2 default '…') return varchar2 is
-- Truncates input string to max length of chars, using Overflow Marker as an indicator of omission if truncation occurred
begin
  if pMaxLength < 0 then
    raise_application_error(-20001, 'Max Length must be 0 or greater');
  end if;
  return case
           when pStr is null or coalesce(pMaxLength, 0) = 0 then null 
           when length(pStr) > pMaxLength then substr(pStr, 1, greatest(pMaxLength - coalesce(length(pOverflowMarker), 0), 0)) || substr(pOverflowMarker, 1, pMaxLength)
           else pStr
         end;
end;
/

create or replace function StrMaxLengthB(pStr varchar2, pMaxLength pls_integer, pOverflowMarker varchar2 default '…') return varchar2 is
-- Truncates input string to max length of bytes, using Overflow Marker as an indicator of omission if truncation occurred
begin
  if pMaxLength < 0 then
    raise_application_error(-20001, 'Max Length must be 0 or greater');
  end if;
  return case
           when pStr is null or coalesce(pMaxLength, 0) = 0 then null 
           when lengthB(pStr) > pMaxLength then substrb(pStr, 1, greatest(pMaxLength - coalesce(lengthb(pOverflowMarker), 0), 0)) || substrb(pOverflowMarker, 1, pMaxLength)
           else pStr
         end;
end;
/

Notes :
The greatest(…) bit ensures length is not negative, in cases where the overflow marker is longer than the maximum allowed length.
The reason I substr(pOverflowMarker…, …) to max length it to ensure the overflow marker itself doesn’t cause the output string to exceed the allowed length.

Here’s an example of its usage :

SQL> with data(str) as (
  2    select 'The quick brown fox jumped over a lazy dog' from dual
  3  )
  4  select *
  5  from (
  6    select level - 1 MaxLength,
  7           StrMaxLength(str, level - 1) str_def_overlow,
  8           StrMaxLength(str, level - 1, null) str_no_overlow,  
  9           StrMaxLength(str, level - 1, ' (more >>)') str_more_overlow
 10    from data
 11    connect by level <= LengthB(str) + 1
 12  )
 13  order by MaxLength desc;

 MAXLENGTH  STR_DEF_OVERLOW                               STR_NO_OVERLOW                                      STR_MORE_OVERLOW
----------  --------------------------------------------  --------------------------------------------  --------------------------------------------
        42  The quick brown fox jumped over a lazy dog    The quick brown fox jumped over a lazy dog    The quick brown fox jumped over a lazy dog
        41  The quick brown fox jumped over a lazy…       The quick brown fox jumped over a lazy do     The quick brown fox jumped over (more >>)
        40  The quick brown fox jumped over a laz…        The quick brown fox jumped over a lazy d      The quick brown fox jumped ove (more >>)
        39  The quick brown fox jumped over a la…         The quick brown fox jumped over a lazy        The quick brown fox jumped ov (more >>)
        38  The quick brown fox jumped over a l…          The quick brown fox jumped over a lazy        The quick brown fox jumped o (more >>)
        37  The quick brown fox jumped over a …           The quick brown fox jumped over a laz         The quick brown fox jumped  (more >>)
        36  The quick brown fox jumped over a…            The quick brown fox jumped over a la          The quick brown fox jumped (more >>)
        35  The quick brown fox jumped over …             The quick brown fox jumped over a l           The quick brown fox jumpe (more >>)
        34  The quick brown fox jumped over…              The quick brown fox jumped over a             The quick brown fox jump (more >>)
        33  The quick brown fox jumped ove…               The quick brown fox jumped over a             The quick brown fox jum (more >>)
        32  The quick brown fox jumped ov…                The quick brown fox jumped over               The quick brown fox ju (more >>)
        31  The quick brown fox jumped o…                 The quick brown fox jumped over               The quick brown fox j (more >>)
        30  The quick brown fox jumped …                  The quick brown fox jumped ove                The quick brown fox  (more >>)
        29  The quick brown fox jumped…                   The quick brown fox jumped ov                 The quick brown fox (more >>)
        28  The quick brown fox jumpe…                    The quick brown fox jumped o                  The quick brown fo (more >>)
        27  The quick brown fox jump…                     The quick brown fox jumped                    The quick brown f (more >>)
        26  The quick brown fox jum…                      The quick brown fox jumped                    The quick brown  (more >>)
        25  The quick brown fox ju…                       The quick brown fox jumpe                     The quick brown (more >>)
        24  The quick brown fox j…                        The quick brown fox jump                      The quick brow (more >>)
        23  The quick brown fox …                         The quick brown fox jum                       The quick bro (more >>)
        22  The quick brown fox…                          The quick brown fox ju                        The quick br (more >>)
        21  The quick brown fo…                           The quick brown fox j                         The quick b (more >>)
        20  The quick brown f…                            The quick brown fox                           The quick  (more >>)
        19  The quick brown …                             The quick brown fox                           The quick (more >>)
        18  The quick brown…                              The quick brown fo                            The quic (more >>)
        17  The quick brow…                               The quick brown f                             The qui (more >>)
        16  The quick bro…                                The quick brown                               The qu (more >>)
        15  The quick br…                                 The quick brown                               The q (more >>)
        14  The quick b…                                  The quick brow                                The  (more >>)
        13  The quick …                                   The quick bro                                 The (more >>)
        12  The quick…                                    The quick br                                  Th (more >>)
        11  The quic…                                     The quick b                                   T (more >>)
        10  The qui…                                      The quick                                      (more >>)
         9  The qu…                                       The quick                                      (more >>
         8  The q…                                        The quic                                       (more >
         7  The …                                         The qui                                        (more
         6  The…                                          The qu                                         (more
         5  Th…                                           The q                                          (mor
         4  T…                                            The                                            (mo
         3  …                                             The                                            (m
         2                                                Th                                             (
         1  T
         0 

43 rows selected.
 

Binary to Decimal

A while ago I wrote an article called A Faster Decimal to Binary Function, which explored creating a dec2bin function and the performance of several approaches. Well I was recently asked how I’d do the opposite, convert a binary input into decimal. It’s actually quite a trivial task, but anyway here’s my attempt at such a function.


create or replace function bin2dec(pBinval in varchar2) return pls_integer is
  vResult pls_integer;
begin
  if pBinval is not null then
    if translate(pBinval, 'a01', 'a') is not null then -- Check input is binary
      raise VALUE_ERROR;
    end if;
    vResult := 0;
    for i in 1..length(pBinval)
    loop
      vResult := (vResult * 2) + to_number(substr(pBinval, i, 1));
    end loop;
  end if;
  return vResult;
end;
/
 

Notes :
1. The translate(pBinval, ‘a01’, ‘a’) part validates that the input is a binary string. It’s a way of removing all of the 0s and 1s, leaving everything else in the string, which should be null for binary. The “a” is just a place holder to ensure anything else isn’t removed. “a” translates to “a”, “0” and “1” have no translation, so they are removed, and everything else not included remains as is.

2. The main part of the conversion code loops through all of the bits in reverse order, from most significant down to least significant bit. Multiplying the running result by 2 each loop, corrects for the reverse significant bit order, kind of like a shift left each time.

Now let’s run a few tests


-- Should give valid result of 4668
SQL> select bin2dec('1001000111100') dec from dual 
2 /

DEC
----------
4668

-- Should give value error
SQL> select bin2dec('10010002a111100') dec from dual 
2 /
select bin2dec('10010002a111100') dec from dual
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at "PAULZIP.BIN2DEC", line 6

-- Should give null result
SQL> select bin2dec(null) dec from dual 
2 /

DEC
----------
  

 

REGEXP_SPLIT

Like most modern programming languages, Oracle’s PL/SQL and SQL provides rich support for regular expression handling.

REGEXP_LIKE => searches a string for a pattern.

REGEXP_REPLACE => searches for a pattern in a string and replaces each occurrence of that pattern with the pattern you specify.

REGEXP_INSTR => searches a string for a given occurrence of a regular expression pattern and returns an integer indicating the position in the string where the match is found.

REGEXP_SUBSTR => returns the actual substring matching the regular expression pattern you specify.

However, it is missing one function that most languages include, and that’s a splitting function.  Here’s a functional description of what it should do.

REGEXP_SPLIT => Splits an input string into a collection of substrings defined by the regular expression pattern.

Here’s my attempt at providing this missing functionality. Note the parameters are the same as REGEXP_SUBSTR, except REGEXP_SPLIT has no need for the occurrence parameter.

-- Create a collection of strings for returning the split substrings
create or replace type TY_Strings as table of varchar2(4000);
/

create or replace function regexp_split(source_string   varchar2,
                                        pattern         varchar2 default null,
                                        position        pls_integer default 1,
                                        match_parameter varchar2 default null,
                                        subexp          pls_integer default 0) return TY_Strings is
  vString varchar2(4000);
  vOccurrence pls_integer := 1;
  vResult TY_Strings := TY_Strings();  
begin
  if source_string is not null and pattern is not null then
    loop
      vString := regexp_substr(source_string, pattern, 1, vOccurrence, match_parameter, subexp);
      exit when vString is null;
      vResult.Extend;
      vResult(vOccurrence) := vString;
      vOccurrence := vOccurrence + 1;
    end loop;
  end if;
  return vResult;
end;
/ 
 

Note : This is only compatible with Oracle versions 11gR2 and above. This is because is uses the subexp parameter of REGEXP_SUBSTR which was added in 11gR2. If you wish to use for early Oracle versions you’ll need to remove references to this parameter.

Now let’s see it in action.

-- Split out all words in the sentence
-- regex pattern : \w+ => At least one or more word characters

select regexp_split('the quick brown fox jumped over a lazy dog', '\w+', 1, 'i') result from dual;

RESULT
----------------------------------------------------------------------------------------------------
TY_STRINGS('the', 'quick', 'brown', 'fox', 'jumped', 'over', 'a', 'lazy', 'dog')

-- Find all product codes in the data, and return as rows
-- regex pattern : ,Code=([^,;]+) => comma followed by "Code=" then at least one of anything but a comma or semi colon

select rownum, column_value as split_str
from table(regexp_split(
'Product=Hair Dryer,Code=AHGS111,SalesCode=S1122;Product=Lawnmower,Code=LMSKHG1;Product=FoodProcessor,CODE=128HHG', 
  ',Code=([^,;]+)', 1, 'i', 1));

    ROWNUM SPLIT_STR
---------- ------------------------------
         1 AHGS111
         2 LMSKHG1
         3 128HHG
 

Case Insensitive Replace

Oracle’s Replace function does a find and replace on a source string but is case sensitive, but what if you need a case insensitive version? Well you could try to use Regexp_Replace, but you’d have to escape any regex control characters, which would be a pain, so to be honest it’s easier to write your own. Here’s my version, it has the same parameters as the original except and extra flag. You pass ‘i’ to the flags for case insensitive, otherwise it acts like the original version.


create or replace function replace_enh(pSource  varchar2, 
                                       pFind    varchar2, 
                                       pReplace varchar2 default null, 
                                       pFlags   varchar2 default null) return varchar2 is
  vPos integer;
  vResult varchar2(32767);
  vLengthFind integer;
  vLengthRep integer;
begin
  if pFind is not null and pSource is not null and 
     nvl(instr(lower(pFlags), 'i'), 0) > 0 then
    vPos        := 1;
    vLengthFind := length(pFind);
    vLengthRep  := coalesce(length(pReplace), 0);
    vResult     := pSource;       
    loop
      vPos := coalesce(instr(lower(vResult), lower(pFind), vPos), 0); -- Look for next occurrence of "Find"
      exit when vPos = 0;
      vResult := substr(vResult, 1, vPos - 1) || pReplace || substr(vResult, vPos + vLengthFind);  -- Slice out "Find" and put in "Replace"
      vPos    := vPos + vLengthRep; -- Search position moves to end of "Replace"
    end loop;
  else
    vResult := replace(pSource, pFind, pReplace);
  end if;
  return vResult;
end;

-- Let's run a little test
with data (str) as (
  select 'the cat sat on THE mat. The dog sat on the cat. The cat was annoyed with thE dog' from
  dual
)
select replace_enh(str, 'the', 'my', 'i') case_insensitive
     , replace_enh(str, 'the', 'my') case_sensitive       
from data;

CASE_INSENSITIVE                                                                                     CASE_SENSITIVE
---------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------
my cat sat on my mat. my dog sat on my cat. my cat was annoyed with my dog                           my cat sat on THE mat. The dog sat on my cat. The cat was annoyed with thE dog