CLOB Concatenation

Adding string data to a CLOB in Oracle can be a tad slow, when compared to other string concatenation situations.  I first became aware of this concatenation performance issue from my own coding experience, but also from threads on Oracle Technical Network and a blog post by Jonathan Lewis.

Continue reading


How to Convert Accented Characters into Non Accented Ones

A while ago a user on an Oracle forum I post to asked how he could convert accented characters into non accented characters (note, he didn’t care about case sensitivity, just wanted rid of the accents).  This was because he was creating a database role from this via an Web API and Oracle was raising an ORA-00911 invalid character whenever an API call was made with an accented character.

I knew that when Oracle did binary sorts on strings, it needed to forget about accents, meaning it had such functionality built in somewhere.  So the solution I gave him ended up leveraging this.

Anyway, here’s an example of the solution using a full list of European accented characters.

with t as (
  select 'ÂÃÄÀÁÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ' str 
  from dual
select str, utl_raw.cast_to_varchar2(nlssort(str, 'nls_sort=binary_ai')) str2 
from t

STR                                                            STR2 
-------------------------------------------------------------- ---------------------------------------------------------------
ÂÃÄÀÁÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ aaaaaaæceeeeiiiiðnoooooouuuuyþssaaaaaaæceeeeiiiiðnoooooouuuuyþy 
1 row selected.

I use Oracle’s inbuilt nlssort functionality which converts a string into a raw (string of bytes) used for sorting, and I simply specify that raw is to be accent and case insensitive. I then simply cast that raw back into a varchar2 – which results in replacing accented chars with their non accented, non case sensitive equivalent!  If he wanted this uppercase, he could simply UPPER(..) it.

Note : Some special characters in Icelandic / Turkish have no unaccented equivalent, but then there’s nothing we can do about those.