In Part 1, I discussed an approach to convert accented (diacritic characters) into non accented ones, using Oracle’s inbuilt nlssort functionality. I also mentioned the flaw with it, in that it loses capitalisation.
Another approach is to use Oracle’s TRANSLATE function, this replaces a sequence of characters in a string with another sequence of characters. So the only problem is creating a list of the diacritic characters and their equivalents. Well fortunately, I’ve done that for you, shown in the example below, and also show how you could wrap things up in an easy to use function.
Note : There are many diacritic characters in the world, so I’ve picked mainly Latin ones here, but you can obviously add to the example how you see fit. Your list may also depend on if you are using a Unicode character-set or not, but you can tweak the list to whatever suits your needs.
SQL> select 2 translate( 3 'Necht již hríšné saxofony dáblu rozezvucí sín údesnými tóny waltzu, tanga a quickstepu', -- Example text 4 'ÁÀÄÂĀÅǍĄĂÃĈĆÇČĐĎÊËÉÈĚĒĖĘĞĢĜĤÌĮĪÏÎÍĴĶĻĹĽÑŃŇŅÒÖÓØŐÕÔŔŘŠŚŜŞȘŤŢȚÜÙÚÛŪŨŲŮŰŴÝŸŶŹŽŻäàāăąǎåáâãćçčĉđďèéêëěęėēĝğģĥìīįíîïıĵķĺłļľñńņňőøöòóôõřŕșšşśŝťțţūůųũüùúûűŵŷýÿžżź', 5 'AAAAAAAAAACCCCDDEEEEEEEEGGGHIIIIIIJKLLLNNNNOOOOOOORRSSSSSTTTUUUUUUUUUWYYYZZZaaaaaaaaaaccccddeeeeeeeeggghiiiiiiijkllllnnnnooooooorrssssstttuuuuuuuuuwyyyzzz' 6 ) accent_less 7 from dual; ACCENT_LESS -------------------------------------------------------------------------------------- Necht jiz hrisne saxofony dablu rozezvuci sin udesnymi tony waltzu, tanga a quickstepu SQL> SQL> create or replace function AccentLess(pStr varchar2) return varchar2 is 2 begin 3 return translate( 4 pStr 5 'ÁÀÄÂĀÅǍĄĂÃĈĆÇČĐĎÊËÉÈĚĒĖĘĞĢĜĤÌĮĪÏÎÍĴĶĻĹĽÑŃŇŅÒÖÓØŐÕÔŔŘŠŚŜŞȘŤŢȚÜÙÚÛŪŨŲŮŰŴÝŸŶŹŽŻäàāăąǎåáâãćçčĉđďèéêëěęėēĝğģĥìīįíîïıĵķĺłļľñńņňőøöòóôõřŕșšşśŝťțţūůųũüùúûűŵŷýÿžżź', 6 'AAAAAAAAAACCCCDDEEEEEEEEGGGHIIIIIIJKLLLNNNNOOOOOOORRSSSSSTTTUUUUUUUUUWYYYZZZaaaaaaaaaaccccddeeeeeeeeggghiiiiiiijkllllnnnnooooooorrssssstttuuuuuuuuuwyyyzzz' 7 ); 8 end; 9 / Function created. SQL> SQL> select AccentLess('Necht již hríšné saxofony dáblu rozezvucí sín údesnými tóny waltzu, tanga a quickstepu') accent_less 2 from dual; ACCENT_LESS -------------------------------------------------------------------------------------- Necht jiz hrisne saxofony dablu rozezvuci sin udesnymi tony waltzu, tanga a quickstepu
For those of you wondering, the test text is a Czech pangram (a sentence using all letters of the particular language’s alphabet), in this case it translates to “Let the sinful saxophones of devils finally make the hall resonate with the frightful tones of waltz, tango and quickstep.”