200 Words – Parsing Non-Alphanumeric Characters and Ligatures in SQL Server
February 6, 2009 by: Brendan

I’m always running into situations where special characters need to be ignored when my applications communicate with SQL Server. I have library functions that I’ve written that parse out special characters, whether it be a specific set that are unique to the application, or simply all non-alphanumeric. While the functions work for simple cases, I often run into scenarios where I need to manage these filters at the View level. Having even a single function within your View can affect the performance of your queries. Instead, I leave the base View alone, and do the formatting locally within the stored procedure.
I came across an elegant solution, putting character codes into a static table (Utility.CharCode in this case) within your database. Then, you can update your variable or field with a simple UPDATE statement:
UPDATE Utility.CharCode
SET @txt_value = REPLACE(@txt_value, CHAR(CharCode), ”)
However, the two-letter combinations “oe” and “ae” are translated to specific character codes. Codes 140, 156, 198, and 230 must be eliminated from your table. Œ, and æ are ligatures in the Latin alphabet. Depending on your collation, SQL will translate the character code for these ligatures to the English letter equivalents, stripping any occurrences from your string.














Posted in

content rss

Recent Comments